Giter Site home page Giter Site logo

petab_test_suite's Introduction

PEtab test suite

The PEtab test suite is a collection of test models for the PEtab parameter estimation data format. It is intended to be used to verify and quantify PEtab-support by developers of tools for model simulation and parameter estimation.

Downloading and installing the test suite

The PEtab test suite can be downloaded from GitHub via

git clone https://github.com/petab-dev/petab_test_suite

The test suite comes with all necessary files pregenerated.

Python library

The PEtab test suite comes with a Python package named petabtests in the homonymous subdirectory. It contains Python functions for generating the tests and evaluating results. This can be installed via

cd petab_test_suite
pip3 install -e .

Using the test suite

Files

The petabtests/cases subdirectory contains different test suites for different PEtab versions and model formats. Each test suite is a collection of enumerated tests. Each test consists of a single PEtab problem defined in XXXX/_XXXX.yaml file and the expected result (chi2 value, log-likelihood, simulation table reference, and tolerances) in XXXX/_XXXX_solution.yaml. XXXX/XXX.md contains a short description of the respective test problem that is not relevant for the execution of the test itself.

Evaluate results

To evaluate how a tool performs on a given test problem, three metrics are employed: Simulations, chi2 value and log-likelihood. A tool can be said to cover a test problem if any of those values matches the ground truth values up to some tolerance.

The Python package provides convenience functions for evaluation in petabtests/evaluate.py.

-> Overview of passed test cases for different tools supporting PEtab

Contributing

Contributions to the PEtab test suite are welcome.

Adding a new test case

To add a new test case, create a new subdirectory in the respective test suite directory under petabtests/cases/. The subdirectory name should be a four-digit number, starting with the next available number.

Most files in the test suite are generated automatically. Those start with an underscore, except for the README.md files. The only files that need to be created manually are the XXXX/XXXX.py files that contain the script to generate the test problem and solution files. Their content should be self-explanatory For adding a new test case, copy XXXX/XXXX.py from an existing test case and adjust it to the new test case.

All remaining files are generated by the petabtests_create script which will be available on your $PATH after installing the provided Python library (see above).

petab_test_suite's People

Contributors

dweindl avatar ffroehlich avatar leonardschmiester avatar marcusrosenblatt avatar yannikschaelte avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

petab_test_suite's Issues

#0007 tests multiple settings at once

Hi all,
in addition to #11 we would suggest to split this test case in

7a) two data points, one on linear, one on log10 scale
7b) two data points, one on linear, one on log scale.

This would make transparent that d2d natively supports log10 trafos, but not natural logithm trafos (although this is doable in principle).

Best,
Lukas

replace simplesbml by antimony

... for creating sbml test models.

then, remove the default sbml file and create models on the fly specific for each test case.
currently some models have initial assignments where we wouldn't want to have them, or they have some unused components that are distracting.

Add tests with compartments updates in condition table

Compartment size assignments are tricky since concentrations need to be updated via conversion to amounts. Should also cover compartment size updates due to changes in states/parameters that occur in size assignments, as well as concurrent update of compartment size AND concentration (in which case no conversion should be performed, if assignment in the condition table are handled the same as event/initial assignments).

Check gradients

Compare gradients/sensitivities to analytic values or finite differences

follow-up of discussion in #6

Add readme

Should contain at least:

  • how to install and use the library
  • how tool support for a test case is defined (correct simulations, chi2, llh?)

Add test case for objectivePriors

I haven't seen a test case that checks if the cost is calculated correctly if there are various objective prior types. I think it would be good to add a test case for this.

Present details on which tools support what

How to best represent? One could do a table with a row for each test case (with a short description), and then checking for each tool whether the test case passes. Other ideas?

#0007 likelihood calculation

We (dMod & d2d teams) find a different likelihood value for test case 0007. Could someone please check if there is an error in the solutions file?

data = [0.2000   -0.0969]; %second datapoint log10-transformed
simus = [0.4286   -0.2430]; %second simulation log10-transformed
errors = [0.5000    0.6000];

DataContribution = -((data-simus).^2)./(2*errors.^2) %[-0.1045   -0.0297]
ErrorContribution = -1/2 * log(2*pi*errors.^2) %[-0.2258   -0.4081]

llh = sum(DataContribution + ErrorContribution) %-0.7681

Test interpretation of parameterScale

I am wondering if it would be good to add a test case that checks if an optimizer interprets the (lowerBound, upperBound and) nominalValue of log and log10 scaled parameters on a linear scale.

challenge for 0007

Hi,

for test model 0007, I get the correct chi2 value, but the likelihood differs.

As you can see in this R code snippet:

chi2 <- obj(pouter)$value
chi2
[1] 3.226562
-0.5*(chi2+sum(log(2pic(0.5,0.6,0.7)^2)))
[1] -2.809449

dMod obtains -2.8 instead of -1.38 as in the given solution.

All other test cases obtain the correct results for both chi2 and llh. Presumably it has something to do with the different log-transformations. Could it be that you internally rescale the noise parameter for log10 or log. But I think this is wrong.

Best,
Marcus

Add further tests for preequilibration and reinitialization

From PEtab doc:

If a species ID is provided, it is interpreted as the initial concentration/amount of that species and will override the initial concentration/amount given in the SBML model or given by a preequilibration condition. If NaN is provided for a condition, the result of the preequilibration (or initial concentration/amount from the SBML model, if no preequilibration is defined) is used.

This is insufficiently covered by the current tests. Case 10 is closest, but no NaNs for species entries in the condition table are present.

Performance criterion

What exactly should be the criterion whether a tool supports a feature, i.e. that a test case passes? We have simulations tables, chi2 and llh values. However, not all tools support chi2 or llh values, or to extract the simulations. One thing to do would be to consider it passed if any of the three values fits, and to add an additional note, e.g. "tool does not support likelihoods". Other ideas?

question to test case 12

Hi,

maybe I do not understand this specific situation in the test.

Changing the compartment size from 1 to 3 when both A and B are inside this one compartment should not change the trajectories of A and B. The solution for the simulation of 0.42857 for obs_a is the same as for other cases. Could it be that the initial of B should be 0 instead of 1 in test case 12. I think then everything should be correct and the frameworks should also be able to capture this test case.

Best,
Marcus

bug in test model 0006!?

Hi,

maybe I am missing sth., but I think the solution value (e.g. chi^2) for model 0006 is not correct. The result suggests that the data in the measurement file has to be scaled by the observable parameters (once 10 and once 15). In my understanding the observable parameters shall be used for the observables only and data should already be on the right scale (atm it is 1 at t=0, should be 10).

Most likely data points have just to be adjusted accordingly in the measurement file.

Btw: Do we actually have benchmark models where sth like this is appearing? Different scaling factors for one and the same combination of simulationConditionID and observables?

Best,
Marcus

file names within test cases

Hi,

at the moment, we have "_model.tsv", "_parameters.tsv", "_observables.tsv" as file names for the test cases. Is there a reason why we not use the ones as in the benchmark collection? So we would e.g. have "model_0001.tsv" and "experimentalCondition_0001.tsv" with "0001" being the model name.

This would make it easier to write consistent import/export functions.

Best,
Marcus

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.