Giter Site home page Giter Site logo

brain-score / brain-score Goto Github PK

View Code? Open in Web Editor NEW
114.0 12.0 62.0 894.15 MB

A framework for evaluating models on their alignment to brain and behavioral measurements (50+ benchmarks)

Home Page: http://brain-score.org

License: MIT License

Python 19.19% MATLAB 0.93% C 0.40% Jupyter Notebook 79.48%

brain-score's Introduction

Build Status Documentation Status Contributor Covenant

Brain-Score is a platform to evaluate computational models of brain function on their match to brain measurements in primate vision. The intent of Brain-Score is to adopt many (ideally all) the experimental benchmarks in the field for the purpose of model testing, falsification, and comparison. To that end, Brain-Score operationalizes experimental data into quantitative benchmarks that any model candidate following the BrainModel interface can be scored on.

Note that you can only access a limited set of public benchmarks when running locally. To score a model on all benchmarks, submit it via the brain-score.org website.

See the documentation for more details, e.g. for submitting a model or benchmark to Brain-Score. For a step-by-step walkthrough on submitting models to the Brain-Score website, see these web tutorials.

See these code examples on scoring models, retrieving data, using and defining benchmarks and metrics. These previous examples might be helpful, but their usage has been deprecated after the 2.0 update.

Brain-Score is made by and for the community. To contribute, please send in a pull request.

Local installation

You will need Python = 3.7 and pip >= 18.1.

pip install git+https://github.com/brain-score/vision

Test if the installation is successful by scoring a model on a public benchmark:

from brainscore_vision.benchmarks import public_benchmark_pool

benchmark = public_benchmark_pool['dicarlo.MajajHong2015public.IT-pls']
model = my_model()
score = benchmark(model)

# >  <xarray.Score ()>
# >  array(0.07637264)
# >  Attributes:
# >  Attributes:
# >      error:                 <xarray.Score ()>\narray(0.00548197)
# >      raw:                   <xarray.Score ()>\narray(0.22545106)\nAttributes:\...
# >      ceiling:               <xarray.DataArray ()>\narray(0.81579938)\nAttribut...
# >      model_identifier:      my-model
# >      benchmark_identifier:  dicarlo.MajajHong2015public.IT-pls

Some steps may take minutes because data has to be downloaded during first-time use.

Environment Variables

Variable Description
RESULTCACHING_HOME directory to cache results (benchmark ceilings) in, ~/.result_caching by default (see https://github.com/brain-score/result_caching)

License

MIT license

Troubleshooting

`ValueError: did not find HDF5 headers` during netcdf4 installation pip seems to fail properly setting up the HDF5_DIR required by netcdf4. Use conda: `conda install netcdf4`
repeated runs of a benchmark / model do not change the outcome even though code was changed results (scores, activations) are cached on disk using https://github.com/mschrimpf/result_caching. Delete the corresponding file or directory to clear the cache.

CI environment

Add CI related build commands to test_setup.sh. The script is executed in CI environment for unittests.

References

If you use Brain-Score in your work, please cite "Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like?" (technical) and "Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence" (perspective) as well as the respective benchmark sources.

@article{SchrimpfKubilius2018BrainScore,
  title={Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like?},
  author={Martin Schrimpf and Jonas Kubilius and Ha Hong and Najib J. Majaj and Rishi Rajalingham and Elias B. Issa and Kohitij Kar and Pouya Bashivan and Jonathan Prescott-Roy and Franziska Geiger and Kailyn Schmidt and Daniel L. K. Yamins and James J. DiCarlo},
  journal={bioRxiv preprint},
  year={2018},
  url={https://www.biorxiv.org/content/10.1101/407007v2}
}

@article{Schrimpf2020integrative,
  title={Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence},
  author={Schrimpf, Martin and Kubilius, Jonas and Lee, Michael J and Murty, N Apurva Ratan and Ajemian, Robert and DiCarlo, James J},
  journal={Neuron},
  year={2020},
  url={https://www.cell.com/neuron/fulltext/S0896-6273(20)30605-X}
}

brain-score's People

Contributors

benlonnqvist avatar chengxuz avatar dapello avatar deirdre-k avatar dmayo avatar ernestobocini avatar fksato avatar franzigeiger avatar gaspto avatar jjpr-mit avatar kvfairchild avatar lchahuas avatar linus-md avatar mike-ferguson avatar mschrimpf avatar qbilius avatar samwinebrake avatar shehadak avatar stothe2 avatar susanwys avatar tiagogmarques avatar yingtiandt avatar yudixie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

brain-score's Issues

Move configuration external to code

A .ini or .yml (or whatever) file, and infrastructure for making config stuff available throughout the project.

Make a config.example.yml, add 'config.yml' to .gitignore so users can add credentials without accidentally committing.

Things to include:

  • directory for local data cache (e.g. needs to be a larger mount than home on OpenMind)
  • SQLite file location
  • Postgres connection information
  • Logging level and location
  • AWS credentials and which profile to use

sqlite I/O error

Sometimes (!), mkgu raises an sqlite3 disk I/O error.
This does not occur all the time, it seems like it mostly happens when running jobs in batches. Maybe concurrent accesses to sqlite do not work?

Traceback (most recent call last):
  File "neural_metrics/compare.py", line 39, in main
    hvm = mkgu.get_assembly(name="HvM")
  File "/om/user/msch/miniconda3/envs/neural-metrics/lib/python3.6/site-packages/mkgu/__init__.py", line 11, in get_assembly
    return fetch.get_assembly(name)
  File "/om/user/msch/miniconda3/envs/neural-metrics/lib/python3.6/site-packages/mkgu/fetch.py", line 247, in get_assembly
    assy_record = get_lookup().lookup_assembly(name)
  File "/om/user/msch/miniconda3/envs/neural-metrics/lib/python3.6/site-packages/mkgu/fetch.py", line 129, in lookup_assembly
    cursor.execute(self.sql_lookup_assy, (name,))
sqlite3.OperationalError: disk I/O error

numpy.linalg.LinAlgError: SVD did not converge

For models such as CORnet-R with many zeros in their activations, PLS regression fails with a numpy.linalg.LinAlgError: SVD did not converge.
The error originates from NaNs in the regression weights which in turn stem from https://github.com/scikit-learn/scikit-learn/blob/a7a834bdb7a51ec260ff005715d50ab6ed01a16b/sklearn/cross_decomposition/pls_.py#L67 where x_score = 0 and thus y_weights = ... / 0 = NaN.

Solution approaches

  1. pad activations with epsilon -- failed due to scaling
  2. pad activations with random numbers -- yields a score of 0 due to randomness
  3. drop neuroids with zero values -- not sure why, but it failed. maybe due to scaling again
  4. drop neuroids with zero values after scaling -- it seems there's only 42 unique values in the first place, after this step, there's only one

specify git dependencies with PEP 508

PEP 508 allows the specification of install_requires requirements like so:

install_requires=[
"result_caching @ git+https://github.com/mschrimpf/result_caching", 
]

This functionality is supported since pip 18.1 and removes the need for --process-dependency-links (which will be removed in pip 19).
We need to update at least our setup.py accordingly.

code-arrangement for Similarity utilities

A Similarity takes as input assembly1, assembly2 and outputs a Score object.

As of now, there are two kinds of similarities:

  1. non-parametric, such as the RDMSimilarity: compute similarity of two assemblies directly
  2. parametric, such as the NeuralFit: first fit on a training set, then predict on a test set and compute similarity based on the predictions

We also have additional utility on top of the simple case:

  1. compute the outer-product of combinations of all adjacent dimensions (i.e. dimensions that are not used for the Similarity such as region).
  2. cross-validate with several folds over a dimension that Similarity is computed over, e.g. object_name as part of presentation

There are several ways to organize the code around this:

  1. sub-classing (this is the way it is organized now): there is a parent class OuterCrossValidationSimilarity that all Similarity classes need to inherit from. This parent class implements (1) and (2) from above and sub-classes only need to implement the simple case. Drawbacks:
    1.1 harder to test since all sub-classes drag the parent code with them
    1.2 we can't just implement apply in our sub-classes but need to adjust the method name
    1.3 we can't compute similarity without the extra baggage, i.e. everything is always cross-validated
  2. chaining: each Similarity class implements exactly one operation in apply. A chain operator then takes all these classes, applies them one after another and outputs only the final result. The result here is a list of assemblies which are then fed into a Score in Similarity.__call__. However, I don't know how to represent parametric and non-parametric similarities with this approach (one has to fit, predict, compare_prediction, the other just has to compare)
  3. extract the computor from Similarity: all the specialized handling ((1) and (2) from utilities) goes into Similarity sub-classes, operation on simplified assemblies goes into a Computor class. Hard to separate the two though, for instance Similarity still needs to call fit, predict etc.

For now, approach (1) works but after NIPS, I would like to revisit the structuring here.

lookup.db not copied on installation

Installing mkgu with python setup.py install doesn't copy the lookup.db file to the site-packages directory. As a result, SQLiteLookup does not find the table names.

Neural Fit identity score low

Using neural fit to go from hvm to itself (i.e. neural_fit(hvm, hvm)) yields a comparably low score of around .78 (see this unit test).
Did others experience something similar (@qbilius)? Is that an issue with the regression?

automated API

following up on #1, it would be great if we could get some documentation. Ideally we'd create this automatically through readthedocs.io

Various suggestions

Bugs

  • test_mkgu has two unused declarations of test_load

Style

  • type is a reserved word in Python; consider replacing with kind
  • Benchmark.calculate, Metric.apply etc might be better served by __call__ method
  • Things like _fetcher_types might be better declared at the top and in capitals (FETCHER_TYPES)
  • I strongly suggest not using rst for formatting README and HISTORY. Nobody uses this format. Markdown is much more common.
  • return 0 (e.g., in metrics.py) is not a thing in Python

Other

  • Docs don't exist yet but aren't they supposed to render automatically on ReadTheDocs?
  • Any chance lookup.dbcould be a simple csv file? Since it is unlikely to grow too big, there would be no performance penalty, but there would be an advantage of being able to quickly see dataset names and available assets.
  • A Jupyter Notebook with an example is much desired.

running metrics is slow

ideas for improving:

  1. parallelize, e.g. across cross-validations
  2. provide a "quick-and-dirty" way of evaluating, e.g. run the fit without cross-validation

test data alignment

the RDMs seem to be different after the recent assembly re-formatting, suggesting that there's an error in the data.
Make sure that the data is aligned properly.

Add StimulusSets

  • With lookup
  • With fetching
  • Add reference(s) to StimulusSets in DataAssembly class and in lookup
  • Do some verification that relevant coordinates in a DataAssembly are valid in the associated StimulusSet

new version of xarray gives an error

(Submitting as requested)
If I use the most recent xarray version, it gives an error that 'Score' does not have an attribute 'indexes'. Using older version of xarray won't give the error.

conda env create errors

Getting the following errors when creating conda environment with conda env create -f environment.yml:

ResolvePackageNotFound:
  - netcdf4==1.2.4=np113py36_1

After removing that package:

UnsatisfiableError: The following specifications were found to be in conflict:
  - libnetcdf==4.4.1=1 -> jpeg=9
  - qt==5.6.2=2

Naming and formatting coordinates

presentation: presentation_id, image_id (currently, it's called id (double-check it's not the _id))
neuroid: neuroid_id

rename vars from 'V0' -> 0
filenames -> strip full path

necessity to configure AWS credentials

It seems like users still have to configure AWS credentials even if they only access public resources.
Can we get rid of that requirement @jjpr-mit? If yes, how long will it take?
It's also okay if we put a note in the README detailing how to configure AWS.

Website not mobile-friendly

Table doesn't fit on screen yet there is no horizontal scrollbar.
Tested on Firefox.

Possible solution (wild guess):

body {
  overflow:auto;
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.