Giter Site home page Giter Site logo

openneuroorg / openneuro Goto Github PK

View Code? Open in Web Editor NEW
97.0 9.0 37.0 68.67 MB

A free and open platform for analyzing and sharing neuroimaging data

Home Page: https://openneuro.org/

License: MIT License

JavaScript 71.88% HTML 0.55% Dockerfile 0.08% Python 4.22% TypeScript 19.19% Shell 0.41% SCSS 3.67%
neuroimaging neuroscience react graphql datasets bids

openneuro's Introduction

CodeCov Coverage Status styled with prettier

About

OpenNeuro is a free and open platform for analyzing and sharing neuroimaging data. It is based around the Brain Imaging Data Structure specification.

Development setup

This project is managed with Lerna and Yarn. To get started, install Yarn and bootstrap the repo.

yarn install

You can run tests with yarn test at the top level of the project. For each package, yarn test --watch will interactively run the tests for changes since the last commit.

Before starting up the services, you will need to copy the example .env.example file to .env and config.env.example to config.env. Many of the values are optional, and most that aren't have default values included in their .example file. Required values below:

  • JWT_SECRET in config.env must be set to a large random string.
  • PERSISTENT_DIR in .env is an absolute path to a directory that will be used to store datasets. This should be a git-annex compatible filesystem and large enough to store some test datasets.

To setup Google as an authentication provider, register a new client app and set the following variables. For development use, you will create a new Google project with oauth credentials for a JavaScript client side app. "Authorized JavaScript Origins" is set to http://localhost:9876 and "Authorized Redirect URIs" is set to http://localhost:9876/crn/auth/google/callback for a site accessible at http://localhost:9876.

# Ending in .apps.googleusercontent.com
GOOGLE_CLIENT_ID=
# 24 character secret string
GOOGLE_CLIENT_SECRET=

podman-compose is used to run a local copy of all required services together.

# This will run podman-compose in the background (-d flag is --detach)
podman-compose up -d

For example, you can restart the server container with podman-compose restart server or view logs with podman-compose logs -f --tail=10 server.

Major Components

JavaScript packages are published in the @openneuro npm namespace.

OpenNeuro Command-line utility tool

OpenNeuro supports a CLI tool based on nodejs for uploading and downloading OpenNeuro datasets.

openneuro's People

Contributors

adswa avatar anibalsolon avatar bendhouseart avatar chrisgorgo avatar ckrountree avatar constellates avatar da5nsy avatar david-nishi avatar dependabot[bot] avatar effigies avatar ehavener avatar exbotanical avatar franklin-feingold avatar jstiehl avatar mgxd avatar mih avatar nellh avatar olgn avatar robertoostenveld avatar rwblair avatar sappelhoff avatar surchs avatar thinknoack avatar tommydino93 avatar tranttommy avatar xesme avatar yarikoptic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openneuro's Issues

Support server side pagination in the client

Most endpoints return all data and are only paginated client side. This is part of the issue with #6 but it's a more general issue causing slow performance (especially if you are bandwidth constrained at all).

Analysis Status Issue for long running analysis with multiple jobs

The current behavior of the application is to update the overall status of the analysis once all the jobs in a multi-job analysis have been completed. This can cause problems if finished jobs sit in the SUCCEEDED or FAILED state on batch for longer than 24 hours waiting for other jobs to finish because the jobs are only persisted for 24 hours. Those jobs are no longer returned during status polling causing a potentially inaccurate status to be reported back to the client.
For example, this job below shows displays a status of SUCCEEDED even though the job status in mongo is RUNNING.
image

We should track the completion of individual jobs "real-time" so as to not lose track of the status of each individual job due to this issue.

Large uploads to S3 fail and are not automatically resumed

Sometimes a complete dataset is available in SciTran but it is not successfully copied to S3 when a job is started. This is more likely to occur as the dataset size goes up but can be worked around by canceling and retrying the job. It will usually succeed after this because the upload resumes where it left of and the underlying S3 or connectivity errors are infrequent.

To improve the upload performance and reliability we can move this SciTran to S3 upload to a queue and process the queue in the worker.

Analyses Sections Redesign

Currently the analyses sections in the dashboards are just list of analyses that link to the dataset that the analyses were run again. The dataset can be filtered by app and app version and sorted by date. See below.
image

There is probably a better way to display this data. Perhaps by grouping the analyses by dataset? Filtering by user who ran the analyses (for Public Dashboard)? Other ideas?

The expected outcome of this issue is to provide preliminary recommendations for UI updates that might improve the user experience with the Analyses sections of the application.

Job ran out of space

The MRIQC execution of this dataset (https://openneuro.org/datasets/ds001060/versions/00001) ran out of space for a few subjects:

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/pipeline/plugins/base.py", line 255, in run
  File "/usr/local/miniconda/lib/python3.6/site-packages/mriqc/bin/mriqc_run.py", line 325, in main
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/pipeline/plugins/base.py", line 308, in _clean_queue
    result=result)
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/pipeline/plugins/multiproc.py", line 190, in _report_crash
    traceback=result['traceback'])
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/pipeline/plugins/base.py", line 78, in report_crash
    crash2txt(crashfile, dict(node=node, traceback=traceback))
  File "/usr/local/miniconda/lib/python3.6/site-packages/niworkflows/nipype/utils/filemanip.py", line 596, in crash2txt
    fp.write(''.join(record['traceback']))
OSError: [Errno 28] No space left on device

Writing to path: /usr/local/src/mriqc/work

Event Logs Updates

  • need to add dataset upload event to event logs.
  • need to display some more useful information in the event logs UI

Store job logs outside of CloudWatch

CloudWatch was very convenient but to support multiple backends, we'll need a way to collect those logs and display them from our own datastore. For Batch, it should be populated from CloudWatch but we should no longer require CloudWatch outside the lifetime of a running job.

PermissionError: [Errno 13] Permission denied: '/output/data/derivatives'

https://openneuro.dev.sqm.io/datasets/ds000105/versions/00001?app=mindboggle&version=1&job=418a5959-43ec-459f-a614-2c41ad701f4e

^@Namespace(analysis_level='participant', bids_dir='/snapshot/data', output_dir='/output/data', participant_label=['13'])
['13']
running
subject_label is 13
/snapshot/data/13/anat/sub-13_T1w.nii.gz
/snapshot/data/13/anat/sub-13_ses-*_T1w.nii.gz
images are ['/snapshot/data/sub-13/anat/sub-13_T1w.nii.gz']
Create missing output directory /output/data/derivatives/mindboggle
Traceback (most recent call last):
  File "/opt/conda/bin/mindboggle123", line 122, in <module>
    os.makedirs(OUT)
  File "/opt/conda/lib/python3.5/os.py", line 231, in makedirs
    makedirs(head, mode, exist_ok)
  File "/opt/conda/lib/python3.5/os.py", line 241, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/output/data/derivatives'
Traceback (most recent call last):
  File "~/code/run.py", line 76, in <module>
    run_mindboggle(t1, label, args.output_dir)
  File "~/code/run.py", line 25, in run_mindboggle
    check_call(cmd)
  File "/opt/conda/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['mindboggle123', '/snapshot/data/sub-13/anat/sub-13_T1w.nii.gz', '--id', 'sub-13', '--out', '/output/data/derivatives/mindboggle', '--working', '/output/data/scratch']' returned non-zero exit status 1

This could be a problem with the app itself, but I don't see how... Creating this directory should be possible.

Upload Failure - Recurring Issue

I am currently trying to upload this dataset: https://openneuro.org/datasets/ds001067

It has failed three times thus far, but continues to make incremental progress each time I resume the upload. This is a recurring issue with each dataset I have previously uploaded.

My laptop is connected to a power source, has no disruption in connection, and has not gone to sleep.

Please advise. Thanks.

Google indexing

Currently content of OpenNeuro is not indexed by Google properly

image

Enable Firefox support

Firefox 49+ supports the needed multifile upload APIs. To enable it, we need to do a once over to fix some minor CSS issues and update the browser check.

  • Firefox CSS fixes.
  • Remove check preventing uploads from Firefox

Provide a better way to get large results

Providing the S3 URL for the results and instructions for using that URL to download your results will make getting results out of the system easier.

A dependency of solving this is we should use signed URLs to prevent excessive reuse of the S3 URLs (redownloading the same dataset many times) and protect private datasets.

Develop HTCondor submit service

To interact with the queue from #50 on HTCondor systems, we need to run the consumer on a submit server local to the cluster. The scope of this should be limited to obtaining a job from the queue, translating it to DagMan / Submit data and submitting that.

I think this should use the Python bindings but one of the other interfaces may prove better if we can share some code with the existing Node implementation.

Uploading hangs on the last file during upload

From the user:

The upload process is stuck, so the dataset is not complete.

I deleted my draft and tried again, and sadly the same problem occurred. It seems to be stuck on the last file, although it has been uploading over a day.
Has anyone else encountered this problem before?

image

I've tried, but was not able to replicate the issue on my machine with the same dataset (which I will share with @nellh and @JohnKael).

App Definition Parameters UI Improvements

The interface for adding parameters to an app definition is currently a little confusing and needs improvement. The current UI is displayed below.
image
A common issue with this UI is the adding of a parameter but forgetting to press ADD button before submitting the app definition. This interface should be redesigned to make the user experience better and less confusing.

As part of this work, we should also address hidden options, as this will add another checkbox to the interface. See below for details from Chris.

"Some BIDS App command line arguments (for example those regarding cpu and memory restrictions) should be always the same for all jobs and users submitting jobs should not be able to change them. The easiest way to implement this is to add a checkbox next to each parameter to determine if this parameter will be visible to users submitting jobs."

Reduce /api/projects bandwidth requirements and latency

When loading https://openneuro.org/dashboard/datasets being logged in as [email protected]

app.min.1b7726f0.js:sourcemap:54 Uncaught TypeError: Cannot read property 'body' of undefined
    at app.min.1b7726f0.js:sourcemap:54
    at a (app.min.1b7726f0.js:sourcemap:54)
    at app.min.1b7726f0.js:sourcemap:54
    at d.callback (app.min.1b7726f0.js:sourcemap:45)
    at d.crossDomainError (app.min.1b7726f0.js:sourcemap:45)
    at XMLHttpRequest.n.onreadystatechange (app.min.1b7726f0.js:sourcemap:45)

This seems like a critical bug.

No UI feedback for missing required fields

When trying to submit a job without providing a required field (BARACUS, license key) clicking the "Start" button leads to no action, but warning or other information is presented to the user (such as highlighting missing required inputs).

Upgrade to React 15

Some front end bugs (especially some UI performance issues) depend on upstream fixes in React or one of the other libraries requiring newer React to upgrade. The major things that need to be done here are:

  • Resolve dependency issues (switch to yarn and fix the dependency list to accurately reflect installed deps).
  • Write some tests for the UI components to help validate changes for the update.
  • Refactor routing to upgrade to react-router v4.
  • Handle all the now invalid uses of the React API and missing packages that were refactored out of React.
  • Extra padding below navbar in some views.
  • Switching snapshots does not correctly trigger the loading state and animations.
  • Active states for many tabs are not working.

Setup better performance tests

I'd like to setup a second dev server to do performance regression testing, which will require some automatic deployment work. Lighthouse CI looks excellent for this as Lighthouse is already one of the tools being used to optimize performance and headless Chrome provides fairly realistic performance values for actual usage of the OpenNeuro site.

  • Deploy to dev and perf servers on every commit. Dev should track master, perf should track whatever branch is being tested.
  • Setup Lighthouse with CircleCI.
  • Graph that data somewhere so we can see how it's changing over time (comparing releases for example).

Collect execution metrics for each analysis

From @poldrack:

it would be useful to know what kind of instances are used for freesurfer, MRIQC, and fMRIPREP

That's something that can be saved but isn't at the moment. The metrics will need to be per-task since a given analysis can be scheduled over several host systems. Some ideas for what can be collected:

  • Instance used (type, EC2 ARN)
  • Real run time (CPU time, not wall clock)
  • Percent utilization (CPU/memory allocated vs used)

@poldrack Anything else you'd like to see for this?

intermittent button failure on analysis dropdown

I have had an intermittent problem where the start button on an analysis fails to do anything when clicked. It seems to occur only when I run an analysis after just having run another within the same session on that window. refreshing the window seems to resolve the problem. low priority but I can imagine that it would flummox some people.

Fix inability to delete directories

I want to remove one entire subject from a dataset. I am able to delete the files within the subject folder, but unable to remove the subject directory, which causes a validation failure. should include ability to delete folders.

S3 configuration is only updated by new app versions

This forces updates to app versions when any changes are made to the S3 buckets, rather than applying that configuration to a submitted job. The configuration needs to be added to job submission and removed from app definition registration.

Uncaught TypeError: Cannot read property 'length' of undefined

When trying to load https://openneuro.org/datasets/ds001049

Uncaught TypeError: Cannot read property 'length' of undefined
    at t.value (/app.min.4f999727.js:50)
    at t.value (/app.min.4f999727.js:50)
    at t.value (/app.min.4f999727.js:50)
    at c._renderValidatedComponentWithoutOwnerOrContext (/app.min.4f999727.js:38)
    at c._renderValidatedComponent (/app.min.4f999727.js:38)
    at c.mountComponent (/app.min.4f999727.js:37)
    at Object.mountComponent (/app.min.4f999727.js:40)
    at v.mountChildren (/app.min.4f999727.js:39)
    at v._createContentMarkup (/app.min.4f999727.js:38)
    at v.mountComponent (/app.min.4f999727.js:38)

We have a user stuck on this - it would be great to get it fixed in a timely manner.

Create an internal jobs queue

Jobs are started out of process but the queue mechanism is very specific to Batch endpoints. It should be replaced with a generic queue describing pending jobs that can be translated into requests to Batch or HTCondor.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.