Giter Site home page Giter Site logo

pennlinc / cubids Goto Github PK

View Code? Open in Web Editor NEW
19.0 7.0 7.0 8.76 MB

Curation of BIDS (CuBIDS): A sanity-preserving software package for processing BIDS datasets.

Home Page: https://cubids.readthedocs.io/

License: MIT License

Makefile 1.16% Python 97.91% Shell 0.93%
neuroimaging neuroimaging-data-science python-package data-curation data-organization neuroscience neuroscience-methods neuroinformatics

cubids's Introduction

CuBIDS: Curation of BIDS

Latest Version GitHub Repository Documentation Status Test Status Codecov Publication DOI Zenodo DOI License

About

CuBIDS (Curation of BIDS) is a workflow and software package designed to facilitate reproducible curation of neuroimaging BIDS datasets. CuBIDS breaks down BIDS dataset curation into four main components and addresses each one using various command line programs complete with version control capabilities. These components are not necessarily linear but all are critical in the process of preparing BIDS data for successful preprocessing and analysis pipeline runs.

  1. CuBIDS facilitates the validation of BIDS data.
  2. CuBIDS visualizes and summarizes the heterogeneity in a BIDS dataset.
  3. CuBIDS helps users test pipelines on the entire parameter space of a BIDS dataset.
  4. CuBIDS allows users to perform metadata-based quality control on their BIDS data.
  5. CuBIDS helps users clean protected information in BIDS datasets, in order to prepare them for public sharing.

https://github.com/PennLINC/CuBIDS/raw/main/docs/_static/cubids_workflow.png

For full documentation, please visit our ReadTheDocs.

Citing CuBIDS

If you use CuBIDS in your research, please cite the following paper:

Covitz, S., Tapera, T. M., Adebimpe, A., Alexander-Bloch, A. F., Bertolero, M. A., Feczko, E., ... & Satterthwaite, T. D. (2022). Curation of BIDS (CuBIDS): A workflow and software package for streamlining reproducible curation of large BIDS datasets. NeuroImage, 263, 119609. doi:10.1016/j.neuroimage.2022.119609.

Please also cite the Zenodo DOI for the version you used.

cubids's People

Contributors

cookpa avatar dependabot[bot] avatar jaberbasma avatar krmurtha avatar mattcieslak avatar megardn avatar scovitz avatar scovitz1 avatar tinashemtapera avatar tsalo avatar yarikoptic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cubids's Issues

add back precision

want to round AFTER clustering so that params that don't belong in the acq-VARIANT string don't end up there

[ENH] Make BIDS validator easy to use

Problem: NodeJS is hard to install and maintain and it is not secure

To Do:

  • Make a Docker wrapper that pulls and runs BIDS-Validator on a BIDS tree
  • ^ but with Singularity
  • Copy Tinashe's parsing code from RBC

Relational Params need to be boolean

IntendedForKeyXX and FieldmapKeyXX values are now True/False instead of filepaths. This should decrease the number of fmap param groups significantly.

How to read and write JSON files

Proof of Concept on reading & writing json files in a Jupyter notebook

  • Displaying the data in the sidecar
  • Editing this data
  • Checking that the sidecar will write "valid json"

Add cubids-make-exemplars CLI program

A CLI function that takes one subject per Acquisition Group and copies them into a new BIDS directory

  • Copy over everything else necessary to make a complete BIDS dataset (dataset_description.json, etc)
  • Create other directories, YODA-style (eg code/, bidsdatasets/)

apply doesn't work if group uses a relative path

Files csv paths can't include '/gpfs/..' and they do if the path to the dataset while running group was relative.

This would be an issue if running CuBIDS on a dataset stored on a different machine, because then you have to use relative paths.

IDEAS FOR FIXING THIS:

  • change the way we get the new path so it's not adding "self.path" to the old path (get the old front and add to the new path instead of self.path + new stem)

[ENH] Add subject/session Acquisition Groups

A set of scans belonging to either a subject or a session will also be a set of Key/Param groups. The combination of Key/Param groups determine how a pipeline will run on that data.

TODO

  • A function that groups subjects or sessions based on the Key/Param groups contained
  • A CLI entrypoint for that function

make datalad optional for apply and purge

The only occurrence of datalad in the apply and purge functions is where we datalad run the merge commands for apply and the rm commands for purge.

Change those two lines to use subprocess.run instead of datalad_handle.run if the use_datalad flag is unset

[ENH] Infrastructure: make a class

Create an object that encapsulates the BIDS directory and operations on it.

It should

  • Find and validate a BIDS tree
  • List unique key-value pair sequences
  • Provide methods to alter json sidecars based on key-value pair sequences

Ideas:

  • Use AnyTree
  • PyBIDS - look into their caching feature

[ENH] Detect key groups and param groups

We need to be able to find all param groups associated with a key group

To Do:

  • Find the key groups in the testdata/inconsistent dataset
  • Find the param groups under each of these
    • Determine which metadata values are part of the param groups for each datatype

[ENH] Set up testing

ToDO

  • Create list of use cases
  • Write documentation for use cases
  • Find data for test cases
  • Write tests for each use case
  • Configure CI service to run

Key-Value Pairs PofC

  • Changing or rearranging bids name key value pairs

Recommend:

  1. Read the name of a bids file
  2. Create a dictionary of each of the key value pairs
  3. Read multiple files
  4. Create 1 dictionary where the keys come from the files' keys, and the values are a list of the possible values

[ENH] Add image info to sidecars

The dimensions and voxel size are stored in nifti headers, which we don't want to have to read in order to group scans by this information.

  • Check BIDS spec to see if they have tags for image size/dimensions
  • Add a CLI bond-nifti-info command that adds image from the nifti headers to sidecars
  • Add "Dimension1", etc to IMAGING_PARAMETERS in bond.constants
  • Add pytests

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.