Giter Site home page Giter Site logo

miti's Introduction

MITI: Minimum Information about Tissue Imaging

Minimum Information about Tissue Images (MITI) reporting guidelines comprise minimal metadata for highly multiplexed tissue images and were developed in consultation with methods developers, experts in imaging metadata (e.g., DICOM and OME) and multiple large-scale atlasing projects; they are guided by existing standards and accommodate most multiplexed imaging technologies and both centralized and distributed data storage.

The YAML files in this repository provide detailed specification of the standard, as outlined in the corresponding manuscript. Individual files capture attributes, their description and significance, but also additional information that is essential for validating specific files; this includes ensuring that data in each field has the correct data type and that it meets constraints on valid values. Each attribute is associated with one of the following data types:

Valid values are specified as sets of predefined keywords for string and as [min, max] intervals for integer and float variables, where both min and max can be optionally omitted to define one-sided intervals.

Community Participation

This is a consensus-based community standard. Anyone with an interest can join the community, contribute to the schema design and implementation, and participate in the decision-making process. Community discussions should be limited to Github and image.sc forum in #mcmicro.

Changes to MITI require a submission via Github Issues with the following information:

  • Scope and Field
  • Summary of changes
  • Example
  • Implementation as a pull request

Please reach out to the governance board with questions about how to engage in community discussions or submit revisions. If the implementation needs a revision, the community can discuss and vote for the submission via Github for at least 30 days.

Governance

The MITI governance board members are expected to participate in strategic planning, approve changes to the governance model, and respond to community feedback. The board will resolve revisions/issues for which the community cannot reach a consensus in a reasonable timeframe. This board will remain for 18 months followed by a community-based voting of a new governance board (3-5 board members). New candidates can be proposed by board members, community members or via a direct application. We welcome and encourage participation by everyone!

As of 2023-10-18, the MITI governance board comprises (i) Denis Schapiro, PhD, Research Group Leader at the Heidelberg University (Chair); (ii) Adam Taylor, PhD, Senior Research Scientist, Sage Bionetworks; (iii) Sarah Arena, MS, Data Scientist, Harvard Medical School

Diversity Statement

The MITI consortium welcomes and encourages participation by everyone. We are committed to being a community that everyone enjoys being part of. Although we may not always be able to accommodate each individual’s preferences, we try our best to treat everyone kindly.

No matter how you identify yourself or how others perceive you: we welcome you. Though no list can hope to be comprehensive, we explicitly honour diversity in: age, culture, ethnicity, genotype, gender identity or expression, language, national origin, neurotype, phenotype, political beliefs, profession, race, religion, sexual orientation, socioeconomic status, subculture and technical ability, to the extent that these do not conflict with this code of conduct.

Example 1

The following YAML block defines a required attribute Tumor tissue type that relates to collection and processing of biospecimens. A valid field must be specified as a character string, using one of the pre-defined keywords: Primary Tumor, Local Tumor Recurrence, etc.

Tumor tissue type:
  description: Text that describes the kind of disease present in the tumor specimen
    as related to a specific time point.
  category: Collection and Processing
  type: string
  valid-values:
  - Primary Tumor
  - Local Tumor Recurrence
  - Distant Tumor Recurrence
  - Metastatic
  - Premalignant
  significance: required

Example 2

The following YAML block defines a recommended attribute Cycle Number, which must be specified as a 1-based index (i.e., an integer belonging to the one-sided interval [1, Inf)).

Cycle Number:
  description: 'the cycle # in which the co-listed reagent(s) was(were) used'
  type: integer
  valid-values:
    min: 1.0
  significance: recommended

Endnotes

We are thankful to the groups behind the following documents, from which we drew content and inspiration:

miti's People

Contributors

adamjtaylor avatar arenasg avatar artemsokolov avatar clarenceyapp avatar denissch avatar seanderickson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

miti's Issues

Governance

Changes to MITI require a submission (via Github Issues- label enhancement) with the following information:

  • Scope and Field
  • Summary of changes
  • Example
  • Implementation as a pull request

The community can discuss and vote for the submission via Github for at least 30 days.

The governance board - Denis Schapiro (chair - Heidelberg), Sarah Arena (Harvard) Adam Taylor (Sage Bionetworks) - will decide based on the community vote/response to accept, ask for a revision or decline.

If the implementation needs a revision, this needs to be submitted latest 30 days after acceptance.

Governance Board Election Announcement!

It’s time for the MITI Governance Board to update its membership! The MITI Governance Board consists of 4-6 board members who will be responsible for holding regular board and community meetings, responding to community feedback on GitHub, and making group decisions about the maintenance of the MITI metadata standard and associated tools. Board membership is an 18-month term (January 2024-July 2025). We encourage all members of the community to apply.

If you are interested in serving on the Board, please submit your name and a brief statement about your interest in serving on the MITI board by 11:59PM ET on October 15, 2023 to this announcement. Voting will take place on GitHub after voting closes.

Add an automated test to validate YAML files

Add a GitHub Actions workflow that, for each file, verifies that:

  1. It can be loaded by a canonical YAML parser.
  2. Each entry contains description, type and significance fields.
  3. The type is specified as one of the pre-defined choices: boolean, integer, float, string, filename, date, doi, rrid.

Such workflow will help with automatic validation of future contributions.

valid-values for "Lost to follow up" in 01-clinical.yaml should be quoted

Currently, valid-values for the item Lost to follow up use unquoted values:

Lost to follow up:
  description: Yes/No/Unknown indicator to identify whether a patient was lost to
    follow up.
  category: Follow-Up
  type: string
  valid-values:
  - Yes
  - No
  - Unknown

When loading using yaml.load (PyYAML==6.0), unquoted values for Yes/No are converted to True/False

>>> import yaml
>>> 
>>> yaml_str = """\
... valid-values:
...     - Yes
...     - No
...     - Unknown
... """
>>> loaded_data = yaml.load(yaml_str, Loader=yaml.FullLoader)
>>> 
>>> print(loaded_data)
{'valid-values': [True, False, 'Unknown']}

at:

Lost to follow up:

More structured schema format?

It seems that the YAML format for table data specification is informed by the choice of Cerberus as an underlying validator.

Is the consortium willing to consider Frictionless Data instead, as the underlying data schema specification framework? It has a lot of environment support, with Python, JS, and bash tools to make FD data packages natively available in various settings (local prototyping, headless/remote computation, web applications). As an example of a project that makes use of FD under the hood, see the C2M2 (Cross Cut Metadata Model).

If such a move (albeit quite significant) could be on the table, in a few days I can make a PR to share some scripts that I am currently using to:

  1. Automatically convert the MITI spec into (i) a flat table of fields, (ii) a table of tables, and (iii) a few separate files for the "valid values" currently spread across various YAML files.
  2. Automatically convert (i), (ii), and (iii) into a FD data package ready to get populated with data.

(The above (1) is not a perfect translation yet.)

Moving to FD offers the advantage that the general purpose validation functionality is high quality and maintained by someone else. A specific advantage is the possibility of foreign-key integrity checks made possible by the schema's awareness of dependencies between tables.

Also, I think maintaining the schema as a flat-table-of-fields behind the scenes would simplify the schema designers' work, as it relieves them of the need to hand-edit schema specification files whose syntax is really designed for the use case of machine reading.

I don't have any kind of association with FD. I'm in the Nadeem lab at MSKCC, and we're making software that would stand to benefit from the MITI standards.

The MITI spec is really comprehensive, and many groups would benefit from standardization in this domain! Thank you for your work in this important effort.

Design a conditional structure for representing significance values

Whether a certain field is required may be conditioned on the value of other fields. Currently, these relationships are captured in plain text format, e.g.:

Object class Description:
  description: Free text description of object class
  type: string
  significance: If Object Class = 'other', required, otherwise recommended

Consider representing the relationships using a structured format, instead. This will enable ease of parsing for tools using MITI. For example, the above could be represented as

Object class Description:
  description: Free text description of object class
  type: string
  significance:
    required: Object class == 'other'

Another example:

PhysicalSizeZ:
  description: physical size of one pixel in z-dimension
  type: float
  significance:
    required: sizeZ > 1

where Object class and sizeZ refer to other existing fields at the top level.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.