miti-consortium / miti Goto Github PK

Minimum Information about Tissue Imaging

Home Page: https://www.miti-consortium.org/

License: MIT License

Python 100.00%

miti's Issues

Governance Board Election Announcement!

It’s time for the MITI Governance Board to update its membership! The MITI Governance Board consists of 4-6 board members who will be responsible for holding regular board and community meetings, responding to community feedback on GitHub, and making group decisions about the maintenance of the MITI metadata standard and associated tools. Board membership is an 18-month term (January 2024-July 2025). We encourage all members of the community to apply.

If you are interested in serving on the Board, please submit your name and a brief statement about your interest in serving on the MITI board by 11:59PM ET on October 15, 2023 to this announcement. Voting will take place on GitHub after voting closes.

More structured schema format?

It seems that the YAML format for table data specification is informed by the choice of Cerberus as an underlying validator.

Is the consortium willing to consider Frictionless Data instead, as the underlying data schema specification framework? It has a lot of environment support, with Python, JS, and bash tools to make FD data packages natively available in various settings (local prototyping, headless/remote computation, web applications). As an example of a project that makes use of FD under the hood, see the C2M2 (Cross Cut Metadata Model).

If such a move (albeit quite significant) could be on the table, in a few days I can make a PR to share some scripts that I am currently using to:

Automatically convert the MITI spec into (i) a flat table of fields, (ii) a table of tables, and (iii) a few separate files for the "valid values" currently spread across various YAML files.
Automatically convert (i), (ii), and (iii) into a FD data package ready to get populated with data.

(The above (1) is not a perfect translation yet.)

Moving to FD offers the advantage that the general purpose validation functionality is high quality and maintained by someone else. A specific advantage is the possibility of foreign-key integrity checks made possible by the schema's awareness of dependencies between tables.

Also, I think maintaining the schema as a flat-table-of-fields behind the scenes would simplify the schema designers' work, as it relieves them of the need to hand-edit schema specification files whose syntax is really designed for the use case of machine reading.

I don't have any kind of association with FD. I'm in the Nadeem lab at MSKCC, and we're making software that would stand to benefit from the MITI standards.

The MITI spec is really comprehensive, and many groups would benefit from standardization in this domain! Thank you for your work in this important effort.

Add an automated test to validate YAML files

Add a GitHub Actions workflow that, for each file, verifies that:

It can be loaded by a canonical YAML parser.
Each entry contains description, type and significance fields.
The type is specified as one of the pre-defined choices: boolean, integer, float, string, filename, date, doi, rrid.

Such workflow will help with automatic validation of future contributions.

Design a conditional structure for representing significance values

Whether a certain field is required may be conditioned on the value of other fields. Currently, these relationships are captured in plain text format, e.g.:

Object class Description:
  description: Free text description of object class
  type: string
  significance: If Object Class = 'other', required, otherwise recommended

Consider representing the relationships using a structured format, instead. This will enable ease of parsing for tools using MITI. For example, the above could be represented as

Object class Description:
  description: Free text description of object class
  type: string
  significance:
    required: Object class == 'other'

Another example:

PhysicalSizeZ:
  description: physical size of one pixel in z-dimension
  type: float
  significance:
    required: sizeZ > 1

where Object class and sizeZ refer to other existing fields at the top level.

Governance

Changes to MITI require a submission (via Github Issues- label enhancement) with the following information:

Scope and Field
Summary of changes
Example
Implementation as a pull request

The community can discuss and vote for the submission via Github for at least 30 days.

The governance board - Denis Schapiro (chair - Heidelberg), Sarah Arena (Harvard) Adam Taylor (Sage Bionetworks) - will decide based on the community vote/response to accept, ask for a revision or decline.

If the implementation needs a revision, this needs to be submitted latest 30 days after acceptance.

valid-values for "Lost to follow up" in 01-clinical.yaml should be quoted

Currently, valid-values for the item Lost to follow up use unquoted values:

Lost to follow up:
  description: Yes/No/Unknown indicator to identify whether a patient was lost to
    follow up.
  category: Follow-Up
  type: string
  valid-values:
  - Yes
  - No
  - Unknown

When loading using yaml.load (PyYAML==6.0), unquoted values for Yes/No are converted to True/False

>>> import yaml
>>> 
>>> yaml_str = """\
... valid-values:
...     - Yes
...     - No
...     - Unknown
... """
>>> loaded_data = yaml.load(yaml_str, Loader=yaml.FullLoader)
>>> 
>>> print(loaded_data)
{'valid-values': [True, False, 'Unknown']}

at:

MITI/yaml/01-clinical.yaml

Line 3915 in 93201a7

Lost to follow up:

miti-consortium / miti Goto Github PK

miti's Issues

Governance Board Election Announcement!

More structured schema format?

Add an automated test to validate YAML files

Design a conditional structure for representing significance values

Governance

valid-values for "Lost to follow up" in 01-clinical.yaml should be quoted

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent