Giter Site home page Giter Site logo

Comments (4)

nmenda avatar nmenda commented on August 20, 2024

actually there are more missing terms.
We have 257 in the latest file from cassavabase, but only 239 variable terms in the working copy on this repo.
Working on fixing this .
These terms must have gotten lost in one of the reformatting cycles.

from co_334-cassava-traits.

leova avatar leova commented on August 20, 2024

Good that you bring this issue up.
I think we should first be clear on the files involved in the issue. Let’s call:

  • File A: the "latest file from cassavabase", presumably, the file before the 2015 curation
  • File B: the TD in template v5 which structure has been discussed and approved and which curated content has been accepted by Afola
  • File C: OBO file converted by Marie from file B. This file is presently on the planteome github.

Is it correct that you identified a lack of 18 variables between file A and file C? (257 variables in file A and 239 variables in file C)?

What is file A?

If you mean that file A is the OBO on https://github.com/nextgencassava/cassava_ontology, I cannot understand because:

  • 1/ I count 256 terms among those 241 are variables (even though they are called traits in this file). Indeed CGIAR cassava trait ontology, agronomic trait, morphological trait, physiological trait, quality trait, stress trait, abiotic stress trait, biotic stress trait, bacterial disease, viral disease, fungal disease, insect damage, derives_from, method_of and scale_of are terms that are not variables. It would then mean that only 2 variables are missing (241-239=2)
  • 2 /The OBO does not include the term 0000256 that you identified as missing.
    By the way, I cannot find 0000256 anywhere (neither on http://www.cassavabase.org/chado/cvterm?action=view&cvterm_id=70760 nor on http://www.cropontology.org/rdf/CO_334:0000256, nor on any file I have locally).

As I could not look at the file A you meant, I was not able to derive the full list of missing terms. Nevertheless, I have worked on the other examples you gave (15, 77, 123, 224) and I have identified 2 causes of losing.

1/ the terms were not present in the original working curation file

On 04/03/2015, Afola sent an updated version of the cassava ontology to Elizabeth. He sent 2 versions of this ontology:

  • File A1: an excel TD under template version 4 with 242 Trait-Method-Scale triplets (remark: there are actually 143 triplets in the file but it includes "days to flower 109" that Afola replaced by "root constriction 109")
  • File A2: an OBO file with 248 variables -back then called traits

Leave aside CO_334:0000027 bacterial disease, CO_334:0000028 viral disease, CO_334:0000029 fungal disease, CO_334:0000030 insect damage, the OBO has 2 variables that are absent of the TD: CO_334:0000077 post-harvest physiological deterioration and CO_334:0000123 plant height with leaf.

At that time, I had assumed that these two files were equivalent so I worked on file A1, the excel TD and not on file A2, the OBO. This might accounts for the losing of 2 variables (CO_334:000077: post-harvest physiological deterioration and CO_334:0000123: plant height with leaf)

2/ the terms were lost during the curation/formatting process

I have looked for Ids that have been lost while curating, converting, exchanging files by comparing file A1 and file B (I saw no conversion issue between file B and file C). I have looked for ids that were present in file A1 and that disappeared in file B and found only 2 variables: CO_334:0000015 Harvest Index and CO_334:0000224 staygreen.

I have not checked so I cannot say when and why they got lost. But I apologize in advance if the losing of these 2 variables is my responsibility.

My conclusion

To the best of my knowledge and understanding, I can only make sense of this issue by saying that only CO_334:0000015 Harvest Index and CO_334:0000224 staygreen have been lost during the curation/formatting/conversions and that only CO_334:000077: post-harvest physiological deterioration and CO_334:0000123: plant height with leaf have been left out of the curation process.

Thanks for sharing more information that can help identify other missing variables.

from co_334-cassava-traits.

nmenda avatar nmenda commented on August 20, 2024

Leo,

I checked the versions again, and it looks like you are correct, and the only missing terms are 0000015, 0000077, 0000123, 0000224 ! We might have other terms that did not make it into the CO version on 4/3/15.
I will these 4 now and will try to add proper methods and scales.
If there are more variables that got lost in the cracks between April and now we will add them again to this OBO file.

from co_334-cassava-traits.

nmenda avatar nmenda commented on August 20, 2024

7fa2a63 closes this issue

from co_334-cassava-traits.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.