Giter Site home page Giter Site logo

obo stats about ols4 HOT 12 CLOSED

ebispot avatar ebispot commented on June 5, 2024
obo stats

from ols4.

Comments (12)

udp avatar udp commented on June 5, 2024 1

Hi @matentzn

I checked the latest indexer run and these seem to be the OBO ontologies we still have a problem with:

ontology id purl problem
mamo http://purl.obolibrary.org/obo/mamo.owl OWL XML
vario http://purl.obolibrary.org/obo/vario.owl OWL ??
gaz http://purl.obolibrary.org/obo/gaz.obo OBO
dinto http://purl.obolibrary.org/obo/dinto.owl Redirects to the github repo
eo http://purl.obolibrary.org/obo/eo.owl Redirects to https://raw.githubusercontent.com/Planteome/plant-environment-ontology/master/plant-environment-ontology.obo.owl which is 404
epo http://purl.obolibrary.org/obo/epo.owl Redirects to https://epidemiology-ontology.googlecode.com/files/epidemiology_ontology.owl which is 404
ero http://purl.obolibrary.org/obo/ero.owl Redirects to https://open.catalyst.harvard.edu/products/eagle-i/ which is a HTML page not an ontology
flu http://purl.obolibrary.org/obo/flu.owl Imports http://purl.obolibrary.org/obo/ido/2010-12-02/ido-main-workaround.owl which is 404
mfo http://purl.obolibrary.org/obo/mfo.owl Redirects to https://obofoundry.org/ not an ontology
mirnao http://purl.obolibrary.org/obo/mirnao.owl Redirects to http://mirna-ontology.googlecode.com/svn/trunk/src/ontology/mirnao.owl which is 404
mo http://purl.obolibrary.org/obo/mo.owl Redirects to http://ontologies.berkeleybop.org/ which is not an ontology
nmr http://purl.obolibrary.org/obo/nmr.owl Redirects to http://ontologies.berkeleybop.org/ which is not an ontology
ogi http://purl.obolibrary.org/obo/ogi.owl Redirects to https://ontology-for-genetic-interval.googlecode.com/svn/trunk/src/OGI.owl which is 404
sep http://purl.obolibrary.org/obo/sep.owl Redirects to http://ontologies.berkeleybop.org/sep.owl NoSuchKey
vhog http://purl.obolibrary.org/obo/vhog.owl Redirects to http://ontologies.berkeleybop.org/vhog.owl NoSuchKey

from ols4.

matentzn avatar matentzn commented on June 5, 2024 1

I manually checked most of these. I personally would suggest to restrict OLS to only active ontologies in OBO:

https://obofoundry.org/

All of these ones you listed here (or most of them, didn't check all) are obsolete, or inactive. OBO Foundry does not recommend the use of non-active ontologies (i.e. they are hidden on https://obofoundry.org/)

from ols4.

anitacaron avatar anitacaron commented on June 5, 2024 1

@udp, can you confirm that RO is not having issues anymore, please?

from ols4.

udp avatar udp commented on June 5, 2024

For that 6.5 GB JSON file, json2csv took 3 minutes and generated 765 MB of CSV

This seems a suspiciously large difference, but I tried gzipping them to see how much ACTUAL data there was and not just repetition:

  • The 6.5 GB JSON file compressed to 322 MB.
  • The 765 MB of CSV compressed to 285 MB

Those numbers are firmly in the same ballpark so I think no data has been lost, so all of obo foundry is actually pretty tiny depending on how you represent it.

I also tried gzipping ALL of the OLS “downloads” folder from noah, so that’s all the OWL files from OBO and OLS’s ontologies, which also includes lots of obsolete stuff I didn’t index above. That compressed to 886 MB. So all of the data in OLS is actually only 886 MB when compressed!

from ols4.

matentzn avatar matentzn commented on June 5, 2024

For

doid
cto
cvdo
mfmo
ons
ro
upheno
mamo
vario

can you list the import URLs that are not rdfxml? I may be able to fix these with a bit of a sledge hammer.

ogi OBOFoundry/OBOFoundry.github.io#1942
ero (inactive on obo, URL of ontology redirects to website) OBOFoundry/OBOFoundry.github.io#1942
rnao Resolves: http://purl.obolibrary.org/obo/rnao.owl

from ols4.

udp avatar udp commented on June 5, 2024

@matentzn they were manually checked by me too to make the table. I didn't realise they were obsolete/inactive. However they will be completely absent (= 404) from OLS when we ship OLS4 if we do not load them. Will this an issue?

from ols4.

henrietteharmse avatar henrietteharmse commented on June 5, 2024

In general I am happy with not loading inactive ontologies. However, even if an ontology is inactive, it can still be used and we can not drop its availability - particularly when there seems to be no other alternative. I think MAMO is a good example of this and is used in EBI by the BioModels team.

A way around this is to not load inactive OBO ontologies. In a case like MAMO we can add it to the EBI OLS config with the URL pointing to the file system.

from ols4.

matentzn avatar matentzn commented on June 5, 2024

@henrietteharmse I think your suggestion is the way to go.

Maybe be a bit more conservative for now and only exclude obsolete ontologies from OBO to start with. If you supply me with a list @udp with the remaining (non obsolete, breaking ones), I can maybe reach out to the groups and use OLS inclusions to up their game a bit and fix their ontology.

from ols4.

udp avatar udp commented on June 5, 2024

@matentzn We currently have an issue with RO. Though the core file is RDF/XML:

https://raw.githubusercontent.com/oborel/obo-relations/master/ro.owl

it imports this file: https://raw.githubusercontent.com/oborel/obo-relations/master/chemical.owl which is in functional syntax.

Issue opened here: oborel/obo-relations#673

from ols4.

matentzn avatar matentzn commented on June 5, 2024

This will be solved soon by @anitacaron, the solution is already there, we just need time to review and implement it.

from ols4.

allenbaron avatar allenbaron commented on June 5, 2024

For the Human Disease Ontology (doid), would loading the doid-merged.owl file (http://purl.obolibrary.org/obo/doid/doid-merged.owl), which has all imports loaded in, fix this issue?

@lschriml, fyi.

from ols4.

allenbaron avatar allenbaron commented on June 5, 2024

The doid file that isn't RDF/XML was our ext.owl file (in OFN). We recently switched it to RDF/XML because other people were experiencing parsing issues (DiseaseOntology/HumanDiseaseOntology#1112).

from ols4.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.