Giter Site home page Giter Site logo

biokg's People

Contributors

dimitrisalivas avatar friguzzi avatar pminervini avatar samehkamaleldin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

biokg's Issues

Open variant of BioKG

I am interested in using BioKG, however, I would need a version consisting only of CC0, CC BY, CC BY NC and CC BY NC SA. I would definitely have to exclude KEGG and HPA for this reason. Are there any alternative sources to substitute the linking done using these sources?

Loading BioKG in Neo4j

Hey folks,

First of all, I'd like to thank you for this contribution. Having a unified biomedical KG is an essential resource for research in this domain.

I would like to use BioKG in my work. Specifically, we would like to train a link predictor to perform the task of drug-target interaction prediction and utilise the benchmarks you so thoughtfully include, in order to compare the performance of our DTI approach vs others.

For this, I thought it would be useful to have the BioKG final data (in /data/biokg/) uploaded to a Neo4j property graphstore, to enable querying for specific benchmarks (using hyper-relations for example: a relation DTI with qualifier (benchmark: 'FDA') or DDI with qualifier (benchmark: MINERAL)). Furthermore, having BioKG as a Neo4j ready graph could increase usability and visibility, so I plan on making it public once I manage to get it done.

The 2 issues I'm facing:

  1. The number of unique entities/relations that I see after loading the .tsv data in Pandas is different than the ones reported in the paper, so I've been looking into what could've gone wrong.

  2. The way I create the Neo4j graph is as follows:

  • Load all entity types from metadata + properties.
  • Get unique id's and use them to create nodes with Cypher
  • Load the links
  • Match on the (already) created nodes + node_id and if both subject and object match - create the link.

Following the logic above everything runs smoothly up to the point where I try to load the links that include COMPLEXES + PATHWAYs for which I cannot find any matches for.

If I understand the data model correctly, complex_ids exist only as part of the LINKS file and do not appear in the properties + metadata files (?).

Which identifiers are the ones that I should use to create the unique Complex nodes?

Apologies for the lengthy post and for potential inaccuracies on my end.

Minor comment:
A typo I found while reading your documentation:

<uniprot_acc> MEMBER_OF_COMPLEX <mesh_id>

The relation should be PROTEIN_DISEASE if I'm not mistaken.

Thank you again for your great contribution! I would greatly appreciate any help :-)

Cheers!

Missing Gene Ontology

Hi, Thank you for developing BioKG. I'm currently looking into several articles in Biomedical Knowledge Graph and I stumbled upon your work. I can't help but notice that in the BioKG paper, you mentioned Gene Ontology. However, I couldn't find it in this repository. Will there be a plan to introduce it soon? Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.