Giter Site home page Giter Site logo

sssom2neo's Introduction

Convert SSSOM TSV to nodes and edges CSV files that can be ingested by neo4j-admin import.

To build:

mvn clean package

To run, assuming you have some mappings called mappings.sssom.tsv:

java -jar target/sssom2neo-1.0-SNAPSHOT.jar \
    --input mappings.sssom.tsv \
    --output-edges edges.csv \
    --output-nodes nodes.csv

You can also run over a directory containing lots of mappings files, like the OLS SSSOM dataset:

java -jar target/sssom2neo-1.0-SNAPSHOT.jar \
    --input ./mappings/ \
    --output-edges edges.csv \
    --output-nodes nodes.csv

Now you have two files, nodes.csv and edges.csv.

Let's load them into Neo4j! Assuming you already have Docker installed, we can do this quite easily. We will populate a new folder called neo with our neo4j database. First we use neo4j-admin to import the CSV:

docker run \
    -v $(pwd)/neo:/data \
    -v $(pwd)/nodes.csv:/mnt/nodes.csv \
    -v $(pwd)/edges.csv:/mnt/edges.csv \
    neo4j:4.4.20-community \
    neo4j-admin import --force --database=neo4j --array-delimiter="u+0000" --nodes=/mnt/nodes.csv --relationships=/mnt/edges.csv

If everything worked correctly, the neo folder should now contain a neo4j database populated with the SSSOM mappings from nodes.csv and edges.csv generated by the code in this repo. We can now start Neo4j:

docker run \
    -v $(pwd)/neo:/data \
    -p 7474:7474 \
    -p 7687:7687 \
    --env=NEO4J_AUTH=none \
    neo4j:4.4.20-community

Hit up http://localhost:7474 to go forth and cypher!

Examples

Get all mappings for a given subject

This query returns all mappings to/from MONDO:0005015 (diabetes mellitus). Note the syntax (a)<-[mapping]->(b) goes both ways, so both outgoing mappings (defined by MONDO) and incoming mappings (defined by other ontologies) are included in the results.

MATCH (a)<-[mapping]->(b) WHERE a.id="MONDO:0005015" RETURN *
Screenshot 2023-05-14 at 23 01 05

Get all mappings for a given subject (transitive)

We can use an arbitrary level of depth, e.g. to search for mappings up to 3 levels deep:

MATCH (a)<-[mapping*0..3]->(b) WHERE a.id="MONDO:0005015" RETURN *
Screenshot 2023-05-14 at 23 01 35

This result set includes transitive mappings e.g. MONDO:0005015-hasDbXref->UMLS:C0011849<-hasDbXref-ORDO:101952-hasDbXref->UMLS:C0011860.

Therefore UMLS:C0011860 (Type 2 diabetes mellitus) is included in the result set. Note that this is a more specific term than we started with! This is a limitation of the lacking semantics of hasDbXref, and a good example of why ontologies should use richer mapping metadata.

sssom2neo's People

Contributors

jamesamcl avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.