dgidb / dgidb-v5 Goto Github PK
View Code? Open in Web Editor NEWProviding interactions between drugs and genes sourced from a variety of publications and knowledgebases
Home Page: https://dgidb.org
License: MIT License
Providing interactions between drugs and genes sourced from a variety of publications and knowledgebases
Home Page: https://dgidb.org
License: MIT License
Current query is more of a Hello World demo, not ultimately how we'll organize queries
For TALC:
For TALC, citation is different in three places but 'most correct' citation appears to be from website:
Morgensztern D, Campo MJ, Dahlberg SE, Doebele RC, Garon E, Gerber DE, Goldberg SB, Hammerman PS, Heist RS, Hensing T, et al. Molecularly targeted therapies in non-small-cell lung cancer annual update 2014. J Thorac Oncol 2015; 10: S1-63. PMID: 25535693
Some other sources appear to have dead or incorrect links, or old/weird source citation data as well
Replace dgidb-v4's drug grouper with VICC therapy normalizer: (https://github.com/cancervariants/therapy-normalization)
Figure it out. Potentially replaced by the VICC gene normalizer, so lower priority.
Will remain on WUSTL AWS resources
Can be anything (such as a single GeneClaim or Source citation) and doesn't have to be pretty for now.
This is essentially to learn how to link everything together and show that we can render something on the top layer front end thats stored in the bottom layer database.
Similar to what we did for GeneClaims, bring over other data models from old version of DGIdb.
For now as learning exercise and general progress, we can just use the old data. We can refactor these as needed if changes to data structure occur.
Additionally, it'd be nice to implement better per-source deleters (so that you don't have to delete every grouping in order to delete/re-add a single source, unless this work is already done and I didn't copy them over correctly) and more optimized interaction grouping in this issue
As was talked about previously, we should figure out a strategy for properly documenting all major functionality as we progress.
Currently the gene interaction query relies on ID's, which we need to derive from symbols entered by user.
Currently, the base Importer class will raise an error if it encounters an interaction type or gene claim category that isn't already in the corresponding tables (see eg
)We should (in separate issues) ensure that the normalization of the values going into those fields has satisfactory results -- but I don't think a normalized value should have to be manually added to any tables, so the constraints above should be removed, and if the value isn't already in the table, the importer should add it.
Designing layout for results page following design goals laid out in user story exercise.
InteractionClaimType -- normalization defined in interaction claim type model
Clarity Biomarkers: "Biomarker"
Clarity Clinical Trials: "immunostimulator", "natigen", "radioimmunotherapy"
My Cancer Genome: "immunotherapy"
GeneCategory
Hopkins/Groom: "DNA DIRECTED DNA POLYMERASE"
It's a big XML file: https://wiki.nci.nih.gov/display/cageneindex/The+Cancer+Gene+Index+Gene-Disease+and+Gene-Compound+XML+Documents
Either recover the original TSV file, write something to rebuild the TSV file, or import directly from the source
In particular, try to identify cases where non-namespaced ID numbers are getting grouped into genes and drugs and fix the importer code accordingly
Return data from multiple entries in front end search bar
Evaluate DGIdb current filtering strategy/language against planned approval enum expansion for all sources:
CHEMBL_1
CHEMBL_2
CHEMBL_3
CHEMBL_4
CHEMBL_WITHDRAWN
FDA_DISCONTINUED
FDA_PRESCRIPTION
FDA_OTC
FDA_TENTATIVE
GTOPDB_APPROVED
GTOPDB_WITHDRAWN
HEMONC_APPROVED
RXNORM_PRESCRIBABLE
Big picture
Specific sources
I think we're currently using a condensed version of the input data because it's super large -- there's work already to pull from an API instead
Adding drugs@FDA as a new source to incorporate information and labeling relating to FDA approval status for drugs.
What fields do we want to search on? Fuzzy match vs exact match? Etc
Building first home screen in React
Construct a new table/set of tables to store application data
We've talked about designing a new front-end UI. It would be useful for us to come up with a mock-up for what that should look like and what components it should have.
From conversation with scientists in clinical research pipelines, it would be extremely useful for them to have a way to filter or sort drug claims by type/stage of development (e.g. (e.g. FDA approved drugs, drugs in clinical trials, research compounds, natural products).
From results page, clicking individual drug or gene should route to drug or gene record. Design these layouts following design goals laid out in user story exercise.
Due to DGIdb's prevalence in pipelines related to cancer cohorts, it would be relevant to identify new cancer-focused sources for gene curation and drug-gene interactions.
Move static pages over to new repo
Implement autocomplete for genes + drugs
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.