dgidb / dgidb-v5 Goto Github PK

View Code? Open in Web Editor NEW

11.0 11.0 2.0 21.04 MB

Providing interactions between drugs and genes sourced from a variety of publications and knowledgebases

Home Page: https://dgidb.org

License: MIT License

Ruby 61.85% JavaScript 0.01% HTML 0.55% TypeScript 26.28% SCSS 3.97% PLpgSQL 7.09% CSS 0.24% Procfile 0.01%

bioinformatics biomedical-data-science drug-target-interactions genomics

dgidb-v5's Issues

Refactor Apollo stuff to match CIViC

Current query is more of a Hello World demo, not ultimately how we'll organize queries

OncoKB importer: handle breaking API changes

Check and update source citation data

For TALC:

For TALC, citation is different in three places but 'most correct' citation appears to be from website:

Morgensztern D, Campo MJ, Dahlberg SE, Doebele RC, Garon E, Gerber DE, Goldberg SB, Hammerman PS, Heist RS, Hensing T, et al. Molecularly targeted therapies in non-small-cell lung cancer annual update 2014. J Thorac Oncol 2015; 10: S1-63. PMID: 25535693

Some other sources appear to have dead or incorrect links, or old/weird source citation data as well

Replace drug grouper with therapy normalizer

Replace dgidb-v4's drug grouper with VICC therapy normalizer: (https://github.com/cancervariants/therapy-normalization)

Ensembl importer

Figure it out. Potentially replaced by the VICC gene normalizer, so lower priority.

Spin up staging box on AWS

Will remain on WUSTL AWS resources

Add environment-specific hostnames to request URLs in client (#93)
Add github -> s3 deployment pipeline (of some kind) for client
Write CloudFormation templates for Beanstalk and RDS
Add some kind of deployment pipeline for server -> Beanstalk
Add cloudfront to templates

Render sample data on a front-end page

Can be anything (such as a single GeneClaim or Source citation) and doesn't have to be pretty for now.

This is essentially to learn how to link everything together and show that we can render something on the top layer front end thats stored in the bottom layer database.

Drug attributes

Examine overlap/conflicts with therapy normalizer
Double-check current DB structure
Write any needed migrations

Bring over remaining data models for drugs and genes

Similar to what we did for GeneClaims, bring over other data models from old version of DGIdb.

For now as learning exercise and general progress, we can just use the old data. We can refactor these as needed if changes to data structure occur.

Additionally, it'd be nice to implement better per-source deleters (so that you don't have to delete every grouping in order to delete/re-add a single source, unless this work is already done and I didn't copy them over correctly) and more optimized interaction grouping in this issue

Import latest version (Y5) of IDG data

Design and implement documentation strategy

As was talked about previously, we should figure out a strategy for properly documenting all major functionality as we progress.

Get ID from symbols (or query by symbols)

Currently the gene interaction query relies on ID's, which we need to derive from symbols entered by user.

Implement responsive layout for home screen

Add interaction type and gene claim category if not already in DB

Currently, the base Importer class will raise an error if it encounters an interaction type or gene claim category that isn't already in the corresponding tables (see eg

dgidb-v5/server/lib/genome/importers/base.rb

Line 92 in 2c5d36a

def create_interaction_claim_type(interaction_claim, type)

)

We should (in separate issues) ensure that the normalization of the values going into those fields has satisfactory results -- but I don't think a normalized value should have to be manually added to any tables, so the constraints above should be removed, and if the value isn't already in the table, the importer should add it.

Add readthedocs

https://readthedocs.org/

Prototype results page

Designing layout for results page following design goals laid out in user story exercise.

Entrez importer

Refactor DrugBank importer

Currently uses a couple of Python modules -- should rewrite in Ruby for cleanliness purposes

Provide public-facing documentation

Backfill existing GraphQL models to provide descriptions within GraphiQL
Construct readthedocs/rdoc/YARD documentation

Process remaining new interaction claim types and gene categories

InteractionClaimType -- normalization defined in interaction claim type model
Clarity Biomarkers: "Biomarker"
Clarity Clinical Trials: "immunostimulator", "natigen", "radioimmunotherapy"
My Cancer Genome: "immunotherapy"

GeneCategory
Hopkins/Groom: "DNA DIRECTED DNA POLYMERASE"

Retrieve NCI data

It's a big XML file: https://wiki.nci.nih.gov/display/cageneindex/The+Cancer+Gene+Index+Gene-Disease+and+Gene-Compound+XML+Documents

Either recover the original TSV file, write something to rebuild the TSV file, or import directly from the source

Clean up gene and drug aliases

In particular, try to identify cases where non-namespaced ID numbers are getting grouped into genes and drugs and fix the importer code accordingly

Multiple parameters for interaction queries

Return data from multiple entries in front end search bar

Drug Approval

Evaluate DGIdb current filtering strategy/language against planned approval enum expansion for all sources:

CHEMBL_1
CHEMBL_2
CHEMBL_3
CHEMBL_4
CHEMBL_WITHDRAWN
FDA_DISCONTINUED
FDA_PRESCRIPTION
FDA_OTC
FDA_TENTATIVE
GTOPDB_APPROVED
GTOPDB_WITHDRAWN
HEMONC_APPROVED
RXNORM_PRESCRIBABLE

Add importers

Big picture

Consider swapping out any low-hanging TSV updaters for direct source or API imports
Figure out what this is: https://raw.githubusercontent.com/dgidb/dgidb/source_tsvs/source_tsvs/TTD_INTERACTIONS.tsv

Specific sources

Add drug and gene resolvers

Gene Resolver (kori)
Drug Resolver (dorian)

Add online updater for DTC

I think we're currently using a condensed version of the input data because it's super large -- there's work already to pull from an API instead

dgidb/dgidb#450

Add a license

Add drugs@FDA to source list (moving issue from old repo)

Adding drugs@FDA as a new source to incorporate information and labeling relating to FDA approval status for drugs.

Add groupers

Gene
Drug
Interaction

Determine filters for searches

What fields do we want to search on? Fuzzy match vs exact match? Etc

Import CGI data from API

Home screen

Building first home screen in React

Add gene loader

CiVIC importer: create_new_source fails

Add FDA application #s to Drugs

Construct a new table/set of tables to store application data

Design a Front-end UI mock-up

We've talked about designing a new front-end UI. It would be useful for us to come up with a mock-up for what that should look like and what components it should have.

JAX-CKB importer: Fix API breaks

Identify way to filter drug claims by type/stage of development

From conversation with scientists in clinical research pipelines, it would be extremely useful for them to have a way to filter or sort drug claims by type/stage of development (e.g. (e.g. FDA approved drugs, drugs in clinical trials, research compounds, natural products).

write any necessary migrations

Guide to Pharmacology importer: Handle multiple input files

Search page autocomplete

Implement autocomplete for genes + drugs

dgidb / dgidb-v5 Goto Github PK

dgidb-v5's Issues

Recommend Projects

Recommend Topics

Recommend Org