Giter Site home page Giter Site logo

deep_learning_metabolomics's Introduction

Deep Learning Accurately Predicts Estrogen Receptor Status in Breast Cancer Metabolomics Data Fadhl M. Alakwaa, Kumardeep Chaudhary, and Lana X. Garmire Journal of Proteome Research 2018 17 (1), 337-347 DOI: 10.1021/acs.jproteome.7b00595

https://pubs.acs.org/action/showCitFormats?doi=10.1021%2Facs.jproteome.7b00595

Problems and objectives

  1. 1 in 8 U.S. women (about 12.4%) will develop invasive breast cancer
  2. in 2018, 266,120 diagnosed new cases of invasive breast cancer
  3. Breast cancer has more than one subtypes. We are interested in two subtypes, ER+ and ER-
  4. Diagnosis and further treatment is depend on these subtypes
  5. Classification of Breast cancer patients based on their metabolomics profile
  6. Assess the predictive accuracy of the Deep Learning (DL) to predict estrogen receptor (ER) status from the metabolomics data.

Image of Block Diagram

preprocessing:to remove sample to sample variation

pretraining:This step improves the model performance, avoids random initialization of the weights, and selects the best model architecture

Image of Block Diagram

Figure 2. (A) Average AUC on 10 hold out test sets of the DL framework against six machine learning algorithms for prediction of ER status from metabolomics data: recursive partitioning and regression trees (RPART) (0.83), linear discriminant analysis (LDA) (0.74), support vector machine (SVM) (0.89), deep learning (DL) (0.93), random forest (RF) (0.89), generalized boosted models (GBM) (0.89), and prediction analysis for microarrays (PAM) (0.88). The above algorithms were run 10 times on different train/test splits. We used pairwise Wilcoxon signed-rank test to estimate the statistical significance of the difference in performance between DL and other methods (∗∗ p < 0.01, ∗ p < 0.1). (B) Bipartite graph of the top 20 important metabolites extracted from DL model and other machine learning algorithms. Large nodes represent the models and small nodes are metabolites. A connection between metabolite and the model means this metabolite is one of the top 20 high importance metabolites extracted by this model.

So, what is the best model? "ALL of them are wrong some of them are useful" George Box

Image of Block Diagram

Image of Block Diagram1

Figure 3. Biological relevance of the DL hidden layers. (A) Activation levels of the high variance nodes extracted from the layer 1 of the DL model. Columns are samples and rows are the top 12 nodes with high variance >0.1. (B) Bipartite graph of enriched significant metabolomics pathways and top hidden nodes. The nodes represent enriched pathways common to all top 12 nodes (green color) in the first hidden layer of DL in KEGG pathway enrichment analysis (FDR< 0.05).

Image of Block Diagram1

Figure 4. Joint pathway analysis between the top 20 DL metabolites and the highly differentiated enzymes. Only significant pathways with at least five overlapping metabolites are shown. X-axis shows the number of overlapped metabolites with the number of genes (number in parentheses) involved in the same pathway, y-axis shows the adjusted joint P-value calculated from IMPALA tool.(42) The size of the nodes represents the size of metabolomic pathway (number of metabolites involved in that pathway). The color of the nodes represents the database source of these pathways.

image

Figure 5. Circos plot of Spearman’s correlation values between top 20 DL metabolites and highly differentiated enzymes with cutoff = |0.35|.

image 2

deep_learning_metabolomics's People

Contributors

fadhlyemen avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.