Giter Site home page Giter Site logo

shawlab-moffitt / drppm-jaccard-pathway-connectivity Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 55.25 MB

As part of the DRPPM-PATH-SURVEIOR family, this R Shiny app aims to identify clusters of pathways based on Jaccard Distance

R 100.00%
r rshin clustering genesets jaccard rshinyapp

drppm-jaccard-pathway-connectivity's Introduction

DRPPM-Jaccard-Pathway-Connectivity

Introduction

The integration of patient genome expression data, phenotypye data, and clinical data can serve as an integral resource for patient prognosis. DRPPM PATH SURVEIOR: Pathway level Survival Examinator serves to do just that, by examining the interaction of pathway analysis with patient expression and cilinical data to discover prominent features that take part in patient outcome. This utility is comprised of 3 R Shiny apps and a pipeline script which can be employed in a cohesive manor to provide an in-depth analysis towards pathway analysis of patient survival. Gene Set pathways utilized in this workflow include the Molecular Signatures Database (MSigDB), LINCS L1000 Small-Molecule Perturbations, and Clue.io ER Stress signtatures, as well as user provided gene sets.

Here we focus on the Pathway Connectivity portion of this workflow with the DRPPM-Jaccard-Pathway-Connectivity R Shiny App. This app takes a list of gene sets as input and performs a Jaccard distance calculation to determine the proximity on the gene sets to one another. Working in tandem with the DRPPM-PATH-SURVEIOR pipeline, the user may subset a top portion of the gene sets that were output from the comprehensive Cox Proportional Hazard table and use that as input to the Jaccard Connectivity app. This allows the user to gain another perspective on the top gene sets identified and how they cluster together by utilizing visualiztions of heatmaps, dendrograms, and phylogeny-type branched outputs.

An example Jaccard Connectivity R Shiny App is hosted here: http://shawlab.science/shiny/DRPPM_PATH_SURVEIOR_Jaccard_Connectivity_App/ where you are welcome to use the example inputs provided in the GitHub or your own to explore.

The DRPPM-PATH-SURVEIOR Family

alt text

Installation

Via Download

  1. Download the Zip File from this GitHub repository: https://github.com/shawlab-moffitt/DRPPM-Jaccard-Pathway-Connectivity
  2. Unzip the downloaded file into the folder of your choice.
  3. Set your working directory in R to the local version of the repository
    • This can be done through the "More" settings in the bottom-right box in R Stuido
    • You may also use the setwd() function in R Console.

Via Git Clone

  1. Clone the GitHub Repository into the destination of your choice.
    • Can be done in R Studio Terminal or a terminal of your choice
git clone https://github.com/shawlab-moffitt/DRPPM-Jaccard-Pathway-Connectivity.git
  1. Set your working directory in R to the cloned repository
    • This can be done through the "More" settings in the bottom-right box in R Stuido
    • You may also use the setwd() function in R Console.

Requirments

R Dependencies

shiny_1.7.1 shinythemes_1.2.0 shinyjqui_0.4.1 shinycssloaders_1.0.0
DT_0.23 pheatmap_1.0.12 readr_2.1.2 dplyr_1.0.9
plotly_4.10.0 clusterProfiler_4.0.5 ggdendro_0.1.23 factoextra_1.0.7
reshape2_1.4.4 stringr_1.4.0 viridis_0.6.2 RColorBrewer_1.1-3

Required Files

  • Comprehensive Gene Set File (Provided):

    • This is a provided file Comprehensive_GeneSet.RData
    • This is for use in the back end and provides the genes for the gene sets that are input.
    • The gene set names should match the ones provdide when running the DRPPM-PATH-SURVIOER-Pipeline
      • If you ran the pipeline with a user provided gene set the genes for those gene sets will unlikely be found to compare distance between gene sets.
  • User Provided List of Gene Sets (.txt/.tsv):

    • The only requirements for the file is that it it tab delimited and the first column is the gene set names.
      • The app will only use the first column, and ignore other columns
    • This can take the ranked Coxh file that is output from the DRPPM-PATH-SURVIOER-Pipeline
      • The user can subset this file for the top number of gene sets in the app
    • Example Input files are provided here, these are files that were ouput from the example run of the DRPPM-PATH-SURVIOER-Pipeline with the 50 MSigDB Hallmark Gene sets, with and without the use of the "Responder" covariate.
  • Gene Annotation File (Optional):

    • A tab delimited file with gene symbols as the first column followed by annotation columns
    • It is recommended to use the output from the raw gene expression run of the DRPPM-PATH-SURVEIOR pipeline
    • There is a table that can be used for annotation of genes by the user in the "Gene Clusters and Annotation" tab
    • This starts as a three column tables of gene set names and clusters repeating for each gene within the gene set
      • The annotation file uploaded will merge the two tables by gene symbol

App Set-Up

  • It is important to ensure that the comprehensive gene set file that is provided is in the proper location for the app to locate it when running.
  • To run the app:
    • The user can select the "Run App" button at the top write of the script in R Studio
    • Or the user can user the runApp() function in R Console
  • When the app is running the user can select to input a file in the user interface and proceed with analysis

App Features

Sidebar Panel

Pathway and Clustering Parameters

alt text

  1. The user may upload their pathways of interest here
    • Please select if the file has a header or not
    • Input file described here
  2. The user can select the top number of gene sets to view from the input file
    • This takes the top number of rows from the file to perform the Jaccard Connectivity on
    • While the Jaccard calculation should not take long, the larger the subset the more time the Jaccard Connectivity will take to process
  3. The user has the ability to choose which clustering method they want to use from the hclust() function, as well as the number of clusters they want to form with their data
    • The cluster table can be viewed in the app, as well as downloaded
  4. A distance cutoff can be used input to generate a SIF file
    • All gene set pairs below the designated cutoff will be included in the file
    • This file can be previewed in the app as well as downloaded

Figure Parameters

alt text

  1. Figure parameter for theheatmap may be adjusted, such as color palette, column and row names, and dendrogram height.
  2. The connectivity visualization can be customized to be viewed as a rectangular or circular denrogram or a phylogeny figure.
    • If phylogeny is chosen a veriety of options to view the phylogeny figure as is provided.

Main Panel

Jaccard Pathway Connectivity Table

alt text

  1. When a file is uploaded to the app, after a few moments a Jaccard Connectivity able will appear showing the jaccard distance, 0-1, (similarity) between gene sets
    • The smaller the number, the more similar the gene set
  2. The table can be downloaded for further use

Connectivity Heatmap

alt text

  1. The heatmap give a global picture of similarit between gene sets

Clustering

alt text

  1. Clustering can be shown as a phylogenetic object, with or without names displyed. The names can be displayed also by hovering the points
    • The visualization is a plotly object, so the user may zoom in to interact with the plot
  2. A dendrogram is another form of visualization available to see the clustering. This is also made with plotly, so the user may zoom in to examine the branches
  3. The clusters can also be viewed as a circular dendrogram. This is not a plotly object and can not be interacted with.

Clustering Annotation

alt text

  • A data frame is displayed on the last tab starting with the gene set, cluster, and gene for each row. This allows users to see what genes are in the gene sets and cluster
  1. A table provided by the user can be uploaded to annotate the genes. The uploaded table must list the gene symbol in the firsst column, the corresponding column can be any annotation the user chooses.
    • It is recommended to use the input from the DRPPM-PATH-SURVEIOR Pipeline raw gene expresison ranking output.

Quesions and Comments

Please email Alyssa Obermayer at [email protected] if you have any further comments or questions.

drppm-jaccard-pathway-connectivity's People

Contributors

shawlab-moffitt avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.