Giter Site home page Giter Site logo

jraysajulga / cravatp-galaxy-docker Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 1.0 142.52 MB

A docker image flavor extended from the Galaxy docker image to include the CRAVAT-P tool and visualization plugin. https://jraysajulga.github.io/cravatp-galaxy-docker.

Dockerfile 17.83% HTML 82.17%

cravatp-galaxy-docker's Introduction

CRAVAT-P Galaxy Docker

A Docker image containing a fully-operational Galaxy instance with pre-installed demonstration material for CRAVAT-P.

main screen

Created as a demonstration for the following technical note for the Journal of Proteome Research:

Bridging the Chromosome-Centric and Biology and Disease Human Proteome Projects: Accessible and automated tools for interpreting biological and pathological impact of protein sequence variants detected via proteogenomics

Ray Sajulga, Subina Mehta, Praveen Kumar, James E. Johnson, Candace R. Guerrero, Michael C. Ryan, Rachel Karchin, Pratik D. Jagtap, and Timothy J. Griffin

What's included


Table of Contents


Galaxy-P

Collaborators


Installation Guide

1.) Install Docker for Mac or PC. Open Docker.

2.) Open your terminal. Run the following command:

docker run -d -p 8080:80 galaxyp/cravatp

The image will now download from the public repository galaxyp/cravatp on Docker Hub. This should take around 15 minutes to download.

In the meanwhile, feel free to take some time to understand the different components of this Docker command. You can also read up on CRAVAT-P background information in the next section.

Component Type Description
docker Base command The base command for the Docker CLI (Command Language Interface)
run Command Run a command in a new container
-d, --detach OPTION Run container in background and print container ID
-p, --publish OPTION Publish a container's port(s) to the host
galaxyp/cravatp IMAGE galaxyp's cravatp image

More documentation can be found at Docker's documentation website.

3.) Once the command is finished, wait a few moments for the Docker image to initialize as a container. Open http://localhost:8080 and follow the CRAVAT-P tutorial to access the CRAVAT-P suite. If you do not see the Galaxy screen, wait a few seconds and then reload the page.

Once you are finished using this container, you can clean up your workspace by simply exiting out of Docker.


Background

CRAVAT-P

(Cancer Related Analysis of VAriants Toolkit - Proteomics)

CRAVAT-P is a proteomic extension of CRAVAT (http://cravat.us) developed for the Galaxy-P (http://galaxyp.org) bioinformatics platform. CRAVAT-P exists as a downstream analysis suite for peptide variants. Current support is tailored towards workflows that generate peptide sequences mapped to genomic locations.


Galaxy Tool

tool

The figure above shows the Galaxy tool developed for submitting jobs to the CRAVAT server. It extends from an earlier version of In Silico Solutions' Galaxy tool (cravat_score_and_annotate). In our CRAVAT-P tool, we added support for additional parameters: CHASM classifiers (e.g., breast, brain-glioblastoma-multiforme, etc.) and the older GRCh37/hg19 human genome build. We also added proteomic support, as highlighted by the outlined red box. Here, a proBED file can be provided for intersection with the genomic input file—VCF (Variant Call Format). You can specify whether you want to output the intersected VCF file or submit only the intersected variants.

Example input files

VCF (Variant Call Format)

ID Chr. Position Strand Ref. base Alt. base
VAR527 chr12 6561055 + T C
VAR529 chr12 110339630 + C T
VAR532 chr14 102083954 + C T
VAR539 chr19 17205335 + A T
VAR541 chr19 17205973 + T C
VAR542 chr19 18856059 + C T

ProBED (Proteomic Browser Extensible Data)

Chr. Start End Peptide Strand
chr12 6561014 6561056 STGVILANDANAER -
chr12 110339607 110339637 EWGSGSDILR +
chr14 102083930 102083972 GVVDSENLPLNISR -
chr19 17205327 17206022 GRMGEPGAEPGHFGVCVDSLTSDK +
chr19 18856027 18856078 EAIDSPVSFLVLHNQIR +

Galaxy Workflow

viewer

Galaxy workflows are tailored pipelines that promote reproducibility, ease-of-use, and preservation of complex analyses. Two workflows, both with differing complexities, are shown above. The simple workflow (top left panel) was used for the paper and Docker image to redirect focus to the downstream analysis i.e., CRAVAT-P's outputs and viewer. A fully-fledged workflow (bottom panel) is shown as an example of a highly complex workflow. The top right panel shows how workflows can automate parameter selection and offer additional options such as e-mail notification and output cleanup.


Galaxy Viewer Plugin

Galaxy uses JavaScript-based visualization plugins to interactively explore your data.

Panel A shows the actual viewer, with panels B - E as blown-up images for further detail.

(A-i) Sidebar for showing additional information, mainly column visibility toggling. There are many columns to sift through > from CRAVAT's annotation.

(A-ii) An embedded webpage from the CRAVAT server termed their "Single Variants Page" feature.

(B) Leveraging the DataTable.js library, this table can be sorted and filtered. By default, it is sorted by p-values (based on the machine learning analysis i.e., VEST or CHASM) from most impactful to least. The selected box exhibits a peptide column that highlights the variant amino acid within a peptide hit. Since some cells may have large amounts of text, the full datum is shown in the display box at the top.

viewer

(C) CRAVAT uses Protein Diagrams to show lollipop mutations from your given protein variant. You can also choose TCGA (The Cancer Genome Atlas) tissue mutations. You can mouse over different parts to show domains, binding sites, and other regions of interest.

(D) CRAVAT uses the cytoscape.js library to display gene enrichment networks housed by the NDEx (Network Data Exchange) infrastructure. You can move elements around and examine different pathways.

(E) CRAVAT uses another project developed by the same lab (Professor Rachel Karchin's lab of John Hopkin's University) called MuPIT (Mutation Position Imaging Toolbox) designed to show the location of single nucleotide variants (SNVs) on interactive three-dimensional protein structures. You can click on individual residues and adjust the display options.


CRAVAT-P Tutorial

Overview

Import the input files → Run the workflow → Access the viewer

1.) Import the input files from the data library

step-1

  • click Shared Data > Data Libraries
  • open Training Data > Input files for CRAVAT-P Demo
  • check the checkbox in the header to select both input files
  • click to History
  • optional: name your new history (e.g., mcf7_cancer_proteogenomics)
  • click import
  • click on the green pop-up window to go back to the homepage to analyze these datasets.

2.) Log in and run the workflow

step-2

  • The CRAVAT-P workflow was placed into an administrative account through Docker. To access it, click Login or Register > Login and log in using the following credentials:
  • click Workflow to show the list of workflows in this account. In this case, we only have the CRAVAT Workflow
  • click on the CRAVAT Workflow button and click Run from the resulting dropdown
  • click Run workflow. The analysis will start and will finish in a couple of minutes. This workflow was set to include proteogenomic input and automatically select the correct input file types (VCF and proBED) in the history.

3.) Access the viewer

step-3

  • Once the VCF output turns green (signifying completion), you can access the visualizer. Open the dataset collection, and click on any of the four datasets to expand it. The variant dataset is preferred, since it typically contains the most useful information. In the viewer, you will be able to access all the datasets anyway.
  • Click the "visualize" icon and select CRAVAT Viewer.

cravatp-galaxy-docker's People

Contributors

jraysajulga avatar

Forkers

subinamehta

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.