Giter Site home page Giter Site logo

palimpsest's Introduction

Palimpsest

Cancer genomes are altered by various mutational processes and, like palimpsests, bear the signatures of these successive processes. The Palimpsest R package provides a complete workflow for the characterization and visualization of mutational signatures and their evolution along tumor development. The package covers a wide range of functions for extracting both base substitution and structural variant signatures, inferring the clonality of each alteration and analyzing the evolution of mutational processes between early clonal and late subclonal events. Palimpsest also estimates the probability of each mutation being due to each process to predict the mechanisms at the origin of driver events. Palimpsest is an easy-to-use toolset for reconstructing the natural history of a tumor using whole exome or whole genome sequencing data.

Installation

Install from the GitHub repository using devtools:

install.packages("devtools")
library(devtools)
devtools::install_github("FunGeST/Palimpsest")

Dependencies

The R package "bedr" is required to perform structural variant signature analysis. The bedr API gives access to "BEDTools" and offers additional utilities for genomic region processing. To gain the functionality of bedr package you will need to have the BEDTools program installed and in your default PATH.

Input files

Palimpsest requires 3 mandatory input files -- a mutational catalogue file (mut_data) describing somatic mutations in the tumor series, a copy number alteration file (cna_data) providing genome-wide absolute copy number estimates, and a minimal sample annotation file (annot_data) indicating gender and tumor purity.

The input files should have the following columns (the header is required, but the order of the columns can change). Example input files are provided with the package.

1]. mut_data: somatic mutation data

  • Sample: Sample identifier. Any alphanumeric string.
  • Type: Mutation type [SNV/Indel].
  • CHROM: Chromosome. Between chr1 and chr22 or the chrX or chrY ('chr' prefix required).
  • POS: Mutation position. A positive integer.
  • REF: Reference base(s): Each base must be one of A,C,G,T (upper case). Multiple bases are permitted. The value in the POS field refers to the position of the first base in the string.
  • ALT: Alternate base(s): Each base must be one of A,C,G,T (upper case). Multiple bases are permitted. The value in the POS field refers to the position of the first base in the string.
  • Tumor_Varcount: Number of variant bases at the position in the tumor sample.
  • Tumor_Depth: Tumor sample sequencing depth at the position.
  • Normal_Depth: Normal sample sequencing depth at the position.
  • Gene_Name: OPTIONAL column for representing mutated gene name.
  • Driver: OPTIONAL column indicating the driver events to be annotated in tumor history plots.

2]. cna_data: copy number alteration data

  • Sample: Sample identifier. Any alphanumeric string.
  • CHROM: Chromosome. Between chr1 and chr22 or the chrX or chrY ('chr' prefix required).
  • POS_START: Start position of segmented chromosome.
  • POS_END: End position of segmented chromosome.
  • LogR: LogR information.
  • Nmin: Minor allele copy number.
  • Nmaj: Major allele copy number.
  • ntot: Total copy number of segmented chromosome.
  • Ploidy: Tumor ploidy.

3]. annot_data: sample annotation data

  • Sample: Sample identifier. Any alphanumeric string.
  • Gender: Gender information for patient [M/F].
  • Purity: Tumor purity estimate (Represented as fraction; ranging between 0.01 - 1).

Optional:

Structural variant signature analysis

4]. sv_data: structural variant data

  • Sample: Sample identifier. Any alphanumeric string.
  • Type: Type of structural variant: INV/DEL/DUP/BND.
  • CHROM_1: Chromosome of the first breakpoint. Between chr1 and chr22 or the chrX or chrY ('chr' prefix required).
  • POS_1: Position of the first breakpoint. A positive integer.
  • CHROM_2: Chromosome of the second breakpoint. Between chr1 and chr22 or the chrX or chrY ('chr' prefix required).
  • POS_2: Position of the second breakpoint. A positive integer.
  • Tumor_Varcount: Column for variant allele count information.
  • Tumor_Depth: Column for tumor sequencing depth information.
  • Normal_Depth: Column for normal sequencing depth information.
  • Driver: OPTIONAL column indicating the driver events to be annotated in tumor history plots.

Running Palimpsest

Refernce

Shinde, J. et al. Bioinformatics (2018) Palimpsest: an R package for studying mutational and structural variant signatures along clonal evolution in cancer.

Fig.(A) Workflow illustrating a typical analysis with Palimpsest. Taking as input somatic mutations, copy-number alterations (CNAs) and structural variants, the package classifies variants as clonal and subclonal, extracts mutational and structural variant signatures separately in early clonal and late subclonal events, and estimates the probability of each alteration being due to each process. The timing of chromosome duplications is also estimated from the ratio of duplicated/non-duplicated mutations to reconstruct the complete natural history of the tumor. (B) Example of output representing, for one tumor, the number of clonal and subclonal mutations, their distribution per mutation signature, the driver alterations (colored according to the most likely causal mutational process) and CNA timing.

License

Copyright (C) 2017 Jayendra Shinde

Palimpsest is a free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.