Giter Site home page Giter Site logo

tumor-loh-app-dev's Introduction

Kids First Loss of Heterozygosity (LOH)

data service logo

The Kids First Loss of Heterozygosity Preprocessing (aka LOH) is a CWL workflow that assesses the loss of heterozygosity in the tumor for rare germline calls filtered by gnomad_3_1_1_AF_popmax (typically < 0.01) or when gnomad_3_1_1_AF_popmax is not defined. This preprocessing is designed to compute variant allele frequency (VAF) for multiple proband tumor samples and can also map germline VAF for family trios if trio germline VCF file is provided.

Basic info

Application Description

The Kids First Loss of Heterozygosity application is divided into two tools: Germline tool and Tumor tool.

Germline Tool

Germline tool filters germline annotations to retain variants based on gnomad_3_1_1_AF_popmax (typically < 0.01) or when gnomad_3_1_1_AF_popmax is not defined. It requires vcf file, proband sample id, ram as required inputs and peddy file as optional input which is required for family trios. It outputs variant information such as gene, chr, start, stop, ref/alt alleles, ref/alt allele depths, variant allele frequency and list of coordinates that will be an input to tumor tool.

Tumor Tool

Search in paired proband tumor sample for aligned reads in the regions where rare variants from the germline exist and exact allele/reference count, allele/reference depth, and calculate variant allele frequency(VAF). Tumor tool has the capability to search in multiple tumor samples for proband and if applicable, paternal and maternal tumor samples. To extract reads from the bam/cram files, this tool utilizes bam-readcount and wraps it with a python script to shape the output in a tabular format.

LOH Inputs

Germline tool
  # Required  
  BS_ID:{ doc: provide BS id for germline normal, type: string }
  frequency:{ doc: provide popmax cutoff for rare germline variants, type: 'float?', default: 0.01 }
  # Optional
  ram_germline:{  doc: Provide ram (in GB) based on the size of vcf, type: 'int?', default: 8}
  # Required for family trios otherwise not required
  peddy_file:{ doc: provide ped file for the trio, type: 'File?' }

Tumor tool
  # Required
  participant_id:{ doc: provide participant id for this run, type: string }
  bamscrams:{ doc: tumor input file in cram or bam format with their index file, type: 'File[]' , secondaryFiles: [ { pattern: ".crai", required: false }, { pattern: ".bai", required: false } ] }
  reference:{ doc: human reference in fasta format with index file, type: File,secondaryFiles: [ .fai ] }
  sample_vcf_file: { doc: provide germline vcf file for this sample, type: File }
  # Optional
  minDepth:{ doc: provide minDepth to consider for tumor reads, type: 'int?', default: 1 }
  bamcramsampleIDs: { doc: provide unique identifers (in the same order) for cram/bam files provided under bamcrams tag. Default is sample ID pulled from bam/cram files., type: 'string[]?' }
  ram_tumor:{  doc: Provide ram (in GB) for tumor tool based on the number cram/bam inputs, type: 'int?', default: 16} 
  minCore:{ type: 'int?', default: 16, doc: "Minimum number of cores for tumor tool based on the number cram/bam inputs" }

LOH schematic

LOH schematic

LOH Output

LOH application will output a tab-separated values file mapped data from germline tool and tumor tool.

output_file:{ type: File, doc: A tsv file with gathered data from germline and tumor tool }

Output headers

Preprocessing LOH will generate a tab-separated values file with following headers:

Headers Description
BS_ID Sample Id for germline sample
gene Gene
chr Chromosome
start Start position
end End position
ref Reference allele
alt Alternate Allele
proband_germline_ref_depth Reference depth from germline from germline proband
proband_germline_alt_depth Alternate depth from germline from germline proband
proband_germline_depth Total number of reads overlapping a site from germline proband
proband_germline_vaf Fraction of reads with the alternate allele from germline proband
paternal_germline_ref_depth Reference depth from germline from paternal germline
paternal_germline_alt_depth Alternate depth from germline from paternal germline
paternal_germline_depth Total number of reads overlapping a site from paternal germline
paternal_germline_vaf Fraction of reads with the alternate allele from paternal germline
maternal_germline_ref_depth Reference depth from maternal germline
maternal_germline_alt_depth Alternate depth from maternal germline
maternal_germline_depth Total number of reads overlapping a site from mother
maternal_germline_vaf Fraction of reads with the alternate allele from mother
proband_sample_id_tumor_vaf Proband variant allele frequency from specific tumor sample
proband_sample_id_tumor_depth Depth of coverage from specific tumor sample
proband_sample_id_tumor_alt_depth Allele count at site from specific proband tumor sample
proband_sample_id_tumor_ref_depth Reference count at site from specific proband tumor sample

More information can be found here

Running it locally on a laptop?

It is recommended to run this CWL workflow on a system with a high number of CPUs and memory (>=16 GB). The basic requirement is a running docker engine and CWL tools. Command line to run the LOH workflow locally is:

cwltool workflow/run_LOH_app.cwl sample_input.yml

Note: Inputs to the workflow need to be defined in sample_input.yml.

tumor-loh-app-dev's People

Contributors

sakshamphul avatar migbro avatar

Stargazers

 avatar

Watchers

Allison Heath avatar Yuankun Zhu avatar Natasha Singh avatar  avatar  avatar  avatar  avatar

tumor-loh-app-dev's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.