Giter Site home page Giter Site logo

oicr-gsi / sample-fingerprinting Goto Github PK

View Code? Open in Web Editor NEW
0.0 16.0 2.0 1.21 MB

workflow that generates genotype fingerprints consumed by SampleFingerprinting workflow

Shell 6.29% WDL 93.71%
qc variant-calling fingerprinting genotyping reporting

sample-fingerprinting's Introduction

fingerprintCollector

FingerprintCollector 2.1, workflow that generates genotype fingerprints consumed by SampleFingerprinting workflow

Overview

Fingerprint Collector workflow produces "fingerprint" data for input alignments passed as .bam files. It is a part of the original implementation and its task is to produce all intermediate data just before creation of similarity matrix and sample swap report. The goal is to decrease the stress on the system by splitting the workflow and collecting variation data independently for each input .bam file. The below graph describes the process:

sample-fingerprinting flowchart

Dependencies

Usage

Cromwell

java -jar cromwell.jar run fingerprintCollector.wdl --inputs inputs.json

Inputs

Required workflow parameters:

Parameter Value Description
inputBam File Input lane-level BAM file
inputBai File Index for the input BAM file
refFasta String Path to the reference fasta file
hotspotSNPs String Path to the gzipped hotspot vcf file
runHaplotypeCaller.modules String Names and versions of modules
runDepthOfCoverage.modules String Names and versions of modules
runFinCreator.modules String Names and versions of modules

Optional workflow parameters:

Parameter Value Default Description
outputFileNamePrefix String basename(inputBam,".bam") Output prefix, customizable. Default is the input file's basename.

Optional task parameters:

Parameter Value Default Description
runHaplotypeCaller.jobMemory Int 8 memory allocated for Job
runHaplotypeCaller.timeout Int 24 Timeout in hours, needed to override imposed limits
runHaplotypeCaller.stdCC Float 30.0 standard call confidence score, default is 30
runDepthOfCoverage.jobMemory Int 8 memory allocated for Job
runDepthOfCoverage.timeout Int 24 Timeout in hours, needed to override imposed limits
runFinCreator.chroms Array[String] ["chr1", "chr2", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", "chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr20", "chr21", "chr22", "chrX"] Canonical chromosomes in desired order (used for soting lines in .fin file)
runFinCreator.timeout Int 10 Timeout in hours, needed to override imposed limits
runFinCreator.jobMemory Int 8 memory allocated for Job

Outputs

Output Type Description
outputVcf File gzipped vcf expression levels for all genes recorded in the reference
outbutTbi File expression levels for all isoforms recorded in the reference
outputFin File Custom format file, shows which hotspots were called as variants

Commands

This section lists command(s) run by fingerprintCollector workflow

  • Running fingerprintCollector

GATK Haplotype Caller using a list of genotyping hotspots:

 gatk HaplotypeCaller
     -R REF_FASTA
     -I INPUT_BAM
     -O SAMPLE_ID.snps.raw.vcf
    --read-filter CigarContainsNoNOperator
    --stand-call-conf STD_CC
     -L HOTSPOT_SNPS

 bgzip -c SAMPLE_ID.snps.raw.vcf > SAMPLE_ID.snps.raw.vcf.gz
 tabix -p vcf SAMPLE_ID.snps.raw.vcf.gz 

```

Depth of Coverage analysis:

```
 java -jar GenomeAnalysisTK.jar 
      -R REF_FASTA
      -T DepthOfCoverage
      -I INPUT_BAM
      -o SAMPLE_ID
      -filterRNC
      -L HOTSPOT_SNPS 

```

Creation of a fingerprint file:

...

 Custom python code producing .fin file with hotspot calls

 please refer to the fingerprintCollector.wdl for source

```
## Support

For support, please file an issue on the [Github project](https://github.com/oicr-gsi) or send an email to [email protected] .

_Generated with generate-markdown-readme (https://github.com/oicr-gsi/gsi-wdl-tools/)_

sample-fingerprinting's People

Contributors

lheisler avatar mariacheng avatar mlaszloffy avatar morgantaschuk avatar pruzanov avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

g3chen m28feng

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.