Giter Site home page Giter Site logo

bamclipper's Introduction

BAMClipper

Remove gene-specific primer sequences from SAM/BAM alignments of PCR amplicons by soft-clipping

Download latest version in a ZIP package

Dependencies, as tested on 64-bit CentOS 5.5

Usage

bamclipper.sh soft-clips gene-specific primers from BAM alignment file based on genomic coordinates of primer pairs in BEDPE format.

./bamclipper.sh -b BAM -p BEDPE [-n NTHREAD] [-s SAMTOOLS] [-g GNUPARALLEL] [-u UPSTREAM] [-d DOWNSTREAM]

Given a BAM file called NAME.bam, a new BAM file (NAME.primerclipped.bam) and its associated index (NAME.primerclipped.bam.bai) will be generated in the current working directory.

Notes: For the sake of performance and simplicity, soft-clipping is performed solely based on genomic coordinates without involving the underlying sequence. Reference sequence names and coordinates of BAM and BEDPE are assumed to be derived from identical reference sequences (e.g. hg19).

Required arguments

  • -b FILE: indexed BAM alignment file
  • -p FILE: BEDPE file of primer pair locations

Options

  • -n INT: number of threads for clipprimer.pl (the workhorse Perl script of BAMClipper) and samtools sort [1]
  • -s FILE: path to samtools executable [samtools]
  • -g FILE: path to gnu parallel executable [parallel]
  • -u INT: number of nucleotide upstream to 5' most nucleotide of primer (in addition to 5' most nucleotide of primer) for assigning alignments to primers based on the alignment starting position. [1]
  • -d INT: number of nucleotide downstream to 5' most nucleotide of primer (in addition to 5' most nucleotide of primer) for assigning alignments to primers based on the alignment starting position. [5]

Example using demo data

# Clip primers by BAMClipper
>./bamclipper.sh -b examples/SRR2075598.bam -p examples/trusight_myeloid.bedpe -n 4
# done!
# SRR2075598.primerclipped.bam and its index SRR2075598.primerclipped.bam.bai are generated.

# the new SRR2075598.primerclipped.bam should be identical to the provided example (compare checksum of alignments and ignore headers)
>samtools view SRR2075598.primerclipped.bam | md5sum
6a431457fd6e892646c17d1c3029c24e  -
>samtools view examples/SRR2075598.primerclipped.bam | md5sum
6a431457fd6e892646c17d1c3029c24e  -

# An example line of primer pair BEDPE file (an amplicon targeting ASXL1)
>grep 31022896 examples/trusight_myeloid.bedpe
chr20   31022896        31022921        chr20   31023096        31023123

Details of demo data:

  • examples/SRR2075598.bam: original BWA-MEM v0.7.7 alignments of a TruSight Myeloid panel dataset (first 100 read pairs of SRR2075598) (reference: hg19)
  • examples/SRR2075598.primerclipped.bam: primer-clipped alignments
  • examples/trusight_myeloid.bedpe: primer locations derived from manufacturer's TruSight-Myeloid-Manifest.txt

Citation

Au CH, Ho DN, Kwong A, Chan TL and Ma ESK, 2017. BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing. Scientific Reports 7:1567 (doi:10.1038/s41598-017-01703-6)

bamclipper's People

Contributors

tommyau avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.