Giter Site home page Giter Site logo

wolfelee / deblur-1 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from biocore/deblur

0.0 1.0 0.0 3.88 MB

Deblur is a greedy deconvolution algorithm based on known read error profiles.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

deblur-1's Introduction

Deblur

Build Status Coverage Status

Deblur is a greedy deconvolution algorithm based on Illumina Miseq/Hiseq error profiles.

Install

  • Deblur requires Python 3.5. If Python 3.5 is not installed, you can create a conda environment for deblur using:
conda create -n deblurenv python=3 numpy

and activate it using:

source activate deblurenv

(note you will need to activate this environment every time you want to use deblur)

At the moment, the install is a two stage process as we do not currently have deblur staged in a conda channel.

  • install deblur dependencies
conda install -c bioconda VSEARCH MAFFT SortMeRNA==2.0 biom-format
  • Install Deblur:
pip install deblur

Example usage

The input to deblur workflow is a directory of fasta files (1 per sample) or a demultiplexed FASTA or FASTQ file. The output is a biom table with sequences as the OTU ids (final.biom in the output directory).

The simple use case just specifies the input fasta file (or directory) and output directory name:

deblur workflow --seqs-fp all_samples.fna --output-dir output

If starting from a barcode and read file, you can first use the qiime split_libraries_fastq.py command (we recommend using -q 19 to remove low quality reads):

split_libraries_fastq.py -i XXX_R1_001.fastq -m map.txt -o split -b XXXX_I1_001.fastq -q 19

and use the split/seqs.fna as the input to the deblur workflow.

Important options

  • The sequence read length can be specified by the -t NNN flag, where NNN denotes the length all sequences will be trimmed to (default=150). Note that all reads shorter than this length will be discarded.

  • In order to run in parallel, the number of threads can be specified by the -O NNN flag (default it 1). Note that running more threads than available cores will not speed up performance.

  • To get a full list of options, use:

deblur workflow --help

Positive and Negative Filtering

By default, deblur uses positive filtering, keeping only 16S sequences (based on homology to Greengenes 88% representative set). For example:

deblur workflow --seqs-fp all_samples.fna --output-dir output

Negative filtering can be selected using the '-n' flag. This causes deblurring to keep all sequences except for known artifact sequences (i.e. PhiX and Adapter sequences), so other non-16S sequences are retained. For example:

deblur workflow --seqs-fp all_samples.fna --output-dir output -n

Code Development Note

Some of the code in the package deblur has been derived from QIIME. The contributors to these specific QIIME modules have granted permission for this porting to take place and put under the BSD license.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.