Giter Site home page Giter Site logo

otu-clustering's Introduction

OTU-clustering

Scripts for benchmarking and comparison of short read OTU clustering tools available via QIIME 1.9.0. The full benchmark can be launched with the script OTU-clustering/shell_scripts/launch_benchmark.sh. This script will execute all software tools and perform analysis using the datasets & scripts below. The user must modify the working directory path in launch_benchmark.sh prior to executing this script. Only commands for launching software in OTU-clustering/shell_scripts/commands_16S.sh and OTU-clustering/shell_scripts/commands_18S.sh are called using the qsub environment, although this is easily modifiable to run on any system.

Dependencies

  1. QIIME 1.9

Simulate reads:

  1. PrimerProspector (1.0.1)
  2. ART (VanillaIceCream-03-11-2014)

Benchmark:

  1. BLAST 2.2.29+
  2. USEARCH Uchime (7.0.1090)
  3. UPARSE (7.0.1090)

Install

git clone https://github.com/ekopylova/OTU-clustering.git

Datasets

  1. Read datasets: ftp://ftp.microbio.me/pub/supplemental_otu_clustering_datasets.tar.gz
  2. Greengenes 13.8: ftp://ftp.greengenes.microbio.me/greengenes_release/gg_13_5/gg_13_8_otus.tar.gz
  3. SILVA 111: ftp://ftp.microbio.me/pub/QIIME_nonstandard_referencedb/Silva_111.tgz
  4. 16S PyNAST template: http://greengenes.lbl.gov/Download/Sequence_Data/Fasta_data_files/core_set_aligned.fasta.imputed
  5. 18S PyNAST template: ftp://ftp.microbio.me/pub/core_Silva119_alignment.fna.gz
  6. For chimera checking 16S, the gold database: http://drive5.com/uchime/uchime_download.html
  7. For chimera checking 18S, the SILVA 97% representative set from SILVA 111 (see 3)

Scripts

The benchmarking and analysis comparison can be executed using the following scripts (in given order). Scripts 2-13 require input arguments, all of which are defined in launch_benchmark.sh.

  1. Launch full benchmark (executes all scripts below):
    OTU-clustering/shell_scripts/launch_benchmark.sh

Otherwise, the user may launch each script individually,

  1. Simulate even and staggered community reads:
    OTU-clustering/shell_scripts/simulate_reads.sh
  2. Launch all software (via QIIME’s pick_closed_reference_otus.py, pick_de_novo_otus.py and pick_open_reference_otus.py) on 16S datasets:
    OTU-clustering/shell_scripts/commands_16S.sh
  3. Launch all software on 18S datasets:
    OTU-clustering/shell_scripts/commands_18S.sh
  4. Remove singleton OTUs (OTUs consisting of only 1 read) from the final OTU tables generated in steps 2 and 3:
    OTU-clustering/python_scripts/run_filter_singleton_otus.py
  5. Summarize taxonomy using filtered OTU tables:
    OTU-clustering/python_scripts/run_summarize_taxa.py
  6. Summarize filtered OTU tables:
    OTU-clustering/python_scripts/run_summarize_tables.py
  7. Compute true positive, false positive, false negative, precision, recall, F-measure and FP-chimera, FP-known, FP-other metrics using the summarized taxonomy results:
    OTU-clustering/python_scripts/run_compute_precision_recall.py
  8. Generate alpha diversity plots:
    OTU-clustering/python_scripts/run_single_rarefaction_and_plot.py
  9. Generate beta diversity plots:
    OTU-clustering/python_scripts/run_beta_diversity_and_procrustes.py
  10. Generate taxonomy comparison tables:
    OTU-clustering/python_scripts/run_compare_taxa_summaries.py
  11. Generate taxonomy stacked bar plots:
    OTU-clustering/python_scripts/run_generate_taxa_barcharts.py
  12. Plot TP, FP-chimera, FP-known and FP-other results:
    OTU-clustering/python_scripts/plot_tp_fp_distribution.py

Citing

If you use any of the data or code included in this repository, please cite with the URL: https://github.com/ekopylova/OTU-clustering.git

otu-clustering's People

Contributors

ekopylova avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

rpatil8 wook2014

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.