Giter Site home page Giter Site logo

qiime_pipeline's Introduction

QIIME2 pipeline

In order to extract and wrangle sequence data, please follow the following procedures.

1. Clone QIIME2 pipeline

git clone https://github.com/TracyRage/qiime_pipeline.git

Make all the .sh files executable

chmod +x *.sh 

2. Extract barcodes from raw sequences

In order to proceed with QIIME2 analysis, you need to create a metadata tsv file with barcodes attributed to each sample. Given the fact, that there’s already a metadata.tsv example in this repo, feel free to modify it, and add there your sample abbreviations and barcodes.

  • Create a temporary directory in seqs/ and decompress raw *fastq.gz files
cd seqs/ && mkdir temporary_dir && gunzip *gz && cp *fastq temporary_dir && gzip *fastq && cd temporary_dir
# Do it for each sample
touch labels.txt && usearch -fastx_getlabels YOUR_SAMPLE_NAME.fastq -output labels.txt | head -n 3 labels.txt

In order to avoid redundance, extract barcodes from FORWARD sequences (i.e. R1)

  • Change sample names and add barcodes in metadata.tsv. Optionally, change the other metadata entries.

  • Don’t forget to properly rename your sequence files. Otherwise, QIIME2 won’t see them. Please, see example files in seqs/.

  • Rename sample, barcode and forward (R1) / reverse (R2) fields: sample_barcode_L001_{R1 or R2}_001.fastq.gz

  • If you done, please, delete example files.

3. Import data in QIIME2

bash import.sh

4. Process data

Go to QIIME View and check your demuz.qzv file, and decide what portion of sequnce to trim (median >= 28). Usually, default settings in process.sh are good enough. So, you may ignore this step.

bash process.sh

5. Train sequence model

This pipeline uses GTDB database.

If you have any other primers to work with, open training.sh and modify FORWARD and REVERSE variables. If not, just run the script.

bash training.sh

Analyze your dataset

Go to QIIME View and check your feature_table.qzv file, write down median frequency and feature count of the sample with the fewest count number. Please open analyze.sh and modify SAMPLING_DEPTH and MEDIAN variables.

bash analyze.sh

Conclusion

To see the bacterial distrbution, go to QIIME View and check your tax_bar_plots.qzv file.

For further statistics / graphics generation consult sping_analysis.Rmd (which is optional).

qiime_pipeline's People

Contributors

tracyrage avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.