Giter Site home page Giter Site logo

ma-diroma / eager Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nf-core/eager

0.0 0.0 0.0 12.41 MB

A fully reproducible and state of the art ancient DNA analysis pipeline

Home Page: https://nf-co.re/eager

License: Other

HTML 1.34% Python 14.77% Nextflow 83.70% Dockerfile 0.19%

eager's Introduction

nf-core/eager

A fully reproducible ancient and modern DNA pipeline in Nextflow and with cloud support..

GitHub Actions CI Status GitHub Actions Linting Status Nextflow nf-core DOI

install with bioconda Docker Container available Singularity Container available

Joins us on Slack

Introduction

nf-core/eager is a bioinformatics best-practice analysis pipeline for NGS sequencing based ancient DNA (aDNA) data analysis.

The pipeline uses Nextflow, a bioinformatics workflow tool. It pre-processes raw data from FASTQ inputs, or preprocessed BAM inputs. It can align reads and performs extensive general NGS and aDNA specific quality-control on the results. It comes with docker, singularity or conda containers making installation trivial and results highly reproducible.

Pipeline steps

Default Steps

By default the pipeline currently performs the following:

  • Create reference genome indices for mapping (bwa, samtools, and picard)
  • Sequencing quality control (FastQC)
  • Sequencing adapter removal and for paired end data merging (AdapterRemoval)
  • Read mapping to reference using (bwa aln, bwa mem or CircularMapper)
  • Post-mapping processing, statistics and conversion to bam (samtools)
  • Ancient DNA C-to-T damage pattern visualisation (DamageProfiler)
  • PCR duplicate removal (DeDup or MarkDuplicates)
  • Post-mapping statistics and BAM quality control (Qualimap)
  • Library Complexity Estimation (preseq)
  • Overall pipeline statistics summaries (MultiQC)

Additional Steps

Additional functionality contained by the pipeline currently includes:

Preprocessing

  • Illumina two-coloured sequencer poly-G tail removal (fastp)
  • Automatic conversion of unmapped reads to FASTQ (samtools)
  • Host DNA (mapped reads) stripping from input FASTQ files (for sensitive samples)

aDNA Damage manipulation

  • Damage removal/clipping for UDG+/UDG-half treatment protocols (BamUtil)
  • Damaged reads extraction and assessment (PMDTools)

Genotyping

  • Creation of VCF genotyping files (GATK UnifiedGenotyper, GATK HaplotypeCaller and FreeBayes)
  • Consensus sequence FASTA creation (VCF2Genome)
  • SNP Table generation (MultiVCFAnalyzer)

Biological Information

  • Mitochondrial to Nuclear read ratio calculation (MtNucRatioCalculator)
  • Statistical sex determination of human individuals (SexDetErrmine)

Metagenomic Screening

  • Taxonomic binner with alignment (MALT)
  • Taxonomic binner without alignment (Kraken2)
  • aDNA characteristic screening of taxonomically binned data from MALT (MaltExtract)

Quick Start

  1. Install nextflow (>= v19.10.0)

  2. Install one of docker, singularity or conda

  3. Download the EAGER pipeline

     nextflow pull nf-core/eager
    
  4. Test the pipeline using the provided test data

     nextflow run nf-core/eager -profile <docker/singularity/conda>,test --paired_end
    
  5. Start running your own ancient DNA analysis!

     nextflow run nf-core/eager -profile <docker/singularity/conda> --reads '*_R{1,2}.fastq.gz' --fasta '<your_reference>.fasta'
    
  6. Once your run has completed successfully, clean up the intermediate files.

     nextflow clean -f -k
    

NB. You can see an overview of the run in the MultiQC report located at ./results/MultiQC/multiqc_report.html

Modifications to the default pipeline are easily made using various options as described in the documentation.

Documentation

The nf-core/eager pipeline comes with documentation about the pipeline, found in the docs/ directory or on the main homepage of the nf-core project:

  1. Nextflow Installation
  2. Pipeline configuration
  3. Running the pipeline
  4. Output and how to interpret the results
  5. EAGER2 Code Contribution Guidelines
  6. nf-core/nextflow Troubleshooting
  7. EAGER Troubleshooting

Credits

This pipeline was mostly written by Alexander Peltzer (apeltzer), with contributions from Stephen Clayton, James A. Fellows Yates, Thiseas C. Lamnidis, Maxime Borry, Zandra Fagernäs, Aida Andrades Valtueña and Maxime Garcia. If you want to contribute, please open an issue (or even better, a pull request!) and ask to be added to the project - everyone is welcome to contribute here!

Authors (alphabetical)

Additional Contributors (alphabetical)

Those who have provided conceptual guidance, suggestions, bug reports etc.

If you've contributed and you're missing in here, please let us know and we will add you in of course!

Tool References

eager's People

Contributors

jfy133 avatar apeltzer avatar maxibor avatar tclamnidis avatar aidaanva avatar zandrafagernas avatar maxulysse avatar phue avatar evanfloden avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.