Giter Site home page Giter Site logo

treat's Introduction

License: GPL v3 Nextflow Twitter URL

Authors: Lasse Faber, Martin Hölzer

TREAT: TranscRiptome EvaluATion

Treat your assemblies well!

A pipeline combining docker containers or conda environments and nextflow for the evaluation of (de novo) transcriptome assembly results.

Execution

To see all options:

nextflow run main.nf --help

Simple example execution:

nextflow run main.nf --assemblies 'test_data/*.fasta' --reads test_data/eco.fastq --reference test_data/eco_genome.fa --annotation test_data/eco_annotation.gff3 --threads 8 --busco bacteria_odb9 -profile standard

Example output for test command

heatmap

Profiles

The workflow can be deployed in different environments.

  • standard: uses docker containers and can be run locally
  • conda: uses conda environments and can be run locally
  • googlegenomics: uses the google cloud and needs to configured manually to run with your gcloud environment

Overview

dag

Motivation

This workflow implementation was motivated by our study comparing various de novo transcriptome assembly tools:

treat's People

Contributors

hoelzer avatar lmfaber avatar

Stargazers

Felipe Marques de Almeida avatar Colin Davenport avatar

Watchers

James Cloos avatar  avatar

treat's Issues

Paired-end read support

Include the workflows for all evaluation tools for paired-end data.
Additionally include TransRate, which is only working for paired-end data.

Heatmap

The script is from my Snakemake pipeline and contains a few Snakemake specific commands which need to be deleted. We would also need to do some minor changes so it will work with this pipeline.

I created a branch for this: https://github.com/hoelzer-lab/treat/tree/heatmap

Dependencies for the script:

pandas
numpy
seaborn
matplotlib

csv as input

Instead of passing the assemblies via the command line, we want to accept a csv file as an input. For example like this:

Assembly R1 R2
trinity.fasta single-end.fq
rna-spades.fasta paired-end_1.fq paired-end_2.fq

Conda profile (Ex90N50): abundance_estimates_to_matrix.pl command not found

Executing the current workflow with the parameter -profile conda leads to the following error.
It seems that the script abundance_estimates_to_matrix.pl in the conda installation is somehow not included in the PATH.

Error executing process > 'EX90N50:ABUNDANCE_ESTIMATES_TO_MATRIX (1)'

Caused by:
  Process `EX90N50:ABUNDANCE_ESTIMATES_TO_MATRIX (1)` terminated with an error exit status (127)

Command executed:

  abundance_estimates_to_matrix.pl --est_method salmon --gene_trans_map none --out_prefix salmon_quant_trinity --name_sample_by_basedir salmon_quant_trinity/quant.sf

Command exit status:
  127

Command output:
  (empty)

Command error:
  .command.sh: line 2: abundance_estimates_to_matrix.pl: command not found

Work dir:
  /tmp/nextflow-work-be47waq/1b/3b0446e806583f095c419ae6c5ef5c

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

rnaQUAST docker

I build a new rnaQUAST docker. Please test when you implement the rnaQUAST module.

This is the current tag:

nanozoo/rnaquast:1.5.2--0ecea11

At least

docker run --rm -it nanozoo/rnaquast:latest rnaQUAST.py --help

works for me.

DETONATE docker

Tools that should be installed and work:

  • rsem-prepare-reference
  • rsem-calculate-expression
  • ref-eval-estimate-true-assembly
  • rsem-eval-calculate-score
  • ref-eval
  • blat
  • samtools

Add conda support

Currently the conda profile support as implemented would work. However, the problem is that for example the HISAT2 module needs also samtools in the conda environment.

I don't know how to add samtools using the current syntax:

conda.config:

 withLabel: HISAT2 { cpus = params.threads ; conda = 'bioconda::hisat2=2.1.0' } 

It would be easy to make a yaml file for each module e.g.:

hisat2.yaml

name: HISAT2
channels:
  - bioconda
dependencies:
  - hisat2=2.1.0
  - samtools=1.9

but then I do not know how to pas such an environment file to the withLabel syntax. @lmfaber

rnaQUAST --prokaryote

The parameter --prokaryote is currently hardcoded for rnaQUAST, which is OK for our current test dataset. However, we should consider passing this as an option in the CLI or config file at some point.

Detonate not working

Here is the path of the working DETONATE binaries on the server:
/data/prostlocal/programs/detonate/1.11
I also saw that samtools is used in version 1.3 on the server, so maybe we should downgrade the version in the docker container, too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.