arumugamlab / minto Goto Github PK

Pipeline for Reproducible and Scalable Integration of Metagenomic and Metatranscriptomic Data

License: MIT License

Python 76.64% R 19.87% Shell 2.06% Perl 1.44%

minto's Introduction

MIntO

A Modular and Scalable Pipeline for Microbiome Metagenomics and Metatranscriptomics Data Integration

MIntO (Microbiome Integrated Meta-Omics), is a highly versatile pipeline that integrates metagenomic and metatranscriptomic data in a scalable way. The distinctive feature of this pipeline is the computation of gene expression profile by taking into account the community turnover and gene expression variations as the underlying process that shapes the community transcript levels along the time and between conditions. The integrated pipeline will be relevant to provide unique biochemical insights into the microbial ecology by linking functions to retrieved genomes and to examine gene expression variations.

MIntO can be downloaded here: https://github.com/arumugamlab/MIntO

What should I know about MIntO?

You do not need to be a programmer to use MIntO

MIntO aims to reduce the barrier of entry for metagenomic/metatranscriptomic data analysis. Experience with linux/unix command line is required, but that's just about it. MIntO is designed to work out of the box, as long as you provide the right information in the configuration files - this is explained in the tutorials. We provide all the key steps and necessary explanations in the wiki section.

If you are a programmer, you can also use MIntO

If you are experienced with bioinformatic pipelines and have used Snakemake, then you could tweak MIntO for your own purposes. You can even add new functionality. You are welcome to contribute to the repository if you are interested.

How do I get started?

We suggest to read the MIntO in a nutshell section first, just to have an idea on how MIntO works. Then move to Installation&Dependencies and then you are set to discover each step! Also you can have a look to the tutorial.

Publication

If you use MIntO during your analysis, please cite:

MIntO: a Modular and Scalable Pipeline for Microbiome Meta-omics Data Integration
Carmen Saenz, Eleonora Nigro, Vithiagaran Gunalan & Manimozhiyan Arumugam
Frontiers in Bioinformatics (2022). doi: 10.3389/fbinf.2022.846922.

Contacts

Feel free to get in touch with the authors if you have any questions. You can post issues in this repository, or write to the corresponding author from the article link above.

minto's People

Contributors

Stargazers

Watchers

Forkers

carmen-saenz khemlalnirmalkar askerdb liaochenlanruo cjreid dong2843997826 arjunsarathi jszarvas microbiomix cpauvert

minto's Issues

Missing taxa_env.yml file

Hello,

During 3.4. STEP4: Retrieving MAGs (ONLY FOR metaG) I encountered that MInto lacks yml configuration file for required by environment.

My command:
snakemake --snakefile /myPath/smk/mags_generation.smk --use-conda --use-singularity --restart-times 0 --keep-going --latency-wait 60 --configfile mags_generation.yaml --conda-prefix /myPath/mintoGit/MIntO/conda_env --jobs 10 --cores 40
"myPath" is of course changed to the appropriate path in the real command.

Error:

Building DAG of jobs...
WorkflowError:
Failed to open source file /myPath/MIntO/envs/taxa_env.yml
FileNotFoundError: ÄErrno 2Å No such file or directory: '/myPath/mintoGit/MIntO/envs/taxa_env.yml'

I see this environment configuration file neither in the repository nor in my installation. Could you add it by any chance?

All the best,
Bogumil

Installation fails with ChildIOException

Hi there, I'm getting a snakemake error that I'm not sure how to resolve when installing MIntO.

My install command:

snakemake\
	--use-conda\
	--restart-times 1\
	--keep-going\
	--latency-wait 60\
	--jobs 2\
	--conda-prefix $MINTO_DIR/conda_env\
	--shadow-prefix /work/<usrname>/minto\
	--snakefile $MINTO_DIR/smk/dependencies.smk\
	--configfile $MINTO_DIR/configuration/dependencies.yaml

(I've changed server-specific details for privacy)

I get the following output:

*******************************
 Reading configuration yaml file: 
 *******************************
  
Building DAG of jobs...
ChildIOException:
File/directory is a child to another output:
$MINTO_DIR/data/kofam_db/profiles
$MINTO_DIR/data/kofam_db/profiles/prokaryote.hal

I have never written snakemake scripts, but Googling around the ChildIOException error suggested that
changing one of the nested things here from output to params can help, so when I play around with that in dependencies.smk it leads to the following error instead:

 *******************************
 Reading configuration yaml file: 
 *******************************
  
Building DAG of jobs...
ChildIOException:
File/directory is a child to another output:
$MINTO_DIR/data/rRNA_databases/idx
$MINTO_DIR/data/rRNA_databases/idx/rRNA_db_index.log

Changing either one to a param breaks the script in other ways. My snakemake version is 5.10.0

I am following instructions on the wiki and am not sure what to do now; any guidance?

Issues with compiling the databases

what's failing is the eggnog_db - log saying the py package created by the database creator is failing and cannot find its related files?

I also tried the manual instructions. In the instructions for eggnog_db it says
"Create a conda environment that will be used to download the eggNOG database conda create env $MINTO_DIR/envs/py38_env.yml --name py38"
but there is no py38_env.yml in the envs directory?

Thank you in advance.

MissingInputException in rule gene_annot_merge in file /MIntO/smk/gene_annotation.smk, line 183:

Reading configuration yaml file

WARNING in configuration yaml file : MIntO is using "/ssd1/MIntO/MIntO/mypath/IBD_tutorial/metaG/8-1-binning/mags_generation_pipeline/prokka" as PATH_reference variable
Building DAG of jobs...
MissingInputException in rule gene_annot_merge in file /ssd1/MIntO/MIntO/smk/gene_annotation.smk, line 183:
Missing input files for rule gene_annot_merge:
output: /ssd1/MIntO/MIntO/mypath/IBD_tutorial/DB/9-MAGs-prokka-post-analysis/CD_transl/MAGs_genes_translated_cds.faa, /ssd1/MIntO/MIntO/mypath/IBD_tutorial/DB/9-MAGs-prokka-post-analysis/genomes_list.txt, /ssd1/MIntO/MIntO/mypath/IBD_tutorial/DB/9-MAGs-prokka-post-analysis/GFF/MAGs_genes.bed, /ssd1/MIntO/MIntO/mypath/IBD_tutorial/DB/9-MAGs-prokka-post-analysis/GFF/MAGs_genes.header-modif.coord_correct.bed
wildcards: wd=/ssd1/MIntO/MIntO/mypath/IBD_tutorial
affected files:
/ssd1/MIntO/MIntO/mypath/IBD_tutorial/metaG/8-1-binning/mags_generation_pipeline/prokka

MGs normalized metagenomics and metatranscriptomics profiles

Thank you for the nice tool. Can we run only the MGs normalization step using minto?

Reference: Read count step normalization step in the methodology of the manuscript (MGs normalization). And the fifth step of the Results section (5. Alignment and normalization).

I have already obtained the MAGs, and now I would like to check their 1) Abundance in metatranscriptomics dataset based on marker genes. 2) Quantification of the gene expression of respective MAGs.

Thank you for your kind instructions.

Regards,
Jaspreet

MIntO with only Meta-transcriptomics data

Hello,

I was interested in using your pipeline for a meta-transcriptomics analysis that I'm preparing. However, I do not have any meta-genomic data. I was wondering if I could still use your pipeline to process only meta-transcriptomic data? If so, could you please give me some pointers on which mode should I use and any considerations that I should take into account in this particular case?

Thanks in advance!
Best regards

Not all output files are specified as strings with metaT (SyntaxError)

Hi!
Thanks for creating and maintaining MIntO! I'm stepping in for a colleague that is using your workflow here (@NicoleTreichel).
Snakemake v7.32.4 complains about a SyntaxError when running the metatranscriptomics workflow, but not with the metagenomics one.

SyntaxError:
Input and output files have to be specified as strings or lists of strings.
  File "/data/MIntO/smk/QC_2.smk", line 292, in <module>

We ran the MIntO version 2.0.0-5-g4f6359 according to git describe --tags which would point to 4f63591.

I'm creating the issue for the record and for other users, but I'm going to submit a PR to fix the issue.
Best,

6-taxa_profile: metaphlan discordance between relative_abundance and estimated_number_of_reads

Dear MIntO Team,

First of all, thank you for providing this nice workflow.
I realized a discrepancy between relative_abundance and estimated_number_of_reads in the output of metaphlan. For some clades the estimated_number_of_reads was 0, even if the relative abundance was positive.
A search of the methaphlan forum suggested an update of the database. After updating mpa_vOct22_CHOCOPhlAnSGB_202212 to mpa_vJun23_CHOCOPhlAnSGB_202307 it now works fine for me.
So my suggestion would be to update the default DB in the MIntO workflow to the Jun23 version.

Best regards,
Nicole

No values given for wildcard 'metaphlan_version'.

Respect Doctor,
CMD:
nakemake --snakefile $MINTO_DIR/smk/dependencies.smk --use-conda --restart-times 0 --keep-going --latency-wait 60 --jobs 10 --configfile $MINTO_DIR/configuration/dependencies.yaml --conda-prefix $MINTO_DIR/conda_env --cores 10

Here:

Reading configuration yaml file:

4.0.6
WildcardError in file /home/zhaoz/soft/MIntO/smk/dependencies.smk, line 350:
No values given for wildcard 'metaphlan_version'.
File "/home/zhaoz/soft/MIntO/smk/dependencies.smk", line 350, in

The dependencies.smk and dependencies.yaml were from here and just justment the dir and cores. I can find the version on 16 lines, but still erros here.

Zhe

dependencies.smk :
`#!/usr/bin/env python

'''
Download and install dependencies

Authors: Carmen Saenz
'''

configuration yaml file

#import sys
import os.path
from os import path
import glob

metaphlan_index = 'mpa_vOct22_CHOCOPhlAnSGB_202212'
metaphlan_version = '4.0.6'
motus_version = '3.0.3'

#args = sys.argv
#print(args)
#args_idx = sys.argv.index('--configfile')
#print(args_idx)
config_path = 'configuration yaml file' #args[args_idx+1]
print(" *******************************")
print(" Reading configuration yaml file: ") #, config_path)
print(" *******************************")
print(" ")

Variables from configuration yaml file

if config['minto_dir'] is None:
#print('ERROR in ')
print('ERROR in ', config_path, ': minto_dir variable is empty. Please, complete ', config_path)
elif path.exists(config['minto_dir']) is False:
#print('ERROR in ')
print('ERROR in ', config_path, ': minto_dir variable path does not exit. Please, complete ', config_path)
else:
minto_dir=config["minto_dir"]

if config['local_dir'] is None:
#print('ERROR in ')
print('ERROR in ', config_path, ': local_dir variable is empty. Please, complete ', config_path)
else:
local_dir = config['local_dir']

if config['download_threads'] is None:
print('ERROR in ', config_path, ': download_threads variable is empty. Please, complete ', config_path)
elif type(config['download_threads']) != int:
print('ERROR in ', config_path, ': download_threads variable is not an integer. Please, complete ', config_path)
else:
download_threads=config["download_threads"]

if config['download_memory'] is None:
print('ERROR in ', config_path, ': download_memory variable is empty. Please, complete ', config_path)
elif type(config['download_memory']) != int:
print('ERROR in ', config_path, ': download_memory variable is not an integer. Please, complete ', config_path)
else:
download_memory=config["download_memory"]

if config['rRNA_index_threads'] is None:
print('ERROR in ', config_path, ': rRNA_index_threads variable is empty. Please, complete ', config_path)
elif type(config['rRNA_index_threads']) != int:
print('ERROR in ', config_path, ': rRNA_index_threads variable is not an integer. Please, complete ', config_path)
else:
index_threads=config["rRNA_index_threads"]

if config['rRNA_index_memory'] is None:
print('ERROR in ', config_path, ': rRNA_index_memory variable is empty. Please, complete ', config_path)
elif type(config['rRNA_index_memory']) != int:
print('ERROR in ', config_path, ': rRNA_index_memory variable is not an integer. Please, complete ', config_path)
else:
index_memory=config["rRNA_index_memory"]

def rRNA_db_out():
files = ["rfam-5.8s-database-id98.fasta",
"rfam-5s-database-id98.fasta",
"silva-arc-16s-id95.fasta",
"silva-arc-23s-id98.fasta",
"silva-bac-16s-id90.fasta",
"silva-bac-23s-id98.fasta",
"silva-euk-18s-id95.fasta",
"silva-euk-28s-id98.fasta",
"idx/rRNA_db_index.log"]
result = expand("{somewhere}/data/rRNA_databases/{file}",
somewhere = minto_dir,
file = files)
return(result)

def eggnog_db_out():
files = ["download_eggnog_data.py",
"data/eggnog.db",
"data/eggnog_proteins.dmnd",
"data/eggnog.taxa.db",
"data/eggnog.taxa.db.traverse.pkl",
"data/mmseqs",
"data/pfam"]
result = expand("{somewhere}/data/eggnog_data/{file}",
somewhere = minto_dir,
file = files)
return(result)

def Kofam_db_out():
files = ["ko_list",
"profiles",
"README"]
result = expand("{somewhere}/data/kofam_db/{file}",
somewhere = minto_dir,
file = files)
return(result)

def dbCAN_db_out():
files = ["CAZyDB.09242021.fa",
"dbCAN.txt",
"tcdb.fa",
"tf-1.hmm",
"tf-2.hmm",
"stp.hmm"]
result = expand("{somewhere}/data/dbCAN_db/{file}",
somewhere = minto_dir,
file = files)
return(result)
print(metaphlan_version)
def metaphlan_db_out():
result=expand("{minto_dir}/data/metaphlan/{metaphlan_version}/{metaphlan_index}_VINFO.csv",
minto_dir=minto_dir,
metaphlan_version=metaphlan_version,
metaphlan_index=metaphlan_index)
return(result)

def motus_db_out():
result=expand("{minto_dir}/logs/motus_download_db_checkpoint.log",
minto_dir=minto_dir)
return(result)

def checkm2_db_out():
result=expand("{minto_dir}/data/CheckM2_database/uniref100.KO.1.dmnd",
minto_dir=minto_dir)
return(result)

def fetchMGs_out():
result=expand("{minto_dir}/logs/fetchMGs_download.done",
minto_dir=minto_dir)
return(result)

def all_env_out():
files = ["vamb_env.log",
"r_pkgs.log",
"mags_env.log"]
result = expand("{somewhere}/logs/{file}",
somewhere = minto_dir,
file = files)
return(result)

Define all the outputs needed by target 'all'

rule all:
input:
checkm2_db_out(),
rRNA_db_out(),
eggnog_db_out(),
Kofam_db_out(),
dbCAN_db_out(),
metaphlan_db_out(),
motus_db_out(),
fetchMGs_out(),
all_env_out()

###############################################################################################

Download and index rRNA database - SortMeRNA

###############################################################################################
rule rRNA_db_download:
output:
"{somewhere}/rRNA_databases/{something}.fasta"
resources: mem=index_memory
threads: index_threads
conda:
config["minto_dir"]+"/envs/MIntO_base.yml" #sortmerna
shell:
"""
mkdir -p {wildcards.somewhere}/rRNA_databases
cd {wildcards.somewhere}/rRNA_databases
wget --quiet https://raw.githubusercontent.com/biocore/sortmerna/master/data/rRNA_databases/{wildcards.something}.fasta
"""

def get_rRNA_db_index_input(wildcards):
files = ["rfam-5.8s-database-id98.fasta",
"rfam-5s-database-id98.fasta",
"silva-arc-16s-id95.fasta",
"silva-arc-23s-id98.fasta",
"silva-bac-16s-id90.fasta",
"silva-bac-23s-id98.fasta",
"silva-euk-18s-id95.fasta",
"silva-euk-28s-id98.fasta"]
result = expand("{somewhere}/data/rRNA_databases/{file}",
somewhere = wildcards.somewhere,
file = files)
return(result)

rule rRNA_db_index:
input:
get_rRNA_db_index_input
output:
rRNA_db_index_file = "{somewhere}/data/rRNA_databases/idx/rRNA_db_index.log",
rRNA_db_index = directory("{somewhere}/data/rRNA_databases/idx")
params:
tmp_sortmerna_index=lambda wildcards: "{local_dir}/MIntO.rRNA_index".format(local_dir=local_dir),
resources: mem=index_memory
threads: index_threads
log:
"{somewhere}/logs/rRNA_db_index.log"
conda:
config["minto_dir"]+"/envs/MIntO_base.yml" #sortmerna
shell:
"""
mkdir -p {params.tmp_sortmerna_index}/idx/
dboption=$(echo {input} | sed "s/ / --ref /g")
time (sortmerna --workdir {params.tmp_sortmerna_index} --idx-dir {params.tmp_sortmerna_index}/idx/ --index 1 --ref $dboption --threads {threads}
rsync {params.tmp_sortmerna_index}/idx/* {output.rRNA_db_index}
echo 'SortMeRNA indexed rRNA_databases done' > {output.rRNA_db_index_file}) >& {log}
rm -rf {params.tmp_sortmerna_index}
"""

###############################################################################################

Download Eggnog database - eggNOGmapper

###############################################################################################

rule eggnog_db:
output:
eggnog_py="{minto_dir}/data/eggnog_data/download_eggnog_data.py",
eggnog_db1="{minto_dir}/data/eggnog_data/data/eggnog.db",
eggnog_db2="{minto_dir}/data/eggnog_data/data/eggnog_proteins.dmnd",
eggnog_db3="{minto_dir}/data/eggnog_data/data/eggnog.taxa.db",
eggnog_db4="{minto_dir}/data/eggnog_data/data/eggnog.taxa.db.traverse.pkl",
eggnog_db5=directory("{minto_dir}/data/eggnog_data/data/mmseqs"),
eggnog_db6=directory("{minto_dir}/data/eggnog_data/data/pfam"),
params:
eggnog_db= lambda wildcards: "{minto_dir}/data/eggnog_data/".format(minto_dir = minto_dir) #config["EGGNOG_db"]
resources: mem=download_memory
threads: download_threads
log:
"{minto_dir}/logs/eggnog_db_download.log"
conda:
config["minto_dir"]+"/envs/gene_annotation.yml"
shell:
"""
mkdir -p {minto_dir}/data/eggnog_data/data
time (cd {minto_dir}/data/eggnog_data/
wget https://raw.githubusercontent.com/eggnogdb/eggnog-mapper/master/download_eggnog_data.py
printf "y\ny\ny\ny\ny\n" |python3 {minto_dir}/data/eggnog_data/download_eggnog_data.py --data_dir {minto_dir}/data/eggnog_data/data -P -M -f
echo 'eggNOG database downloaded') &> {log}
"""

###############################################################################################

Download KEGG database - KOfamScan

###############################################################################################
rule Kofam_db:
output:
kofam_db1="{minto_dir}/data/kofam_db/ko_list",
#kofam_db2="{minto_dir}/data/kofam_db/profiles.tar",
kofam_db3=directory("{minto_dir}/data/kofam_db/profiles"), #DIRECTORY xxx
kofam_db4="{minto_dir}/data/kofam_db/README"
resources: mem=download_memory
threads: download_threads
log:
"{minto_dir}/logs/kofam_db_download.log"
conda:
config["minto_dir"]+"/envs/gene_annotation.yml"
shell:
"""
mkdir -p {minto_dir}/data/kofam_db/
time (cd {minto_dir}/data/kofam_db/
wget ftp://ftp.genome.jp/pub/db/kofam/*
gunzip {minto_dir}/data/kofam_db/ko_list.gz
tar -zxvf {minto_dir}/data/kofam_db/profiles.tar.gz
echo 'KEGG database downloaded') &> {log}
"""

###############################################################################################

Download dbCAN database - run_dbcan

###############################################################################################

rule dbCAN_db:
output:
dbCAN_db1="{minto_dir}/data/dbCAN_db/CAZyDB.09242021.fa",
dbCAN_db2="{minto_dir}/data/dbCAN_db/dbCAN.txt",
dbCAN_db3="{minto_dir}/data/dbCAN_db/tcdb.fa",
dbCAN_db4="{minto_dir}/data/dbCAN_db/tf-1.hmm",
dbCAN_db5="{minto_dir}/data/dbCAN_db/tf-2.hmm",
dbCAN_db6="{minto_dir}/data/dbCAN_db/stp.hmm",
resources: mem=download_memory
threads: download_threads
log:
"{minto_dir}/logs/dbCAN_db_download.log"
conda:
config["minto_dir"]+"/envs/gene_annotation.yml"
shell:
"""
mkdir -p {minto_dir}/data/dbCAN_db/
cd {minto_dir}/data/dbCAN_db/
time (wget https://bcb.unl.edu/dbCAN2/download/Databases/CAZyDB.09242021.fa
diamond makedb --in CAZyDB.09242021.fa -d CAZy
wget https://bcb.unl.edu/dbCAN2/download/Databases/V10/dbCAN-HMMdb-V10.txt
mv dbCAN-HMMdb-V10.txt dbCAN.txt
hmmpress dbCAN.txt
wget https://bcb.unl.edu/dbCAN2/download/Databases/tcdb.fa
diamond makedb --in tcdb.fa -d tcdb
wget https://bcb.unl.edu/dbCAN2/download/Databases/tf-1.hmm
hmmpress tf-1.hmm
wget https://bcb.unl.edu/dbCAN2/download/Databases/tf-2.hmm
hmmpress tf-2.hmm
wget https://bcb.unl.edu/dbCAN2/download/Databases/stp.hmm
hmmpress stp.hmm
echo 'dbCAN database downloaded and installed') &> {log}
"""

https://github.com/linnabrown/run_dbcan

Database Installation.

git clone https://github.com/linnabrown/run_dbcan.git

cd run_dbcan

test -d db || mkdir db

cd db \

&& wget http://bcb.unl.edu/dbCAN2/download/CAZyDB.09242021.fa && diamond makedb --in CAZyDB.09242021.fa -d CAZy \

&& wget https://bcb.unl.edu/dbCAN2/download/Databases/V10/dbCAN-HMMdb-V10.txt && mv dbCAN-HMMdb-V10.txt dbCAN.txt && hmmpress dbCAN.txt \

&& wget http://bcb.unl.edu/dbCAN2/download/Databases/tcdb.fa && diamond makedb --in tcdb.fa -d tcdb \

&& wget http://bcb.unl.edu/dbCAN2/download/Databases/tf-1.hmm && hmmpress tf-1.hmm \

&& wget http://bcb.unl.edu/dbCAN2/download/Databases/tf-2.hmm && hmmpress tf-2.hmm \

&& wget http://bcb.unl.edu/dbCAN2/download/Databases/stp.hmm && hmmpress stp.hmm

DATABASES Installation

https://bcb.unl.edu/dbCAN2/download/Databases/ #Databse -- Database Folder

https://bcb.unl.edu/dbCAN2/download/Databases/CAZyDB.09242021.fa #CAZy.fa--use diamond makedb --in CAZyDB.09242021.fa -d CAZy

[CAZyme]:included in eCAMI.

[EC]: included in eCAMI.

https://bcb.unl.edu/dbCAN2/download/Databases/V10/dbCAN-HMMdb-V10.txt #dbCAN-HMMdb-V10.txt--First use mv dbCAN-HMMdb-V10.txt dbCAN.txt, then use hmmpress dbCAN.txt

https://bcb.unl.edu/dbCAN2/download/Databases/tcdb.fa #tcdb.fa--use diamond makedb --in tcdb.fa -d tcdb

https://bcb.unl.edu/dbCAN2/download/Databases/tf-1.hmm #tf-1.hmm--use hmmpress tf-1.hmm

https://bcb.unl.edu/dbCAN2/download/Databases/tf-2.hmm #tf-2.hmm--use hmmpress tf-2.hmm

https://bcb.unl.edu/dbCAN2/download/Databases/stp.hmm #stp.hmm--use hmmpress stp.hmm

###############################################################################################

Download metaphlan database - MetaPhlAn4

###############################################################################################

rule metaphlan_db:
output:
metaphlan_db=expand("{minto_dir}/data/metaphlan/{metaphlan_version}/{metaphlan_index}VINFO.csv",
minto_dir=minto_dir,
metaphlan_index=metaphlan_index),
resources:
mem=download_memory
threads:
download_threads
log:
expand("{minto_dir}/logs/metaphlan{metaphlan_version}_{metaphlan_index}_download_db.log",
minto_dir=minto_dir,
metaphlan_version=metaphlan_version,
metaphlan_index=metaphlan_index)
conda:
config["minto_dir"]+"/envs/metaphlan.yml"
shell:
"""
mkdir -p {minto_dir}/data/metaphlan/{metaphlan_version}
time (
metaphlan --version
metaphlan --install --index {metaphlan_index} --bowtie2db {minto_dir}/data/metaphlan/{metaphlan_version}/
if [ $? -eq 0 ]; then
echo 'MetaPhlAn database download: OK'
echo "{metaphlan_index}" > {minto_dir}/data/metaphlan/{metaphlan_version}/mpa_latest
else
echo 'MetaPhlAn database download: FAIL'
fi) &> {log}
"""

###############################################################################################

Download mOTUs database - mOTUs3

###############################################################################################

rule motus_db:
output:
"{minto_dir}/data/motus/db.{motus_version}.downloaded"
resources:
mem=download_memory
threads:
download_threads
log:
"{minto_dir}/logs/motus_download_db.log"
conda:
config["minto_dir"]+"/envs/motus_env.yml"
shell:
"""
time (
motus downloadDB
if [ $? -eq 0 ]; then
echo 'mOTUs3 database download: OK'
echo OK > {output}
else
echo 'mOTUs3 database download: FAIL'
fi
) &> {log}
"""

###############################################################################################

Download CheckM2 database

###############################################################################################

rule checkm2_db:
output:
"{minto_dir}/data/CheckM2_database/uniref100.KO.1.dmnd"
resources:
mem=download_memory
threads:
download_threads
log:
"{minto_dir}/logs/checkm2_download_db.log"
conda:
config["minto_dir"]+"/envs/checkm2.yml"
shell:
"""
time (
checkm2 database --download --path {minto_dir}/data
if [ $? -eq 0 ]; then
echo 'CheckM2 database download: OK'
else
echo 'CheckM2 database download: FAIL'
fi
) &> {log}
"""

###############################################################################################

Download fetchMGs

###############################################################################################

rule download_fetchMGs:
output:
done="{minto_dir}/logs/fetchMGs_download.done",
data=directory("{minto_dir}/data/fetchMGs-1.2")
resources:
mem=download_memory
threads:
download_threads
log:
"{minto_dir}/logs/fetchMGs_download.log"
shell:
"""
time (
cd {minto_dir}/data/
wget -O fetchMGs-1.2.tar.gz https://github.com/motu-tool/fetchMGs/archive/refs/tags/v1.2.tar.gz
tar xfz fetchMGs-1.2.tar.gz
if [ $? -eq 0 ]; then
echo 'fetchMGs download: OK'
echo OK > {output.done}
else
echo 'fetchMGs download: FAIL'
fi
rm fetchMGs-1.2.tar.gz
) &> {log}
"""

###############################################################################################

Generate conda environments

###############################################################################################

rule r_pkgs:
output:
r_pkgs="{minto_dir}/logs/r_pkgs.log"
resources:
mem=download_memory
threads:
download_threads
conda:
config["minto_dir"]+"/envs/r_pkgs.yml"
shell:
"""
time (echo 'r_pkgs environment generated') &> {output}
"""

rule mags_gen_vamb:
output:
vamb_env="{minto_dir}/logs/vamb_env.log"
resources:
mem=download_memory
threads:
download_threads
log:
"{minto_dir}/logs/vamb_env.log"
conda:
config["minto_dir"]+"/envs/avamb.yml"
shell:
"""
time (
echo 'VAMB environment generated') &> {log}
"""

rule mags_gen:
output:
mags_env="{minto_dir}/logs/mags_env.log"
resources:
mem=download_memory
threads:
download_threads
conda:
config["minto_dir"]+"/envs/mags.yml"
shell:
"""
time (echo 'mags environment generated') &> {output}
"""

rule mags_gen_py36:
output:
py36_env="{minto_dir}/logs/py36_env.log"
resources:
mem=download_memory
threads:
download_threads
log:
"{minto_dir}/logs/py36_env.log"
conda:
config["minto_dir"]+"/envs/py36_env.yml"
shell:
"""
time (
echo 'Python 3.6 environment generated') &> {log}
"""
`

Installation failed

Hello, I want to ask what is going on with this error, I didn't find how to solve it, thank you!

snakemake --use-conda --restart-times 1 --keep-going --latency-wait 60 --jobs 32 --cores 64 --conda-prefix /home/pc/software/miniconda3/envs/MIntO --snakefile $MINTO_DIR/smk/dependencies.smk --configfile $MINTO_DIR/configuration/dependencies.yaml

Traceback (most recent call last):
File "/home/pc/software/miniconda3/envs/MIntO/lib/python3.12/site-packages/snakemake/cli.py", line 1887, in args_to_api
dag_api = workflow_api.dag(
^^^^^^^^^^^^^^^^^
File "/home/pc/software/miniconda3/envs/MIntO/lib/python3.12/site-packages/snakemake/api.py", line 326, in dag
return DAGApi(
^^^^^^^
File "", line 6, in init
File "/home/pc/software/miniconda3/envs/MIntO/lib/python3.12/site-packages/snakemake/api.py", line 421, in post_init
self.workflow_api._workflow.dag_settings = self.dag_settings
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/pc/software/miniconda3/envs/MIntO/lib/python3.12/site-packages/snakemake/api.py", line 372, in _workflow
workflow.include(
File "/home/pc/software/miniconda3/envs/MIntO/lib/python3.12/site-packages/snakemake/workflow.py", line 1369, in include
exec(compile(code, snakefile.get_path_or_uri(), "exec"), self.globals)
File "/root/data/APP_MIntO/MIntO/smk/dependencies.smk", line 15, in
include: 'include/cmdline_validator.smk'
File "/home/pc/software/miniconda3/envs/MIntO/lib/python3.12/site-packages/snakemake/workflow.py", line 1369, in include
exec(compile(code, snakefile.get_path_or_uri(), "exec"), self.globals)
File "/root/data/APP_MIntO/MIntO/smk/include/cmdline_validator.smk", line 11, in
if workflow.shadow_prefix is None:
^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'Workflow' object has no attribute 'shadow_prefix'

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.