Comments (31)
Update- I read the wiki and created a snakemake profile. I tried to run the install for HPC and get an error, see below.
Thanks!
(/lustre/project/taw/share/conda-envs/hecatomb) [kvigil@cypress2 slurm]$ hecatomb install --profile slurm
██╗ ██╗███████╗ ██████╗ █████╗ ████████╗ ██████╗ ███╗ ███╗██████╗
██║ ██║██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║██╔══██╗
███████║█████╗ ██║ ███████║ ██║ ██║ ██║██╔████╔██║██████╔╝
██╔══██║██╔══╝ ██║ ██╔══██║ ██║ ██║ ██║██║╚██╔╝██║██╔══██╗
██║ ██║███████╗╚██████╗██║ ██║ ██║ ╚██████╔╝██║ ╚═╝ ██║██████╔╝
╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝
Hecatomb version v1.1.0
[2023:01:26 16:07:35] Config file hecatomb.config.yaml already exists. Using existing config file.
[2023:01:26 16:07:35] Writing runtime config file to hecatomb.config.yaml
[2023:01:26 16:07:35] ------------------
[2023:01:26 16:07:35] | Runtime config |
[2023:01:26 16:07:35] ------------------
BigJobCpu: 24
BigJobMem: 64000
BigJobTimeMin: 1440
COMPRESSION: 1
CONTIG_MINLENGTH: 1000
CUTTAIL_WINDOW: 25
DEDUP_ACCURACY: 4
Databases: null
ENTROPY: 0.5
ENTROPYWINDOW: 25
Host: human
MediumJobCpu: 16
MediumJobMem: 32000
MoreRamCpu: 2
MoreRamMem: 16000
Output: hecatomb.out
Preprocess: paired
QSCORE: 15
READ_MINLENGTH: 90
Reads: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data
Search: sensitive
SmallJobCpu: 8
SmallJobMem: 16000
canuSettings: correctedErrorRate=0.16 maxInputCoverage=10000 minInputCoverage=0 corOutCoverage=10000
corMhapSensitivity=high corMinCoverage=0 useGrid=False stopOnLowCoverage=False genomeSize=10M
-nanopore
filtAAprimary: --min-length 30 -e 1e-3
filtAAsecondary: --min-length 30 -e 1e-5
filtNTprimary: --min-length 90 -e 1e-10
filtNTsecondary: --min-length 90 -e 1e-20
linclustParams: --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode
3
perfAA: --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0
perfAAfast: -s 4.0 --lca-mode 2 --shuffle 0
perfNT: --start-sens 2 -s 7 --sens-steps 3
perfNTfast: -s 4.0
taxIdIgnore: 0 1 2 10239 131567 12429 2759
[2023:01:26 16:07:35] ---------------------
[2023:01:26 16:07:35] | Snakemake command |
[2023:01:26 16:07:35] ---------------------
snakemake -s /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/workflow/DownloadDB.smk --configfile hecatomb.config.yaml --rerun-incomplete --printshellcmds --nolock --show-failed-logs --profile slurm
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/config.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/dbFiles.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Using shell: /bin/bash
Provided cluster nodes: 100
Job stats:
job count min threads max threads
all 1 1 1
download_db_file 7 1 1
total 8 1 1
Select jobs to execute...
[Thu Jan 26 16:07:39 2023]
rule download_db_file:
output: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/nt/virus_primary_nt/sequenceDB
jobid: 45
reason: Forced execution
wildcards: file=nt/virus_primary_nt/sequenceDB
resources: mem_mb=2000, mem_mib=1908, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
Traceback (most recent call last):
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1079, in call
return func(*args, **kwargs)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/cli/main.py", line 76, in _main
init_loggers(context)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/cli/main.py", line 58, in init_loggers
if context and context.json:
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 1207, in get
matches = [self.type.load(self.name, match) for match in raw_matches]
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 1207, in
matches = [self.type.load(self.name, match) for match in raw_matches]
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 974, in load
match.value(self._element_type),
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 267, in value
return make_immutable(self._raw_value)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/_vendor/auxlib/collection.py", line 24, in make_immutable
elif isiterable(value):
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/_vendor/auxlib/compat.py", line 37, in isiterable
return not isinstance(obj, string_types) and isinstance(obj, collections.Iterable)
AttributeError: module 'collections' has no attribute 'Iterable'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/lustre/project/taw/share/conda-envs/hecatomb/bin/conda", line 13, in
sys.exit(main())
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/cli/main.py", line 150, in main
return conda_exception_handler(_main, *args, **kwargs)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1371, in conda_exception_handler
return_value = exception_handler(func, *args, **kwargs)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1082, in call
return self.handle_exception(exc_val, exc_tb)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1126, in handle_exception
return self.handle_unexpected_exception(exc_val, exc_tb)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1137, in handle_unexpected_exception
self.print_unexpected_error_report(error_report)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1193, in print_unexpected_error_report
if context.json:
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 1207, in get
matches = [self.type.load(self.name, match) for match in raw_matches]
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 1207, in
matches = [self.type.load(self.name, match) for match in raw_matches]
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 974, in load
match.value(self._element_type),
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 267, in value
return make_immutable(self._raw_value)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/_vendor/auxlib/collection.py", line 24, in make_immutable
elif isiterable(value):
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/_vendor/auxlib/compat.py", line 37, in isiterable
return not isinstance(obj, string_types) and isinstance(obj, collections.Iterable)
AttributeError: module 'collections' has no attribute 'Iterable'
Traceback (most recent call last):
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/init.py", line 757, in snakemake
success = workflow.execute(
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/workflow.py", line 1089, in execute
raise e
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/workflow.py", line 1085, in execute
success = self.scheduler.schedule()
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/scheduler.py", line 592, in schedule
self.run(runjobs)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/scheduler.py", line 641, in run
executor.run_jobs(
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/executors/init.py", line 155, in run_jobs
self.run(
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/executors/init.py", line 1156, in run
self.write_jobscript(job, jobscript)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/executors/init.py", line 884, in write_jobscript
exec_job = self.format_job_exec(job)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/executors/init.py", line 442, in format_job_exec
self.general_args,
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/common/init.py", line 218, in get
value = self.method(instance)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/executors/init.py", line 334, in general_args
w2a("conda_base_path", skip=not self.assume_shared_fs),
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/executors/init.py", line 294, in workflow_property_to_arg
value = getattr(self.workflow, property)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/workflow.py", line 300, in conda_base_path
return Conda().prefix_path
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/deployment/conda.py", line 667, in init
shell.check_output(self._get_cmd(f"conda info --json"), text=True)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/shell.py", line 63, in check_output
return sp.check_output(cmd, shell=True, executable=executable, **kwargs)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/subprocess.py", line 421, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'conda info --json' returned non-zero exit status 1.
[2023:01:26 16:07:41] Error: Snakemake failed
from hecatomb.
This is a weird issue and I'm guessing there's something wrong with the environment. Did you load anaconda as a module on your HPC?
collections.iterable was deprecated in 3.10 and it looks like the workflow is calling packages from both 3.8 and 3.10. I would suggest trying a fresh install of miniconda on your home directory with python 3.10 if that's possible.
from hecatomb.
Looks like it worked when I load the module right before running! Thanks
(base) [kvigil@cypress1 ~]$ module load anaconda3/2020.07
(base) [kvigil@cypress1 ~]$ conda activate hecatomb
(/lustre/project/taw/share/conda-envs/hecatomb) [kvigil@cypress1 ~]$ hecatomb install --profile slurm
██╗ ██╗███████╗ ██████╗ █████╗ ████████╗ ██████╗ ███╗ ███╗██████╗
██║ ██║██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║██╔══██╗
███████║█████╗ ██║ ███████║ ██║ ██║ ██║██╔████╔██║██████╔╝
██╔══██║██╔══╝ ██║ ██╔══██║ ██║ ██║ ██║██║╚██╔╝██║██╔══██╗
██║ ██║███████╗╚██████╗██║ ██║ ██║ ╚██████╔╝██║ ╚═╝ ██║██████╔╝
╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝
Hecatomb version v1.1.0
[2023:01:27 07:37:56] Config file hecatomb.config.yaml already exists. Using existing config file.
[2023:01:27 07:37:56] Writing runtime config file to hecatomb.config.yaml
[2023:01:27 07:37:56] ------------------
[2023:01:27 07:37:56] | Runtime config |
[2023:01:27 07:37:56] ------------------
BigJobCpu: 24
BigJobMem: 64000
BigJobTimeMin: 1440
COMPRESSION: 1
CONTIG_MINLENGTH: 1000
CUTTAIL_WINDOW: 25
DEDUP_ACCURACY: 4
Databases: null
ENTROPY: 0.5
ENTROPYWINDOW: 25
Host: human
MediumJobCpu: 16
MediumJobMem: 32000
MoreRamCpu: 2
MoreRamMem: 16000
Output: hecatomb.out
Preprocess: paired
QSCORE: 15
READ_MINLENGTH: 90
Reads: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data
Search: sensitive
SmallJobCpu: 8
SmallJobMem: 16000
canuSettings: correctedErrorRate=0.16 maxInputCoverage=10000 minInputCoverage=0 corOutCoverage=10000
corMhapSensitivity=high corMinCoverage=0 useGrid=False stopOnLowCoverage=False genomeSize=10M
-nanopore
filtAAprimary: --min-length 30 -e 1e-3
filtAAsecondary: --min-length 30 -e 1e-5
filtNTprimary: --min-length 90 -e 1e-10
filtNTsecondary: --min-length 90 -e 1e-20
linclustParams: --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode
3
perfAA: --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0
perfAAfast: -s 4.0 --lca-mode 2 --shuffle 0
perfNT: --start-sens 2 -s 7 --sens-steps 3
perfNTfast: -s 4.0
taxIdIgnore: 0 1 2 10239 131567 12429 2759
[2023:01:27 07:37:56] ---------------------
[2023:01:27 07:37:56] | Snakemake command |
[2023:01:27 07:37:56] ---------------------
snakemake -s /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/workflow/DownloadDB.smk --configfile hecatomb.config.yaml --rerun-incomplete --printshellcmds --nolock --show-failed-logs --profile slurm
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/config.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/dbFiles.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Nothing to be done (all requested files are present and up to date).
Complete log: .snakemake/log/2023-01-27T073806.383518.snakemake.log
[2023:01:27 07:38:09] Snakemake finished successfully
from hecatomb.
So looks like when I run the HPC test I get a snakemake failed error:
(/lustre/project/taw/share/conda-envs/hecatomb) [kvigil@cypress1 ~]$ hecatomb test --profile slurm
██╗ ██╗███████╗ ██████╗ █████╗ ████████╗ ██████╗ ███╗ ███╗██████╗
██║ ██║██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║██╔══██╗
███████║█████╗ ██║ ███████║ ██║ ██║ ██║██╔████╔██║██████╔╝
██╔══██║██╔══╝ ██║ ██╔══██║ ██║ ██║ ██║██║╚██╔╝██║██╔══██╗
██║ ██║███████╗╚██████╗██║ ██║ ██║ ╚██████╔╝██║ ╚═╝ ██║██████╔╝
╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝
Hecatomb version v1.1.0
[2023:01:27 07:40:59] Config file hecatomb.config.yaml already exists. Using existing config file.
[2023:01:27 07:40:59] Writing runtime config file to hecatomb.config.yaml
[2023:01:27 07:40:59] ------------------
[2023:01:27 07:40:59] | Runtime config |
[2023:01:27 07:40:59] ------------------
BigJobCpu: 24
BigJobMem: 64000
BigJobTimeMin: 1440
COMPRESSION: 1
CONTIG_MINLENGTH: 1000
CUTTAIL_WINDOW: 25
DEDUP_ACCURACY: 4
Databases: null
ENTROPY: 0.5
ENTROPYWINDOW: 25
Host: human
MediumJobCpu: 16
MediumJobMem: 32000
MoreRamCpu: 2
MoreRamMem: 16000
Output: hecatomb.out
Preprocess: paired
QSCORE: 15
READ_MINLENGTH: 90
Reads: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data
Search: sensitive
SmallJobCpu: 8
SmallJobMem: 16000
canuSettings: correctedErrorRate=0.16 maxInputCoverage=10000 minInputCoverage=0 corOutCoverage=10000
corMhapSensitivity=high corMinCoverage=0 useGrid=False stopOnLowCoverage=False genomeSize=10M
-nanopore
filtAAprimary: --min-length 30 -e 1e-3
filtAAsecondary: --min-length 30 -e 1e-5
filtNTprimary: --min-length 90 -e 1e-10
filtNTsecondary: --min-length 90 -e 1e-20
linclustParams: --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode
3
perfAA: --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0
perfAAfast: -s 4.0 --lca-mode 2 --shuffle 0
perfNT: --start-sens 2 -s 7 --sens-steps 3
perfNTfast: -s 4.0
taxIdIgnore: 0 1 2 10239 131567 12429 2759
[2023:01:27 07:40:59] ---------------------
[2023:01:27 07:40:59] | Snakemake command |
[2023:01:27 07:40:59] ---------------------
snakemake -s /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/workflow/Hecatomb.smk --configfile hecatomb.config.yaml --use-conda --conda-frontend mamba --conda-prefix /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/conda --rerun-incomplete --printshellcmds --nolock --show-failed-logs --profile slurm
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/config.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/dbFiles.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/immutable.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Traceback (most recent call last):
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1079, in call
return func(*args, **kwargs)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/cli/main.py", line 76, in _main
init_loggers(context)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/cli/main.py", line 58, in init_loggers
if context and context.json:
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 1207, in get
matches = [self.type.load(self.name, match) for match in raw_matches]
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 1207, in
matches = [self.type.load(self.name, match) for match in raw_matches]
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 974, in load
match.value(self._element_type),
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 267, in value
return make_immutable(self._raw_value)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/_vendor/auxlib/collection.py", line 24, in make_immutable
elif isiterable(value):
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/_vendor/auxlib/compat.py", line 37, in isiterable
return not isinstance(obj, string_types) and isinstance(obj, collections.Iterable)
AttributeError: module 'collections' has no attribute 'Iterable'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/lustre/project/taw/share/conda-envs/hecatomb/bin/conda", line 13, in
sys.exit(main())
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/cli/main.py", line 150, in main
return conda_exception_handler(_main, *args, **kwargs)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1371, in conda_exception_handler
return_value = exception_handler(func, *args, **kwargs)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1082, in call
return self.handle_exception(exc_val, exc_tb)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1126, in handle_exception
return self.handle_unexpected_exception(exc_val, exc_tb)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1137, in handle_unexpected_exception
self.print_unexpected_error_report(error_report)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/exceptions.py", line 1193, in print_unexpected_error_report
if context.json:
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 1207, in get
matches = [self.type.load(self.name, match) for match in raw_matches]
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 1207, in
matches = [self.type.load(self.name, match) for match in raw_matches]
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 974, in load
match.value(self._element_type),
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/common/configuration.py", line 267, in value
return make_immutable(self._raw_value)
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/_vendor/auxlib/collection.py", line 24, in make_immutable
elif isiterable(value):
File "/share/apps/anaconda/3/2020.07/lib/python3.8/site-packages/conda/_vendor/auxlib/compat.py", line 37, in isiterable
return not isinstance(obj, string_types) and isinstance(obj, collections.Iterable)
AttributeError: module 'collections' has no attribute 'Iterable'
Traceback (most recent call last):
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/init.py", line 757, in snakemake
success = workflow.execute(
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/workflow.py", line 928, in execute
dag.create_conda_envs(
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/dag.py", line 338, in create_conda_envs
env.create(dryrun)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/deployment/conda.py", line 405, in create
pin_file = self.pin_file
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/common/init.py", line 218, in get
value = self.method(instance)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/deployment/conda.py", line 109, in pin_file
f".{self.conda.platform}.pin.txt"
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/common/init.py", line 218, in get
value = self.method(instance)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/deployment/conda.py", line 102, in conda
return Conda(
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/deployment/conda.py", line 667, in init
shell.check_output(self._get_cmd(f"conda info --json"), text=True)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/site-packages/snakemake/shell.py", line 63, in check_output
return sp.check_output(cmd, shell=True, executable=executable, **kwargs)
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/subprocess.py", line 421, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'conda info --json' returned non-zero exit status 1.
[2023:01:27 07:41:06] Error: Snakemake failed
from hecatomb.
2023-01-27T102534.391649.snakemake.log
2023-01-27T102517.189801.snakemake.log
2023-01-27T102344.153011.snakemake.log
2023-01-27T102208.438015.snakemake.log
2023-01-27T102146.386328.snakemake.log
2023-01-27T102120.713968.snakemake.log
2023-01-27T074100.928790.snakemake.log
2023-01-27T073806.383518.snakemake.log
from hecatomb.
Hi did anybody checkout these logs? Thanks katie
from hecatomb.
Hi,
Sorry I meant to follow up on this. I'm not sure which of those logs is the latest, but based on what you copy pasted it looks like the same issue as before and I think the anaconda module might be too old. I would try installing miniconda with python 3.10. You should be able to install a local copy to your home directory even on an HPC.
from hecatomb.
@beardymcjohnface Hi thank you! I will definitely try this out. Thanks for the response.
from hecatomb.
Hi I am still getting a snakemake error. Also, it doesn't look like this run was saved in a log like it stated below. Thanks!
(/lustre/project/taw/share/conda-envs/hecatomb) [kvigil@cypress2 conda-envs]$ unset PYTHONPATH
(/lustre/project/taw/share/conda-envs/hecatomb) [kvigil@cypress2 conda-envs]$ hecatomb test --profile slurm
██╗ ██╗███████╗ ██████╗ █████╗ ████████╗ ██████╗ ███╗ ███╗██████╗
██║ ██║██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║██╔══██╗
███████║█████╗ ██║ ███████║ ██║ ██║ ██║██╔████╔██║██████╔╝
██╔══██║██╔══╝ ██║ ██╔══██║ ██║ ██║ ██║██║╚██╔╝██║██╔══██╗
██║ ██║███████╗╚██████╗██║ ██║ ██║ ╚██████╔╝██║ ╚═╝ ██║██████╔╝
╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝
Hecatomb version v1.1.0
[2023:03:06 13:20:24] Config file hecatomb.config.yaml already exists. Using existing config file.
[2023:03:06 13:20:24] Writing runtime config file to hecatomb.config.yaml
[2023:03:06 13:20:24] ------------------
[2023:03:06 13:20:24] | Runtime config |
[2023:03:06 13:20:24] ------------------
BigJobCpu: 24
BigJobMem: 64000
BigJobTimeMin: 1440
COMPRESSION: 1
CONTIG_MINLENGTH: 1000
CUTTAIL_WINDOW: 25
DEDUP_ACCURACY: 4
Databases: null
ENTROPY: 0.5
ENTROPYWINDOW: 25
Host: human
MediumJobCpu: 16
MediumJobMem: 32000
MoreRamCpu: 2
MoreRamMem: 16000
Output: hecatomb.out
Preprocess: paired
QSCORE: 15
READ_MINLENGTH: 90
Reads: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data
Search: sensitive
SmallJobCpu: 8
SmallJobMem: 16000
canuSettings: correctedErrorRate=0.16 maxInputCoverage=10000 minInputCoverage=0 corOutCoverage=10000
corMhapSensitivity=high corMinCoverage=0 useGrid=False stopOnLowCoverage=False genomeSize=10M
-nanopore
filtAAprimary: --min-length 30 -e 1e-3
filtAAsecondary: --min-length 30 -e 1e-5
filtNTprimary: --min-length 90 -e 1e-10
filtNTsecondary: --min-length 90 -e 1e-20
linclustParams: --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode
3
perfAA: --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0
perfAAfast: -s 4.0 --lca-mode 2 --shuffle 0
perfNT: --start-sens 2 -s 7 --sens-steps 3
perfNTfast: -s 4.0
taxIdIgnore: 0 1 2 10239 131567 12429 2759
[2023:03:06 13:20:24] ---------------------
[2023:03:06 13:20:24] | Snakemake command |
[2023:03:06 13:20:24] ---------------------
snakemake -s /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/workflow/Hecatomb.smk --configfile hecatomb.config.yaml --use-conda --conda-frontend mamba --conda-prefix /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/conda --rerun-incomplete --printshellcmds --nolock --show-failed-logs --profile slurm
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/config.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/dbFiles.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/immutable.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Using shell: /bin/bash
Provided cluster nodes: 100
Job stats:
job count min threads max threads
PRIMARY_AA_parsing 1 2 2
PRIMARY_AA_taxonomy_assignment 1 24 24
PRIMARY_NT_parsing 1 2 2
PRIMARY_NT_reformat 1 2 2
PRIMARY_NT_taxonomic_assignment 1 24 24
SECONDARY_AA_generate_output_table 1 2 2
SECONDARY_AA_parsing 1 2 2
SECONDARY_AA_refactor_finalize 1 2 2
SECONDARY_AA_taxonomy_assignment 1 24 24
SECONDARY_AA_tophit_lineage 1 2 2
SECONDARY_NT_convert 1 24 24
SECONDARY_NT_generate_output_table 1 2 2
SECONDARY_NT_summary 1 2 2
SECONDARY_NT_taxonomic_assignment 1 24 24
all 1 1 1
archive_for_assembly 10 1 1
assembly_kmer_normalization 10 8 8
bam_index 1 2 2
calculate_gc 2 2 2
calculate_tet_freq 2 2 2
cluster_similar_sequences 10 24 24
combine_AA_NT 1 1 1
concatenate_contigs 1 1 1
concatentate_contig_count_tables 1 1 1
contig_krona_plot 1 1 1
contig_krona_text_format 1 1 1
contig_read_taxonomy 1 2 2
coverage_calculations 10 8 8
create_contig_count_table 10 1 1
create_host_index 1 8 8
create_individual_seqtables 10 24 24
dumpSamplesTsv 1 1 1
fastp_preprocessing 4 16 16
host_removal_mapping 10 8 8
individual_sample_assembly 10 16 16
krona_plot 1 1 1
krona_text_format 1 1 1
link_assembly 1 1 1
mapSampleAssemblyPairedReads 10 8 8
mapSampleAssemblyUnpairedReads 10 8 8
map_seq_table 1 16 16
merge_seq_table 1 1 1
mmseqs_contig_annotation 1 24 24
mmseqs_contig_annotation_summary 1 24 24
nonhost_read_combine 10 1 1
nonhost_read_repair 10 8 8
poolR1Unmapped 1 1 1
poolR2Unmapped 1 1 1
poolUnpairedUnmapped 1 1 1
population_assembly 1 16 16
pullPairedUnmappedReads 10 8 8
pullPairedUnmappedReadsMateMapped 10 8 8
rescue_read_kmer_normalization 1 24 24
secondary_nt_calc_lca 1 24 24
secondary_nt_lca_table 1 1 1
seq_properties_table 2 2 2
tax_level_counts 1 2 2
unmapped_read_rescue_assembly 1 16 16
zip_fastq 60 1 1
total 250 1 24
Select jobs to execute...
[Mon Mar 6 13:20:34 2023]
rule fastp_preprocessing:
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.json, hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
jobid: 43
benchmark: hecatomb.out/benchmarks/fastp_preprocessing.A13-258-124-06_CGTACG.txt
reason: Missing output files: hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq
wildcards: sample=A13-258-124-06_CGTACG
threads: 16
resources: mem_mb=32000, mem_mib=30518, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
Submitted job 43 with external jobid 'Submitted batch job 2287003'.
[Mon Mar 6 13:20:34 2023]
rule dumpSamplesTsv:
output: hecatomb.out/results/hecatomb.samples.tsv
jobid: 252
reason: Missing output files: hecatomb.out/results/hecatomb.samples.tsv
resources: mem_mb=2000, mem_mib=1908, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
Submitted job 252 with external jobid 'Submitted batch job 2287004'.
[Mon Mar 6 13:20:34 2023]
rule fastp_preprocessing:
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.json, hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
jobid: 49
benchmark: hecatomb.out/benchmarks/fastp_preprocessing.A13-256-117-06_ACTGAT.txt
reason: Missing output files: hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq
wildcards: sample=A13-256-117-06_ACTGAT
threads: 16
resources: mem_mb=32000, mem_mib=30518, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
Submitted job 49 with external jobid 'Submitted batch job 2287005'.
[Mon Mar 6 13:20:34 2023]
rule fastp_preprocessing:
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.json, hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
jobid: 31
benchmark: hecatomb.out/benchmarks/fastp_preprocessing.A13-253-140-06_GTCCGC.txt
reason: Missing output files: hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq
wildcards: sample=A13-253-140-06_GTCCGC
threads: 16
resources: mem_mb=32000, mem_mib=30518, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
Submitted job 31 with external jobid 'Submitted batch job 2287006'.
[Mon Mar 6 13:20:34 2023]
rule create_host_index:
input: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz
output: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx
log: hecatomb.out/stderr/create_host_index.log
jobid: 7
benchmark: hecatomb.out/benchmarks/create_host_index.txt
reason: Missing output files: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx
threads: 8
resources: mem_mb=16000, mem_mib=15259, disk_mb=1939, disk_mib=1850, tmpdir=, time=1440
minimap2 -t 8 -d /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx <(cat /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz) 2> hecatomb.out/stderr/create_host_index.log
rm hecatomb.out/stderr/create_host_index.log
Submitted job 7 with external jobid 'Submitted batch job 2287007'.
[Mon Mar 6 13:20:34 2023]
rule fastp_preprocessing:
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.json, hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
jobid: 55
benchmark: hecatomb.out/benchmarks/fastp_preprocessing.A13-256-115-06_GTTTCG.txt
reason: Missing output files: hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq
wildcards: sample=A13-256-115-06_GTTTCG
threads: 16
resources: mem_mb=32000, mem_mib=30518, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
Submitted job 55 with external jobid 'Submitted batch job 2287008'.
Invalid jobs list: Submitted
[Mon Mar 6 13:20:44 2023]
Error in rule fastp_preprocessing:
jobid: 43
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.json, hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4fabc0d9baac0a21fd5dfed339929234_
shell:
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287003
Logfile hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log not found.
Error executing rule fastp_preprocessing on cluster (jobid: 43, external: Submitted batch job 2287003, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.fastp_preprocessing.43.sh). For error details see the cluster log and the log files of the involved rule(s).
Trying to restart job 43.
Select jobs to execute...
[Mon Mar 6 13:20:44 2023]
rule fastp_preprocessing:
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.json, hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
jobid: 43
benchmark: hecatomb.out/benchmarks/fastp_preprocessing.A13-258-124-06_CGTACG.txt
reason: Missing output files: hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq
wildcards: sample=A13-258-124-06_CGTACG
threads: 16
resources: mem_mb=32000, mem_mib=30518, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
Submitted job 43 with external jobid 'Submitted batch job 2287009'.
Invalid jobs list: Submitted
[Mon Mar 6 13:20:44 2023]
Error in rule dumpSamplesTsv:
jobid: 252
output: hecatomb.out/results/hecatomb.samples.tsv
cluster_jobid: Submitted batch job 2287004
Error executing rule dumpSamplesTsv on cluster (jobid: 252, external: Submitted batch job 2287004, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.dumpSamplesTsv.252.sh). For error details see the cluster log and the log files of the involved rule(s).
Trying to restart job 252.
Select jobs to execute...
[Mon Mar 6 13:20:44 2023]
rule dumpSamplesTsv:
output: hecatomb.out/results/hecatomb.samples.tsv
jobid: 252
reason: Missing output files: hecatomb.out/results/hecatomb.samples.tsv
resources: mem_mb=2000, mem_mib=1908, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
Submitted job 252 with external jobid 'Submitted batch job 2287010'.
Invalid jobs list: Submitted
[Mon Mar 6 13:20:45 2023]
Error in rule fastp_preprocessing:
jobid: 49
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.json, hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4fabc0d9baac0a21fd5dfed339929234_
shell:
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287005
Logfile hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log not found.
Error executing rule fastp_preprocessing on cluster (jobid: 49, external: Submitted batch job 2287005, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.fastp_preprocessing.49.sh). For error details see the cluster log and the log files of the involved rule(s).
Trying to restart job 49.
Select jobs to execute...
[Mon Mar 6 13:20:45 2023]
rule fastp_preprocessing:
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.json, hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
jobid: 49
benchmark: hecatomb.out/benchmarks/fastp_preprocessing.A13-256-117-06_ACTGAT.txt
reason: Missing output files: hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq
wildcards: sample=A13-256-117-06_ACTGAT
threads: 16
resources: mem_mb=32000, mem_mib=30518, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
Submitted job 49 with external jobid 'Submitted batch job 2287011'.
Invalid jobs list: Submitted
[Mon Mar 6 13:20:45 2023]
Error in rule fastp_preprocessing:
jobid: 31
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.json, hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4fabc0d9baac0a21fd5dfed339929234_
shell:
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287006
Logfile hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log not found.
Error executing rule fastp_preprocessing on cluster (jobid: 31, external: Submitted batch job 2287006, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.fastp_preprocessing.31.sh). For error details see the cluster log and the log files of the involved rule(s).
Trying to restart job 31.
Select jobs to execute...
[Mon Mar 6 13:20:45 2023]
rule fastp_preprocessing:
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.json, hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
jobid: 31
benchmark: hecatomb.out/benchmarks/fastp_preprocessing.A13-253-140-06_GTCCGC.txt
reason: Missing output files: hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq
wildcards: sample=A13-253-140-06_GTCCGC
threads: 16
resources: mem_mb=32000, mem_mib=30518, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
Submitted job 31 with external jobid 'Submitted batch job 2287012'.
Invalid jobs list: Submitted
[Mon Mar 6 13:20:45 2023]
Error in rule create_host_index:
jobid: 7
input: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz
output: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx
log: hecatomb.out/stderr/create_host_index.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4ea269956b660de957aed37e1acd2903_
shell:
minimap2 -t 8 -d /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx <(cat /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz) 2> hecatomb.out/stderr/create_host_index.log
rm hecatomb.out/stderr/create_host_index.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287007
Logfile hecatomb.out/stderr/create_host_index.log not found.
Error executing rule create_host_index on cluster (jobid: 7, external: Submitted batch job 2287007, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.create_host_index.7.sh). For error details see the cluster log and the log files of the involved rule(s).
Trying to restart job 7.
Select jobs to execute...
[Mon Mar 6 13:20:45 2023]
rule create_host_index:
input: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz
output: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx
log: hecatomb.out/stderr/create_host_index.log
jobid: 7
benchmark: hecatomb.out/benchmarks/create_host_index.txt
reason: Missing output files: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx
threads: 8
resources: mem_mb=16000, mem_mib=15259, disk_mb=1939, disk_mib=1850, tmpdir=, time=1440
minimap2 -t 8 -d /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx <(cat /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz) 2> hecatomb.out/stderr/create_host_index.log
rm hecatomb.out/stderr/create_host_index.log
Submitted job 7 with external jobid 'Submitted batch job 2287013'.
Invalid jobs list: Submitted
[Mon Mar 6 13:20:45 2023]
Error in rule fastp_preprocessing:
jobid: 55
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.json, hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4fabc0d9baac0a21fd5dfed339929234_
shell:
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287008
Logfile hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log not found.
Error executing rule fastp_preprocessing on cluster (jobid: 55, external: Submitted batch job 2287008, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.fastp_preprocessing.55.sh). For error details see the cluster log and the log files of the involved rule(s).
Trying to restart job 55.
Select jobs to execute...
[Mon Mar 6 13:20:45 2023]
rule fastp_preprocessing:
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.json, hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
jobid: 55
benchmark: hecatomb.out/benchmarks/fastp_preprocessing.A13-256-115-06_GTTTCG.txt
reason: Missing output files: hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq
wildcards: sample=A13-256-115-06_GTTTCG
threads: 16
resources: mem_mb=32000, mem_mib=30518, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
Submitted job 55 with external jobid 'Submitted batch job 2287014'.
Invalid jobs list: Submitted
[Mon Mar 6 13:20:55 2023]
Error in rule fastp_preprocessing:
jobid: 43
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.json, hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4fabc0d9baac0a21fd5dfed339929234_
shell:
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-258-124-06_CGTACG_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-258-124-06_CGTACG_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-258-124-06_CGTACG.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287009
Logfile hecatomb.out/stderr/fastp_preprocessing.A13-258-124-06_CGTACG.log not found.
Error executing rule fastp_preprocessing on cluster (jobid: 43, external: Submitted batch job 2287009, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.fastp_preprocessing.43.sh). For error details see the cluster log and the log files of the involved rule(s).
Invalid jobs list: Submitted
[Mon Mar 6 13:20:55 2023]
Error in rule dumpSamplesTsv:
jobid: 252
output: hecatomb.out/results/hecatomb.samples.tsv
cluster_jobid: Submitted batch job 2287010
Error executing rule dumpSamplesTsv on cluster (jobid: 252, external: Submitted batch job 2287010, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.dumpSamplesTsv.252.sh). For error details see the cluster log and the log files of the involved rule(s).
Invalid jobs list: Submitted
[Mon Mar 6 13:20:56 2023]
Error in rule fastp_preprocessing:
jobid: 49
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.json, hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4fabc0d9baac0a21fd5dfed339929234_
shell:
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-117-06_ACTGAT_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-256-117-06_ACTGAT_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-256-117-06_ACTGAT.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287011
Logfile hecatomb.out/stderr/fastp_preprocessing.A13-256-117-06_ACTGAT.log not found.
Error executing rule fastp_preprocessing on cluster (jobid: 49, external: Submitted batch job 2287011, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.fastp_preprocessing.49.sh). For error details see the cluster log and the log files of the involved rule(s).
Invalid jobs list: Submitted
[Mon Mar 6 13:20:56 2023]
Error in rule fastp_preprocessing:
jobid: 31
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.json, hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4fabc0d9baac0a21fd5dfed339929234_
shell:
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-253-140-06_GTCCGC_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-253-140-06_GTCCGC_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-253-140-06_GTCCGC.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287012
Logfile hecatomb.out/stderr/fastp_preprocessing.A13-253-140-06_GTCCGC.log not found.
Error executing rule fastp_preprocessing on cluster (jobid: 31, external: Submitted batch job 2287012, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.fastp_preprocessing.31.sh). For error details see the cluster log and the log files of the involved rule(s).
Invalid jobs list: Submitted
[Mon Mar 6 13:20:56 2023]
Error in rule create_host_index:
jobid: 7
input: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz
output: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx
log: hecatomb.out/stderr/create_host_index.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4ea269956b660de957aed37e1acd2903_
shell:
minimap2 -t 8 -d /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx <(cat /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz) 2> hecatomb.out/stderr/create_host_index.log
rm hecatomb.out/stderr/create_host_index.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287013
Logfile hecatomb.out/stderr/create_host_index.log not found.
Error executing rule create_host_index on cluster (jobid: 7, external: Submitted batch job 2287013, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.create_host_index.7.sh). For error details see the cluster log and the log files of the involved rule(s).
Invalid jobs list: Submitted
[Mon Mar 6 13:20:56 2023]
Error in rule fastp_preprocessing:
jobid: 55
input: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R1.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R2.fastq.gz, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/contaminants/vector_contaminants.fa
output: hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq, hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq, hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.json, hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.html
log: hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/4fabc0d9baac0a21fd5dfed339929234_
shell:
fastp -i /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R1.fastq.gz -I /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data/A13-256-115-06_GTTTCG_R2.fastq.gz -o hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R1.s1.out.fastq -O hecatomb.out/processing/temp/p01/A13-256-115-06_GTTTCG_R2.s1.out.fastq -z 1 -j hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.json -h hecatomb.out/processing/stats/p01/A13-256-115-06_GTTTCG.s1.stats.html --qualified_quality_phred 15 --length_required 90 --detect_adapter_for_pe --cut_tail --cut_tail_window_size 25 --cut_tail_mean_quality 15 --dedup --dup_calc_accuracy 4 --trim_poly_x --thread 16 2> hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
rm hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
cluster_jobid: Submitted batch job 2287014
Logfile hecatomb.out/stderr/fastp_preprocessing.A13-256-115-06_GTTTCG.log not found.
Error executing rule fastp_preprocessing on cluster (jobid: 55, external: Submitted batch job 2287014, jobscript: /lustre/project/taw/share/conda-envs/.snakemake/tmp.skvkwg2w/snakejob.fastp_preprocessing.55.sh). For error details see the cluster log and the log files of the involved rule(s).
Exiting because a job execution failed. Look above for error message
FATAL: Hecatomb encountered an error.
Dumping all error logs to "hecatomb.errorLogs.txt"Complete log: .snakemake/log/2023-03-06T132026.448612.snakemake.log
[2023:03:06 13:21:06] Error: Snakemake failed
from hecatomb.
Progress! I think this is the same issue as #87, can you check your profile config.yaml file to make sure you've got --parsable
for sbatch?
from hecatomb.
2023-03-06T145158.281223.snakemake.log
2023-03-06T145440.340562.snakemake.log
2023-03-06T145809.129136.snakemake.log
2023-03-06T145820.167039.snakemake.log
2023-03-06T130924.336960.snakemake.log
2023-03-06T132026.448612.snakemake.log
Here are all my logs if this helps
from hecatomb.
Here is what that file says:
#####################################################
| |__ ___ ___ __ | | ___ _ __ ___ | |__
| '_ \ / _ / __/ | __/ _ \| '_
_ | ' \
| | | | __/ (| (| | || () | | | | | | |) |
|| ||_|__,|____/|| || ||_.__/
#####################################################
For more information see: https://github.com/shandley/hecatomb and https://hecatomb.readthedocs.io/en/latest/
##################
Run Parameters
##################
Inputs and outputs (specify Reads here or on the command line, Output blank = use default)
Reads:
Output:
Database installation location, leave blank = use hecatomb install location
Databases:
STICK TO YOUR SYSTEM'S CPU:RAM RATIO FOR THESE
BigJobMem: 64000 # Memory for MMSeqs in megabytes (e.g 64GB = 64000, recommend >= 64000)
BigJobCpu: 24 # Threads for MMSeqs (recommend >= 16)
BigJobTimeMin: 1440 # Max runtime in minutes for MMSeqs (this is only enforced by the Snakemake profile)
MediumJobMem: 32000 # Memory for Megahit/Flye in megabytes (recommend >= 32000)
MediumJobCpu: 16 # CPUs for Megahit/Flye in megabytes (recommend >= 16)
SmallJobMem: 16000 # Memory for BBTools etc. in megabytes (recommend >= 16000)
SmallJobCpu: 8 # CPUs for BBTools etc. (recommend >= 8)
Some jobs need more RAM; go over your CPU:RAM ratio if needed
MoreRamMem: 16000 # Memory for slightly RAM-hungry jobs in megabytes (recommend >= 16000)
MoreRamCpu: 2 # CPUs for slightly RAM-hungry jobs (recommend >= 2)
############################
Optional Rule Parameters
############################
Preprocessing
QSCORE: 15 # Read quality trimming score (rule fastp_preprocessing in 01_preprocessing.smk)
READ_MINLENGTH: 90 # Minimum read length during QC steps (rule fastp_preprocessing in 01_preprocessing.smk)
CONTIG_MINLENGTH: 1000 # Read minimum length (rule contig_reformating_and_stats in 01_preprocessing.smk)
CUTTAIL_WINDOW: 25 # Sliding window size for low qual read filter rule fastp_preprocessing in 01_preprocessing.smk)
DEDUP_ACCURACY: 4 # Specify the level (1 ~ 6). The higher level means more memory usage and more running time, but lower risk of incorrect deduplication marking (rule fastp_preprocessing in 01_preprocessing.smk)
COMPRESSION: 1 # Compression level for gzip output (1 ~ 9). 1 is fastest, 9 is smallest. Default is 1, based on assumption of large scratch space (rule fastp_preprocessing in 01_preprocessing.smk)
ENTROPY: 0.5 # Read minimum entropy (rule remove_low_quality in 01_preprocessing.smk)
ENTROPYWINDOW: 25 # entropy window for low qual read filter
CLUSTER READS TO SEQTABLE (MMSEQS EASY-LINCLUST)
-c = req coverage of target seq
--min-seq-id = req identity [0-1] of alignment
linclustParams:
--kmer-per-seq-scale 0.3
-c 0.8
--cov-mode 1
--min-seq-id 0.97
--alignment-mode 3
If the following LCA-calculated TaxIDs are encountered, defer to the top hit TaxID
0 = unclassified
1 = root
2 = bacteria root
10239 = virus root
131567 = cell org root
12429 = unclass virus
2759 = eukaryota root
12333 = unclass phage
taxIdIgnore: 0 1 2 10239 131567 12429 2759
###########################################
Canu settings (for longread assemblies)
###########################################
recommended correctedErrorRate is 0.16 to 0.12 (depending on coverage) for nanopore and
0.105 to 0.040 (depending on coverage) for pacbio (non-HiFi) - below is low-coverage nanopore
https://canu.readthedocs.io/en/latest/faq.html#what-parameters-can-i-tweak
canuSettings:
correctedErrorRate=0.16
maxInputCoverage=10000
minInputCoverage=0
corOutCoverage=10000
corMhapSensitivity=high
corMinCoverage=0
useGrid=False
stopOnLowCoverage=False
genomeSize=10M
-nanopore
-pacbio
-pacbio-hifi
###################
MMSeqs settings
###################
ALIGNMENT FILTERING CUTOFFS
--min-length for AA should be equal or less than 1/3 of READ_MINLENGTH
--min-length for NT should be equal or less than READ_MINLENGTH
filtAAprimary:
--min-length 30
-e 1e-3
filtAAsecondary:
--min-length 30
-e 1e-5
filtNTprimary:
--min-length 90
-e 1e-10
filtNTsecondary:
--min-length 90
-e 1e-20
PERFORMANCE SETTINGS - SEE MMSEQS DOCUMENTATION FOR DETAILS
sensitive AA search
perfAA:
--start-sens 1
--sens-steps 3
-s 7
--lca-mode 2
--shuffle 0
fast AA search
perfAAfast:
-s 4.0
--lca-mode 2
--shuffle 0
sensitive NT search
perfNT:
--start-sens 2
-s 7
--sens-steps 3
fast NT search
perfNTfast:
-s 4.0
from hecatomb.
OOPs I found the correct yam file: I added --parsable at the end of sbatch and it is running now!
cluster:
mkdir -p logs/{rule}/ &&
sbatch
--cpus-per-task={threads}
--mem={resources.mem_mb}
--time={resources.time}
--job-name=smk-{rule}
--output=logs/{rule}/{jobid}.out
--error=logs/{rule}/{jobid}.err
--parsable
default-resources:
- mem_mb=2000
- time=1440
jobs: 100
latency-wait: 60
local-cores: 8
restart-times: 1
max-jobs-per-second: 20
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy
use-conda: True
conda-frontend: mamba
cluster-status: ~/.config/snakemake/slurm/slurm-status.py
max-status-checks-per-second: 10
from hecatomb.
Hi I am still getting an error with the slurm test:
(base) [kvigil@cypress01-123 ~]$ hecatomb test --profile slurm
██╗ ██╗███████╗ ██████╗ █████╗ ████████╗ ██████╗ ███╗ ███╗██████╗
██║ ██║██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║██╔══██╗
███████║█████╗ ██║ ███████║ ██║ ██║ ██║██╔████╔██║██████╔╝
██╔══██║██╔══╝ ██║ ██╔══██║ ██║ ██║ ██║██║╚██╔╝██║██╔══██╗
██║ ██║███████╗╚██████╗██║ ██║ ██║ ╚██████╔╝██║ ╚═╝ ██║██████╔╝
╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝
Hecatomb version v1.1.0
[2023:03:09 11:54:15] Config file hecatomb.config.yaml already exists. Using existing config file.
[2023:03:09 11:54:15] Writing runtime config file to hecatomb.config.yaml
[2023:03:09 11:54:15] ------------------
[2023:03:09 11:54:15] | Runtime config |
[2023:03:09 11:54:15] ------------------
BigJobCpu: 24
BigJobMem: 64000
BigJobTimeMin: 1440
COMPRESSION: 1
CONTIG_MINLENGTH: 1000
CUTTAIL_WINDOW: 25
DEDUP_ACCURACY: 4
Databases: null
ENTROPY: 0.5
ENTROPYWINDOW: 25
Host: human
MediumJobCpu: 16
MediumJobMem: 32000
MoreRamCpu: 2
MoreRamMem: 16000
Output: hecatomb.out
Preprocess: paired
QSCORE: 15
READ_MINLENGTH: 90
Reads: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data
Search: sensitive
SmallJobCpu: 8
SmallJobMem: 16000
canuSettings: correctedErrorRate=0.16 maxInputCoverage=10000 minInputCoverage=0 corOutCoverage=10000
corMhapSensitivity=high corMinCoverage=0 useGrid=False stopOnLowCoverage=False genomeSize=10M
-nanopore
filtAAprimary: --min-length 30 -e 1e-3
filtAAsecondary: --min-length 30 -e 1e-5
filtNTprimary: --min-length 90 -e 1e-10
filtNTsecondary: --min-length 90 -e 1e-20
linclustParams: --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode
3
perfAA: --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0
perfAAfast: -s 4.0 --lca-mode 2 --shuffle 0
perfNT: --start-sens 2 -s 7 --sens-steps 3
perfNTfast: -s 4.0
taxIdIgnore: 0 1 2 10239 131567 12429 2759
[2023:03:09 11:54:15] ---------------------
[2023:03:09 11:54:15] | Snakemake command |
[2023:03:09 11:54:15] ---------------------
snakemake -s /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/workflow/Hecatomb.smk --configfile hecatomb.config.yaml --use-conda --conda-frontend mamba --conda-prefix /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/conda --rerun-incomplete --printshellcmds --nolock --show-failed-logs --profile slurm
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/config.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/dbFiles.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/immutable.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Using shell: /bin/bash
Provided cluster nodes: 100
Job stats:
job count min threads max threads
PRIMARY_AA_parsing 1 2 2
PRIMARY_AA_taxonomy_assignment 1 24 24
PRIMARY_NT_parsing 1 2 2
PRIMARY_NT_reformat 1 2 2
PRIMARY_NT_taxonomic_assignment 1 24 24
SECONDARY_AA_generate_output_table 1 2 2
SECONDARY_AA_parsing 1 2 2
SECONDARY_AA_refactor_finalize 1 2 2
SECONDARY_AA_taxonomy_assignment 1 24 24
SECONDARY_AA_tophit_lineage 1 2 2
SECONDARY_NT_convert 1 24 24
SECONDARY_NT_generate_output_table 1 2 2
SECONDARY_NT_summary 1 2 2
SECONDARY_NT_taxonomic_assignment 1 24 24
all 1 1 1
bam_index 1 2 2
calculate_gc 2 2 2
calculate_tet_freq 2 2 2
cluster_similar_sequences 10 24 24
combine_AA_NT 1 1 1
concatenate_contigs 1 1 1
concatentate_contig_count_tables 1 1 1
contig_krona_plot 1 1 1
contig_krona_text_format 1 1 1
contig_read_taxonomy 1 2 2
coverage_calculations 10 8 8
create_contig_count_table 10 1 1
create_individual_seqtables 10 24 24
krona_plot 1 1 1
krona_text_format 1 1 1
link_assembly 1 1 1
map_seq_table 1 16 16
merge_seq_table 1 1 1
mmseqs_contig_annotation 1 24 24
mmseqs_contig_annotation_summary 1 24 24
population_assembly 1 16 16
rescue_read_kmer_normalization 1 24 24
secondary_nt_calc_lca 1 24 24
secondary_nt_lca_table 1 1 1
seq_properties_table 2 2 2
tax_level_counts 1 2 2
unmapped_read_rescue_assembly 1 16 16
total 81 1 24
Select jobs to execute...
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-135-177-06_AGTTCC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-135-177-06_AGTTCC.log
jobid: 10
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-135-177-06_AGTTCC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-135-177-06_AGTTCC_R1.all.fastq
wildcards: sample=A13-135-177-06_AGTTCC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-135-177-06_AGTTCC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1 hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-135-177-06_AGTTCC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-135-177-06_AGTTCC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-252-114-06_CCGTCC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-252-114-06_CCGTCC.log
jobid: 58
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-252-114-06_CCGTCC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-252-114-06_CCGTCC_R1.all.fastq
wildcards: sample=A13-252-114-06_CCGTCC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-252-114-06_CCGTCC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1 hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-252-114-06_CCGTCC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-252-114-06_CCGTCC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-12-250-06_GGCTAC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-12-250-06_GGCTAC.log
jobid: 34
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-12-250-06_GGCTAC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-12-250-06_GGCTAC_R1.all.fastq
wildcards: sample=A13-12-250-06_GGCTAC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-12-250-06_GGCTAC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1 hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-12-250-06_GGCTAC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-12-250-06_GGCTAC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule rescue_read_kmer_normalization:
input: hecatomb.out/processing/assembly/unmapRescue_R1.fastq, hecatomb.out/processing/assembly/unmapRescue_R2.fastq, hecatomb.out/processing/assembly/unmapRescue.s.fastq
output: hecatomb.out/processing/assembly/unmapRescueNorm_R1.fastq, hecatomb.out/processing/assembly/unmapRescueNorm_R2.fastq
log: hecatomb.out/stderr/rescue_read_kmer_normalization.log
jobid: 160
benchmark: hecatomb.out/benchmarks/rescue_read_kmer_normalization.txt
reason: Missing output files: hecatomb.out/processing/assembly/unmapRescueNorm_R1.fastq, hecatomb.out/processing/assembly/unmapRescueNorm_R2.fastq; Updated input files: hecatomb.out/processing/assembly/unmapRescue_R1.fastq, hecatomb.out/processing/assembly/unmapRescue.s.fastq, hecatomb.out/processing/assembly/unmapRescue_R2.fastq
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440, javaAlloc=57600
bbnorm.sh in=hecatomb.out/processing/assembly/unmapRescue_R1.fastq in2=hecatomb.out/processing/assembly/unmapRescue_R2.fastq extra=hecatomb.out/processing/assembly/unmapRescue.s.fastq out=hecatomb.out/processing/assembly/unmapRescueNorm_R1.fastq out2=hecatomb.out/processing/assembly/unmapRescueNorm_R2.fastq target=100 ow=t threads=24 -Xmx57600m 2> hecatomb.out/stderr/rescue_read_kmer_normalization.log
rm hecatomb.out/stderr/rescue_read_kmer_normalization.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-151-169-06_ATGTCA_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-151-169-06_ATGTCA.log
jobid: 3
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-151-169-06_ATGTCA.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-151-169-06_ATGTCA_R1.all.fastq
wildcards: sample=A13-151-169-06_ATGTCA
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-151-169-06_ATGTCA_R1.all.fastq hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1 hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-151-169-06_ATGTCA.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-151-169-06_ATGTCA.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-255-183-06_GTGGCC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-255-183-06_GTGGCC.log
jobid: 16
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-255-183-06_GTGGCC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-255-183-06_GTGGCC_R1.all.fastq
wildcards: sample=A13-255-183-06_GTGGCC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-255-183-06_GTGGCC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1 hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-255-183-06_GTGGCC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-255-183-06_GTGGCC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-258-124-06_CGTACG_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-258-124-06_CGTACG.log
jobid: 40
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-258-124-06_CGTACG.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-258-124-06_CGTACG_R1.all.fastq
wildcards: sample=A13-258-124-06_CGTACG
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-258-124-06_CGTACG_R1.all.fastq hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1 hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-258-124-06_CGTACG.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-258-124-06_CGTACG.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-04-182-06_TAGCTT_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-04-182-06_TAGCTT.log
jobid: 22
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-04-182-06_TAGCTT.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-04-182-06_TAGCTT_R1.all.fastq
wildcards: sample=A13-04-182-06_TAGCTT
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-04-182-06_TAGCTT_R1.all.fastq hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1 hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-04-182-06_TAGCTT.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-04-182-06_TAGCTT.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-256-117-06_ACTGAT_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-256-117-06_ACTGAT.log
jobid: 46
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-256-117-06_ACTGAT.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-256-117-06_ACTGAT_R1.all.fastq
wildcards: sample=A13-256-117-06_ACTGAT
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-256-117-06_ACTGAT_R1.all.fastq hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1 hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-256-117-06_ACTGAT.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-256-117-06_ACTGAT.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-253-140-06_GTCCGC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-253-140-06_GTCCGC.log
jobid: 28
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-253-140-06_GTCCGC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-253-140-06_GTCCGC_R1.all.fastq
wildcards: sample=A13-253-140-06_GTCCGC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-253-140-06_GTCCGC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1 hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-253-140-06_GTCCGC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-253-140-06_GTCCGC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-256-115-06_GTTTCG_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-256-115-06_GTTTCG.log
jobid: 52
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-256-115-06_GTTTCG.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-256-115-06_GTTTCG_R1.all.fastq
wildcards: sample=A13-256-115-06_GTTTCG
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-256-115-06_GTTTCG_R1.all.fastq hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1 hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-256-115-06_GTTTCG.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-256-115-06_GTTTCG.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
Trying to restart job 10.
Trying to restart job 58.
Trying to restart job 34.
Trying to restart job 160.
Trying to restart job 3.
Trying to restart job 16.
Trying to restart job 40.
Trying to restart job 22.
Trying to restart job 46.
Trying to restart job 28.
Trying to restart job 52.
Select jobs to execute...
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-135-177-06_AGTTCC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-135-177-06_AGTTCC.log
jobid: 10
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-135-177-06_AGTTCC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-135-177-06_AGTTCC_R1.all.fastq
wildcards: sample=A13-135-177-06_AGTTCC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-135-177-06_AGTTCC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_R1 hecatomb.out/processing/temp/p05/A13-135-177-06_AGTTCC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-135-177-06_AGTTCC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-135-177-06_AGTTCC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-12-250-06_GGCTAC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-12-250-06_GGCTAC.log
jobid: 34
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-12-250-06_GGCTAC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-12-250-06_GGCTAC_R1.all.fastq
wildcards: sample=A13-12-250-06_GGCTAC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-12-250-06_GGCTAC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_R1 hecatomb.out/processing/temp/p05/A13-12-250-06_GGCTAC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-12-250-06_GGCTAC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-12-250-06_GGCTAC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-252-114-06_CCGTCC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-252-114-06_CCGTCC.log
jobid: 58
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-252-114-06_CCGTCC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-252-114-06_CCGTCC_R1.all.fastq
wildcards: sample=A13-252-114-06_CCGTCC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-252-114-06_CCGTCC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_R1 hecatomb.out/processing/temp/p05/A13-252-114-06_CCGTCC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-252-114-06_CCGTCC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-252-114-06_CCGTCC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule rescue_read_kmer_normalization:
input: hecatomb.out/processing/assembly/unmapRescue_R1.fastq, hecatomb.out/processing/assembly/unmapRescue_R2.fastq, hecatomb.out/processing/assembly/unmapRescue.s.fastq
output: hecatomb.out/processing/assembly/unmapRescueNorm_R1.fastq, hecatomb.out/processing/assembly/unmapRescueNorm_R2.fastq
log: hecatomb.out/stderr/rescue_read_kmer_normalization.log
jobid: 160
benchmark: hecatomb.out/benchmarks/rescue_read_kmer_normalization.txt
reason: Missing output files: hecatomb.out/processing/assembly/unmapRescueNorm_R1.fastq, hecatomb.out/processing/assembly/unmapRescueNorm_R2.fastq; Updated input files: hecatomb.out/processing/assembly/unmapRescue_R1.fastq, hecatomb.out/processing/assembly/unmapRescue.s.fastq, hecatomb.out/processing/assembly/unmapRescue_R2.fastq
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440, javaAlloc=57600
bbnorm.sh in=hecatomb.out/processing/assembly/unmapRescue_R1.fastq in2=hecatomb.out/processing/assembly/unmapRescue_R2.fastq extra=hecatomb.out/processing/assembly/unmapRescue.s.fastq out=hecatomb.out/processing/assembly/unmapRescueNorm_R1.fastq out2=hecatomb.out/processing/assembly/unmapRescueNorm_R2.fastq target=100 ow=t threads=24 -Xmx57600m 2> hecatomb.out/stderr/rescue_read_kmer_normalization.log
rm hecatomb.out/stderr/rescue_read_kmer_normalization.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-151-169-06_ATGTCA_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-151-169-06_ATGTCA.log
jobid: 3
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-151-169-06_ATGTCA.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-151-169-06_ATGTCA_R1.all.fastq
wildcards: sample=A13-151-169-06_ATGTCA
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-151-169-06_ATGTCA_R1.all.fastq hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_R1 hecatomb.out/processing/temp/p05/A13-151-169-06_ATGTCA_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-151-169-06_ATGTCA.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-151-169-06_ATGTCA.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-255-183-06_GTGGCC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-255-183-06_GTGGCC.log
jobid: 16
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-255-183-06_GTGGCC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-255-183-06_GTGGCC_R1.all.fastq
wildcards: sample=A13-255-183-06_GTGGCC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-255-183-06_GTGGCC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_R1 hecatomb.out/processing/temp/p05/A13-255-183-06_GTGGCC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-255-183-06_GTGGCC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-255-183-06_GTGGCC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-258-124-06_CGTACG_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-258-124-06_CGTACG.log
jobid: 40
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-258-124-06_CGTACG.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-258-124-06_CGTACG_R1.all.fastq
wildcards: sample=A13-258-124-06_CGTACG
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-258-124-06_CGTACG_R1.all.fastq hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_R1 hecatomb.out/processing/temp/p05/A13-258-124-06_CGTACG_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-258-124-06_CGTACG.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-258-124-06_CGTACG.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-04-182-06_TAGCTT_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-04-182-06_TAGCTT.log
jobid: 22
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-04-182-06_TAGCTT.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-04-182-06_TAGCTT_R1.all.fastq
wildcards: sample=A13-04-182-06_TAGCTT
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-04-182-06_TAGCTT_R1.all.fastq hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_R1 hecatomb.out/processing/temp/p05/A13-04-182-06_TAGCTT_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-04-182-06_TAGCTT.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-04-182-06_TAGCTT.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-256-117-06_ACTGAT_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-256-117-06_ACTGAT.log
jobid: 46
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-256-117-06_ACTGAT.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-256-117-06_ACTGAT_R1.all.fastq
wildcards: sample=A13-256-117-06_ACTGAT
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-256-117-06_ACTGAT_R1.all.fastq hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_R1 hecatomb.out/processing/temp/p05/A13-256-117-06_ACTGAT_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-256-117-06_ACTGAT.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-256-117-06_ACTGAT.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-253-140-06_GTCCGC_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-253-140-06_GTCCGC.log
jobid: 28
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-253-140-06_GTCCGC.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1_cluster.tsv; Updated input files: hecatomb.out/processing/temp/p04/A13-253-140-06_GTCCGC_R1.all.fastq
wildcards: sample=A13-253-140-06_GTCCGC
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-253-140-06_GTCCGC_R1.all.fastq hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_R1 hecatomb.out/processing/temp/p05/A13-253-140-06_GTCCGC_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-253-140-06_GTCCGC.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-253-140-06_GTCCGC.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
[Thu Mar 9 11:54:19 2023]
rule cluster_similar_sequences:
input: hecatomb.out/processing/temp/p04/A13-256-115-06_GTTTCG_R1.all.fastq
output: hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_rep_seq.fasta, hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_all_seqs.fasta
log: hecatomb.out/stderr/cluster_similar_sequences.A13-256-115-06_GTTTCG.log
jobid: 52
benchmark: hecatomb.out/benchmarks/cluster_similar_sequences.A13-256-115-06_GTTTCG.txt
reason: Missing output files: hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_cluster.tsv, hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1_rep_seq.fasta; Updated input files: hecatomb.out/processing/temp/p04/A13-256-115-06_GTTTCG_R1.all.fastq
wildcards: sample=A13-256-115-06_GTTTCG
threads: 24
resources: mem_mb=64000, mem_mib=61036, disk_mb=1000, disk_mib=954, tmpdir=, time=1440
mmseqs easy-linclust hecatomb.out/processing/temp/p04/A13-256-115-06_GTTTCG_R1.all.fastq hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_R1 hecatomb.out/processing/temp/p05/A13-256-115-06_GTTTCG_TMP --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode 3 --threads 24 &> hecatomb.out/stderr/cluster_similar_sequences.A13-256-115-06_GTTTCG.log
rm hecatomb.out/stderr/cluster_similar_sequences.A13-256-115-06_GTTTCG.log
sbatch: error: Batch job submission failed: Requested node configuration is not available
Error submitting jobscript (exit code 1):
Exiting because a job execution failed. Look above for error message
FATAL: Hecatomb encountered an error.
Dumping all error logs to "hecatomb.errorLogs.txt"Complete log: .snakemake/log/2023-03-09T115415.686966.snakemake.log
[2023:03:09 11:54:29] Error: Snakemake failed
(base) [kvigil@cypress01-123 ~]$
from hecatomb.
Hi I changed my config files to use 20 threads, but it looks like Hecatomb is still running on 24 threads. My HPC only allows 20 threads. What other files do I need to change? Here is my log:
I changed these config files under .config/snakemake/slurm
hectomb.config.yaml
config.yaml
And this file:
conda-envs/hecatomb/snakemake/config
config.yaml
2023-03-09T131458.367041.snakemake.log
Thanks!
from hecatomb.
If you're using a profile then --threads should be ignored. In your hecatomb.out/hecatomb.config.yaml
can you change BigJobCpu: 24
to BigJobCpu: 20
. You should also make sure BigJobMem
is balanced for the cpu:memory ratio of the nodes. You can run sinfo -Nl
to see that info for your cluster.
from hecatomb.
Hi @beardymcjohnface I don't see the hecatomb.config.yaml in the hecatomb.out folder. This file is located in /home/kvigil/.config/snakemake/slurm
I changed the hecatomb.config.yaml file to reflect 20 threads:
BigJobCpu: 20
BigJobMem: 64000
BigJobTimeMin: 1440
COMPRESSION: 1
CONTIG_MINLENGTH: 1000
CUTTAIL_WINDOW: 25
DEDUP_ACCURACY: 4
Databases: null
ENTROPY: 0.5
ENTROPYWINDOW: 25
Host: human
MediumJobCpu: 16
MediumJobMem: 32000
MoreRamCpu: 2
MoreRamMem: 16000
Output: hecatomb.out
Preprocess: paired
QSCORE: 15
READ_MINLENGTH: 90
Reads: /lustre/project/taw/share/conda-envs/hecatomb/bin/../test_data
Search: sensitive
SmallJobCpu: 8
SmallJobMem: 16000
canuSettings: correctedErrorRate=0.16 maxInputCoverage=10000 minInputCoverage=0 corOutCoverage=10000
corMhapSensitivity=high corMinCoverage=0 useGrid=False stopOnLowCoverage=False genomeSize=10M
-nanopore
filtAAprimary: --min-length 30 -e 1e-3
filtAAsecondary: --min-length 30 -e 1e-5
filtNTprimary: --min-length 90 -e 1e-10
filtNTsecondary: --min-length 90 -e 1e-20
linclustParams: --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode
3
perfAA: --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0
perfAAfast: -s 4.0 --lca-mode 2 --shuffle 0
perfNT: --start-sens 2 -s 7 --sens-steps 3
perfNTfast: -s 4.0
taxIdIgnore: 0 1 2 10239 131567 12429 2759
from hecatomb.
2023-03-10T141950.676347.snakemake.log
Here is my log. It still says 24 threads even though I changed it.
from hecatomb.
The /home/kvigil/.config/snakemake/slurm
directory should contain config.yaml
with the profile settings, which is different from the hecatomb.config.yaml
, which should either be in your working directory or the hecatomb.out
directory depending on the Hecatomb version. changing Hecatomb's default config (in /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/config.yaml
) will only work once you've deleted your old hecatomb.config.yaml
file. Do you have a hecatomb.config.yaml
file in your working directory?
from hecatomb.
Good news it worked when I changed the file to 20 threads! - It took about 6 hours to complete on HPC is that how long the test data usually take?
255 of 255 steps (100%) done
Removing temporary output hecatomb.out/processing/mapping/assembly.seqtable.bam.
Complete log: .snakemake/log/2023-03-14T113116.169580.snakemake.log
Hecatomb finished successfully!
[2023:03:14 17:59:12] Snakemake finished successfully
from hecatomb.
That's great! yes unfortunately the current test dataset still takes quite a while. It's a small-ish dataset (10 samples with 25k paired reads), but it runs on the full size databases which takes a long time. We do need a much quicker test simulation, both for users and CI.
from hecatomb.
@beardymcjohnface I am having a hard time now adding my host genome.
hecatomb addHost --host oyster --hostfa H:\hecatomb
\host_genomes\GCF_002022765.2_C_virginica-3.0_protein.faa.gz
██╗ ██╗███████╗ ██████╗ █████╗ ████████╗ ██████╗ ███╗ ███╗██████╗
██║ ██║██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║██╔══██╗
███████║█████╗ ██║ ███████║ ██║ ██║ ██║██╔████╔██║██████╔╝
██╔══██║██╔══╝ ██║ ██╔══██║ ██║ ██║ ██║██║╚██╔╝██║██╔══██╗
██║ ██║███████╗╚██████╗██║ ██║ ██║ ╚██████╔╝██║ ╚═╝ ██║██████╔╝
╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝
Hecatomb version v1.1.0
Usage: hecatomb [OPTIONS] COMMAND [ARGS]...
Try 'hecatomb --help' for help.
Error: No such command 'addHost'.
How do you recommend adding the host genome from NCBI?
from hecatomb.
Hi,
Now I am getting an error when I run my sample:
(/lustre/project/taw/share/conda-envs/hecatomb) [kvigil@cypress2 ~]$ hecatomb run --preprocess longread --reads /lustre/project/taw/ONR030223/ONR030223/20230302_1344_MN18851_FAW54058_d4b97d63/fastq_pass/barcode07 --host oyster --threads 20
██╗ ██╗███████╗ ██████╗ █████╗ ████████╗ ██████╗ ███╗ ███╗██████╗
██║ ██║██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║██╔══██╗
███████║█████╗ ██║ ███████║ ██║ ██║ ██║██╔████╔██║██████╔╝
██╔══██║██╔══╝ ██║ ██╔══██║ ██║ ██║ ██║██║╚██╔╝██║██╔══██╗
██║ ██║███████╗╚██████╗██║ ██║ ██║ ╚██████╔╝██║ ╚═╝ ██║██████╔╝
╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝
Hecatomb version v1.1.0
[2023:03:15 16:29:58] Config file hecatomb.config.yaml already exists. Using existing config file.
[2023:03:15 16:29:58] Writing runtime config file to hecatomb.config.yaml
[2023:03:15 16:29:58] ------------------
[2023:03:15 16:29:58] | Runtime config |
[2023:03:15 16:29:58] ------------------
BigJobCpu: 24
BigJobMem: 64000
BigJobTimeMin: 1440
COMPRESSION: 1
CONTIG_MINLENGTH: 1000
CUTTAIL_WINDOW: 25
DEDUP_ACCURACY: 4
Databases: null
ENTROPY: 0.5
ENTROPYWINDOW: 25
Host: oyster
MediumJobCpu: 16
MediumJobMem: 32000
MoreRamCpu: 2
MoreRamMem: 16000
Output: hecatomb.out
Preprocess: longread
QSCORE: 15
READ_MINLENGTH: 90
Reads: /lustre/project/taw/ONR030223/ONR030223/20230302_1344_MN18851_FAW54058_d4b97d63/fastq_pass/barcode07
Search: sensitive
SmallJobCpu: 8
SmallJobMem: 16000
canuSettings: correctedErrorRate=0.16 maxInputCoverage=10000 minInputCoverage=0 corOutCoverage=10000
corMhapSensitivity=high corMinCoverage=0 useGrid=False stopOnLowCoverage=False genomeSize=10M
-nanopore
filtAAprimary: --min-length 30 -e 1e-3
filtAAsecondary: --min-length 30 -e 1e-5
filtNTprimary: --min-length 90 -e 1e-10
filtNTsecondary: --min-length 90 -e 1e-20
linclustParams: --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode
3
perfAA: --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0
perfAAfast: -s 4.0 --lca-mode 2 --shuffle 0
perfNT: --start-sens 2 -s 7 --sens-steps 3
perfNTfast: -s 4.0
taxIdIgnore: 0 1 2 10239 131567 12429 2759
[2023:03:15 16:29:58] ---------------------
[2023:03:15 16:29:58] | Snakemake command |
[2023:03:15 16:29:58] ---------------------
snakemake -s /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/workflow/Hecatomb.smk --configfile hecatomb.config.yaml --jobs 20 --use-conda --conda-frontend mamba --conda-prefix /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/conda --rerun-incomplete --printshellcmds --nolock --show-failed-logs
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/config.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/dbFiles.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/immutable.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Using shell: /bin/bash
Provided cores: 20
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
PRIMARY_AA_parsing 1 2 2
PRIMARY_AA_taxonomy_assignment 1 20 20
PRIMARY_NT_parsing 1 2 2
PRIMARY_NT_reformat 1 2 2
PRIMARY_NT_taxonomic_assignment 1 20 20
SECONDARY_AA_generate_output_table 1 2 2
SECONDARY_AA_parsing 1 2 2
SECONDARY_AA_refactor_finalize 1 2 2
SECONDARY_AA_taxonomy_assignment 1 20 20
SECONDARY_AA_tophit_lineage 1 2 2
SECONDARY_NT_convert 1 20 20
SECONDARY_NT_generate_output_table 1 2 2
SECONDARY_NT_summary 1 2 2
SECONDARY_NT_taxonomic_assignment 1 20 20
all 1 1 1
bam_index 1 2 2
calculate_gc 1 2 2
calculate_tet_freq 1 2 2
canu_sample_assembly 1 16 16
combine_AA_NT 1 1 1
combine_canu_contigs 1 1 1
concatentate_contig_count_tables 1 1 1
contig_krona_plot 1 1 1
contig_krona_text_format 1 1 1
contig_read_taxonomy 1 2 2
coverage_calculations 38 8 8
create_contig_count_table 38 1 1
krona_plot 1 1 1
krona_text_format 1 1 1
link_assembly 1 1 1
map_seq_table 1 16 16
mmseqs_contig_annotation 1 20 20
mmseqs_contig_annotation_summary 1 20 20
population_assembly 1 16 16
secondary_nt_calc_lca 1 20 20
secondary_nt_lca_table 1 1 1
seq_properties_table 2 2 2
tax_level_counts 1 2 2
total 113 1 20
Select jobs to execute...
[Wed Mar 15 16:30:09 2023]
rule PRIMARY_AA_taxonomy_assignment:
input: hecatomb.out/results/seqtable.fasta, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/aa/virus_primary_aa/sequenceDB
output: hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_lca.tsv, hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_report, hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_report, hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_aln, hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_aln_sorted
log: hecatomb.out/stderr/PRIMARY_AA_taxonomy_assignment.log
jobid: 322
benchmark: hecatomb.out/benchmarks/PRIMARY_AA_taxonomy_assignment.txt
reason: Updated input files: hecatomb.out/results/seqtable.fasta
threads: 20
resources: tmpdir=/tmp, mem_mb=64000, mem_mib=61036, time=1440
{ # Run mmseqs taxonomy module
mmseqs easy-taxonomy hecatomb.out/results/seqtable.fasta /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/aa/virus_primary_aa/sequenceDB hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp --min-length 30 -e 1e-3 --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0 -a --tax-output-mode 2 --search-type 2 --tax-lineage 1 --lca-ranks "superkingdom,phylum,class,order,family,genus,species" --format-output "query,target,evalue,pident,fident,nident,mismatch,qcov,tcov,qstart,qend,qlen,tstart,tend,tlen,alnlen,bits,qheader,theader,taxid,taxname,taxlineage" --threads 20 --split-memory-limit 48000M;
# Add headers
sort -k1 -n hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_aln | sed '1i query target evalue pident fident nident mismatch qcov tcov qstart qend qlen tstart tend tlen alnlen bits qheader theader taxid taxname lineage' > hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_aln_sorted;
} &> hecatomb.out/stderr/PRIMARY_AA_taxonomy_assignment.log
rm hecatomb.out/stderr/PRIMARY_AA_taxonomy_assignment.log
Activating conda environment: ../../lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/373dc37b9d1afb91f5ef6873a443b8ec_
[Wed Mar 15 16:53:00 2023]
Error in rule PRIMARY_AA_taxonomy_assignment:
jobid: 322
input: hecatomb.out/results/seqtable.fasta, /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/aa/virus_primary_aa/sequenceDB
output: hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_lca.tsv, hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_report, hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_report, hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_aln, hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_aln_sorted
log: hecatomb.out/stderr/PRIMARY_AA_taxonomy_assignment.log (check log file(s) for error details)
conda-env: /lustre/project/taw/share/conda-envs/hecatomb/snakemake/conda/373dc37b9d1afb91f5ef6873a443b8ec_
shell:
{ # Run mmseqs taxonomy module
mmseqs easy-taxonomy hecatomb.out/results/seqtable.fasta /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/aa/virus_primary_aa/sequenceDB hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp --min-length 30 -e 1e-3 --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0 -a --tax-output-mode 2 --search-type 2 --tax-lineage 1 --lca-ranks "superkingdom,phylum,class,order,family,genus,species" --format-output "query,target,evalue,pident,fident,nident,mismatch,qcov,tcov,qstart,qend,qlen,tstart,tend,tlen,alnlen,bits,qheader,theader,taxid,taxname,taxlineage" --threads 20 --split-memory-limit 48000M;
# Add headers
sort -k1 -n hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_aln | sed '1i query target evalue pident fident nident mismatch qcov tcov qstart qend qlen tstart tend tlen alnlen bits qheader theader taxid taxname lineage' > hecatomb.out/processing/mmseqs_aa_primary/MMSEQS_AA_PRIMARY_tophit_aln_sorted;
} &> hecatomb.out/stderr/PRIMARY_AA_taxonomy_assignment.log
rm hecatomb.out/stderr/PRIMARY_AA_taxonomy_assignment.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile hecatomb.out/stderr/PRIMARY_AA_taxonomy_assignment.log:
createdb hecatomb.out/results/seqtable.fasta hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/query --dbtype 0 --shuffle 0 --createdb-mode 1 --write-lookup 0 --id-offset 0 --compressed 0 -v 3
Converting sequences
[===============
Time for merging to query_h: 0h 0m 0s 2ms
Time for merging to query: 0h 0m 0s 1ms
Database type: Nucleotide
Time for processing: 0h 0m 0s 426ms
Tmp hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp folder does not exist or is not a directory.
extractorfs hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/query hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/q_orfs_aa --min-length 30 --max-length 32734 --max-gaps 2147483647 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 1 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --translate 1 --use-all-table-starts 0 --id-offset 0 --create-lookup 0 --threads 20 --compressed 0 -v 3
[==================Invalid sequence with index 2003!
====Invalid sequence with index 55989!
=Invalid sequence with index 63657!
===Invalid sequence with index 79056!
======Invalid sequence with index 64748!
==============Invalid sequence with index 13929!
==================Invalid sequence with index 128311!
=] 151.76K 2s 965ms
Time for merging to q_orfs_aa_h: 0h 0m 1s 105ms
Time for merging to q_orfs_aa: 0h 0m 1s 838ms
Time for processing: 0h 0m 8s 793ms
prefilter hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/q_orfs_aa /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/aa/virus_primary_aa/sequenceDB hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/pref_0 --sub-mat nucl:nucleotide.out,aa:blosum62.out --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -k 0 --k-score 2147483647 --alph-size nucl:5,aa:21 --max-seq-len 65535 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 48000M -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca 1 --pcb 1.5 --threads 20 --compressed 0 -v 3 -s 1.000
Query database size: 1773665 type: Aminoacid
Estimated memory consumption: 6G
Target database size: 1840337 type: Aminoacid
Index table k-mer threshold: 154 at k-mer size 6
Index table: counting k-mers
[=================================================================] 1.84M 37s 499ms
Index table: Masked residues: 6082841
Index table: fill
[=================================================================] 1.84M 12s 622ms
Index statistics
Entries: 281557312
DB size: 2099 MB
Avg k-mer size: 4.399333
Top 10 k-mers
TQFELG 248971
GWGPFQ 247784
DFENTQ 246686
NVPGGS 246673
RKTFPS 245401
YQNLGW 244983
SPIQMT 232436
GSLIHR 223178
IKKSWR 222897
HPGKKS 221025
Time for index table init: 0h 0m 52s 957ms
Process prefiltering step 1 of 1
k-mer similarity threshold: 154
Starting prefiltering scores calculation (step 1 of 1)
Query db start 1 to 1773665
Target db start 1 to 1840337
[=================================================================] 1.77M 16s 26ms
2.375376 k-mers per position
407 DB matches per sequence
0 overflows
0 queries produce too many hits (truncated result)
1 sequences passed prefiltering per query sequence
0 median result list length
1059700 sequences with 0 size result lists
Time for merging to pref_0: 0h 0m 1s 477ms
Time for processing: 0h 1m 15s 877ms
align hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/q_orfs_aa /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/aa/virus_primary_aa/sequenceDB hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/pref_0 hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/aln_0 --sub-mat nucl:nucleotide.out,aa:blosum62.out -a 1 --alignment-mode 2 --wrapped-scoring 0 -e 0.001 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --realign 0 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca 1 --pcb 1.5 --score-bias 0 --gap-open nucl:5,aa:11 --gap-extend nucl:2,aa:1 --zdrop 40 --threads 20 --compressed 0 -v 3
Compute score, coverage and sequence identity
Query database size: 1773665 type: Aminoacid
Target database size: 1840337 type: Aminoacid
Calculation of alignments
[=================================================================] 1.77M 4s 24ms
Time for merging to aln_0: 0h 0m 0s 734ms
2956412 alignments calculated.
151097 sequence pairs passed the thresholds (0.051108 of overall calculated).
0.085189 hits per query sequence.
Time for processing: 0h 0m 20s 80ms
createsubdb hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/order_0 hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/q_orfs_aa hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/input_0 -v 3 --subdb-mode 1
Time for merging to input_0: 0h 0m 0s 1ms
Time for processing: 0h 0m 0s 479ms
prefilter hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/input_0 /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/aa/virus_primary_aa/sequenceDB hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/pref_1 --sub-mat nucl:nucleotide.out,aa:blosum62.out --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -k 0 --k-score 2147483647 --alph-size nucl:5,aa:21 --max-seq-len 65535 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 48000M -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca 1 --pcb 1.5 --threads 20 --compressed 0 -v 3 -s 4.0
Query database size: 1769959 type: Aminoacid
Estimated memory consumption: 6G
Target database size: 1840337 type: Aminoacid
Index table k-mer threshold: 127 at k-mer size 6
Index table: counting k-mers
[=================================================================] 1.84M 43s 425ms
Index table: Masked residues: 6082841
Index table: fill
[=================================================================] 1.84M 19s 793ms
Index statistics
Entries: 520624295
DB size: 3467 MB
Avg k-mer size: 8.134755
Top 10 k-mers
TQFELG 248971
GWGPFQ 247784
DFENTQ 246686
NVPGGS 246673
RKTFPS 245401
YQNLGW 244983
NNTGYQ 236377
VLVDFS 235418
NKTDEV 234993
VTLVAY 228872
Time for index table init: 0h 1m 13s 263ms
Process prefiltering step 1 of 1
k-mer similarity threshold: 127
Starting prefiltering scores calculation (step 1 of 1)
Query db start 1 to 1769959
Target db start 1 to 1840337
[=================================================================] 1.77M 58s 433ms
57.741859 k-mers per position
15904 DB matches per sequence
0 overflows
0 queries produce too many hits (truncated result)
65 sequences passed prefiltering per query sequence
41 median result list length
22847 sequences with 0 size result lists
Time for merging to pref_1: 0h 0m 3s 484ms
Time for processing: 0h 2m 34s 915ms
align hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/input_0 /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/aa/virus_primary_aa/sequenceDB hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/pref_1 hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/aln_1 --sub-mat nucl:nucleotide.out,aa:blosum62.out -a 1 --alignment-mode 2 --wrapped-scoring 0 -e 0.001 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --realign 0 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca 1 --pcb 1.5 --score-bias 0 --gap-open nucl:5,aa:11 --gap-extend nucl:2,aa:1 --zdrop 40 --threads 20 --compressed 0 -v 3
Compute score, coverage and sequence identity
Query database size: 1769959 type: Aminoacid
Target database size: 1840337 type: Aminoacid
Calculation of alignments
[=================================================================] 1.77M 2m 0s 140ms
Time for merging to aln_1: 0h 0m 5s 38ms
115062805 alignments calculated.
36106 sequence pairs passed the thresholds (0.000314 of overall calculated).
0.020399 hits per query sequence.
Time for processing: 0h 2m 12s 467ms
mergedbs hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/q_orfs_aa hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/aln_merge_new hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/aln_0 hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/aln_1 --compressed 0 -v 3
Merging the results to hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/aln_merge_new
Time for merging to aln_merge_new: 0h 0m 0s 298ms
Time for processing: 0h 0m 5s 423ms
rmdb hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/aln_merge -v 3
Time for processing: 0h 0m 0s 15ms
mvdb hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/aln_merge_new hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/aln_merge -v 3
Time for processing: 0h 0m 0s 8ms
createsubdb hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/order_1 hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/input_0 hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/input_1 -v 3 --subdb-mode 1
Time for merging to input_1: 0h 0m 0s 1ms
Time for processing: 0h 0m 0s 478ms
prefilter hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/input_1 /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../../databases/aa/virus_primary_aa/sequenceDB hecatomb.out/processing/mmseqs_aa_primary/mmseqs_aa_tmp/8991890088055356121/taxonomy_tmp/7817104469906507499/tmp_hsp1/7360049934769759246/search/pref_2 --sub-mat nucl:nucleotide.out,aa:blosum62.out --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -k 0 --k-score 2147483647 --alph-size nucl:5,aa:21 --max-seq-len 65535 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 48000M -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca 1 --pcb 1.5 --threads 20 --compressed 0 -v 3 -s 7.0
Query database size: 1768631 type: Aminoacid
Estimated memory consumption: 6G
Target database size: 1840337 type: Aminoacid
Index table k-mer threshold: 100 at k-mer size 6
Index table: counting k-mers
[=================================================================] 1.84M 44s 187ms
Index table: Masked residues: 6082841
Index table: fill
[=================================================================] 1.84M 50s 41ms
Index statistics
Entries: 522772222
DB size: 3479 MB
Avg k-mer size: 8.168316
Top 10 k-mers
TQFELG 248971
GWGPFQ 247784
DFENTQ 246686
NVPGGS 246673
RKTFPS 245401
YQNLGW 244983
NNTGYQ 236377
VLVDFS 235418
NKTDEV 234993
VTLVAY 228872
Time for index table init: 0h 1m 49s 749ms
Process prefiltering step 1 of 1
k-mer similarity threshold: 100
Starting prefiltering scores calculation (step 1 of 1)
Query db start 1 to 1768631
Target db start 1 to 1840337
[====================================================Can not write to data file e
Error: Prefilter died
Error: Search step died
Error: First search died
Error: Search died
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
FATAL: Hecatomb encountered an error.
Dumping all error logs to "hecatomb.errorLogs.txt"Complete log: .snakemake/log/2023-03-15T163002.627032.snakemake.log
[2023:03:15 16:53:00] Error: Snakemake failed
from hecatomb.
@beardymcjohnface I am having a hard time now adding my host genome. hecatomb addHost --host oyster --hostfa H:\hecatomb \host_genomes\GCF_002022765.2_C_virginica-3.0_protein.faa.gz
██╗ ██╗███████╗ ██████╗ █████╗ ████████╗ ██████╗ ███╗ ███╗██████╗ ██║ ██║██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║██╔══██╗ ███████║█████╗ ██║ ███████║ ██║ ██║ ██║██╔████╔██║██████╔╝ ██╔══██║██╔══╝ ██║ ██╔══██║ ██║ ██║ ██║██║╚██╔╝██║██╔══██╗ ██║ ██║███████╗╚██████╗██║ ██║ ██║ ╚██████╔╝██║ ╚═╝ ██║██████╔╝ ╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝
Hecatomb version v1.1.0
Usage: hecatomb [OPTIONS] COMMAND [ARGS]... Try 'hecatomb --help' for help.
Error: No such command 'addHost'.
How do you recommend adding the host genome from NCBI?
This was my bad, the subcommand changed with an update. It's now add-host:
hecatomb add-host ...
Make sure you download the nucleotide file, not the proteins.
from hecatomb.
re: the mmseqs error, I'm not sure why it's died, there's only two similar issues that I could find on the mmseqs github. Do you have disk-space or inode quotas? and were you running this with a profile or as a local job? if you ran it with the example profile, there might be slurm logs in logs/PRIMARY_AA_taxonomy_assignment
which should tell you if the scheduler killed it for some reason.
from hecatomb.
@beardymcjohnface Here are all my logs,
For the 3/16/23 logs I used:
hecatomb run --reads /lustre/project/taw/ONR030223/ONR030223/20230302_1344_MN18851_FAW54058_d4b97d63/fastq_pass/barcode07 --threads 20 --preprocess longread --
profile slurm --host oyster
2023-03-16T092821.856620.snakemake.log
2023-03-16T092747.012837.snakemake.log
2023-03-16T092055.486949.snakemake.log
2023-03-16T091947.073215.snakemake.log
Do I need to concatenate my fastq files for each barcode??
from hecatomb.
Scheduler logs are handled separately, you should be able to find them in logs/PRIMARY_AA_taxonomy_assignment/
. If it died due to memory or time limits it should state in these logs.
EDIT: Yes you should concatenate the fastq files for each barcode.
from hecatomb.
I concatenated all the fastq.gz files and now I get this error:
(/lustre/project/taw/share/conda-envs/hecatomb) [kvigil@cypress01-119 barcode12]$ hecatomb run --reads /lustre/project/taw/ONR030223/ONR030223/20230302_1344_MN18851_FAW54058_d4b97d63/fastq_pass/concatenate/barcode01.fastq.gz --preprocess longread --threads 20 --profile slurm
██╗ ██╗███████╗ ██████╗ █████╗ ████████╗ ██████╗ ███╗ ███╗██████╗
██║ ██║██╔════╝██╔════╝██╔══██╗╚══██╔══╝██╔═══██╗████╗ ████║██╔══██╗
███████║█████╗ ██║ ███████║ ██║ ██║ ██║██╔████╔██║██████╔╝
██╔══██║██╔══╝ ██║ ██╔══██║ ██║ ██║ ██║██║╚██╔╝██║██╔══██╗
██║ ██║███████╗╚██████╗██║ ██║ ██║ ╚██████╔╝██║ ╚═╝ ██║██████╔╝
╚═╝ ╚═╝╚══════╝ ╚═════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝
Hecatomb version v1.1.0
[2023:03:18 12:19:16] Config file hecatomb.config.yaml already exists. Using existing config file.
[2023:03:18 12:19:16] Writing runtime config file to hecatomb.config.yaml
[2023:03:18 12:19:16] ------------------
[2023:03:18 12:19:16] | Runtime config |
[2023:03:18 12:19:16] ------------------
BigJobCpu: 20
BigJobMem: 64000
BigJobTimeMin: 1440
COMPRESSION: 1
CONTIG_MINLENGTH: 1000
CUTTAIL_WINDOW: 25
DEDUP_ACCURACY: 4
Databases: null
ENTROPY: 0.5
ENTROPYWINDOW: 25
Host: human
MediumJobCpu: 16
MediumJobMem: 32000
MoreRamCpu: 2
MoreRamMem: 16000
Output: hecatomb.out
Preprocess: longread
QSCORE: 15
READ_MINLENGTH: 90
Reads: /lustre/project/taw/ONR030223/ONR030223/20230302_1344_MN18851_FAW54058_d4b97d63/fastq_pass/concatenate/barcode01.fastq.gz
Search: sensitive
SmallJobCpu: 8
SmallJobMem: 16000
canuSettings: correctedErrorRate=0.16 maxInputCoverage=10000 minInputCoverage=0 corOutCoverage=10000
corMhapSensitivity=high corMinCoverage=0 useGrid=False stopOnLowCoverage=False genomeSize=10M
-nanopore
filtAAprimary: --min-length 30 -e 1e-3
filtAAsecondary: --min-length 30 -e 1e-5
filtNTprimary: --min-length 90 -e 1e-10
filtNTsecondary: --min-length 90 -e 1e-20
linclustParams: --kmer-per-seq-scale 0.3 -c 0.8 --cov-mode 1 --min-seq-id 0.97 --alignment-mode
3
perfAA: --start-sens 1 --sens-steps 3 -s 7 --lca-mode 2 --shuffle 0
perfAAfast: -s 4.0 --lca-mode 2 --shuffle 0
perfNT: --start-sens 2 -s 7 --sens-steps 3
perfNTfast: -s 4.0
taxIdIgnore: 0 1 2 10239 131567 12429 2759
[2023:03:18 12:19:16] ---------------------
[2023:03:18 12:19:16] | Snakemake command |
[2023:03:18 12:19:16] ---------------------
snakemake -s /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/workflow/Hecatomb.smk --configfile hecatomb.config.yaml --use-conda --conda-frontend mamba --conda-prefix /lustre/project/taw/share/conda-envs/hecatomb/bin/../snakemake/conda --rerun-incomplete --printshellcmds --nolock --show-failed-logs --profile slurm
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/config.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/dbFiles.yaml is extended by additional config specified via the command line.
Config file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/../config/immutable.yaml is extended by additional config specified via the command line.
UnicodeDecodeError in file /lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/rules/00_samples_se.smk, line 34:
'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
File "/lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/Hecatomb.smk", line 71, in
File "/lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/rules/00_samples_se.smk", line 53, in parseSamples
File "/lustre/project/taw/share/conda-envs/hecatomb/snakemake/workflow/rules/00_samples_se.smk", line 34, in samplesFromTsv
File "/lustre/project/taw/share/conda-envs/hecatomb/lib/python3.10/codecs.py", line 322, in decode
[2023:03:18 12:19:19] Error: Snakemake failed
from hecatomb.
Sorry for the late reply, we had a hackathon last week. With --reads, you need to either pass a folder where the reads are located, or a TSV file that specifies the sample names and file paths. You also don't have to run them one barcode at a time. I would put the concatenated fastqs for each barcode into a new directory and pass that to --reads.
from hecatomb.
Thanks! I will try this out!
from hecatomb.
Related Issues (20)
- 98, 99% Failure Issue HOT 1
- ModuleNotFoundError: No module named 'attrmap' HOT 1
- Errors in newest version 1.3.0 HOT 15
- flye crash in population_assembly step HOT 2
- Illumina_NextSeq_Run dies immediately HOT 2
- Check for the presence of an environment variable for location of databases HOT 1
- readthedocs viral ecology R tutorial error fix HOT 1
- assembly Contigs in results folder HOT 1
- Enhancement: add whitelist to pre-processing HOT 2
- I want to create a web app for hecatomb. HOT 1
- HPC Execution problem when changing to V.1.1.0 HOT 4
- Skip host removal HOT 1
- Solving OSerror Issues in a WSL Environment during the megahit step. HOT 2
- Can Hecatomb be used for searching not only viruses but also bacteria, fungi, and mycoplasma? HOT 1
- bigtable.tsv column question HOT 1
- Filtering out host (human) genome beforehand? HOT 3
- No rule to produce assembly HOT 5
- add-host is failing on mask_host HOT 3
- out-of-memory error during population_assembly.flye, STAGE: repeat HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hecatomb.