Hi, I'm currently trying to get hecatomb working on a VM, but I've r

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Comments (43)

linsalrob commented on July 3, 2024

It appears to me that the JVM consumed all the system memory and was killed.

Can you please check the # Run Parameters # section of hecatomb.config.yaml and make sure that you are not requesting too much memory?

Let us know if that fixes the problem.

from hecatomb.

blusoldier001 commented on July 3, 2024

I'm not sure how much memory I should use. I've been running hecatomb from a docker which requires a specific allocated memory number that I've requested to be 16gb in most cases. Should I request in the docker itself to get more memory?

This is my Hecatomb.config.yaml file, by the way.

##################

Run Parameters

##################

Database installation location, leave blank = use hecatomb install location

Databases:

STICK TO YOUR SYSTEM'S CPU:RAM RATIO FOR THESE

BigJobMem: 64000 # Memory for MMSeqs in megabytes (e.g 64GB = 64000, recommend >= 64000)
BigJobCpu: 24 # Threads for MMSeqs (recommend >= 16)
BigJobTimeMin: 5760 # Max runtime in minutes for MMSeqs (this is only enforced by the Snakemake profile)
MediumJobMem: 32000 # Memory for Megahit/Flye in megabytes (recommend >= 32000)
MediumJobCpu: 16 # CPUs for Megahit/Flye in megabytes (recommend >= 16)
SmallJobMem: 16000 # Memory for BBTools etc. in megabytes (recommend >= 16000)
SmallJobCpu: 8 # CPUs for BBTools etc. (recommend >= 8)
# default CPUs = 1
defaultMem: 2000 # Default memory in megabytes (for use with --profile)
defaultTime: 1440 # Default time in minutes (for use with --profile)
defaultJobs: 100 # Default concurrent jobs (for use with --profile)

Some jobs need more RAM; go over your CPU:RAM ratio if needed

MoreRamMem: 16000 # Memory for slightly RAM-hungry jobs in megabytes (recommend >= 16000)
MoreRamCpu: 2 # CPUs for slightly RAM-hungry jobs (recommend >= 2)

from hecatomb.

beardymcjohnface commented on July 3, 2024

According to the log, you have 64 cores and 32 GB of RAM. Is that correct? If so, by default Hecatomb will be spinning up 3 or 4 BBTools jobs, each reserving 16GB which will put you over your system's available memory.

Change this part of the config like so:

BigJobMem: 32000 # Memory for MMSeqs in megabytes (e.g 64GB = 64000, recommend >= 64000)
BigJobCpu: 64 # Threads for MMSeqs (recommend >= 16)
BigJobTimeMin: 5760 # Max runtime in minutes for MMSeqs (this is only enforced by the Snakemake profile)
MediumJobMem: 32000 # Memory for Megahit/Flye in megabytes (recommend >= 32000)
MediumJobCpu: 64 # CPUs for Megahit/Flye in megabytes (recommend >= 16)
SmallJobMem: 16000 # Memory for BBTools etc. in megabytes (recommend >= 16000)
SmallJobCpu: 32 # CPUs for BBTools etc. (recommend >= 8)

from hecatomb.

blusoldier001 commented on July 3, 2024

Thank you!
This initially seemed to solve my problem, but after 64%, another process failed. I've attached the log here.
2022-01-25T213119.818853.snakemake.log

from hecatomb.

beardymcjohnface commented on July 3, 2024

It's progress at least. This is an MMSeqs error, and I can't see anything helpful related to bus errors on the mmseqs github issues page https://github.com/soedinglab/MMseqs2/issues
Try rerunning it and I'll see how the newest version of MMSeqs2 works with Hecatomb.

from hecatomb.

shandley commented on July 3, 2024

Isn't the specific version specified in the mmseqs.yaml though? It should be as the newer versions of mmseqs (13 and above) changed almost everything about mmseqs output so we would want to make sure no other versions other than the one specific in the env are used or there will be loads of downstream parsing issues.

from hecatomb.

beardymcjohnface commented on July 3, 2024

Scott, the new mmseqsUpdate branch seems to be working for me on the test dataset and we should be good to migrate to the new version whenever we want. it includes a couple of bugfixes that i'll need to cherry pick into dev and master for now. In the end I only needed to tweak the AA taxonomy steps. The NT tax steps and the assembly mmseqs step worked fine (though I still need to check the assembly contig annotations to make sure they're correct). The bigtable looks fine though.

Jason, let me know if you want to try this version and need help checking out the mmseqsUpdate branch and running it.

from hecatomb.

shandley commented on July 3, 2024

Hi @beardymcjohnface we should really take a deeper look. When mmseqs2 updated to release 13-45111 they changed not only everything about how the algorithm works (it works primarily as a contig annotator and less well as a short read annotator) but they also changed all of the output files. The columns are not the same, I don't think I was able to sort out how to dissect the LCA results. It really wasn't an incremental version release as much as it was a release of an entirely new software package.

from hecatomb.

beardymcjohnface commented on July 3, 2024

I agree, I made it a separate branch so we could make a pull request, review it there and make any necessary changes before merging it with the main branch (assuming it works fine).

from hecatomb.

blusoldier001 commented on July 3, 2024

Hi,

I've tried running it again, and this time it hit a different error. I'm not sure if these two are related.
2022-02-01T211755.954089.snakemake.log

from hecatomb.

beardymcjohnface commented on July 3, 2024

Hi Sorry for the late reply. This is an MMSeqs issue I think. You could try running the commands manually and see if they work, but I would also append memory limit on the search command (which I've done below). I'll patch this into the next release of Hecatomb just to be safe. If it does work, rerun Hecatomb and it should continue after this step.

mmseqs createdb \
    hecatomb_out/PROCESSING/ASSEMBLY/CONTIG_DICTIONARY/FLYE/assembly.fasta \
    hecatomb_out/PROCESSING/ASSEMBLY/CONTIG_DICTIONARY/FLYE/queryDB \
    --dbtype 2

mmseqs search \
    hecatomb_out/PROCESSING/ASSEMBLY/CONTIG_DICTIONARY/FLYE/queryDB \
    /storage1/fs1/leyao.wang/Active/jason_test/hecatomb/snakemake/workflow/../../databases/nt/virus_primary_nt/sequenceDB \
    hecatomb_out/PROCESSING/ASSEMBLY/CONTIG_DICTIONARY/FLYE/results/result \
    hecatomb_out/PROCESSING/ASSEMBLY/CONTIG_DICTIONARY/FLYE/mmseqs_nt_tmp \
    --start-sens 2 -s 7 --sens-steps 3 --min-length 90 -e 1e-5 --search-type 3 \
    --split-memory-limit 24000

from hecatomb.

blusoldier001 commented on July 3, 2024

It says that the mmseqs command is not found. Should I install MMseqs? Could that be what's causing the issue?

from hecatomb.

beardymcjohnface commented on July 3, 2024

oh my bad. You could install it, or you could use the conda env that snakemake created. The easiest way is to just install mmseqs2:

# dont run from your base env, your hecatomb env should be fine
mamba install mmseqs2=12.113e3=h2d02072_2

from hecatomb.

blusoldier001 commented on July 3, 2024

Hi Michael,

When I ran the aforementioned commands, I got another issue: there's a file in hecatomb that couldn't be opened for writting.
2-10-2022-output.txt

from hecatomb.

beardymcjohnface commented on July 3, 2024

I'm not sure why MMSeqs is failing here. you could try deleting the mmseqs directories hecatomb_out/PROCESSING/ASSEMBLY/CONTIG_DICTIONARY/FLYE/mmseqs_nt_tmp and hecatomb_out/PROCESSING/ASSEMBLY/CONTIG_DICTIONARY/FLYE/results and rerunning hecatomb. Otherwise we'll have to pester the MMSeqs developers for some ideas.

If you're not worried about the contig annotations you can rerun hecatomb and add the option --snake=-k. The pipeline will still "fail" but it should create everything except these files (the assembly, seqtable, bigtable, read-based contig annotations etc).

from hecatomb.

blusoldier001 commented on July 3, 2024

I was able to delete mmseqs_nt_tmp but it looks like results didn't exist in the first place.

When I ran hecatomb again, it exited almost instantly. This was the error log:
2022-02-11T202054.110585.snakemake.log

from hecatomb.

blusoldier001 commented on July 3, 2024

Also, we do need contig annotations.

from hecatomb.

beardymcjohnface commented on July 3, 2024

I would suggest deleting the hecatomb_out/PROCESSING/ASSEMBLY/CONTIG_DICTIONARY/ directory and making the pipeline regenerate those files; something has been corrupted at some point I think. You should also include the snakemake 'keep going' flag by adding --snake=-k to the end of your hecatomb command. That should hopefully make the pipeline finish the read annotations if nothing else.

from hecatomb.

blusoldier001 commented on July 3, 2024

Ok! Would the pipeline regenerate them if I simply ran '''hecatomb run --test --snake=-k'''?

from hecatomb.

beardymcjohnface commented on July 3, 2024

Yes, any files that are missing should be regenerated, as well as any subsequent files that depend on them. I'm just looking back through the thread; is this the test dataset that is failing?

from hecatomb.

blusoldier001 commented on July 3, 2024

Yes.

from hecatomb.

blusoldier001 commented on July 3, 2024

After trying to regenerate them, I tried to run it again but it still keeps hitting errors. When I tried to regenerate again after deleting config_dictionary, I then was greeted with this message:

(/storage1/fs1/leyao.wang/Active/jason_test/hecatomb) j.m.li@compute1-exec-132:~$ hecatomb run --test --snake=-k
Config file hecatomb.config.yaml already exists.
Running Hecatomb
Running snakemake command:
snakemake -j 32 --use-conda --conda-frontend mamba --rerun-incomplete --printshellcmds --nolock --show-failed-logs --conda-prefix /storage1/fs1/leyao.wang/Active/jason_test/hecatomb/snakemake/workflow/conda --configfile hecatomb.config.yaml -k -s /storage1/fs1/leyao.wang/Active/jason_test/hecatomb/snakemake/workflow/Hecatomb.smk -C Reads=/storage1/fs1/leyao.wang/Active/jason_test/hecatomb/test_data Host=human Output=hecatomb_out SkipAssembly=False Fast=False Report=False
Building DAG of jobs...
WorkflowError:
Unable to obtain modification time of file hecatomb_out/RESULTS/assembly.fasta although it existed before. It could be that a concurrent process has deleted it while Snakemake was running.
File "/storage1/fs1/leyao.wang/Active/jason_test/hecatomb/lib/python3.10/asyncio/runners.py", line 44, in run
File "/storage1/fs1/leyao.wang/Active/jason_test/hecatomb/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete

I've also attached my error file from the normal run.
2022-02-17T210726.314687.snakemake.log

from hecatomb.

beardymcjohnface commented on July 3, 2024

That modification time error can occur during reruns of failed/killed snakemake pipelines. I think you can just touch the file and it should be ok. Alternatively, you should be able to delete the .snakemake/ directory.

The normal run error is back to the mmseqs running out of memory. I don't think we actually got running the mmseqs commands manually to run did we?

The fix for the memory issue is here: 14625b0
You just need to add --split-memory-limit {MMSeqsMemSplit} to the mmseqs command in 03_contig_annotation.smk rules file for the mmseqs_contig_annotation rule. Your file should be in /storage1/fs1/leyao.wang/Active/jason_test/hecatomb/snakemake/workflow/rules/03_contig_annotation.smk. Otherwise, we could try and install the github version and checkout the dev branch, or wait for the next release.

from hecatomb.

blusoldier001 commented on July 3, 2024

Regarding the modification time error, do you mean I should open and close the 2 python files mentioned?

For mmseqs, there's multiple categories for the annotation rule. Which one should I put the command under?
For reference, this is what the file lists for rule mmseqs_contig_annotation:

rule mmseqs_contig_annotation:
    """Contig annotation step 01: Assign taxonomy to contigs in contig_dictionary using mmseqs
    
    Database: NCBI virus assembly with taxID added
    """
    input:
        contigs=os.path.join(ASSEMBLY,"CONTIG_DICTIONARY","FLYE","assembly.fasta"),
        db=os.path.join(NCBIVIRDB, "sequenceDB")
    output:
        queryDB=os.path.join(ASSEMBLY,"CONTIG_DICTIONARY","FLYE","queryDB"),
        result=os.path.join(ASSEMBLY,"CONTIG_DICTIONARY","FLYE","results","result.index")
    params:
        respath=os.path.join(ASSEMBLY,"CONTIG_DICTIONARY","FLYE","results","result"),
        tmppath=os.path.join(ASSEMBLY,"CONTIG_DICTIONARY","FLYE","mmseqs_nt_tmp")
    benchmark:
        os.path.join(BENCH, "mmseqs_contig_annotation.txt")
    log:
        os.path.join(STDERR, "mmseqs_contig_annotation.log")
    resources:
        mem_mb=MMSeqsMem
    threads:
        MMSeqsCPU
    conda:
        os.path.join("../", "envs", "mmseqs2.yaml")
    shell:
        """
        {{
        mmseqs createdb {input.contigs} {output.queryDB} --dbtype 2;
        mmseqs search {output.queryDB} {input.db} {params.respath} {params.tmppath} \
            {MMSeqsSensNT} {config[filtNTsecondary]} \
            --search-type 3 ; }} &> {log}
        rm {log}
        """

from hecatomb.

beardymcjohnface commented on July 3, 2024

You can use the touch command to update the timestamps of the files, which is what Snakemake uses to keep track of what it does and does not need to do. If you open the commit link -> 14625b0 you can make the same changes in your file.

from hecatomb.

blusoldier001 commented on July 3, 2024

That's strange. When I try to touch "hecatomb_out/RESULTS/assembly.fasta" it claims the file/directory does not exist, even when I cd into RESULTS. In the same folder, if I type the ls command, assembly.fasta shows up. But touching it does not work.

from hecatomb.

blusoldier001 commented on July 3, 2024

I've been able to update the modification time of the symlink, but even then, I'm still encountering this error.

from hecatomb.

blusoldier001 commented on July 3, 2024

Ok, that's weird. I ran it after changing the name of the file I was supposed to touch so that hecatomb wouldn't be able to find it and it completed a test run successfully.

from hecatomb.

blusoldier001 commented on July 3, 2024

Now that it successfully completed, do I have to run it again with any other modifications, or is it good to use?

from hecatomb.

blusoldier001 commented on July 3, 2024

There was a problem when I was running actual datasets, but I don't know if these are issues with hecatomb itself or with the data sets. I've attached all 3 error logs.

The main issue referred to something as "Invalid header line: must start with @HD/@SQ/@RG/@PG/@co".

Did my version of hecatomb become corrupted due to numerous failed runs?
2022-03-08T220712.552501.snakemake.log
2022-03-08T222137.634342.snakemake.log
hecatomb.crashreport.log

from hecatomb.

beardymcjohnface commented on July 3, 2024

The error here is with samtools view in host_removal_mapping. minimap maps the reads to the host genome, samtools view will filter the mapped reads, and samtools fastq will convert the bam format back to fastq. I'm not sure why the header isn't being passed by minimap.

Can you run ls -lh hecatomb_out/PROCESSING/TMP/p06/ to make sure the input fastq files aren't empty?

Then you could run

minimap2 -ax sr -t 8 --secondary=no \
/storage1/fs1/leyao.wang/Active/jason_test/hecatomb/snakemake/workflow/../../databases/host/human/masked_ref.fa.gz.idx \
hecatomb_out/PROCESSING/TMP/p06/M667_I8470_32876_Wang_Asthma_A07_NEBNext_Index_5_ACAGTGATCT_S5_L001_R1.s6.out.fastq \
hecatomb_out/PROCESSING/TMP/p06/M667_I8470_32876_Wang_Asthma_A07_NEBNext_Index_5_ACAGTGATCT_S5_L001_R2.s6.out.fastq \
> A07.minimap.test.sam

to see if minimap is outputting any alignments.

from hecatomb.

blusoldier001 commented on July 3, 2024

The first command returns an error: "ls: cannot access 'hecatomb_out/PROCESSING/TMP/p06/': Operation not permitted"
The second command returns the error "bash: minimap2: command not found"

from hecatomb.

beardymcjohnface commented on July 3, 2024

you might need to create and spin up an environment with minimap2 conda create -n minimap2 -c bioconda minimap2 && conda activate minimap2
does the hecatomb_out/PROCESSING/TMP/p06/ directory exist?

from hecatomb.

blusoldier001 commented on July 3, 2024

The directory exists. I'll back to you on the minimap issue.

from hecatomb.

blusoldier001 commented on July 3, 2024

Also, should I be running the program while inside of the hecatomb folder or would it be ok if I just cd-ed to the folder of the inputs and then ran it?

from hecatomb.

beardymcjohnface commented on July 3, 2024

run it in a clean folder. When I'm running it, I'll create a new folder someAnalysis and a sub folder for the reads someAnalysis/reads. I would copy or link the reads to the reads folder, then cd to someAnalysis and run hecatomb from there. Don't run it from the hecatomb installation folder.

from hecatomb.

blusoldier001 commented on July 3, 2024

I moved to a clean folder, ran it, got an error, then proceeded to run the minimap command and then run it again. Unfortunately, it looks like I'm still hitting errors. Here's what I got:
2022-03-10T232525.690659.snakemake.log

Was I supposed to cd into hecatomb_out/PROCESSING/TMP/p06/ and then run the command?
Because if I do that I get an error: "ERROR: failed to open file 'hecatomb_out/PROCESSING/TMP/p06/M667_I8470_32876_Wang_Asthma_A07_NEBNext_Index_5_ACAGTGATCT_S5_L001_R1.s6.out.fastq'"

Thanks!

from hecatomb.

blusoldier001 commented on July 3, 2024

Here's the file that's unable to be opened.
M667_I8470_32876_Wang_Asthma_A07_NEBNext_Index_5_ACAGTGATCT_S5_L001.s6.stats.zip

from hecatomb.

beardymcjohnface commented on July 3, 2024

That looks like the same error as before. I'm guessing that sample doesn't have any reads following QC and host removal. I'll have to add an update to check for this.
Is this work urgent; do you want me to try and run your samples for you?

from hecatomb.

blusoldier001 commented on July 3, 2024

That would be great, thanks! However, the samples, after being zipped, is still about 4GB. How should I send it to you?

from hecatomb.

beardymcjohnface commented on July 3, 2024

Thanks for the email. The dataset ran fine on our system using the current conda version of hecatomb. I wish I knew why it was causing so much grief, but we'll probably have to test Hecatomb in some cloud VMs at some point.

from hecatomb.

blusoldier001 commented on July 3, 2024

Ok, thank you.

Do you know what the error message

Logfile hecatomb_out/STDERR/host_removal_mapping.M667_I8470_32876_Wang_Asthma_A07_NEBNext_Index_5_ACAGTGATCT_S5_L001.samtoolsView.log:
[E::sam_hdr_create] Invalid header line: must start with @HD/@SQ/@RG/@PG/@CO
[main_samview] fail to read the header from "-".

Logfile hecatomb_out/STDERR/host_removal_mapping.M667_I8470_32876_Wang_Asthma_A07_NEBNext_Index_5_ACAGTGATCT_S5_L001.samtoolsFastq.log:
Failed to read header for "-"

is referring to? Since this seems to be a local problem.

from hecatomb.

beardymcjohnface commented on July 3, 2024

cont'd via email.

from hecatomb.

Java Runtime Error when running test databases on VM about hecatomb HOT 43 CLOSED

Comments (43)

Run Parameters

Database installation location, leave blank = use hecatomb install location

STICK TO YOUR SYSTEM'S CPU:RAM RATIO FOR THESE

Some jobs need more RAM; go over your CPU:RAM ratio if needed

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent