Giter Site home page Giter Site logo

Comments (7)

jfy133 avatar jfy133 commented on June 16, 2024

Thanks for the report!

EOFerror implies to me that there is an empty input file or corrupted database somewhere...

If you go into the reported work directory, can you inspect the input files to see if they do have something in them?

from mag.

carleton-envbiotech avatar carleton-envbiotech commented on June 16, 2024

Working through the work directory, I see the following:

  • A directory called checkm_data_2015_01_16 that has multiple subdirectory within it
  • An input_bins directory that contains a single .fa file, as would be expected
  • An additional directory called: MEGAHIT-DASTool-unclassified-dastool_refined_unbinned-NitrifyingPelletDNA_Week4_Sulphatereduction_DNARNAkit_rep3_S14_wf after the sample name that initiated this process.
  • Notably, the 'bin' subdirectory within the above-mentioned directory is empty, as is the 'storage' directory.
  • Using 'cat' to open the checkm.log only returns this, which looks like where the process encountered an error:

[2024-02-22 09:26:07] INFO: CheckM v1.2.1 [2024-02-22 09:26:07] INFO: checkm lineage_wf -t 10 -f MEGAHIT-DASTool-unclassified-dastool_refined_unbinned-NitrifyingPelletDNA_Week4_Sulphatereduction_DNARNAkit_rep3_S14_wf.tsv --tab_table --pplacer_threads 10 -x fa input_bins/ MEGAHIT-DASTool-unclassified-dastool_refined_unbinned-NitrifyingPelletDNA_Week4_Sulphatereduction_DNARNAkit_rep3_S14_wf [2024-02-22 09:26:07] INFO: CheckM data: checkm_data_2015_01_16 [2024-02-22 09:26:07] INFO: [CheckM - tree] Placing bins in reference genome tree. [2024-02-22 09:26:08] INFO: Identifying marker genes in 1 bins with 10 threads:

from mag.

jfy133 avatar jfy133 commented on June 16, 2024

Looking at the checkm issues, I think it maybe you have run out of memory for the checkm process.

You shoulf increase the memory for that errored process in your custom config file too, as you've already done for others it seems

from mag.

carleton-envbiotech avatar carleton-envbiotech commented on June 16, 2024

I obtained this error message even after adjusting the configuration to look like the following excerpt:

process { withName: GTDBTK_CLASSIFYWF { cpus = 32 memory = 256.GB } withName: CHECKM_QC { cpus = 32 memory = 256.GB } }

from mag.

jfy133 avatar jfy133 commented on June 16, 2024

Gah. Could you try running the command manually (.command.sh) with a local copy of checkM? That way we can isolate the error whether it's the pipeline doing something wrong or thetool...

from mag.

jfy133 avatar jfy133 commented on June 16, 2024

@carleton-envbiotech my feeling is either still memory, this seems to be REALLY common issue with checkm, and results in very similar errors.

I note that your configuration in teh except woudn't work without new lines - was that just a quick type out?

process { 
        withName: GTDBTK_CLASSIFYWF { 
                cpus = 32
                memory = 256.GB 
        } 
        withName: CHECKM_QC { 
                cpus = 32 
                memory = 256.GB 
        }
}

Works for me for example

Otherwise, maybe it's the wrong database file being passed to it... the nf-core/mag docs for --checkm_db says it shoujld be this below, but looks like you have a different name in the command above (it might be the same contents, IDK)

default: https://data.ace.uq.edu.au/public/gtdb/data/releases/release214/214.1/auxillary_files/gtdbtk_r214_data.tar.gz```

from mag.

jfy133 avatar jfy133 commented on June 16, 2024

Going to close for now, as I think ti's a memory issue rather than a pipeline error

from mag.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.