Giter Site home page Giter Site logo

cactus's Introduction

Cactus

Build Status

Cactus is a reference-free whole-genome alignment program, as well as a pangenome graph construction toolkit.

Getting Cactus

  • Use the precompiled binaries (Linux X86) or Docker image from the latest release
  • See below for details on building from source.

Getting help

Please subscribe to the cactus-announce low-volume mailing list so we may reach out about releases and other announcements.

To ask questions or request help, please use the Cactus GitHub Discussions.

To file a bug report or enhancement request against the code or documentation, create a GitHub Issue.

Align Genomes from Different Species

Align Genomes from the Same Species and Build Pangenome Graphs

Acknowledgements

Cactus uses many different algorithms and individual code contributions, principally from Joel Armstrong, Glenn Hickey, Mark Diekhans and Benedict Paten. We are particularly grateful to:

  • Yung H. Tsin and Nima Norouzi for contributing their 3-edge connected components program code, which is crucial in constructing the cactus graph structure, see: Tsin,Y.H., "A simple 3-edge-connected component algorithm," Theory of Computing Systems, vol.40, No.2, 2007, pp.125-142.
  • Bob Harris for providing endless support for his LastZ pairwise, blast-like genome alignment tool.
  • Melissa Jane Hubiz and Adam Siepel for halPhyloP and Phast.
  • Sneha Goenka and Yatish Turakhia for SegAlign, the GPU-accelerated version of LastZ.
  • Yan Gao et al. for abPOA
  • Heng Li for minigraph, minimap2, gfatools and dna-brnn
  • Dany Doerr for GFAffix, used to optionally clean pangenome graphs.
  • The vg team for vg, used to process pangenome graphs.
  • The authors of Mash
  • Andrea Guarracino, Erik Garrison and co-authors for odgi. Make sure to cite odgi when using it or its visualizations.
  • Hani Z. Girgis for RED
  • Erik Garrison and co-authors for vcfwave. vcflib citation

Installing Manually From Source

Cactus requires Python >= 3.8 along with Python development headers and libraries

Clone cactus and submodules

git clone https://github.com/ComparativeGenomicsToolkit/cactus.git --recursive

Create the Python virtual environment. Install virtualenv first if needed with python3 -m pip install virtualenv.

cd cactus
virtualenv -p python3 cactus_env
echo "export PATH=$(pwd)/bin:\$PATH" >> cactus_env/bin/activate
echo "export PYTHONPATH=$(pwd)/lib:\$PYTHONPATH" >> cactus_env/bin/activate
source cactus_env/bin/activate
python3 -m pip install -U setuptools pip wheel
python3 -m pip install -U .
python3 -m pip install -U -r ./toil-requirement.txt

If you have Docker installed, you can now run Cactus. All binaries, such as lastz and cactus-consolidated will be run via Docker. Singularity binaries can be used in place of docker binaries with the --binariesMode singularity flag. Note, you must use Singularity 2.3 - 2.6 or Singularity 3.1.0+. Singularity 3 versions below 3.1.0 are incompatible with cactus (see issue #55 and issue #60).

By default, cactus will use the image corresponding to the latest release when running docker binaries. This is usually okay, but can be overridden with the CACTUS_DOCKER_ORG and CACTUS_DOCKER_TAG environment variables. For example, to use GPU release 2.4.4, run export CACTUS_DOCKER_TAG=v2.4.4-gpu before running cactus.

Compiling Binaries Locally

In order to compile the binaries locally and not use a Docker image, you need some dependencies installed. On Ubuntu (we've tested on 20.04 and 22.04), you can look at the Cactus Dockerfile for guidance. To obtain the apt-get command:

grep apt-get Dockerfile | head -1 | sed -e 's/RUN //g' -e 's/apt-get/sudo apt-get/g'

Progressive Cactus can be built on ARM cpus including on Mac (with packages installed via Brew), but Minigraph-Cactus is currently X86-only.

To build Cactus, run

make -j 8

In order to run the Minigraph-Cactus pipeline, you must also run

build-tools/downloadPangenomeTools

In order to run cactus-pangenome --vcfwave you may need to then run

export LD_LIBRARY_PATH=$(pwd)/lib:$LD_LIBRARY_PATH

If you want to work with MAF, including running cactus-hal2maf, you must also run

build-tools/downloadMafTools

In order to toggle between local and Docker binaries, use the --binariesMode command line option. If --binariesMode is not specified, local binaries will be used if found in PATH, otherwise a Docker image will be used.

cactus's People

Contributors

adamnovak avatar adderan avatar benedictpaten avatar dailydreaming avatar dentearl avatar diekhans avatar epaull avatar esrice avatar glennhickey avatar gsneha26 avatar jasonbuechler avatar jmonlong avatar joelarmstrong avatar lparsons avatar markfilus avatar mobinasri avatar muffato avatar ngannguyen avatar ofirr avatar pehgp avatar robin-rounthwaite avatar thiagogenez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cactus's Issues

can't run cactus

Hi,
I am trying to run cactus because it's required by another program called "Ragout" to assemble genome based on reference genomes. However, after I installed this program, I can't find the script progressiveCactus anywhere. Do you have any idea why?
Thanks,
Wei Wei

-lpthread apparently absent from cactus/caf/Makefile

In my hands -lpthread must be added to gcc build of cactus_caf and stCafTests else

...
cd caf && make all
make[1]: Entering directory `/home/mec/local/src/cactus/caf'
gcc -std=c99 -O3 -g -Wall --pedantic -funroll-loops -DNDEBUG  -I /home/mec/local/src/cactus/submodules/sonLib/lib -I inc -I ../lib// -c impl/*.c
impl/giantComponent.c: In function 'stCaf_breakupComponentGreedily':
impl/giantComponent.c:38:13: warning: variable 'edgeScore' set but not used [-Wunused-but-set-variable]
     int64_t edgeScore = INT64_MAX;
             ^
ar rc stCaf.a *.o
ranlib stCaf.a 
rm *.o
mv stCaf.a ../lib//
cp inc/*.h ../lib//
gcc -std=c99 -O3 -g -Wall --pedantic -funroll-loops -DNDEBUG  -I /home/mec/local/src/cactus/submodules/sonLib/lib -I inc -I impl -I../lib/ -o ../bin//stCafTests tests/*.c impl/*.c ../lib//stCaf.a ../lib//cactusBlastAlignment.a /home/mec/local/src/cactus/submodules/sonLib/lib/stPinchesAndCacti.a /home/mec/local/src/cactus/submodules/sonLib/lib/3EdgeConnected.a ../lib//cactusLib.a /home/mec/local/src/cactus/submodules/sonLib/lib/sonLib.a /home/mec/local/src/cactus/submodules/sonLib/lib/cuTest.a    -lz -lm -lpthread
impl/giantComponent.c: In function 'stCaf_breakupComponentGreedily':
impl/giantComponent.c:38:13: warning: variable 'edgeScore' set but not used [-Wunused-but-set-variable]
     int64_t edgeScore = INT64_MAX;
             ^
gcc -std=c99 -O3 -g -Wall --pedantic -funroll-loops -DNDEBUG  -I /home/mec/local/src/cactus/submodules/sonLib/lib -I inc -I impl -I../lib/ -o ../bin//cactus_caf cactus_caf.c impl/*.c ../lib//stCaf.a ../lib//cactusBlastAlignment.a /home/mec/local/src/cactus/submodules/sonLib/lib/stPinchesAndCacti.a /home/mec/local/src/cactus/submodules/sonLib/lib/3EdgeConnected.a ../lib//cactusLib.a /home/mec/local/src/cactus/submodules/sonLib/lib/sonLib.a /home/mec/local/src/cactus/submodules/sonLib/lib/cuTest.a    -lz -lm
impl/giantComponent.c: In function 'stCaf_breakupComponentGreedily':
impl/giantComponent.c:38:13: warning: variable 'edgeScore' set but not used [-Wunused-but-set-variable]
     int64_t edgeScore = INT64_MAX;
             ^
/home/mec/local/src/cactus/submodules/sonLib/lib/sonLib.a(stThreadPool.o): In function `stThreadPool_construct':
/home/mec/local/src/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/home/mec/local/src/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/home/mec/local/src/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/home/mec/local/src/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/home/mec/local/src/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/home/mec/local/src/cactus/submodules/sonLib/lib/sonLib.a(stThreadPool.o):/home/mec/local/src/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: more undefined references to `pthread_create' follow
/home/mec/local/src/cactus/submodules/sonLib/lib/sonLib.a(stThreadPool.o): In function `stThreadPool_destruct':
/home/mec/local/src/cactus/submodules/sonLib/C/impl/stThreadPool.c:195: undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status
make[1]: *** [../bin//cactus_caf] Error 1
make[1]: Leaving directory `/home/mec/local/src/cactus/caf'
make: *** [all.caf] Error 2

run cactus

I just install cactus and want to run it, but I couldn't find the executable file named cactus. In the bin directory, there are cactus_*** commands, just like cactus_bar, cactus_barTests, cactus_caf. Does anyone know?

trouble installing cactus

Hi there, if one does not have any of various dependencies already installed, which versions of the following do you suggest?
kyoto
lua
lzo
zlib
gcc
toil
I am having issues with installing cactus, and my group very much wants to use it , but I am having trouble installing it correctly. I even got it installed at one point, but there was a discrepancy with the versions, which I couldn't trace down so I thought I'd just ask here beforehand.
Thanks,
Shwetha

Syntax error: "(" unexpected in CactusSetupPhase singularity script

When running using singularity on our slurm cluster (with singularity 3.0.1) I get a syntax error on in the CactusSetupPhase:

=========> Failed job 'CactusSetupPhase' A/y/jobh8kc9l
INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
INFO:toil:Running Toil version 3.14.0-b91dbf9bf6116879952f0a70f9a2fbbcae7e51b6.
WARNING:toil.resource:'JTRES_57583f1446e225c9fde5b17d58ae8b9c' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
INFO:toil.lib.bioio:Sequences in cactus setup: ['simHuman_chr6', 'simMouse_chr6', 'simRat_chr6', 'simCow_chr6', 'simDog_chr6']
INFO:toil.lib.bioio:Sequences in cactus setup filenames: ['>id=1|simHuman.chr6|0\n', '>id=0|simMouse.chr6\n', '>id=2|simRat.chr6\n', '>id=4|simCow.chr6|0\n', '>id=3|simDog.chr6|0\n']
INFO:cactus.shared.common:Work dirs: set([u'/tmp/toil-e9feab2f-d6c3-451d-b8e3-3a340eaa4356-56d34a5a-a657-4dd7-a772-c841bc6d87d4/tmpn17gUC/537867c6-3575-4d7d-b2dd-9f8098975e0c'])
INFO:cactus.shared.common:Docker work dir: /tmp/toil-e9feab2f-d6c3-451d-b8e3-3a340eaa4356-56d34a5a-a657-4dd7-a772-c841bc6d87d4/tmpn17gUC/537867c6-3575-4d7d-b2dd-9f8098975e0c
INFO:cactus.shared.common:Running the command ['singularity', '--silent', 'run', u'/tigress/lparsons/kocher_lab/bee_genome_alignment/cactus_evolverMammals_jobStorePath/cactus.img', u'cactus_setup', u'--speciesTree', u'((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_c
hr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', u'--cactusDisk', u'<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="127.0.0.1" port="6328" />\n\t\t</st_kv_database_conf>\n\t', u
'--logLevel', u'INFO', u'--outgroupEvents', u'simHuman_chr6 simDog_chr6 simCow_chr6', u'tmpIHe4kJ.tmp', u'tmpKPNO0E.tmp', u'tmpVEzmZ8.tmp', u'tmpDwKNI0.tmp', u'tmpGS_J80.tmp']
/.singularity.d/runscript: 1: eval: Syntax error: "(" unexpected
Traceback (most recent call last):
  File "/tigress/lparsons/miniconda3/envs/bee_genome_alignment/lib/python2.7/site-packages/toil/worker.py", line 309, in workerScript
    job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
  File "/tigress/lparsons/kocher_lab/bee_genome_alignment/cactus/src/cactus/shared/common.py", line 1094, in _runner
    super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
  File "/tigress/lparsons/miniconda3/envs/bee_genome_alignment/lib/python2.7/site-packages/toil/job.py", line 1328, in _runner
    returnValues = self._run(jobGraph, fileStore)
  File "/tigress/lparsons/miniconda3/envs/bee_genome_alignment/lib/python2.7/site-packages/toil/job.py", line 1273, in _run
    return self.run(fileStore)
  File "/tigress/lparsons/kocher_lab/bee_genome_alignment/cactus/src/cactus/pipeline/cactus_workflow.py", line 641, in run
    makeEventHeadersAlphaNumeric=self.getOptionalPhaseAttrib("makeEventHeadersAlphaNumeric", bool, False))
  File "/tigress/lparsons/kocher_lab/bee_genome_alignment/cactus/src/cactus/shared/common.py", line 220, in runCactusSetup
    parameters=["cactus_setup"] + args + sequences)
  File "/tigress/lparsons/kocher_lab/bee_genome_alignment/cactus/src/cactus/shared/common.py", line 1038, in cactus_call
    raise RuntimeError("Command %s failed with output: %s" % (call, output))
RuntimeError: Command ['singularity', '--silent', 'run', u'/tigress/lparsons/kocher_lab/bee_genome_alignment/cactus_evolverMammals_jobStorePath/cactus.img', u'cactus_setup', u'--speciesTree', u'((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974
)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', u'--cactusDisk', u'<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="127.0.0.1" port="6328" />\n\t\t</st_kv_database_conf>\n\t', u'--logLevel', u'INFO', u
'--outgroupEvents', u'simHuman_chr6 simDog_chr6 simCow_chr6', u'tmpIHe4kJ.tmp', u'tmpKPNO0E.tmp', u'tmpVEzmZ8.tmp', u'tmpDwKNI0.tmp', u'tmpGS_J80.tmp'] failed with output:
ERROR:toil.worker:Exiting the worker because of a failed job on host della-r4c2n8
WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'CactusSetupPhase' A/y/jobh8kc9l with ID A/y/jobh8kc9l to 0
<=========

Complete logfile attached:
logfile.log.gz

[Dependency] KyotoCabinet not creating a snapshot

Hello,
I am having problems in installing cactus from source. The issue is that the compiled KyotoCabinet is not capable of creating a snapshot when so instructed; it exits the tentative, complaining that it is hanging.
I have tried to make it function with the default Ubuntu version, as well as by compiling the versions provided by CloudFlare and AlticeLabs. The program functions when using the provided Docker, but I cannot understand how or why.

Is there any particular "recipe" one has to follow to install kyototycoon? This is essential for my team, as we plan to both develop Cactus a bit, and most importantly, because we do not have a proper Singularity/Docker installation on our cluster.

too little default memory for Runblast

Hi,
I am using Cactus on a linux cluster using Slurm.

I have noticed that a large number of jobs run out of memory. They get resubmitted with increased memory, but this IMO creates unnecessary burden on the scheduler (plus negatively impacts my fairshare).

As you can see in the example below, the memory appears to be set to 100M, which is different than --defaultMemory which I thought was set to 2 G. I don't know if it is a mere coincidence, but 100M is the memory allocated to jobs by default if not specified at time of submission on our cluster.

@joelarmstrong , any idea what is happening here?

Issued job 'RunBlast' Q/O/jobExNWJB with job batch system ID: 10974 and cores: 1, disk: 100.0 M, and memory: 100.0 M
Job ended successfully: 'RunBlast' Q/O/jobExNWJB
The job seems to have left a log file, indicating failure: 'RunBlast' Q/O/jobExNWJB
Q/O/jobExNWJB    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
Q/O/jobExNWJB    INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
Q/O/jobExNWJB    WARNING:toil.resource:'JTRES_b2dc083614dd582f2772f871e1b81eef' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
Q/O/jobExNWJB    INFO:cactus.shared.common:Docker work dir: /tmp/toil-68ba9d40-d272-4d3a-82d7-37579919f5d7-1e4ee729-0a33-477c-9099-d1a437c00af4/tmpbRhvn2/bbb7f143-3e34-46b2-b48c-df6fea1565f2
Q/O/jobExNWJB    INFO:cactus.shared.common:Running the command ['singularity', '--silent', 'run', u'/n/regal/hoekstra_lab/lassance/ProgressiveCactus/Pman1_Pman2/jobstore/cactus.img', u'cPecanLastz', u'--format=cigar', u'--notrivial', u'--step=1', u'--ambiguous=iupac,100,100', u'--ydrop=3000', u'tmpDACbhQ.tmp[multiple][nameparse=darkspace]', u'tmpidyx89.tmp[nameparse=darkspace]']
Q/O/jobExNWJB    Running command catchsegv 'cPecanLastz' '--format=cigar' '--notrivial' '--step=1' '--ambiguous=iupac,100,100' '--ydrop=3000' 'tmpDACbhQ.tmp[multiple][nameparse=darkspace]' 'tmpidyx89.tmp[nameparse=darkspace]'
Q/O/jobExNWJB    Killed
Q/O/jobExNWJB    Traceback (most recent call last):
Q/O/jobExNWJB      File "/n/home01/lassance/.conda/envs/ENV_CACTUS/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
Q/O/jobExNWJB        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
Q/O/jobExNWJB      File "/n/home01/lassance/.conda/envs/ENV_CACTUS/lib/python2.7/site-packages/cactus/shared/common.py", line 1094, in _runner
Q/O/jobExNWJB        super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
Q/O/jobExNWJB      File "/n/home01/lassance/.conda/envs/ENV_CACTUS/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
Q/O/jobExNWJB        returnValues = self._run(jobGraph, fileStore)
Q/O/jobExNWJB      File "/n/home01/lassance/.conda/envs/ENV_CACTUS/lib/python2.7/site-packages/toil/job.py", line 1296, in _run
Q/O/jobExNWJB        return self.run(fileStore)
Q/O/jobExNWJB      File "/n/home01/lassance/.conda/envs/ENV_CACTUS/lib/python2.7/site-packages/cactus/blast/blast.py", line 443, in run
Q/O/jobExNWJB        runLastz(seqFile1, seqFile2, blastResultsFile, lastzArguments = self.blastOptions.lastzArguments)
Q/O/jobExNWJB      File "/n/home01/lassance/.conda/envs/ENV_CACTUS/lib/python2.7/site-packages/cactus/shared/common.py", line 758, in runLastz
Q/O/jobExNWJB        soft_timeout=5400)
Q/O/jobExNWJB      File "/n/home01/lassance/.conda/envs/ENV_CACTUS/lib/python2.7/site-packages/cactus/shared/common.py", line 1038, in cactus_call
Q/O/jobExNWJB        raise RuntimeError("Command %s failed with output: %s" % (call, output))
Q/O/jobExNWJB    RuntimeError: Command ['singularity', '--silent', 'run', u'/n/regal/hoekstra_lab/lassance/ProgressiveCactus/Pman1_Pman2/jobstore/cactus.img', u'cPecanLastz', u'--format=cigar', u'--notrivial', u'--step=1', u'--ambiguous=iupac,100,100', u'--ydrop=3000', u'tmpDACbhQ.tmp[multiple][nameparse=darkspace]', u'tmpidyx89.tmp[nameparse=darkspace]'] failed with output: None
Q/O/jobExNWJB    ERROR:toil.worker:Exiting the worker because of a failed job on host holy7b01214.rc.fas.harvard.edu
Q/O/jobExNWJB    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'RunBlast' Q/O/jobExNWJB with ID Q/O/jobExNWJB to 5
Q/O/jobExNWJB    WARNING:toil.jobGraph:We have increased the default memory of the failed job 'RunBlast' Q/O/jobExNWJB to 2147483648 bytes
Issued job 'RunBlast' Q/O/jobExNWJB with job batch system ID: 32089 and cores: 1, disk: 100.0 M, and memory: 2.0 G
Job ended successfully: 'RunBlast' Q/O/jobExNWJB

20+ hour realign jobs when self-aligning primates

Maybe an issue with heavily repetitive regions? Just using this as a space for notes:

Problematic jobs:

/bin/sh -c cactus_lastz --format=cigar --step=2 --ambiguous=iupac,100,100 --ydrop=3000 --notransition --identity=85 /hive/users/jcarmstr/cactusStuff/100way-forBakeOff/primates/work/jobTree/jobs/t1/t0/t0/gTD0/tmp_eYgf24uzun/chunks/119[multiple][nameparse=darkspace] /hive/users/jcarmstr/cactusStuff/100way-forBakeOff/primates/work/jobTree/jobs/t1/t0/t0/gTD0/tmp_eYgf24uzun/chunks/119[nameparse=darkspace] --notrivial | cactus_realign --gapGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99 /hive/users/jcarmstr/cactusStuff/100way-forBakeOff/primates/work/jobTree/jobs/t1/t0/t0/gTD0/tmp_eYgf24uzun/chunks/119 > /scratch/tmp/tmpJ0UXU7/localTempDir/tempResults.cig

/bin/sh -c cactus_lastz --format=cigar --step=2 --ambiguous=iupac,100,100 --ydrop=3000 --notransition --identity=85 /hive/users/jcarmstr/cactusStuff/100way-forBakeOff/primates/work/jobTree/jobs/t1/t0/t0/gTD0/tmp_eYgf24uzun/chunks/118[multiple][nameparse=darkspace] /hive/users/jcarmstr/cactusStuff/100way-forBakeOff/primates/work/jobTree/jobs/t1/t0/t0/gTD0/tmp_eYgf24uzun/chunks/118[nameparse=darkspace] --notrivial | cactus_realign --gapGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99 /hive/users/jcarmstr/cactusStuff/100way-forBakeOff/primates/work/jobTree/jobs/t1/t0/t0/gTD0/tmp_eYgf24uzun/chunks/118 > /scratch/tmp/tmp1wAwDi/localTempDir/tempResults.cig

/bin/sh -c cactus_lastz --format=cigar --step=2 --ambiguous=iupac,100,100 --ydrop=3000 --notransition --identity=85 /hive/users/jcarmstr/cactusStuff/100way-forBakeOff/primates/work/jobTree/jobs/t1/t0/t0/gTD0/tmp_eYgf24uzun/chunks/117[multiple][nameparse=darkspace] /hive/users/jcarmstr/cactusStuff/100way-forBakeOff/primates/work/jobTree/jobs/t1/t0/t0/gTD0/tmp_eYgf24uzun/chunks/117[nameparse=darkspace] --notrivial | cactus_realign --gapGamma 0.9 --diagonalExpansion 4 --splitMatrixBiggerThanThis 10 --constraintDiagonalTrim 0 --alignAmbiguityCharacters --splitIndelsLongerThanThis 99 /hive/users/jcarmstr/cactusStuff/100way-forBakeOff/primates/work/jobTree/jobs/t1/t0/t0/gTD0/tmp_eYgf24uzun/chunks/117 > /scratch/tmp/tmpYQ0UGZ/localTempDir/tempResults.cig

LD_RUN_PATH, LDFLAGS ignored by Makefile

Is there a way to pass -Wl,--enable-new-dtags,-rpath,/some/path to the linker when using the included Makefile? I have GCC installed in a non-standard location and would like the ability to embed a RUNPATH in the libraries and binaries, but exporting LD_RUN_PATH and/or LDFLAGS has no effect. I could just patchelf everything after running make, but presumably there's a better way. I don't consider setting LD_LIBRARY_PATH as a real solution, just a great way to mess up other software's functionality. Is there another flag that I can use, or a modification to the Makefile that I should try?

DeadlockException Error when Running Cactus

Hi,

I'm trying to run cactus with docker on a machine, but whenever I do, the DeadlockException gets raised, and I cannot figure out a way to fix it. I have tried increasing the disk space, the memory, and the number of cores used using --defaultDisk, --defaultMemory, and --defaultCores. I have also tried lowering the number of service jobs run at once using --maxServiceJobs. The program also uses only 2G of memory when 64G is available on the machine, and I specify more memory can be allocated. I would greatly appreciate any guidance on this issue. Thank you.

compile error

Hi Benedict,

This is an error I got in the compilation process.

cd api && make all
make[1]: Entering directory `/nfs/yunheo1/software/assemblathon1/cactus/api'
cp inc/cactus*.h ../lib//
gcc -std=c99 -O3 -g -Wall -Werror --pedantic -funroll-loops -lm  -I ..//../sonLib/lib -I/nfs/yunheo1/software/assemblathon1/local/include -DH
AVE_TOKYO_CABINET=1 -I/nfs/yunheo1/software/assemblathon1/local/include -DHAVE_KYOTO_TYCOON=1  -I/usr/include/mysql -DHAVE_MYSQL=1  -I inc -I
 ../lib// -c impl/cactus*.c
impl/cactusChain.c: In function รขโ‚ฌหœchain_getAverageInstanceBaseLengthรขโ‚ฌโ„ข:
impl/cactusChain.c:98:19: error: variable รขโ‚ฌหœlรขโ‚ฌโ„ข set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors

impl/cactusDisk.c: In function รขโ‚ฌหœcactusDisk_getBlockOfUniqueIDsรขโ‚ฌโ„ข:
impl/cactusDisk.c:554:13: error: assuming signed overflow does not occur when assuming that (X + c) < X is always false [-Werror=strict-overf
low]
cc1: all warnings being treated as errors

make[1]: *** [../lib//cactusLib.a] Error 1
make[1]: Leaving directory `/nfs/yunheo1/software/assemblathon1/cactus/api'
make: *** [all.api] Error 2

Could you check why it happened?
Thank you.

Yun

high throughput or high performance cluster?

Thank you for developing this extremely useful software. I'm eager to use it on my data. Which is comprised of nine highly-fragmented draft genome assemblies (~100K contigs) and one well annotated, but also fragmented reference assembly ( ~20K scaffolds), all ten assemblies are from one individual representing ten species of the same genus. I'm wondering if you could suggest a way to map these assemblies to each other with the goals of 1) using the well annotated reference and mapped contigs to create an official gene set for the other draft assemblies using the Comparative Annotation Toolkit, CAT, and 2) identifying novel genic features such as large scale inversions, deletions etc... I've created a relatedness tree from approximately 350 single copy orthologs, identified from BUSCO, and I'm curious if I should try to map all of these assemblies to the reference at once, including the nexus tree, or map them individually to the reference to create 9 HAL files? Also, I have access to two computing clusters: a high performance (MPI supported) and high throughput and I'm a bit confused by the cluster and batching section on the user page so was hoping you could make a suggestion. Thanks again and I hope to hear back from you soon

Error "TypeError: 'InEdgeDataView' object does not support indexing"

I'm trying to run the evolverMammals.txt cactus example on the interactive node of my cluster. The virtual environment loads correctly, and docker isn't installed so I've been using "--binariesMode local".

Whenever I run the cactus command with these settings, I get the error "TypeError: 'InEdgeDataView' object does not support indexing". The most recent traceback is to:

../anaconda2/envs/cactus/lib/python2.7/site-packages/sonLib/nxtree.py", line 71, in getParent
return edges[0][0]

I am not well versed in python and was wondering if you had any guidance on how to proceed.

toil.leader.FailedJobsException when running on aws

Run the command:

nohup cactus --stats --nodeTypes c4.8xlarge:0.6,r4.8xlarge --minNodes 0,0 --maxNodes 70,4 --provisioner aws --batchSystem mesos --metrics aws:us-east-1:primate-jobstore-2 seqFile.txt s3://primate-output/cactus.hal >cactus.log 2>&1 &

Then the cactus report error

INFO:toil.leader:Finished toil run with 35792 failed jobs.
INFO:toil.leader:Failed jobs at end of the run: 'CactusHalGeneratorUpWrapper' 727e28d1-3b2f-459d-8ff2-4404ec7478b3 'CactusHalGeneratorRecursion' 739fe488-6525-4565-b1b0-e76f8a0e3331 'CactusHalGeneratorRecursion' 7fc3698c-6b38-4f95-b7f2-a4a15d9bdbbf 'CactusHalGeneratorRecursion' 5cd8dd55-4487-4cde-8fa2-e522c076e457 'CactusHalGeneratorRecursion' 1ee0e6a6-1d5c-44b2-877e-2368965863e2 'CactusHalGeneratorRecursion' 53353941-2b80-42d6-be97-7541a90bfce5 'CactusHalGeneratorUpWrapper' 8fab38a1-7b43-4dd4-ba2b-1899dc3ba6a1 'CactusHalGeneratorRecursion' 2d871cb0-119f-4724-b278-a7ad84a5747c 'CactusHalGeneratorRecursion' dbce77b8-7e9c-4f6e-97a0-cd7ee7c2ae74 'CactusHalGeneratorRecursion' 8632ee77-2a5e-47b8-98cf-edd47bc4ff90 'CactusHalGeneratorRecursion' 49ac25b7-985d-4c24-ac88-3407e2b2fe3c 'CactusHalGeneratorRecursion' 9b8d29e5-1b77-4edb-b86d-d6c29af1f452 'CactusHalGeneratorRecursion' 9d547595-d4e5-4a7d-8a92-53debe0d521a  ......
I0216 10:12:55.940577   205 sched.cpp:1987] Asked to stop the driver
I0216 10:12:55.940704   367 sched.cpp:1187] Stopping framework 'a2b45cb2-c6f2-4068-a612-f44fd197efd9-0000'
Importing 7 sequences
Traceback (most recent call last):
  File "/venv/bin/cactus", line 10, in <module>
    sys.exit(main())
  File "/venv/local/lib/python2.7/site-packages/cactus/progressive/cactus_progressive.py", line 520, in main
    halID = toil.start(RunCactusPreprocessorThenProgressiveDown(options, project, memory=configWrapper.getDefaultMemory()))
  File "/usr/local/lib/python2.7/dist-packages/toil/common.py", line 784, in start
    return self._runMainLoop(rootJobGraph)
  File "/usr/local/lib/python2.7/dist-packages/toil/common.py", line 1059, in _runMainLoop
    jobCache=self._jobCache).run()
  File "/usr/local/lib/python2.7/dist-packages/toil/leader.py", line 237, in run
    raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore)
toil.leader.FailedJobsException

Then I grep the failed job "727e28d1-3b2f-459d-8ff2-4404ec7478b3"

INFO:toil.leader:Issued job 'CactusHalGeneratorUpWrapper' 727e28d1-3b2f-459d-8ff2-4404ec7478b3 with job batch system ID: 1230677 and cores: 1, disk: 2.
0 G, and memory: 3.8 G
INFO:toil.leader:Job ended successfully: 'CactusHalGeneratorUpWrapper' 727e28d1-3b2f-459d-8ff2-4404ec7478b3
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'CactusHalGeneratorUpWrapper' 727e28d1-3b2f-459d-8ff2-4404ec7478b3
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3    INFO:toil.fileStore:LOG-TO-MASTER: Max memory used for job CactusHalGeneratorUpWrapper (tool cactus_halGenerator) on JSON features {"maxFlowerSize": 52, "flowerGroupSize": 56, "numFlowers": 2}: 0
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3    Traceback (most recent call last):
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3      File "/usr/local/lib/python2.7/dist-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3      File "/tmp/tmppkVkNn/d144cae90014c416ea76c8a4e7ed49e0/cactus/shared/common.py", line 1096, in _runner
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3        super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3      File "/usr/local/lib/python2.7/dist-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3        returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3      File "/usr/local/lib/python2.7/dist-packages/toil/job.py", line 1296, in _run
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3        return self.run(fileStore)
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3      File "/tmp/tmppkVkNn/d144cae90014c416ea76c8a4e7ed49e0/cactus/pipeline/cactus_workflow.py", line 1324, in run
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3        self.getOptionalPhaseAttrib("showOnlySubstitutionsWithRespectToReference", bool))
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3      File "/tmp/tmppkVkNn/d144cae90014c416ea76c8a4e7ed49e0/cactus/shared/common.py", line 720, in runCactusHalGenerator
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3        job_name=jobName, features=features, fileStore=fileStore)
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3      File "/tmp/tmppkVkNn/d144cae90014c416ea76c8a4e7ed49e0/cactus/shared/common.py", line 1040, in cactus_call
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3        raise RuntimeError("Command %s failed with output: %s" % (call, output))
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3    RuntimeError: Command ['docker', 'run', '--interactive', '--net=host', '--log-driver=none', '-u', '0:0', '-v', '/var/lib/toil/toil-d67eb928-2ecc-4c0b-9897-384cf81e972b-ef9fa6f5-a671-4b01-8c21-edee399e9844/tmpp4Wpj4/3d99139b-b609-483b-9076-37f0bce03061:/data', '--name', '94fb766f-68af-4dff-9b05-029e448c7df8', '--rm', 'quay.io/comparative-genomics-toolkit/cactus:157ed0cca83ff56b42fd216d9f95011620253df2', 'cactus_halGenerator', '--logLevel', 'INFO', '--cactusDisk', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="10.0.45.31" port="32354" />\n\t\t</st_kv_database_conf>\n\t', '--secondaryDisk', '<st_kv_database_conf type="kyoto_tycoon">\n   <kyoto_tycoon database_dir="fakepath" host="10.0.45.31" port="18579" />\n  </st_kv_database_conf>\n ', '--referenceEventString', 'Anc2'] failed with output: None
WARNING:toil.leader:727e28d1-3b2f-459d-8ff2-4404ec7478b3    ERROR:toil.worker:Exiting the worker because of a failed job on host ip-10-0-45-31.ec2.internal

Thank you!

root outgroups are not preprocessed

Specifying a root outgroup in cactus_createMultiProject does not result in the root outgroup being preprocessed. If it is repetitive this could result in extra long runtimes.

Suggest that this sequence is passed into the experiment file as an extra sequence to preprocess.

pkg-config for kyoto-*

Hi again,

Sorry I've been spamming you with so many questions ! My next one is regarding kyoto-* and having to update sonLib/include.mk
Is there a reason why it has to be edited by hand rather than automatically detected with pkg-config (like hiredis) ? Here I'm using linuxbrew to install kyoto-* so they're installed in a location that's not hardcoded in the makefile, but pkg-config is configured correctly and I can use it to find all the paths that are needed.

If you don't have any objections, I can amend the makefile (with another else after the hard-coded paths, to make sure they still work) and make a pull-request

AssertionError when running on aws

Run the command:

nohup cactus --stats --nodeTypes c4.8xlarge:0.9,r3.8xlarge --minNodes 0,0 --maxNodes 70,4 --provisioner aws --batchSystem mesos --metrics aws:us-east-1:primate-jobstore seqFile.txt s3://primate-output/cactus.hal > cactus.log 2>&1

And then cactus report error:

nohup: ignoring input
/usr/local/lib/python2.7/dist-packages/cryptography/hazmat/primitives/constant_time.py:26: CryptographyDeprecationWarning: Support for your Python version is deprecated. The next version of cryptography will remove support. Please upgrade to a 2.7.x release that supports hmac.compare_digest as soon as possible.
  utils.DeprecatedIn23,
WARNING:toil.jobStores.aws.jobStore:Exception during panic
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/toil/jobStores/aws/jobStore.py", line 135, in initialize
    self.destroy()
  File "/usr/local/lib/python2.7/dist-packages/toil/jobStores/aws/jobStore.py", line 1255, in destroy
    self._bind(create=False, block=False)
  File "/usr/local/lib/python2.7/dist-packages/toil/jobStores/aws/jobStore.py", line 167, in _bind
    versioning=True)
  File "/usr/local/lib/python2.7/dist-packages/toil/jobStores/aws/jobStore.py", line 736, in _bindBucket
    assert False, 'Cannot modify versioning on existing bucket'
AssertionError: Cannot modify versioning on existing bucket
Traceback (most recent call last):
  File "/venv/bin/cactus", line 10, in <module>
    sys.exit(main())
  File "/venv/local/lib/python2.7/site-packages/cactus/progressive/cactus_progressive.py", line 454, in main
    with Toil(options) as toil:
  File "/usr/local/lib/python2.7/dist-packages/toil/common.py", line 712, in __enter__
    jobStore.initialize(config)
  File "/usr/local/lib/python2.7/dist-packages/toil/jobStores/aws/jobStore.py", line 135, in initialize
    self.destroy()
  File "/usr/local/lib/python2.7/dist-packages/toil/lib/exceptions.py", line 55, in __exit__
    raise_(exc_type, exc_value, traceback)
  File "/usr/local/lib/python2.7/dist-packages/toil/jobStores/aws/jobStore.py", line 132, in initialize
    self._bind(create=True)
  File "/usr/local/lib/python2.7/dist-packages/toil/jobStores/aws/jobStore.py", line 167, in _bind
    versioning=True)
  File "/usr/local/lib/python2.7/dist-packages/toil/jobStores/aws/jobStore.py", line 736, in _bindBucket
    assert False, 'Cannot modify versioning on existing bucket'
AssertionError: Cannot modify versioning on existing bucket

I have set the versioning of the bucket(s3://primate-output/) and it still doesn't work.

Running with singularity on cluster

Hi,

I am trying to get cactus running on our cluster but having a few issues.

We have a slurm based submission system and can run singularity. However, the cluster can't see the outside world. I have installed cactus in an anaconda environment and it seems to run with the bundled test data set on our software node that has access to the internet. When running on the main cluster with the --singularity flag it fails trying to pull the singularity image.

I have tried manually downloading the image like this:

singularity pull --size 2000 --name cactus.img docker://quay.io/comparative-genomics-toolkit/cactus:dec921fce1a8a7d3dd71a4466ed4565e755d4cdc

And then exporting the location:

export CACTUS_SINGULARITY_IMG=/hpc-home/tmathers/cactus_docker/cactus/cactus.img

However, the cactus script still tries to download the image and fails. Is it possible to disable this behaviour and specify a pre downloaded image?

Thanks,

Tom.

How to restart incomplete runs?

Hello,

I am currently trying to run Cactus on 12 mammalian genomes, on a cluster. The cluster has a built in wall time of a few days, which means the alignment does not finish, since I have to align so many genomes. Is there a way to restart Cactus using the incomplete alignment information, so it picks up where the last run left off, instead of starting from scratch?

Thanks,
Nikki

AssertionError in checkForDeadlocks: assert len(runningServiceJobs) <= totalRunningJobs

When running cactus using Slurm, I ran into an exception AssertionError thrown by Toil:

File "/Genomics/grid/users/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/toil/leader.py", line 524, in checkForDeadlocks
    assert len(runningServiceJobs) <= totalRunningJobs

I'm not sure if this is actually a problem with Toil and Slurm or if there is something going on with how cactus is using Toil. I'm adding this issue here to track it in case anyone else has a similar problem. Will update if I find new details.

Traceback (most recent call last):
  File "/Genomics/grid/users/lparsons/miniconda3/envs/cactus/bin/cactus", line 10, in <module>
    sys.exit(main())
  File "/Genomics/grid/users/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/cactus/progressive/cactus_progressive.py", line 477, in main                                                                                                                             
    halID = toil.restart()
  File "/Genomics/grid/users/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/toil/common.py", line 773, in restart
    return self._runMainLoop(rootJobGraph)
  File "/Genomics/grid/users/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/toil/common.py", line 1018, in _runMainLoop
    jobCache=self._jobCache).run()
  File "/Genomics/grid/users/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/toil/leader.py", line 197, in run
    self.innerLoop()
  File "/Genomics/grid/users/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/toil/leader.py", line 501, in innerLoop
    self.checkForDeadlocks()
  File "/Genomics/grid/users/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/toil/leader.py", line 524, in checkForDeadlocks
    assert len(runningServiceJobs) <= totalRunningJobs
AssertionError

SLURM `sacct` calls swamping slurms job database

I'm running into an issue trying to run Cactus on our SLURM cluster. The cactus process running on the submit node seems to start about one thousand (1000) sacct processes, which crashed our SLURM db. I'm not sure if this part of a toil configuration or specific to how cactus is using toil. There appear to be many sacct processes for the same job number, which seems odd.

I seem to be able to reduce this by setting --parasolMaxBatches. However, I still seem to run into the issue.

Cannot build hdf5 with gcc5

Hi there,

I can't compile cactus with gcc5 because it fails on the hdf5 library. This is because the version of hdf5 used (1.8.9, 2012-05-09) needs the C99 standard instead of the default C90, e.g.

warning: ISO C90 does not support 'long long'
warning: anonymous variadic macros were introduced in C99

but more importantly

error: C++ style comments are not allowed in ISO C90

I've noticed that you've tried to solve this by adding in the hdf5 section of the main Makefile: CFLAGS=-std=c99 and the -e make option. -e works to a certain extent but doesn't percolate down to the actual gcc. See this process tree:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
muffato  14402  0.0  0.0 108184   996 pts/90   S+   00:58   0:00  |   \_ make hdf5Rule
muffato  14404  0.0  0.0 113128  1200 pts/90   S+   00:58   0:00  |       \_ /bin/sh -c cd /homes/muffato/src/local/brewenv/tok/cactus/submodules/hdf5 && (output=`(./configure --prefix=/homes/muffato/src/local/brew
muffato  14405  0.1  0.0 116564  4008 pts/90   S+   00:58   0:00  |           \_ /bin/sh -c cd /homes/muffato/src/local/brewenv/tok/cactus/submodules/hdf5 && (output=`(./configure --prefix=/homes/muffato/src/local/
muffato  14406  0.0  0.0 113128   652 pts/90   S+   00:58   0:00  |               \_ /bin/sh -c cd /homes/muffato/src/local/brewenv/tok/cactus/submodules/hdf5 && (output=`(./configure --prefix=/homes/muffato/src/lo
muffato  14407  0.0  0.0 113128   644 pts/90   S+   00:58   0:00  |                   \_ /bin/sh -c cd /homes/muffato/src/local/brewenv/tok/cactus/submodules/hdf5 && (output=`(./configure --prefix=/homes/muffato/sr
muffato  23438  0.0  0.0 108320  1160 pts/90   S+   00:59   0:00  |                       \_ make -j4 -e
muffato  23439  0.0  0.0 113132  1452 pts/90   S+   00:59   0:00  |                           \_ /bin/sh -c fail= failcom='exit 1'; \ for f in x $MAKEFLAGS; do \   case $f in \     *=* | --[!k]*);; \     *k*) failc
muffato  23443  0.0  0.0 113132   684 pts/90   S+   00:59   0:00  |                               \_ /bin/sh -c fail= failcom='exit 1'; \ for f in x $MAKEFLAGS; do \   case $f in \     *=* | --[!k]*);; \     *k*) f
muffato  23444  0.3  0.0 110352  3272 pts/90   S+   00:59   0:00  |                                   \_ make all
muffato  23445  0.4  0.0 112296  5180 pts/90   S+   00:59   0:00  |                                       \_ make all-am
muffato  27693  0.0  0.0 113132  1420 pts/90   S+   01:00   0:00  |                                           \_ /bin/sh -c echo "  CC    " H5FS.lo;/bin/sh ../libtool --silent --tag=CC   --mode=compile gcc -DHAVE_C
muffato  27697  0.0  0.0 113524  1948 pts/90   S+   01:00   0:00  |                                               \_ /bin/sh ../libtool --silent --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -D_LARGEFILE_SOURCE -
muffato  28110  0.0  0.0   8160   940 pts/90   S+   01:00   0:00  |                                                   \_ gcc -DHAVE_CONFIG_H -I. -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_BSD_SOURCE -D_POSIX_C_SO
muffato  28116  0.0  0.0  54660 20540 pts/90   R+   01:00   0:00  |                                                       \_ /nfs/production/panda/ensembl/compara/muffato/src/local/brewenv/tok/Cellar/gcc/5.5.0/bin/

The processes that still have CFLAGS=-std=c99 are: 23438, 23439, 23443 and 23444. It makes sense: the top-level make is the one run from the Makefile, then both shells will copy their environment, and the finally make all, lacking -e stops propagating the right CFLAGS to its children. Below it, CFLAGS is empty, and the final gcc command-line doesn't have any -std flags, e.g.

gcc -DHAVE_CONFIG_H -I. -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_BSD_SOURCE -D_POSIX_C_SOURCE=199506L -DNDEBUG -UH5_DEBUG_API -ansi -pedantic -Wall -W -Wundef -Wshadow -Wpointer-arith -Wbad-function-cast -Wcast-qual -Wcast-align -Wwrite-strings -Wconversion -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wredundant-decls -Wnested-externs -Winline -O -fomit-frame-pointer -finline-functions -MT H5Adense.lo -MD -MP -MF .deps/H5Adense.Tpo -c H5Adense.c -o H5Adense.o

hence the error.

Do you have any idea how to fix that ?

Thank you
Matthieu

Trailing "|"s in headers are stripped

Not a huge priority, since the "|" character isn't allowed by default right now, but that's a restriction based on what assembly hubs can work with. Cactus itself should handle these characters just fine.

$ grep '>' CAT.fa 
>gi|545778205|gb|U00096.3|
>CAT2.1
$ halStats --sequences CAT original.hal 
gi|545778205|gb|U00096.3,CAT2.1

I'm almost positive this is due to my multiple-outgroups code. I keep a lot of information (unique id per genome, starting position of trimmed outgroup sequences) stored in "|"-separated tokens.

something up in cactus_createMultiProject.py

Running cactus_createMultiCactusProject.py (using the latest development branch of cactus) I got the following error:

[benedict@kolossus progCactusError]$ cactus_createMultiCactusProject.py /cluster/home/benedict/progCactusError/experiment.xml ./progressiveCactusAlignment --fixNames=False --rootOutgroupPath /hive/users/benedict/datasets/realMammals/hg19.fa --rootOutgroupDist 1.0
Traceback (most recent call last):
File "/hive/users/benedict/cactus/bin/cactus_createMultiCactusProject.py", line 234, in
main()
File "/hive/users/benedict/cactus/bin/cactus_createMultiCactusProject.py", line 228, in main
createFileStructure(mcProj, expTemplate, confTemplate, options)
File "/hive/users/benedict/cactus/bin/cactus_createMultiCactusProject.py", line 127, in createFileStructure
exp.updateTree(subtree, seqMap)
File "/hive/users/benedict/cactus/shared/experimentWrapper.py", line 416, in updateTree
sequences += seqMap[nodeName]
KeyError: 'fishAnc00'

I've put the example in /cluster/home/benedict/progCactusError/

I think this relates to Joel's latest changes, so I've assigned to him. I will look again when I get a chance.

providing --unmaskInput to preprocessor string doesn't work

Minor, but just want to remember to fix this: the way the preprocessor works currently doesn't play nicely with lastzRepeatMask.py. It would be nice to be able to use the --unmaskInput flag in the config.xml tag, so the input genomes don't need to be manually unmasked to use our new repeat-masking strategy.

cactus_fastaGenerator losing/rearranging seq names

In /hive/users/jcarmstr/cactusStuff/100way/pigToShrew/work/progressiveAlignment/pigToShrewAnc07/pigToShrewAnc07_hal.fa, KB112597's sequence has been prepended with a copy of KB112596's sequence (which is still in the file in the right place, and looks fine).

Diagrammatically, the result looks like:

>KB112597
(KB112596 seq)>(KB112597 seq)

Bug with 0 size flowers

Reporting file: /Users/benedictpaten/sync/eclipse/git/cactus/tmp_uBW8JrJzX2/jobTree/logFileDir/tmp_0pQlPGGCfM/tmp_LQQETnDRsg/tmp_tYFKCG1i50/tmp_brO7c92aGz.log
tmp_brO7c92aGz.log: Assertion failed: (flower_builtBlocks(flower)), function checkFlowerIsNotRedundant, file cactus_check.c, line 104.
tmp_brO7c92aGz.log: Traceback (most recent call last):
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/jobTree/bin/jobTreeSlave", line 118, in processJob
tmp_brO7c92aGz.log: jobTree.scriptTree.scriptTree.run(tempJob, l[1], l[2:])
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/jobTree/scriptTree/scriptTree.py", line 42, in run
tmp_brO7c92aGz.log: target.execute(job)
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/jobTree/scriptTree/stack.py", line 218, in execute
tmp_brO7c92aGz.log: target.run()
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/cactus/pipeline/cactus_workflow.py", line 603, in run
tmp_brO7c92aGz.log: runCactusCheck(self.options.cactusDiskDatabaseString, self.flowerNames)
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/cactus/shared/common.py", line 261, in runCactusCheck
tmp_brO7c92aGz.log: system("cactus_check --cactusDisk '%s' %s --logLevel %s %s" % (cactusDiskDatabaseString, " ".join(flowerNames), logLevel, recursive))
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/sonLib/bioio.py", line 160, in system
tmp_brO7c92aGz.log: raise RuntimeError("Command: %s exited with non-zero status %i" % (command, i))
tmp_brO7c92aGz.log: RuntimeError: Command: cactus_check --cactusDisk '<st_kv_database_conf type="tokyo_cabinet"><tokyo_cabinet database_dir="/Users/benedictpaten/sync/eclipse/git/cactus/tmp_uBW8JrJzX2/cactusDisk_ltNzJmnR3v_8615" /></st_kv_database_conf>' 0 --logLevel CRITICAL exited with non-zero status 6
The log file of the slave for the job
Reporting file: /Users/benedictpaten/sync/eclipse/git/cactus/tmp_uBW8JrJzX2/jobTree/slaveLogFileDir/tmp_SQooQj0iGB/tmp_FwDfKsADvx/tmp_Z5faJmOY4Z/tmp_8B18H1AQSB.log
tmp_8B18H1AQSB.log: Caught an exception in the target being run
tmp_8B18H1AQSB.log: Failed the job
tmp_8B18H1AQSB.log: Exiting the slave because of a failed job
Job: /Users/benedictpaten/sync/eclipse/git/cactus/tmp_uBW8JrJzX2/jobTree/jobs/tmp_53iUaRtZmQ/tmp_kp0Tuc0CNf/tmp_vf95XG4pjT/tmp_HouIcP5qFh.xml is completely failed
There are 1 jobs currently in job tree: /Users/benedictpaten/sync/eclipse/git/cactus/tmp_uBW8JrJzX2/jobTree
Colour: red, number of jobs: 1
Reporting file: /Users/benedictpaten/sync/eclipse/git/cactus/tmp_uBW8JrJzX2/jobTree/logFileDir/tmp_0pQlPGGCfM/tmp_LQQETnDRsg/tmp_tYFKCG1i50/tmp_brO7c92aGz.log
tmp_brO7c92aGz.log: Assertion failed: (flower_builtBlocks(flower)), function checkFlowerIsNotRedundant, file cactus_check.c, line 104.
tmp_brO7c92aGz.log: Traceback (most recent call last):
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/jobTree/bin/jobTreeSlave", line 118, in processJob
tmp_brO7c92aGz.log: jobTree.scriptTree.scriptTree.run(tempJob, l[1], l[2:])
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/jobTree/scriptTree/scriptTree.py", line 42, in run
tmp_brO7c92aGz.log: target.execute(job)
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/jobTree/scriptTree/stack.py", line 218, in execute
tmp_brO7c92aGz.log: target.run()
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/cactus/pipeline/cactus_workflow.py", line 603, in run
tmp_brO7c92aGz.log: runCactusCheck(self.options.cactusDiskDatabaseString, self.flowerNames)
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/cactus/shared/common.py", line 261, in runCactusCheck
tmp_brO7c92aGz.log: system("cactus_check --cactusDisk '%s' %s --logLevel %s %s" % (cactusDiskDatabaseString, " ".join(flowerNames), logLevel, recursive))
tmp_brO7c92aGz.log: File "/Users/benedictpaten/sync/eclipse/git/sonLib/bioio.py", line 160, in system
tmp_brO7c92aGz.log: raise RuntimeError("Command: %s exited with non-zero status %i" % (command, i))
tmp_brO7c92aGz.log: RuntimeError: Command: cactus_check --cactusDisk '<st_kv_database_conf type="tokyo_cabinet"><tokyo_cabinet database_dir="/Users/benedictpaten/sync/eclipse/git/cactus/tmp_uBW8JrJzX2/cactusDisk_ltNzJmnR3v_8615" /></st_kv_database_conf>' 0 --logLevel CRITICAL exited with non-zero status 6
E...There are 0 jobs currently in job tree: /Users/benedictpaten/sync/eclipse/git/cactus/tmp_KBCyn7wXID/jobTree

ST_KV_DATABASE_EXCEPTION

I've been able to get the example data to run to completion on my workstation, but not on our cluster yet. I'm using Singularity 3.0.3 with the patch described in #55. We are using SLURM as the batchSystem. I'm running into two (possibly related) problems:

  1. The workflow hangs on the SavePrimaryDB job. There seem to be two jobs running, one for the KtServerService and the second the SavePrimaryDB job. Neither is using any CPU or appears to be doing anything. Is it perhaps hung on writing to some filesystem? I've created a separate issue #60 for this.

  2. Exception: ST_KV_DATABASE_EXCEPTION: Opening connection to host: 127.0.0.1 with error: network error
    I suspect this is due to the fact that the KtServerService job isn't running properly, or isn't running on the same node as the job trying to communicate with it? I'm not sure, but cactus retries the jobs and they seem to eventually succeed, except for the SavePrimaryDB, which simply never completes.

WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'CactusSetupPhase' y/x/jobBBlOvT                          
WARNING:toil.leader:y/x/jobBBlOvT    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---                                                       
WARNING:toil.leader:y/x/jobBBlOvT    INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.                     
WARNING:toil.leader:y/x/jobBBlOvT    WARNING:toil.resource:'JTRES_3ef5b3ea0b822be113fb2928b0c49342' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).                                                                                               WARNING:toil.leader:y/x/jobBBlOvT    INFO:toil.lib.bioio:Sequences in cactus setup: ['simHuman_chr6', 'simMouse_chr6', 'simRat_chr6', 'simCow_chr6', 'simDog_chr6']                                                                                                                
WARNING:toil.leader:y/x/jobBBlOvT    INFO:toil.lib.bioio:Sequences in cactus setup filenames: ['>id=1|simHuman.chr6|0\n', '>id=0|simMouse.chr6\n', '>id=2|simRat.chr6\n', '>id=4|simCow.chr6|0\n', '>id=3|simDog.chr6|0\n']                                                        WARNING:toil.leader:y/x/jobBBlOvT    INFO:cactus.shared.common:Work dirs: set([u'/tmp/toil-36105e42-53df-4919-84a4-5a8802fd18bd-4c0800dd-a2a1-4d1f-84f4-956467d8a8ed/tmpPahA7T/db9e39ee-495d-440c-86dd-74fff75bac68'])                                                             
WARNING:toil.leader:y/x/jobBBlOvT    INFO:cactus.shared.common:Docker work dir: /tmp/toil-36105e42-53df-4919-84a4-5a8802fd18bd-4c0800dd-a2a1-4d1f-84f4-956467d8a8ed/tmpPahA7T/db9e39ee-495d-440c-86dd-74fff75bac68                                                                 
WARNING:toil.leader:y/x/jobBBlOvT    INFO:cactus.shared.common:Running the command ['singularity', '--silent', 'run', u'/tigress/lparsons/kocher_lab/cactus/jobStore/cactus.img', u'cactus_setup', u'--speciesTree', u'((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', u'--cactusDisk', u'<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="127.0.0.1" port="17863" />\n\t\t</st_kv_database_conf>\n\t', u'--logLevel', u'INFO', u'--outgroupEvents', u'simHuman_chr6 simDog_chr6 simCow_chr6', u'tmpjdxHjy.tmp', u'tmpwWt5NL.tmp', u'tmpa3RrMG.tmp', u'tmpzPWIax.tmp', u'tmpLU024t.tmp']                                                                                                      
WARNING:toil.leader:y/x/jobBBlOvT    Running command catchsegv 'cactus_setup' '--speciesTree' '((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;' '--cactusDisk' '<st_kv_database_conf type=kyoto_tycoon> <kyoto_tycoon database_dir=fakepath host=127.0.0.1 port=17863 /> </st_kv_database_conf> ' '--logLevel' 'INFO' '--outgroupEvents' 'simHuman_chr6 simDog_chr6 simCow_chr6' 'tmpjdxHjy.tmp' 'tmpwWt5NL.tmp' 'tmpa3RrMG.tmp' 'tmpzPWIax.tmp' 'tmpLU024t.tmp'                                                             
WARNING:toil.leader:y/x/jobBBlOvT    Set log level to INFO          
WARNING:toil.leader:y/x/jobBBlOvT    Flower disk name : <st_kv_database_conf type=kyoto_tycoon> <kyoto_tycoon database_dir=fakepath host=127.0.0.1 port=17863 /> </st_kv_database_conf>                                                                                            
WARNING:toil.leader:y/x/jobBBlOvT    Sequence file/directory tmpjdxHjy.tmp                                                               
WARNING:toil.leader:y/x/jobBBlOvT    Sequence file/directory tmpwWt5NL.tmp                                                               
WARNING:toil.leader:y/x/jobBBlOvT    Sequence file/directory tmpa3RrMG.tmp                                                               
WARNING:toil.leader:y/x/jobBBlOvT    Sequence file/directory tmpzPWIax.tmp                                                               
WARNING:toil.leader:y/x/jobBBlOvT    Sequence file/directory tmpLU024t.tmp                                                               
WARNING:toil.leader:y/x/jobBBlOvT    Exception: ST_KV_DATABASE_EXCEPTION: Opening connection to host: 127.0.0.1 with error: network error                                                                                                                                          
WARNING:toil.leader:y/x/jobBBlOvT    Uncaught exception             
WARNING:toil.leader:y/x/jobBBlOvT    Traceback (most recent call last):                                                                  
WARNING:toil.leader:y/x/jobBBlOvT      File "/home/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript                                                                                                                          
WARNING:toil.leader:y/x/jobBBlOvT        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)                          
WARNING:toil.leader:y/x/jobBBlOvT      File "/home/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/cactus/shared/common.py", line 1094, in _runner                                                                                                                     
WARNING:toil.leader:y/x/jobBBlOvT        super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)      
WARNING:toil.leader:y/x/jobBBlOvT      File "/home/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner                                                                                                                                 
WARNING:toil.leader:y/x/jobBBlOvT        returnValues = self._run(jobGraph, fileStore)                                                   
WARNING:toil.leader:y/x/jobBBlOvT      File "/home/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/toil/job.py", line 1296, in _run                                                                                                                                    
WARNING:toil.leader:y/x/jobBBlOvT        return self.run(fileStore) 
WARNING:toil.leader:y/x/jobBBlOvT      File "/home/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/cactus/pipeline/cactus_workflow.py", line 641, in run                                                                                                               WARNING:toil.leader:y/x/jobBBlOvT        makeEventHeadersAlphaNumeric=self.getOptionalPhaseAttrib("makeEventHeadersAlphaNumeric", bool, False))                                                                                                                                    
WARNING:toil.leader:y/x/jobBBlOvT      File "/home/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/cactus/shared/common.py", line 220, in runCactusSetup                                                                                                               
WARNING:toil.leader:y/x/jobBBlOvT        parameters=["cactus_setup"] + args + sequences)                                                 
WARNING:toil.leader:y/x/jobBBlOvT      File "/home/lparsons/miniconda3/envs/cactus/lib/python2.7/site-packages/cactus/shared/common.py", line 1038, in cactus_call                                                                                                                 
WARNING:toil.leader:y/x/jobBBlOvT        raise RuntimeError("Command %s failed with output: %s" % (call, output))                        
WARNING:toil.leader:y/x/jobBBlOvT    RuntimeError: Command ['singularity', '--silent', 'run', u'/tigress/lparsons/kocher_lab/cactus/jobStore/cactus.img', u'cactus_setup', u'--speciesTree', u'((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', u'--cactusDisk', u'<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="127.0.0.1" port="17863" />\n\t\t</st_kv_database_conf>\n\t', u'--logLevel', u'INFO', u'--outgroupEvents', u'simHuman_chr6 simDog_chr6 simCow_chr6', u'tmpjdxHjy.tmp', u'tmpwWt5NL.tmp', u'tmpa3RrMG.tmp', u'tmpzPWIax.tmp', u'tmpLU024t.tmp'] failed with output:                                                                                                          
WARNING:toil.leader:y/x/jobBBlOvT    ERROR:toil.worker:Exiting the worker because of a failed job on host della-r4c4n14                  
WARNING:toil.leader:y/x/jobBBlOvT    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'CactusSetupPhase' y/x/jobBBlOvT with ID y/x/jobBBlOvT to 5                                                                                             

error on gridEngine

I don't succeed to make cactus run. I have tried on three different job management systems (SLURM, LSF and now gridEngine). So I'm focusing now only on gridEngine that sounds the most easy to make it run according of different external reason.

We are using toil 3.18.0 (released on October 2nd)

I do

     export TOIL_GRIDENGINE_PE='smp'  
     export TOIL_GRIDENGINE_ARGS='-V -q all.q'  
     cactus --batchSystem gridEngine /projects/annotation/cactus/cactus_intermediate_files /sw/bioinfo/cactus/latest/cactus/examples/evolverMammals.txt outputHal --logFile LOGFILE

and I end up with that message:

INFO:toil.leader:Job ended successfully: 'RunCactusPreprocessorThenProgressiveDown' w/A/job7Y0wzz
INFO:toil.leader:Finished toil run successfully.
Traceback (most recent call last):
  File "/sw/bioinfo/cactus/latest/bin/cactus", line 11, in <module>
    sys.exit(main())
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/cactus/progressive/cactus_progressive.py", line 500, in main
    halID = toil.start(RunCactusPreprocessorThenProgressiveDown(options, project, memory=configWrapper.getDefaultMemory()))
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/common.py", line 784, in start
    return self._runMainLoop(rootJobGraph)
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/common.py", line 1059, in _runMainLoop
    jobCache=self._jobCache).run()
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/leader.py", line 239, in run
    return self.jobStore.getRootJobReturnValue()
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/jobStores/abstractJobStore.py", line 218, in getRootJobReturnValue
    return safeUnpickleFromStream(fH)
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/common.py", line 1360, in safeUnpickleFromStream
    return pickle.loads(string)
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/job.py", line 1763, in __new__
    return cls._resolve(*args)
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/job.py", line 1775, in _resolve
    value = safeUnpickleFromStream(fileHandle)
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/common.py", line 1360, in safeUnpickleFromStream
    return pickle.loads(string)
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/job.py", line 1763, in __new__
    return cls._resolve(*args)
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/job.py", line 1775, in _resolve
    value = safeUnpickleFromStream(fileHandle)
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/common.py", line 1360, in safeUnpickleFromStream
    return pickle.loads(string)
  File "/sw/bioinfo/cactus/d2cae96/lib/python2.7/site-packages/toil/job.py", line 1853, in __setstate__
    "this job as a child/follow-on of {jobName}.".format(jobName=jobName))
RuntimeError: This job was passed a promise that wasn't yet resolved when it ran. The job 'RunCactusPreprocessorThenProgressiveDown2' that fulfills this promise hasn't yet finished. This means that there aren't enough constraints to ensure the current job always runs after 'RunCactusPreprocessorThenProgressiveDown2'. Consider adding a follow-on indirection between this job and its parent, or adding this job as a child/follow-on of 'RunCactusPreprocessorThenProgressiveDown2'.

Any help would be much appreciated.

make error

Hi Benedict,

I got an error in compilation process.

make[1]: Entering directory `/gpfs2/projects/gec/tool/assemblathon1/cactus/api'
cp inc/cactus*.h ../lib//
gcc -std=c99 -O2 -g -Wall -Werror --pedantic -funroll-loops -lm  -I ..//../sonLib/lib -I/projects/gec/tool/assemblathon1/local/include -DHAVE_TOKYO_CABINET=1 -I/projects/gec/tool/assemblathon1/local/include -DHAVE_KYOTO_TYCOON=1    -I inc -I ../lib// -c impl/cactus*.c
cc1: warnings being treated as errors
impl/cactusDisk.c:369: error: รขโ‚ฌหœgetNextStubรขโ‚ฌโ„ข defined but not used
make[1]: *** [../lib//cactusLib.a] Error 1
make[1]: Leaving directory `/gpfs2/projects/gec/tool/assemblathon1/cactus/api'
make: *** [all.api] Error 2

The error may be removed by the definition of getNextStub.
Thank you.

Yun

-lpthread missing with gcc5

Hi again

I think I'm slowly getting there :)

Next issue I got was when compiling caf

cd caf && make all
make[1]: Entering directory `/nfs/production/panda/ensembl/compara/muffato/src/local/brewenv/tok/cactus/caf'
gcc -std=c99 -O3 -g -Wall --pedantic -funroll-loops -DNDEBUG  -I /homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/lib -I inc -I ../lib// -c impl/*.c
impl/giantComponent.c: In function 'stCaf_breakupComponentGreedily':
impl/giantComponent.c:38:13: warning: variable 'edgeScore' set but not used [-Wunused-but-set-variable]
     int64_t edgeScore = INT64_MAX;
             ^
ar rc stCaf.a *.o
ranlib stCaf.a 
rm *.o
mv stCaf.a ../lib//
cp inc/*.h ../lib//
gcc -std=c99 -O3 -g -Wall --pedantic -funroll-loops -DNDEBUG  -I /homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/lib -I inc -I impl -I../lib/ -o ../bin//stCafTests tests/*.c impl/*.c ../lib//stCaf.a ../lib//cactusBlastAlignment.a /homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/lib/stPinchesAndCacti.a /homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/lib/3EdgeConnected.a ../lib//cactusLib.a /homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/lib/sonLib.a /homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/lib/cuTest.a    -lz -lm
impl/giantComponent.c: In function 'stCaf_breakupComponentGreedily':
impl/giantComponent.c:38:13: warning: variable 'edgeScore' set but not used [-Wunused-but-set-variable]
     int64_t edgeScore = INT64_MAX;
             ^
/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/lib/sonLib.a(stThreadPool.o): In function `stThreadPool_construct':
/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: undefined reference to `pthread_create'
/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/lib/sonLib.a(stThreadPool.o):/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/C/impl/stThreadPool.c:144: more undefined references to `pthread_create' follow
/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/lib/sonLib.a(stThreadPool.o): In function `stThreadPool_destruct':
/homes/muffato/src/local/brewenv/tok/cactus/submodules/sonLib/C/impl/stThreadPool.c:195: undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status
make[1]: *** [../bin//stCafTests] Error 1
make[1]: Leaving directory `/nfs/production/panda/ensembl/compara/muffato/src/local/brewenv/tok/cactus/caf'
make: *** [all.caf] Error 2

I fixed that by adding -lphread to dblibs in submodules/sonLib/include.mk but I don't know if that's advised on earlier versions of gcc

Matthieu

Please use increasing version numbers for your release tags

Hello,

on behalf of the Debian Med team I intend to package Cactus for main Debian. I noticed that you use non-numeric release "versions". Unfortunately this is not helpful when trying to detect new versions automatically by Debian system tools. It would be great if your release tags (may be in addition to the current naming tags) would contain some numeric value to guess the sequence of the releases out of the tag name.

Thanks for considering

     Andreas.

Cactus Output Files

Hello there,

May i ask you a question about the output HAL files, please? Am trying to use h5diff to compare 2 output files that are supposed to be the same (resulted from using the same configFile to run) but are generated by different runs, the command i used is:

docker run --rm -ti -v /tmp:/mnt --entrypoint=h5diff hdfgroup/hdf5-json \ /mnt/pestis_output1.hal /mnt/pestis_new_output1.hal

However, so many differences found, the result is as follows:

image

In addition to this one, i have tried many other combinations, e.g.,

  • pestis_output3.hal vs. pestis_new_output3.hal (--configFile blockTrim3.xml),
  • pestis_output.hal vs. pestis_new_output.hal (no --configFile flag used),
  • pestis_output_new_1-0.hal vs. pestis_output_new_1-2.hal (--configFile blockTrim1.xml --minNodes 0 OR --minNodes 2),
  • pestis_output_new_3-0.hal vs. pestis_output_new_3-2.hal (--configFile blockTrim3.xml --minNodes 0 OR --minNodes 2),
  • pestis_output_new_1-0.hal vs. pestis_new_output1.hal , and
  • some others.

However, the results are all similar --- there are many differences found. May i ask you if you happen to know why, please? Thank you!

Sincerely,

bettie

P.S. Would like to attach some output HAL files and configuration files (e.g. blockTrim1.xml, blockTrim3.xml), but it does not allow me to.

Failed to run on test data

Hi, I finally installed Cactus, and when I try running it on the test input, it errors out and I don't know where to begin to debug. Help!

Here's the last of my traceback..

cactus /testing/bla ../cactus/examples/evolverMammals.txt output.hal --binariesMode local

WARNING:toil.leader:r/1/jobi224_I        return self.run(fileStore)
2018-09-19 11:22:01,225 - toil.leader - WARNING - r/1/jobi224_I        return self.run(fileStore)
WARNING:toil.leader:r/1/jobi224_I      File "/net/eichler/vol27/projects/assemblies/nobackups/CAT/Progressive_cactus/cactus_env/lib/python2.7/site-packages/cactus/preprocessor/cactus_preprocessor.py", line 68, in run
2018-09-19 11:22:01,225 - toil.leader - WARNING - r/1/jobi224_I      File "/net/eichler/vol27/projects/assemblies/nobackups/CAT/Progressive_cactus/cactus_env/lib/python2.7/site-packages/cactus/preprocessor/cactus_preprocessor.py", line 68, in run
WARNING:toil.leader:r/1/jobi224_I        parameters=["cactus_checkUniqueHeaders.py"] + args)
2018-09-19 11:22:01,225 - toil.leader - WARNING - r/1/jobi224_I        parameters=["cactus_checkUniqueHeaders.py"] + args)
WARNING:toil.leader:r/1/jobi224_I      File "/net/eichler/vol27/projects/assemblies/nobackups/CAT/Progressive_cactus/cactus_env/lib/python2.7/site-packages/cactus/shared/common.py", line 1026, in cactus_call
2018-09-19 11:22:01,225 - toil.leader - WARNING - r/1/jobi224_I      File "/net/eichler/vol27/projects/assemblies/nobackups/CAT/Progressive_cactus/cactus_env/lib/python2.7/site-packages/cactus/shared/common.py", line 1026, in cactus_call
WARNING:toil.leader:r/1/jobi224_I        raise RuntimeError("Command %s failed with output: %s" % (call, output))
2018-09-19 11:22:01,225 - toil.leader - WARNING - r/1/jobi224_I        raise RuntimeError("Command %s failed with output: %s" % (call, output))
WARNING:toil.leader:r/1/jobi224_I    RuntimeError: Command ['cactus_checkUniqueHeaders.py', u'/tmp/toil-0ef9deff-9b9a-46f3-9865-b86519d512a8-98be900264573415441321ad00000024/tmpqYpVGL/55d34483-063c-4196-86b2-17ccf7fd6efb/tmpyrgjR9.tmp', '--checkAssemblyHub'] failed with output: None
2018-09-19 11:22:01,226 - toil.leader - WARNING - r/1/jobi224_I    RuntimeError: Command ['cactus_checkUniqueHeaders.py', u'/tmp/toil-0ef9deff-9b9a-46f3-9865-b86519d512a8-98be900264573415441321ad00000024/tmpqYpVGL/55d34483-063c-4196-86b2-17ccf7fd6efb/tmpyrgjR9.tmp', '--checkAssemblyHub'] failed with output: None
WARNING:toil.leader:r/1/jobi224_I    ERROR:toil.worker:Exiting the worker because of a failed job on host lynx.grid.gs.washington.edu
2018-09-19 11:22:01,226 - toil.leader - WARNING - r/1/jobi224_I    ERROR:toil.worker:Exiting the worker because of a failed job on host lynx.grid.gs.washington.edu
WARNING:toil.leader:r/1/jobi224_I    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'PreprocessChunk' r/1/jobi224_I with ID r/1/jobi224_I to 0
2018-09-19 11:22:01,226 - toil.leader - WARNING - r/1/jobi224_I    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'PreprocessChunk' r/1/jobi224_I with ID r/1/jobi224_I to 0
WARNING:toil.leader:Job 'PreprocessChunk' r/1/jobi224_I with ID r/1/jobi224_I is completely failed
2018-09-19 11:22:01,227 - toil.leader - WARNING - Job 'PreprocessChunk' r/1/jobi224_I with ID r/1/jobi224_I is completely failed
INFO:toil.leader:Finished toil run with 17 failed jobs.
2018-09-19 11:22:12,347 - toil.leader - INFO - Finished toil run with 17 failed jobs.
INFO:toil.leader:Failed jobs at end of the run: 'PreprocessChunk' B/W/joboqM8h7 'CactusPreprocessor2' c/o/jobMSAhhh 'BatchPreprocessor' u/l/jobUcfIcj 'BatchPreprocessor' E/D/jobQFfn5U 'BatchPreprocessor' I/F/job8fkCjn 'PreprocessChunk' 1/A/jobbFBkPX 'PreprocessChunk' r/1/jobi224_I 'CactusPreprocessor2' t/a/jobjOIm32 'CactusPreprocessor2' E/u/jobH2l429 'CactusPreprocessor2' q/8/jobMariZY 'PreprocessChunk' M/f/job_OneN8 'CactusPreprocessor2' 8/e/job6aEHdY 'PreprocessChunk' k/R/jobVJ72dn 'BatchPreprocessor' G/d/jobrw1xtx 'BatchPreprocessor' B/T/jobsSib1Q 'RunCactusPreprocessorThenProgressiveDown' 1/2/job8arKHI 'CactusPreprocessor' v/H/job5h0phF
2018-09-19 11:22:12,348 - toil.leader - INFO - Failed jobs at end of the run: 'PreprocessChunk' B/W/joboqM8h7 'CactusPreprocessor2' c/o/jobMSAhhh 'BatchPreprocessor' u/l/jobUcfIcj 'BatchPreprocessor' E/D/jobQFfn5U 'BatchPreprocessor' I/F/job8fkCjn 'PreprocessChunk' 1/A/jobbFBkPX 'PreprocessChunk' r/1/jobi224_I 'CactusPreprocessor2' t/a/jobjOIm32 'CactusPreprocessor2' E/u/jobH2l429 'CactusPreprocessor2' q/8/jobMariZY 'PreprocessChunk' M/f/job_OneN8 'CactusPreprocessor2' 8/e/job6aEHdY 'PreprocessChunk' k/R/jobVJ72dn 'BatchPreprocessor' G/d/jobrw1xtx 'BatchPreprocessor' B/T/jobsSib1Q 'RunCactusPreprocessorThenProgressiveDown' 1/2/job8arKHI 'CactusPreprocessor' v/H/job5h0phF
Traceback (most recent call last):
  File "/net/eichler/vol27/projects/assemblies/nobackups/CAT/Progressive_cactus/cactus_env/bin/cactus", line 11, in <module>
    load_entry_point('progressiveCactus==1.0', 'console_scripts', 'cactus')()
  File "/net/eichler/vol27/projects/assemblies/nobackups/CAT/Progressive_cactus/cactus_env/lib/python2.7/site-packages/cactus/progressive/cactus_progressive.py", line 493, in main
    halID = toil.start(RunCactusPreprocessorThenProgressiveDown(options, project, memory=configWrapper.getDefaultMemory()))
  File "/net/eichler/vol27/projects/assemblies/nobackups/CAT/Progressive_cactus/cactus_env/lib/python2.7/site-packages/toil/common.py", line 783, in start
    return self._runMainLoop(rootJobGraph)
  File "/net/eichler/vol27/projects/assemblies/nobackups/CAT/Progressive_cactus/cactus_env/lib/python2.7/site-packages/toil/common.py", line 1057, in _runMainLoop
    jobCache=self._jobCache).run()
  File "/net/eichler/vol27/projects/assemblies/nobackups/CAT/Progressive_cactus/cactus_env/lib/python2.7/site-packages/toil/leader.py", line 242, in run
    raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore)
toil.leader.FailedJobsException
 

AlignFastaFragments failed jobs

Hi,

I'm testing Cactus. I already tried the example you provide and it works. Now I'm trying to use it on a very small dataset: 3 very closely related genomes, 2 of them the same species. Unfortunately Cactus is not work properly and returns me several warnings and errors. This is one of many.

WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'AlignFastaFragments' N/C/jobdFq6BM
2018-09-26 12:31:07,637 - toil.leader - WARNING - The job seems to have left a log file, indicating failure: 'AlignFastaFragments' N/C/jobdFq6BM
WARNING:toil.leader:N/C/jobdFq6BM    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
2018-09-26 12:31:07,638 - toil.leader - WARNING - N/C/jobdFq6BM    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:N/C/jobdFq6BM    INFO:toil:Running Toil version 3.17.0-585383ed5c4453b556269818571d3c7419c613b0.
2018-09-26 12:31:07,638 - toil.leader - WARNING - N/C/jobdFq6BM    INFO:toil:Running Toil version 3.17.0-585383ed5c4453b556269818571d3c7419c613b0.
WARNING:toil.leader:N/C/jobdFq6BM    WARNING:toil.resource:'JTRES_89337c6db946d31e57196f911b74762f' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
2018-09-26 12:31:07,639 - toil.leader - WARNING - N/C/jobdFq6BM    WARNING:toil.resource:'JTRES_89337c6db946d31e57196f911b74762f' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:N/C/jobdFq6BM    INFO:cactus.shared.common:Work dirs: set([])
2018-09-26 12:31:07,639 - toil.leader - WARNING - N/C/jobdFq6BM    INFO:cactus.shared.common:Work dirs: set([])
WARNING:toil.leader:N/C/jobdFq6BM    INFO:cactus.shared.common:Docker work dir: .
2018-09-26 12:31:07,639 - toil.leader - WARNING - N/C/jobdFq6BM    INFO:cactus.shared.common:Docker work dir: .
WARNING:toil.leader:N/C/jobdFq6BM    INFO:cactus.shared.common:Running the command ['singularity', '--silent', 'run', u'/rds/user/fc464/hpc-work/TetramoriumProject/jobStore.Cactus.Talpestre3Genomes.Cactus/cactus.img', 'cPecanLastz', u'tmpnfN6c2.tmp[multiple][nameparse=darkspace]', u'tmpEguE1j.tmp[nameparse=darkspace]', '--querydepth=keep,nowarn:13', '--format=general:name1,zstart1,end1,name2,zstart2+,end2+', '--markend']
2018-09-26 12:31:07,639 - toil.leader - WARNING - N/C/jobdFq6BM    INFO:cactus.shared.common:Running the command ['singularity', '--silent', 'run', u'/rds/user/fc464/hpc-work/TetramoriumProject/jobStore.Cactus.Talpestre3Genomes.Cactus/cactus.img', 'cPecanLastz', u'tmpnfN6c2.tmp[multiple][nameparse=darkspace]', u'tmpEguE1j.tmp[nameparse=darkspace]', '--querydepth=keep,nowarn:13', '--format=general:name1,zstart1,end1,name2,zstart2+,end2+', '--markend']
WARNING:toil.leader:N/C/jobdFq6BM    Running command catchsegv 'cPecanLastz' 'tmpnfN6c2.tmp[multiple][nameparse=darkspace]' 'tmpEguE1j.tmp[nameparse=darkspace]' '--querydepth=keep,nowarn:13' '--format=general:name1,zstart1,end1,name2,zstart2+,end2+' '--markend'
2018-09-26 12:31:07,639 - toil.leader - WARNING - N/C/jobdFq6BM    Running command catchsegv 'cPecanLastz' 'tmpnfN6c2.tmp[multiple][nameparse=darkspace]' 'tmpEguE1j.tmp[nameparse=darkspace]' '--querydepth=keep,nowarn:13' '--format=general:name1,zstart1,end1,name2,zstart2+,end2+' '--markend'
WARNING:toil.leader:N/C/jobdFq6BM    FAILURE: bad fasta character in tmpnfN6c2.tmp, >scaffold_16|848048 (uppercase R)
2018-09-26 12:31:07,639 - toil.leader - WARNING - N/C/jobdFq6BM    FAILURE: bad fasta character in tmpnfN6c2.tmp, >scaffold_16|848048 (uppercase R)
WARNING:toil.leader:N/C/jobdFq6BM    remove or replace non-ACGTN characters or consider using --ambiguous=iupac
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM    remove or replace non-ACGTN characters or consider using --ambiguous=iupac
WARNING:toil.leader:N/C/jobdFq6BM    Traceback (most recent call last):
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM    Traceback (most recent call last):
WARNING:toil.leader:N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:N/C/jobdFq6BM        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/cactus/shared/common.py", line 1082, in _runner
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/cactus/shared/common.py", line 1082, in _runner
WARNING:toil.leader:N/C/jobdFq6BM        super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM        super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/toil/job.py", line 1350, in _runner
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/toil/job.py", line 1350, in _runner
WARNING:toil.leader:N/C/jobdFq6BM        returnValues = self._run(jobGraph, fileStore)
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM        returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/toil/job.py", line 1295, in _run
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/toil/job.py", line 1295, in _run
WARNING:toil.leader:N/C/jobdFq6BM        return self.run(fileStore)
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM        return self.run(fileStore)
WARNING:toil.leader:N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/cactus/preprocessor/lastzRepeatMasking/cactus_lastzRepeatMask.py", line 80, in run
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/cactus/preprocessor/lastzRepeatMasking/cactus_lastzRepeatMask.py", line 80, in run
WARNING:toil.leader:N/C/jobdFq6BM        "--markend"])
2018-09-26 12:31:07,640 - toil.leader - WARNING - N/C/jobdFq6BM        "--markend"])
WARNING:toil.leader:N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/cactus/shared/common.py", line 1026, in cactus_call
2018-09-26 12:31:07,641 - toil.leader - WARNING - N/C/jobdFq6BM      File "/home/fc464/software/cactus/cactus_env/lib/python2.7/site-packages/cactus/shared/common.py", line 1026, in cactus_call
WARNING:toil.leader:N/C/jobdFq6BM        raise RuntimeError("Command %s failed with output: %s" % (call, output))
2018-09-26 12:31:07,641 - toil.leader - WARNING - N/C/jobdFq6BM        raise RuntimeError("Command %s failed with output: %s" % (call, output))
WARNING:toil.leader:N/C/jobdFq6BM    RuntimeError: Command ['singularity', '--silent', 'run', u'/rds/user/fc464/hpc-work/TetramoriumProject/jobStore.Cactus.Talpestre3Genomes.Cactus/cactus.img', 'cPecanLastz', u'tmpnfN6c2.tmp[multiple][nameparse=darkspace]', u'tmpEguE1j.tmp[nameparse=darkspace]', '--querydepth=keep,nowarn:13', '--format=general:name1,zstart1,end1,name2,zstart2+,end2+', '--markend'] failed with output: None
2018-09-26 12:31:07,641 - toil.leader - WARNING - N/C/jobdFq6BM    RuntimeError: Command ['singularity', '--silent', 'run', u'/rds/user/fc464/hpc-work/TetramoriumProject/jobStore.Cactus.Talpestre3Genomes.Cactus/cactus.img', 'cPecanLastz', u'tmpnfN6c2.tmp[multiple][nameparse=darkspace]', u'tmpEguE1j.tmp[nameparse=darkspace]', '--querydepth=keep,nowarn:13', '--format=general:name1,zstart1,end1,name2,zstart2+,end2+', '--markend'] failed with output: None
WARNING:toil.leader:N/C/jobdFq6BM    ERROR:toil.worker:Exiting the worker because of a failed job on host cpu-e-470
2018-09-26 12:31:07,641 - toil.leader - WARNING - N/C/jobdFq6BM    ERROR:toil.worker:Exiting the worker because of a failed job on host cpu-e-470
WARNING:toil.leader:N/C/jobdFq6BM    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'AlignFastaFragments' N/C/jobdFq6BM with ID N/C/jobdFq6BM to 5
2018-09-26 12:31:07,641 - toil.leader - WARNING - N/C/jobdFq6BM    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'AlignFastaFragments' N/C/jobdFq6BM with ID N/C/jobdFq6BM to 5

The whole log is attached.
F

Cactus.slurm.log

Toil workers get stuck in a message spree when running Cactus distributed on an SGE queue

I'm opening a separate thread for an issue, as suggested by @adamnovak here: #63 (comment).

I've been testing Cactus using the evolverMammals example, which uses 5 small FASTA files (simCow.chr6, simDog.chr6 etc.). Each file takes around 600 KB of space. I'm trying to distribute the analysis on an SGE queue.

Sometimes, the Toil workers get stuck in some sort of a message loop. This doesn't happen on every run, but I've had it happen a few times before, ever since I started testing Cactus.

Basically, at some stages during the analysis where messages are being passed around (INFO:toil.statsAndLogging:Got message from job at time [...]), some Toil worker appears to get stuck in a message spree, where it keeps on sending around 10 messages per second for the remainder of the run.

I always cancel the job, the moment I notice the message spree. Here's part of the output of a job where this happened (to keep it under 512 KB for pastebin): https://pastebin.com/Dn7ZZwPk.

I also saw some other messages scattered between the neverending ones, which suggests that other jobs may keep on going. However, I wouldn't want to leave the job running after those messages start appearing over and over.

[...]
INFO:toil.leader:Job ended successfully: 'logAssemblyStats' C/h/jobixfkF9
INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:49: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:49: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:50: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:50: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:50: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.leader:Job ended successfully: 'logAssemblyStats' d/u/job0MEcOB
INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:50: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:50: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:50: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:50: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:50: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:50: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.leader:Job ended successfully: 'ProgressiveDown' a/7/jobxNaBa2
INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:51: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

INFO:toil.leader:Issued job 'ProgressiveNext' D/L/jobtHeuaH with job batch system ID: 90 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 15:00:51: After preprocessing, got assembly stats for genome simMouse_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-59304c1f-da00-4705-8393-061318b6effa-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpJR1ln1/26a8498d-ad6f-404f-925a-83d8589e5868/tmpuVeBAB.tmp Total-sequences: 1 Total-length: 636262 Proportion-repeat-masked: 0.056477 ProportionNs: 0.000000 Total-Ns: 0 N50: 636262 Median-sequence-length: 636262 Max-sequence-length: 636262 Min-sequence-length: 636262

I resubmitted the job, and the message loop happened again, at the same step, the only difference being that this time the message was referring a different input file (simRat_chr6):

INFO:toil.statsAndLogging:Got message from job at time 03-21-2019 16:03:58: Before preprocessing, got assembly stats for genome simRat_chr6: Input-sample: /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-80892356-ece5-40db-aab7-a884cafc14d1-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmpyfsLD3/9c8f168d-8d75-495c-b4d5-5c77f3f4896b/tmpIMAbGv.tmp Total-sequences: 1 Total-length: 647215 Proportion-repeat-masked: 0.075276 ProportionNs: 0.000000 Total-Ns: 0 N50: 647215 Median-sequence-length: 647215 Max-sequence-length: 647215 Min-sequence-length: 647215

On a third attempt, the analysis progressed after this stage, without getting stuck in the message loop. Overall, I think this happens around 1 in 3 or 1 in 4 attempts of running Cactus.

KtServerService error

Hi,

I'm having an error with KtServerService when running cactus on a dataset with 7 genomes. I was able to run cactus on 2 genomes without encountering this issue. Here are some parts of the logs I thought would be relevant :

INFO:toil.leader:Issued job 'StartPrimaryDB' n/C/job05D_Ff with job batch system ID: 0 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.leader:Job ended successfully: 'StartPrimaryDB' n/C/job05D_Ff
INFO:toil.leader:Issued job 'KtServerService' g/f/job6l7svt with job batch system ID: 1 and cores: 0, disk: 2.0 G, and memory: 4.3 G
INFO:toil.leader:Job ended successfully: 'KtServerService' g/f/job6l7svt
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'KtServerService' g/f/job6l7svt
WARNING:toil.leader:g/f/job6l7svt INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:g/f/job6l7svt INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
WARNING:toil.leader:g/f/job6l7svt WARNING:toil.resource:'JTRES_01944ae18e37adfa915b306be00f52e1' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:g/f/job6l7svt WARNING:toil.resource:'JTRES_01944ae18e37adfa915b306be00f52e1' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:g/f/job6l7svt INFO:cactus.shared.common:Running the command ['netstat', '-tuplen']
WARNING:toil.leader:g/f/job6l7svt (No info could be read for "-p": geteuid()=24224 but you should be root.)
WARNING:toil.leader:g/f/job6l7svt INFO:cactus.shared.common:Running the command ['ktserver', '-port', '30496', '-ls', '-tout', '200000', '-th', '64', '-bgs', u'/tmp/toil-d112ee36-cbf0-4c80-ad4f-0366e64760c3-d94d914f-1b6f-4938-8699-7357c3f3f800/tmpCFu2R3/21cf6aff-a6b2-4ef3-b0a1-d78f892741db/tbfTb5i/snapshot', '-bgsc', 'lzo', '-bgsi', '1000000', '-log', u'/tmp/toil-d112ee36-cbf0-4c80-ad4f-0366e64760c3-d94d914f-1b6f-4938-8699-7357c3f3f800/tmpCFu2R3/21cf6aff-a6b2-4ef3-b0a1-d78f892741db/tmpPyS64o.tmp', ':#opts=ls#bnum=30m#msiz=50g#ktopts=p']
WARNING:toil.leader:g/f/job6l7svt CRITICAL:toil.lib.bioio:Error starting ktserver.
WARNING:toil.leader:g/f/job6l7svt INFO:cactus.shared.common:Running the command ['ktremotemgr', 'remove', '-port', '30496', '-host', '10.0.20.57', 'TERMINATE']
WARNING:toil.leader:g/f/job6l7svt ktremotemgr: DB::remove failed: 10.0.20.57:30496: 3: logical inconsistency: DB: 7: no record: no record
WARNING:toil.leader:g/f/job6l7svt Process ServerProcess-1:
WARNING:toil.leader:g/f/job6l7svt Traceback (most recent call last):
WARNING:toil.leader:g/f/job6l7svt File "/software/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
WARNING:toil.leader:g/f/job6l7svt self.run()
WARNING:toil.leader:g/f/job6l7svt File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 82, in run
WARNING:toil.leader:g/f/job6l7svt self.tryRun(*self.args, **self.kwargs)
WARNING:toil.leader:g/f/job6l7svt File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 102, in tryRun
WARNING:toil.leader:g/f/job6l7svt cactus_call(parameters=["ktremotemgr", "remove"] + getRemoteParams(dbElem) + ["TERMINATE"])
WARNING:toil.leader:g/f/job6l7svt File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/cactus/shared/common.py", line 1038, in cactus_call
WARNING:toil.leader:g/f/job6l7svt raise RuntimeError("Command %s failed with output: %s" % (call, output))
WARNING:toil.leader:g/f/job6l7svt RuntimeError: Command ['ktremotemgr', 'remove', '-port', '30496', '-host', '10.0.20.57', 'TERMINATE'] failed with output: None
WARNING:toil.leader:g/f/job6l7svt CRITICAL:toil.lib.bioio:Error starting ktserver.
WARNING:toil.leader:g/f/job6l7svt Traceback (most recent call last):
WARNING:toil.leader:g/f/job6l7svt File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:g/f/job6l7svt job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:g/f/job6l7svt File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:g/f/job6l7svt returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:g/f/job6l7svt File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/toil/job.py", line 1694, in _run
WARNING:toil.leader:g/f/job6l7svt returnValues = self.run(fileStore)
WARNING:toil.leader:g/f/job6l7svt File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/toil/job.py", line 1644, in run
WARNING:toil.leader:g/f/job6l7svt startCredentials = service.start(self)
WARNING:toil.leader:g/f/job6l7svt File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/cactus/pipeline/ktserverToil.py", line 33, in start
WARNING:toil.leader:g/f/job6l7svt snapshotExportID=snapshotExportID)
WARNING:toil.leader:g/f/job6l7svt File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 62, in runKtserver
WARNING:toil.leader:g/f/job6l7svt raise RuntimeError("Unable to launch ktserver in time. Log: %s" % log)
WARNING:toil.leader:g/f/job6l7svt RuntimeError: Unable to launch ktserver in time. Log: 2018-11-23T11:23:32.793822+01:00: [SYSTEM]: ================ [START]: pid=5130
WARNING:toil.leader:g/f/job6l7svt 2018-11-23T11:23:32.794104+01:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
WARNING:toil.leader:g/f/job6l7svt 2018-11-23T11:23:32.794554+01:00: [SYSTEM]: applying a snapshot file: db=0 ts=1542963976113000000 count=6467072 size=4381380962
WARNING:toil.leader:g/f/job6l7svt 2018-11-23T11:23:33.493808+01:00: [ERROR]: [DB]: :: 9: system error: too short region
WARNING:toil.leader:g/f/job6l7svt 2018-11-23T11:23:33.493932+01:00: [ERROR]: could not apply a snapshot: system error: too short region
WARNING:toil.leader:g/f/job6l7svt 2018-11-23T11:23:33.494107+01:00: [SYSTEM]: starting the server: expr=:30496
WARNING:toil.leader:g/f/job6l7svt 2018-11-23T11:23:33.494221+01:00: [SYSTEM]: server socket opened: expr=:30496 timeout=200000.0
WARNING:toil.leader:g/f/job6l7svt 2018-11-23T11:23:33.494251+01:00: [SYSTEM]: listening server socket started: fd=4
WARNING:toil.leader:g/f/job6l7svt
WARNING:toil.leader:g/f/job6l7svt ERROR:toil.worker:Exiting the worker because of a failed job on host dee-serv07.vital-it.ch
WARNING:toil.leader:g/f/job6l7svt WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'KtServerService' g/f/job6l7svt with ID g/f/job6l7svt to 5

This error message is repeated many times in the log. After all these messages, the logs end with this message :

INFO:toil.leader:Finished toil run with 13 failed jobs.
INFO:toil.leader:Failed jobs at end of the run: 'ProgressiveUp' A/r/job5PkiIS 'ProgressiveDown' k/l/jobdOQ_ns 'StartPrimaryDB' n/C/job05D_Ff 'CactusReferenceCheckpoint' C/D/jobBHqaG0 'ProgressiveDown' O/u/jobfg6RgM 'CactusBarCheckpoint' L/D/jobdgkHCw 'CactusTrimmingBlastPhase' t/b/job2IlD62 'ProgressiveDown' W/R/jobbZsCWY 'ProgressiveNext' R/k/job3m5wZm 'RunCactusPreprocessorThenProgressiveDown' T/P/jobzfQQNe 'KtServerService' d/C/jobbFkXe9 'CactusSetupCheckpoint' e/e/job7mK2be 'RunCactusPreprocessorThenProgressiveDown2' x/U/jobJAPrbP
Traceback (most recent call last):
File "/Home/rferon/tools/cactus/cactus_env/bin/cactus", line 11, in
sys.exit(main())
File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/cactus/progressive/cactus_progressive.py", line 458, in main
halID = toil.restart()
File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/toil/common.py", line 816, in restart
return self._runMainLoop(rootJobGraph)
File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/toil/common.py", line 1059, in _runMainLoop
jobCache=self._jobCache).run()
File "/Home/rferon/tools/cactus/cactus_env/lib/python2.7/site-packages/toil/leader.py", line 237, in run
raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore)
toil.leader.FailedJobsException

I'm running cactus on a cluster in local dependency mode (the dependencies were installed on the cluster). Here is the command :

cactus --restart --binariesMode local --maxCores 32 temp cactus_test_7_genomes.txt cactus_test_7_genomes.hal

Does this error come from the database, or is it another problem (suggested by this line : [DB]: :: 9: system error: too short region)
Thanks !

Runtime estimation

Under the subheader System Requirements, it says;

For primate-sized genomes (3 gigabases each), you should expect Cactus to use approximately 120 CPU-days of compute per genome, with about 120 GB of RAM used at peak. The requirements scale roughly quadratically, so aligning two 1-megabase bacterial genomes takes only 1.5 CPU-hours and 14 GB RAM

If I were to try and align 3, 4 or 5 genomes, each ~ 300-400MB in size, could you please explain how you arrive at your best guesstimates for RAM and run time in cpu hours. The quadratic scaling explanation is not quite clear in this example, hence this request.

Also, is there a sense of how repeat content % in genome inputs may influence these calculations?

Thanks!

Running (progressive) cactus via Docker

Hi there. I'm new to cactus and have been trying to get it up and running via the Docker image on quay.io:

docker pull quay.io/comparative-genomics-toolkit/cactus    

Unfortunately, the program does not progress beyond parsing the seqFile. Specifically, when I run cactus as follows:

docker run -v /cactus:/cactus:Z quay.io/comparative-genomics-toolkit/cactus \  
    /cactus/examples/evolverMammals.txt \    
    /cactus/alignment.hal \    
    --maxCpus 4    

where /cactus/examples/evolverMammals.txt is the example from the repository, I get the error:

/cactus/examples/evolverMammals.txt: 1: /cactus/examples/evolverMammals.txt: Syntax error: word unexpected (expecting ")")

So for whatever reason it's failing to parse the Newick tree. If I delete the line containing the Newick tree, then I get the error:

/cactus/examples/evolverMammals.txt: 1: /cactus/examples/evolverMammals.txt: simCow_chr6: not found
/cactus/examples/evolverMammals.txt: 2: /cactus/examples/evolverMammals.txt: simDog_chr6: not found
/cactus/examples/evolverMammals.txt: 3: /cactus/examples/evolverMammals.txt: simHuman_chr6: not found
/cactus/examples/evolverMammals.txt: 4: /cactus/examples/evolverMammals.txt: simMouse_chr6: not found
/cactus/examples/evolverMammals.txt: 5: /cactus/examples/evolverMammals.txt: simRat_chr6: not found

I get the same errors if I try to load my own seqFile where the sequence paths are in a subdirectory of /cactus.

I presume I'm doing something wrong, but I haven't been able to find any documentation on running progressive cactus via the Docker image. Is my docker run ... command even running progressive cactus? Any insight would be greatly appreciated!

The host OS is Arch Linux with kernel version 4.18.1.

KTServer connection fails (ST_KV_DATABASE_EXCEPTION) while running Cactus on an SGE cluster

Hello,

I'm trying to run the evolverMammals example on an SGE cluster (where Docker and Singularity aren't supported) using the latest version of Cactus installed through git. My problem seems similar to another recent issue report: #57.

In case it's relevant, here are a few notes about how I installed Cactus. I first compiled the older version (progressiveCactus) from GitHub, because this automatically downloads and compiles the needed dependencies, including Kyoto Tycoon (the newest version of Cactus doesn't include this). I then sourced the environment from progressiveCactus, compiled the newer version of Cactus and installed it via Pip into a freshly created Conda environment.

The evolverMammals test works fine for me on a single node, i.e. when running the following through qsub:

cactus --binariesMode local cactusWork evolverMammals-offline.txt evolverMammals.hal --root mr

However, things are failing when running distributed on multiple nodes of an SGE queue, as such:

cactus --binariesMode local cactusWork evolverMammals-offline.txt evolverMammals.hal --root mr --batchSystem gridEngine --workDir /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp --logInfo --logFile cactus.log --maxCores 32 --disableCaching

Below are some of the errors I get. There are multiple retries, but the job never manages to continue successfully. The cluster nodes should be able to communicate to each other, so I'm not sure about what could cause the ST_KV_DATABASE_EXCEPTION messages.

How can I get Cactus and KTServer to work properly when running with --batchSystem gridEngine?

INFO:toil.leader:Issued job 'StartPrimaryDB' D/F/jobwjfXJl with job batch system ID: 150 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.leader:Job ended successfully: 'StartPrimaryDB' D/F/jobwjfXJl
INFO:toil.leader:Issued job 'KtServerService' B/T/jobNX_Wvk with job batch system ID: 151 and cores: 0, disk: 2.0 G, and memory: 2.3 G
INFO:toil.leader:Issued job 'CactusSetupPhase' G/Y/jobmegNP3 with job batch system ID: 152 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.leader:Job ended successfully: 'KtServerService' B/T/jobNX_Wvk
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'KtServerService' B/T/jobNX_Wvk
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
WARNING:toil.leader:B/T/jobNX_Wvk    WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:B/T/jobNX_Wvk    WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:cactus.shared.common:Running the command ['netstat', '-tuplen']
WARNING:toil.leader:B/T/jobNX_Wvk    (No info could be read for "-p": geteuid()=98354 but you should be root.)
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:cactus.shared.common:Running the command ['ktserver', '-port', '29439', '-ls', '-tout', '200000', '-th', '64', '-bgs', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-6386a5c9-5d92-486a-9720-412b1ca610f6/tmpxersIz/e6b71d4f-cc17-405b-9945-bf74e2503b84/t7jtQpA/snapshot', '-bgsc', 'lzo', '-bgsi', '1000000', '-log', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-6386a5c9-5d92-486a-9720-412b1ca610f6/tmpxersIz/e6b71d4f-cc17-405b-9945-bf74e2503b84/tmpdU7AZM.tmp', ':#opts=ls#bnum=30m#msiz=50g#ktopts=p']
WARNING:toil.leader:B/T/jobNX_Wvk    terminate called after throwing an instance of 'std::runtime_error'
WARNING:toil.leader:B/T/jobNX_Wvk      what():  pthread_create
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:cactus.shared.common:Running the command ['ktremotemgr', 'get', '-port', '29439', '-host', '172.16.13.37', 'TERMINATE']
WARNING:toil.leader:B/T/jobNX_Wvk    Process ServerProcess-1:
WARNING:toil.leader:B/T/jobNX_Wvk    Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
WARNING:toil.leader:B/T/jobNX_Wvk        self.run()
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 82, in run
WARNING:toil.leader:B/T/jobNX_Wvk        self.tryRun(*self.args, **self.kwargs)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 118, in tryRun
WARNING:toil.leader:B/T/jobNX_Wvk        raise RuntimeError("KTServer failed. Log: %s" % f.read())
WARNING:toil.leader:B/T/jobNX_Wvk    RuntimeError: KTServer failed. Log: 2019-03-08T10:07:26.636823+02:00: [SYSTEM]: ================ [START]: pid=20742
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:07:26.637007+02:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:07:26.638447+02:00: [SYSTEM]: starting the server: expr=:29439
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:07:26.638549+02:00: [SYSTEM]: server socket opened: expr=:29439 timeout=200000.0
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:07:26.638575+02:00: [SYSTEM]: listening server socket started: fd=4
WARNING:toil.leader:B/T/jobNX_Wvk    
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:cactus.shared.common:Running the command ['ktremotemgr', 'set', '-port', '29439', '-host', '172.16.13.37', 'TERMINATE', '1']
WARNING:toil.leader:B/T/jobNX_Wvk    ktremotemgr: DB::open failed: : 6: network error: connection failed
WARNING:toil.leader:B/T/jobNX_Wvk    Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:B/T/jobNX_Wvk        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:B/T/jobNX_Wvk        returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1694, in _run
WARNING:toil.leader:B/T/jobNX_Wvk        returnValues = self.run(fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1673, in run
WARNING:toil.leader:B/T/jobNX_Wvk        if not service.check():
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverToil.py", line 55, in check
WARNING:toil.leader:B/T/jobNX_Wvk        raise RuntimeError(msg)
WARNING:toil.leader:B/T/jobNX_Wvk    RuntimeError: Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 82, in run
WARNING:toil.leader:B/T/jobNX_Wvk        self.tryRun(*self.args, **self.kwargs)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 118, in tryRun
WARNING:toil.leader:B/T/jobNX_Wvk        raise RuntimeError("KTServer failed. Log: %s" % f.read())
WARNING:toil.leader:B/T/jobNX_Wvk    RuntimeError: KTServer failed. Log: 2019-03-08T10:07:26.636823+02:00: [SYSTEM]: ================ [START]: pid=20742
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:07:26.637007+02:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:07:26.638447+02:00: [SYSTEM]: starting the server: expr=:29439
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:07:26.638549+02:00: [SYSTEM]: server socket opened: expr=:29439 timeout=200000.0
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:07:26.638575+02:00: [SYSTEM]: listening server socket started: fd=4
WARNING:toil.leader:B/T/jobNX_Wvk    
WARNING:toil.leader:B/T/jobNX_Wvk    
WARNING:toil.leader:B/T/jobNX_Wvk    ERROR:toil.worker:Exiting the worker because of a failed job on host haswell-wn37.grid.pub.ro
WARNING:toil.leader:B/T/jobNX_Wvk    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'KtServerService' B/T/jobNX_Wvk with ID B/T/jobNX_Wvk to 5
INFO:toil.leader:Issued job 'KtServerService' B/T/jobNX_Wvk with job batch system ID: 153 and cores: 0, disk: 2.0 G, and memory: 2.3 G
INFO:toil.leader:Job ended successfully: 'CactusSetupPhase' G/Y/jobmegNP3
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'CactusSetupPhase' G/Y/jobmegNP3
WARNING:toil.leader:G/Y/jobmegNP3    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:G/Y/jobmegNP3    INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
WARNING:toil.leader:G/Y/jobmegNP3    WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:G/Y/jobmegNP3    INFO:toil.lib.bioio:Sequences in cactus setup: ['simHuman_chr6', 'simMouse_chr6', 'simRat_chr6', 'simCow_chr6', 'simDog_chr6']
WARNING:toil.leader:G/Y/jobmegNP3    INFO:toil.lib.bioio:Sequences in cactus setup filenames: ['>id=1|simHuman.chr6|0\n', '>id=0|simMouse.chr6\n', '>id=2|simRat.chr6\n', '>id=4|simCow.chr6|0\n', '>id=3|simDog.chr6|0\n']
WARNING:toil.leader:G/Y/jobmegNP3    INFO:cactus.shared.common:Running the command ['cactus_setup', '--speciesTree', '((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', '--cactusDisk', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />\n\t\t</st_kv_database_conf>\n\t', '--logLevel', 'INFO', '--outgroupEvents', 'simHuman_chr6 simDog_chr6 simCow_chr6', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp3wGM7F.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpqkriEI.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpo20GAf.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpL6ca4z.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp5PbyOe.tmp']
WARNING:toil.leader:G/Y/jobmegNP3    Set log level to INFO
WARNING:toil.leader:G/Y/jobmegNP3    Flower disk name : <st_kv_database_conf type="kyoto_tycoon">
WARNING:toil.leader:G/Y/jobmegNP3    			<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />
WARNING:toil.leader:G/Y/jobmegNP3    		</st_kv_database_conf>
WARNING:toil.leader:G/Y/jobmegNP3    	
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp3wGM7F.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpqkriEI.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpo20GAf.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpL6ca4z.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp5PbyOe.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Exception: ST_KV_DATABASE_EXCEPTION: Opening connection to host: 172.16.13.37 with error: network error
WARNING:toil.leader:G/Y/jobmegNP3    Uncaught exception
WARNING:toil.leader:G/Y/jobmegNP3    Traceback (most recent call last):
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:G/Y/jobmegNP3        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 1096, in _runner
WARNING:toil.leader:G/Y/jobmegNP3        super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:G/Y/jobmegNP3        returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1296, in _run
WARNING:toil.leader:G/Y/jobmegNP3        return self.run(fileStore)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/cactus_workflow.py", line 641, in run
WARNING:toil.leader:G/Y/jobmegNP3        makeEventHeadersAlphaNumeric=self.getOptionalPhaseAttrib("makeEventHeadersAlphaNumeric", bool, False))
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 220, in runCactusSetup
WARNING:toil.leader:G/Y/jobmegNP3        parameters=["cactus_setup"] + args + sequences)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 1040, in cactus_call
WARNING:toil.leader:G/Y/jobmegNP3        raise RuntimeError("Command %s failed with output: %s" % (call, output))
WARNING:toil.leader:G/Y/jobmegNP3    RuntimeError: Command ['cactus_setup', '--speciesTree', '((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', '--cactusDisk', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />\n\t\t</st_kv_database_conf>\n\t', '--logLevel', 'INFO', '--outgroupEvents', 'simHuman_chr6 simDog_chr6 simCow_chr6', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp3wGM7F.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpqkriEI.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpo20GAf.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpL6ca4z.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp5PbyOe.tmp'] failed with output: 
WARNING:toil.leader:G/Y/jobmegNP3    ERROR:toil.worker:Exiting the worker because of a failed job on host haswell-wn41.grid.pub.ro
WARNING:toil.leader:G/Y/jobmegNP3    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'CactusSetupPhase' G/Y/jobmegNP3 with ID G/Y/jobmegNP3 to 5
INFO:toil.leader:Issued job 'CactusSetupPhase' G/Y/jobmegNP3 with job batch system ID: 154 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.leader:Job ended successfully: 'CactusSetupPhase' G/Y/jobmegNP3
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'CactusSetupPhase' G/Y/jobmegNP3
WARNING:toil.leader:G/Y/jobmegNP3    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:G/Y/jobmegNP3    INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
WARNING:toil.leader:G/Y/jobmegNP3    WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:G/Y/jobmegNP3    INFO:toil.lib.bioio:Sequences in cactus setup: ['simHuman_chr6', 'simMouse_chr6', 'simRat_chr6', 'simCow_chr6', 'simDog_chr6']
WARNING:toil.leader:G/Y/jobmegNP3    INFO:toil.lib.bioio:Sequences in cactus setup filenames: ['>id=1|simHuman.chr6|0\n', '>id=0|simMouse.chr6\n', '>id=2|simRat.chr6\n', '>id=4|simCow.chr6|0\n', '>id=3|simDog.chr6|0\n']
WARNING:toil.leader:G/Y/jobmegNP3    INFO:cactus.shared.common:Running the command ['cactus_setup', '--speciesTree', '((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', '--cactusDisk', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />\n\t\t</st_kv_database_conf>\n\t', '--logLevel', 'INFO', '--outgroupEvents', 'simHuman_chr6 simDog_chr6 simCow_chr6', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp6DYzJV.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp4U7wPE.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp2oM2za.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpUgFVai.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpWG2t9T.tmp']
WARNING:toil.leader:G/Y/jobmegNP3    Set log level to INFO
WARNING:toil.leader:G/Y/jobmegNP3    Flower disk name : <st_kv_database_conf type="kyoto_tycoon">
WARNING:toil.leader:G/Y/jobmegNP3    			<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />
WARNING:toil.leader:G/Y/jobmegNP3    		</st_kv_database_conf>
WARNING:toil.leader:G/Y/jobmegNP3    	
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp6DYzJV.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp4U7wPE.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp2oM2za.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpUgFVai.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpWG2t9T.tmp
WARNING:toil.leader:G/Y/jobmegNP3    Exception: ST_KV_DATABASE_EXCEPTION: Opening connection to host: 172.16.13.37 with error: network error
WARNING:toil.leader:G/Y/jobmegNP3    Uncaught exception
WARNING:toil.leader:G/Y/jobmegNP3    Traceback (most recent call last):
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:G/Y/jobmegNP3        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 1096, in _runner
WARNING:toil.leader:G/Y/jobmegNP3        super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:G/Y/jobmegNP3        returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1296, in _run
WARNING:toil.leader:G/Y/jobmegNP3        return self.run(fileStore)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/cactus_workflow.py", line 641, in run
WARNING:toil.leader:G/Y/jobmegNP3        makeEventHeadersAlphaNumeric=self.getOptionalPhaseAttrib("makeEventHeadersAlphaNumeric", bool, False))
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 220, in runCactusSetup
WARNING:toil.leader:G/Y/jobmegNP3        parameters=["cactus_setup"] + args + sequences)
WARNING:toil.leader:G/Y/jobmegNP3      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 1040, in cactus_call
WARNING:toil.leader:G/Y/jobmegNP3        raise RuntimeError("Command %s failed with output: %s" % (call, output))
WARNING:toil.leader:G/Y/jobmegNP3    RuntimeError: Command ['cactus_setup', '--speciesTree', '((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', '--cactusDisk', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />\n\t\t</st_kv_database_conf>\n\t', '--logLevel', 'INFO', '--outgroupEvents', 'simHuman_chr6 simDog_chr6 simCow_chr6', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp6DYzJV.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp4U7wPE.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp2oM2za.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpUgFVai.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpWG2t9T.tmp'] failed with output: 
WARNING:toil.leader:G/Y/jobmegNP3    ERROR:toil.worker:Exiting the worker because of a failed job on host haswell-wn35.grid.pub.ro
WARNING:toil.leader:G/Y/jobmegNP3    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'CactusSetupPhase' G/Y/jobmegNP3 with ID G/Y/jobmegNP3 to 4
INFO:toil.leader:Issued job 'CactusSetupPhase' G/Y/jobmegNP3 with job batch system ID: 155 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.leader:Job ended successfully: 'KtServerService' B/T/jobNX_Wvk
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'KtServerService' B/T/jobNX_Wvk
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
WARNING:toil.leader:B/T/jobNX_Wvk    WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:B/T/jobNX_Wvk    WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:cactus.shared.common:Running the command ['netstat', '-tuplen']
WARNING:toil.leader:B/T/jobNX_Wvk    (No info could be read for "-p": geteuid()=98354 but you should be root.)
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:cactus.shared.common:Running the command ['ktserver', '-port', '26666', '-ls', '-tout', '200000', '-th', '64', '-bgs', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-b02a3811-2b63-4208-851d-7815af46a62d/tmp2i3iEe/bbd7502d-1905-454c-8a42-2a91f1f28f96/tTaU1Kr/snapshot', '-bgsc', 'lzo', '-bgsi', '1000000', '-log', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-b02a3811-2b63-4208-851d-7815af46a62d/tmp2i3iEe/bbd7502d-1905-454c-8a42-2a91f1f28f96/tmpFZwVOe.tmp', ':#opts=ls#bnum=30m#msiz=50g#ktopts=p']
WARNING:toil.leader:B/T/jobNX_Wvk    terminate called after throwing an instance of 'std::runtime_error'
WARNING:toil.leader:B/T/jobNX_Wvk      what():  pthread_create
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:cactus.shared.common:Running the command ['ktremotemgr', 'get', '-port', '26666', '-host', '172.16.13.39', 'TERMINATE']
WARNING:toil.leader:B/T/jobNX_Wvk    Process ServerProcess-1:
WARNING:toil.leader:B/T/jobNX_Wvk    Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
WARNING:toil.leader:B/T/jobNX_Wvk        self.run()
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 82, in run
WARNING:toil.leader:B/T/jobNX_Wvk        self.tryRun(*self.args, **self.kwargs)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 118, in tryRun
WARNING:toil.leader:B/T/jobNX_Wvk        raise RuntimeError("KTServer failed. Log: %s" % f.read())
WARNING:toil.leader:B/T/jobNX_Wvk    RuntimeError: KTServer failed. Log: 2019-03-08T10:11:46.189587+02:00: [SYSTEM]: ================ [START]: pid=9125
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:11:46.189773+02:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:11:46.191313+02:00: [SYSTEM]: starting the server: expr=:26666
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:11:46.191411+02:00: [SYSTEM]: server socket opened: expr=:26666 timeout=200000.0
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:11:46.191438+02:00: [SYSTEM]: listening server socket started: fd=4
WARNING:toil.leader:B/T/jobNX_Wvk    
WARNING:toil.leader:B/T/jobNX_Wvk    INFO:cactus.shared.common:Running the command ['ktremotemgr', 'set', '-port', '26666', '-host', '172.16.13.39', 'TERMINATE', '1']
WARNING:toil.leader:B/T/jobNX_Wvk    ktremotemgr: DB::open failed: : 6: network error: connection failed
WARNING:toil.leader:B/T/jobNX_Wvk    Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:B/T/jobNX_Wvk        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:B/T/jobNX_Wvk        returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1694, in _run
WARNING:toil.leader:B/T/jobNX_Wvk        returnValues = self.run(fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1673, in run
WARNING:toil.leader:B/T/jobNX_Wvk        if not service.check():
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverToil.py", line 55, in check
WARNING:toil.leader:B/T/jobNX_Wvk        raise RuntimeError(msg)
WARNING:toil.leader:B/T/jobNX_Wvk    RuntimeError: Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 82, in run
WARNING:toil.leader:B/T/jobNX_Wvk        self.tryRun(*self.args, **self.kwargs)
WARNING:toil.leader:B/T/jobNX_Wvk      File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 118, in tryRun
WARNING:toil.leader:B/T/jobNX_Wvk        raise RuntimeError("KTServer failed. Log: %s" % f.read())
WARNING:toil.leader:B/T/jobNX_Wvk    RuntimeError: KTServer failed. Log: 2019-03-08T10:11:46.189587+02:00: [SYSTEM]: ================ [START]: pid=9125
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:11:46.189773+02:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:11:46.191313+02:00: [SYSTEM]: starting the server: expr=:26666
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:11:46.191411+02:00: [SYSTEM]: server socket opened: expr=:26666 timeout=200000.0
WARNING:toil.leader:B/T/jobNX_Wvk    2019-03-08T10:11:46.191438+02:00: [SYSTEM]: listening server socket started: fd=4

Best settings

Hi,

I'm trying to align the genomes of these 6 species:

screen shot 2018-09-20 at 10 25 54

For Talp I have 12 resequenced genomes plus the reference, for a total of 18 genomes to align.
This is how I set the control file:

((((((((((bkp55bd,bkp67ad),bkp73bd),bkp88bd),(((wk44ad,wk59bd),wk77bd),wk86ad)),(((wp45ad,wp45bd),wp87bd),wp96bd)),Talp),Timm),(Tbic,Tsim)),Tpar),Veme);
bkp55bd 433_bkp55bd.fasta
bkp67ad 433_bkp67ad.fasta
bkp73bd 433_bkp73bd.fasta
bkp88bd 433_bkp88bd.fasta
wk44ad 433_wk44ad.fasta
wk59bd 433_wk59bd.fasta
wk77bd 433_wk77bd.fasta
wk86ad 433_wk86ad.fasta
wp45ad 433_wp45ad.fasta
wp45bd 433_wp45bd.fasta
wp87bd 433_wp87bd.fasta
wp96bd 433_wp96bd.fasta
*Talp Talp.v2.0.assembly.fasta
Tbic Tbic.v1.0.assembly.fasta
Timm Timm.v1.0.assembly.fasta
Tpar Tpar.v1.0.assembly.fasta
Tsim Tsim.v1.0.assembly.fasta
*Veme V.emery_V1.0.fasta

I'm now running the job using the default settings. Would be the settings to play with?

Thanks
F

Cactus pipeline hangs when attempting to send `TERMINATE` signal to ktserver process under Singularity 3

When running the example data on our Centos6 system, interactively (no job manager) I run into a situation where the SavePrimaryDB step (among others) hangs. The problem is the DB service is not being terminated as it should be because the SIGINT signal isn't passed through the default Singularity 3 runscript. Under Singularity 2, the runscript used an exec command, but under Singularity 3, the runscript uses an eval command. This has the effect that signals are not passed down to the /opt/cactus/wrapper.sh script.

INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
INFO:toil:Running Toil version 3.14.0-b91dbf9bf6116879952f0a70f9a2fbbcae7e51b6.
WARNING:toil.resource:'JTRES_729ca706c404b9f57dd4bfd347bf0bd2' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
INFO:cactus.shared.common:Work dirs: set([])
INFO:cactus.shared.common:Docker work dir: .
INFO:cactus.shared.common:Running the command ['singularity', '--silent', 'run', '/Genomics/grid/users/lparsons/kocher_lab/cactus/cactus.img', 'cactus_workflow_flowerStats', 'INFO', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="128.112.116.245" port="9429" />\n\t\t</st_kv_database_conf>\n\t', '0']
Running command catchsegv 'cactus_workflow_flowerStats' 'INFO' '<st_kv_database_conf type=kyoto_tycoon> <kyoto_tycoon database_dir=fakepath host=128.112.116.245 port=9429 /> </st_kv_database_conf> ' '0'
Set log level to INFO
INFO:toil.fileStore:LOG-TO-MASTER: At end of caf phase, got stats {
  "flowerName": 0,
  "totalBases": 2254067,
  "totalEnds": 640,
  "totalCaps": 1642,
  "maxEndDegree": 4,
  "maxAdjacencyLength": 627671,
  "totalBlocks": 242,
  "totalGroups": 247,
  "totalEdges": 652,
  "totalFreeEnds": 146,
  "totalAttachedEnds": 10,
  "totalChains": 5,
  "totalLinkGroups": 247
}
flower name: 0 total bases: 2254067 total-ends: 640 total-caps: 1642 max-end-degree: 4 max-adjacency-length: 627671 total-blocks: 242 total-groups: 247 total-edges: 326 total-free-ends: 146 total-attached-ends: 10 total-chains: 5 total-link groups: 247

INFO:cactus.shared.common:Work dirs: set([])
INFO:cactus.shared.common:Docker work dir: .
INFO:cactus.shared.common:Running the command ['singularity', '--silent', 'run', '/Genomics/grid/users/lparsons/kocher_lab/cactus/cactus.img', 'ktremotemgr', 'set', '-port', '9429', '-host', '128.112.116.245', 'TERMINATE', '1']
Running command catchsegv 'ktremotemgr' 'set' '-port' '9429' '-host' '128.112.116.245' 'TERMINATE' '1'

cannot find cPickle

Hi there, I installed cactus, but when I do "cactus -h" from the command line, I get a module not found error. After doing some digging, I am finding that cPickle is not available for python3.
Is the virtual environment in cactus_env supposed to be running python3 or python 2.7 ? I am confused.

People find the cactus header requirements + error messages confusing

Low priority, but: The requirements that cactus has for fasta headers are somewhat confusing to users, and it's especially bad since a lot of NCBI/genbank identifiers break the requirements. We should look into making the error messages more straightforward and provide a suggested solution (replacing .'s and/or spaces with underscores, probably).

We can't do much about the browser's requirements for the "." syntax, but we could look into using [nameparse=full] instead of [nameparse=darkspace] for lastz to attempt to get rid of the "first word of a header must be unique" rule. This would make parsing the cigars difficult, though, so it may be more trouble than it's worth.

error with "config.pickle" file non-exist

Hi,
Lately I am stuck in the following error while I am testing the example data:

<INFO:toil.lib.bioio:Root logger is at level 'INFO', 'toil' logger at level 'INFO'. INFO:toil.lib.bioio:Logging to file 'try.log'. INFO:toil.lib.bioio:Root logger is at level 'INFO', 'toil' logger at level 'INFO'. INFO:toil.lib.bioio:Logging to file 'try.log'. Traceback (most recent call last): File "/home/apps/software/cactus/20180705-IGB-gcc-4.9.4-Python-2.7.13/cactus_env/bin/cactus", line 11, in <module> sys.exit(main()) File "/home/apps/software/cactus/20180705-IGB-gcc-4.9.4-Python-2.7.13/cactus_env/lib/python2.7/site-packages/cactus/progressive/cactus_progressive.py", line 447, in main with Toil(options) as toil: File "/home/apps/software/cactus/20180705-IGB-gcc-4.9.4-Python-2.7.13/cactus_env/lib/python2.7/site-packages/toil/common.py", line 714, in __enter__ jobStore.resume() File "/home/apps/software/cactus/20180705-IGB-gcc-4.9.4-Python-2.7.13/cactus_env/lib/python2.7/site-packages/toil/jobStores/fileJobStore.py", line 90, in resume super(FileJobStore, self).resume() File "/home/apps/software/cactus/20180705-IGB-gcc-4.9.4-Python-2.7.13/cactus_env/lib/python2.7/site-packages/toil/jobStores/abstractJobStore.py", line 156, in resume with self.readSharedFileStream('config.pickle') as fileHandle: File "/home/apps/software/Python/2.7.13-IGB-gcc-4.9.4/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/home/apps/software/cactus/20180705-IGB-gcc-4.9.4-Python-2.7.13/cactus_env/lib/python2.7/site-packages/toil/jobStores/fileJobStore.py", line 400, in readSharedFileStream raise NoSuchFileException(sharedFileName,sharedFileName) toil.jobStores.abstractJobStore.NoSuchFileException: File 'config.pickle' (config.pickle) does not exist>

I really don't know where this config.pickle file could come from.
The command I ran is
cactus --logFile try.log --binariesMode local --restart output/ seqfile try.hal
Anyone has ideas? Thanks.

Wei Wei

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.