Giter Site home page Giter Site logo

dragonflye's Issues

mismatch between model names valid for dragonflye 1.1.0 and medaka 1.8.0

Hi,

I have made a conda install of dragonflye (within a docker image), forcing the dependencies for flye and medaka to be the latest versions:

micromamba install -n base -y -c conda-forge -c bioconda \
    flye=2.9.2 \
    medaka=1.8.0 \
    dragonflye=1.1.0

this works, but if I want to specifiy the use of the latest model r1041_e82_400bps_sup_v420 , I get an error at the medaka stage:

[...]
[dragonflye] Running: medaka_consensus -i READS.fq.gz -d flye/polish/racon/1/consensus.fasta -o flye/polish/medaka/1 -m r1041_e82_400bps_sup_v420 -t 4  2>&1 | sed 's/^/[polishing - medaka (1 of 1)] /' | tee -a dragonflye.log
[polishing - medaka (1 of 1)] Traceback (most recent call last):
[polishing - medaka (1 of 1)]   File "/opt/conda/lib/python3.10/site-packages/medaka/medaka.py", line 35, in __call__
[polishing - medaka (1 of 1)]     model_fp = medaka.models.resolve_model(val)
[polishing - medaka (1 of 1)]   File "/opt/conda/lib/python3.10/site-packages/medaka/models.py", line 31, in resolve_model
[polishing - medaka (1 of 1)]     raise ValueError(
[polishing - medaka (1 of 1)] ValueError: Model r1041_e82_400bps_sup_v420 is not a known model or existant file.
[dragonflye] Error running command: medaka_consensus -i READS.fq.gz -d flye/polish/racon/1/consensus.fasta -o flye/polish/medaka/1 -m r1041_e82_400bps_sup_v420 -t 4  2>&1 | sed 's/^/[polishing - medaka (1 of 1)] /' | tee -a
dragonflye.log

Indeed medaka wants something like this: r1041_e82_400bps_sup_v4.2.0, with dots in the version name.

docker run -v $HOME:$HOME -w $HOME/test gitlab-registry.internal.sanger.ac.uk/sanger-pathogens/docker-images-test/dragonflye:1.1.0 medaka tools list\_models
Available: r103_fast_g507, r103_fast_snp_g507, r103_fast_variant_g507, r103_hac_g507, r103_hac_snp_g507, r103_hac_variant_g507, r103_min_high_g345, r103_min_high_g360, r103_prom_high_g360, r103_prom_snp_g3210, r103_prom_variant_g3210, r103_sup_g507, r103_sup_snp_g507, r103_sup_variant_g507, r1041_e82_260bps_fast_g632, r1041_e82_260bps_fast_variant_g632, r1041_e82_260bps_hac_g632, r1041_e82_260bps_hac_v4.0.0, r1041_e82_260bps_hac_v4.1.0, r1041_e82_260bps_hac_variant_g632, r1041_e82_260bps_hac_variant_v4.1.0, r1041_e82_260bps_sup_g632, r1041_e82_260bps_sup_v4.0.0, r1041_e82_260bps_sup_v4.1.0, r1041_e82_260bps_sup_variant_g632, r1041_e82_260bps_sup_variant_v4.1.0, r1041_e82_400bps_fast_g615, r1041_e82_400bps_fast_g632, r1041_e82_400bps_fast_variant_g615, r1041_e82_400bps_fast_variant_g632, r1041_e82_400bps_hac_g615, r1041_e82_400bps_hac_g632, r1041_e82_400bps_hac_v4.0.0, r1041_e82_400bps_hac_v4.1.0, r1041_e82_400bps_hac_v4.2.0, r1041_e82_400bps_hac_variant_g615, r1041_e82_400bps_hac_variant_g632, r1041_e82_400bps_hac_variant_v4.1.0, r1041_e82_400bps_hac_variant_v4.2.0, r1041_e82_400bps_sup_g615, r1041_e82_400bps_sup_v4.0.0, r1041_e82_400bps_sup_v4.1.0, r1041_e82_400bps_sup_v4.2.0, r1041_e82_400bps_sup_variant_g615, r1041_e82_400bps_sup_variant_v4.1.0, r1041_e82_400bps_sup_variant_v4.2.0, r104_e81_fast_g5015, r104_e81_fast_variant_g5015, r104_e81_hac_g5015, r104_e81_hac_variant_g5015, r104_e81_sup_g5015, r104_e81_sup_g610, r104_e81_sup_variant_g610, r10_min_high_g303, r10_min_high_g340, r941_e81_fast_g514, r941_e81_fast_variant_g514, r941_e81_hac_g514, r941_e81_hac_variant_g514, r941_e81_sup_g514, r941_e81_sup_variant_g514, r941_min_fast_g303, r941_min_fast_g507, r941_min_fast_snp_g507, r941_min_fast_variant_g507, r941_min_hac_g507, r941_min_hac_snp_g507, r941_min_hac_variant_g507, r941_min_high_g303, r941_min_high_g330, r941_min_high_g340_rle, r941_min_high_g344, r941_min_high_g351, r941_min_high_g360, r941_min_sup_g507, r941_min_sup_snp_g507, r941_min_sup_variant_g507, r941_prom_fast_g303, r941_prom_fast_g507, r941_prom_fast_snp_g507, r941_prom_fast_variant_g507, r941_prom_hac_g507, r941_prom_hac_snp_g507, r941_prom_hac_variant_g507, r941_prom_high_g303, r941_prom_high_g330, r941_prom_high_g344, r941_prom_high_g360, r941_prom_high_g4011, r941_prom_snp_g303, r941_prom_snp_g322, r941_prom_snp_g360, r941_prom_sup_g507, r941_prom_sup_snp_g507, r941_prom_sup_variant_g507, r941_prom_variant_g303, r941_prom_variant_g322, r941_prom_variant_g360, r941_sup_plant_g610, r941_sup_plant_variant_g610
Default consensus:  r1041_e82_400bps_sup_v4.2.0
Default variant:  r1041_e82_400bps_sup_variant_v4.2.0

If trying to give that medaka-valid value to dragonflye:

dragonflye \
--reads dragonflye/barcode07.fastq.gz \
--R1 4075_2#2_1.fastq.gz \
--R2 4075_2#2_2.fastq.gz \
--gsize 4.5M --medaka 1 --model r1041_e82_400bps_sup_v4.2.0 \
--cpus 4 --ram 6 --outdir dragonflye/test_IP6794-89

then dragonflye fails at the argument validation step:

[dragonflye] You ran: /opt/conda/bin/dragonflye --reads dragonflye/barcode07.fastq.gz --R1 dragonflye/4075_2#2_1
.fastq.gz --R2 dragonflye/4075_2#2_2.fastq.gz --gsize 4.5M --medaka 1 --model r1041_e82_400bps_sup_v4.2.0 --cpus 4 --ram 6 --outdir dragonflye/test_IP6794-89
[dragonflye] This is dragonflye 1.1.0
[dragonflye] Written by Robert A Petit III
[dragonflye] Homepage is https://github.com/rpetit3/dragonflye
[dragonflye] Operating system is linux
[dragonflye] Perl version is v5.32.1
[dragonflye] Machine has 256 CPU cores and 2015.34 GB RAM
[dragonflye] Verifying input model (--model): r1041_e82_400bps_sup_v4.2.0
[dragonflye] Unable to verify model 'r1041_e82_400bps_sup_v4.2.0', please check spelling and try again.
[dragonflye] Available Medaka models include:
[dragonflye]    r103_fast_g507
[dragonflye]    r103_hac_g507
[dragonflye]    r103_min_high_g345
[dragonflye]    r103_min_high_g360
[...]

Could you please change your validation scheme so that it matches that of medaka?

Best wishes,
Florent

Citation

Hello, is there a specific citation format available for Dragonflye?

Filtering reads quality

Hi

The dragonflye use nanoq filter reads length, and nanoq also can filter reads quality, could you add the feature to dragonflye?

Thanks

Niche QoL inquiry: polypolish acceptance of nonstandard file affixes?

Hi Robert,

Is there any way for dragonflye to accept nonstandard file inputs for polypolish?

e.g. get some version of this (fq.gz for R1/R2) working:

dragonflye\
 --cpus 12\
 --ram 12\
 --reads $RUN/gpy646sup/${SAMPLE}_merged_barcode*.fastq.gz\
 --R1 $RUN/d40_JING_out/output/${SAMPLE}*/${SAMPLE}*_val_1.fq.gz\
 --R2 $RUN/d40_JING_out/output/${SAMPLE}*/${SAMPLE}*_val_2.fq.gz\
 --depth 0\
 --nanohq\
 --medaka 2\
 --model r1041_e82_400bps_sup_g615\
 --polypolish 1\
 --outdir $OUTDIR/${SAMPLE}_dflye_m180p_out\
 --force  || echo "dflye error in i=$i"

instead of this (standard fastq.gz):

dragonflye\
 --cpus 12\
 --ram 12\
 --reads $RUN/gpy646sup/${SAMPLE}_merged_barcode*.fastq.gz\
 --R1 $RUN/d40_JING_out/output/${SAMPLE}*/${SAMPLE}*_val_1.fastq.gz\
 --R2 $RUN/d40_JING_out/output/${SAMPLE}*/${SAMPLE}*_val_2.fastq.gz\
 --depth 0\
 --nanohq\
 --medaka 2\
 --model r1041_e82_400bps_sup_g615\
 --polypolish 1\
 --outdir $OUTDIR/${SAMPLE}_dflye_m180p_out\
 --force  || echo "dflye error in i=$i"

I copy the .fastq.gz as .fq.gz and use the version immediately above for now, but I imagine there must be some less-bad way to just use the erstwhile-usable poorly-named files from another pipeline. (I still haven't been able to get bactopia-dev to spin up singularity containers with our SLURM nodes.)

Continued thanks for your amazing work either way!

Execution error with Rasusa v1.0.0 in conda environment

Hi,

Thanks for developing Dragonflye—it's been highly useful! I've encountered an issue with Rasusa during installation on a new Ubuntu 22.04 machine using Conda. Dragonflye v1.2.0 seems to pull Rasusa v1.0.0 instead of v0.8.0, leading to a syntax error in the rasusa command.

[...]
[dragonflye] Using rasusa - /opt/conda/envs/dragonflye_1.2.0/bin/rasusa | rasusa 1.0.0
[...]
[dragonflye] Running: rasusa -i READS\.filt\.fq\.gz -c 100 -g 50000 -s 42  2>&1 1> READS.sub.fq | sed 's/^/[rasusa] /' | tee -a dragonflye.log
[rasusa] error: unexpected argument '-i' found
[...]

Correct Rasusa v1.0.0 syntax:

rasusa reads -c 100 -g 50000 -s 42 READS\.filt\.fq\.gz 2>&1 1> READS.sub.fq | sed 's/^/[rasusa] /' | tee -a dragonflye.log

Cheers,
Nouri

Failed to run medaka consensus. - ModelStoreTF exception <class 'NotImplementedError'>

Hello @rpetit3 ,

Thank you for developing dragonflye.

We are trying to use dragonflye to perform assembly on the E. faecium isolates with the following command:
dragonflye --reads 02_fastq/220818_VRE1.fastq.gz --gsize 2.8M --outdir 04_dragonflye/220818_VRE1 --cpus 20 --nanohq --model r941_min_sup_g507

However, we encounted an error in the polishing step when using medaka. Please find the error logs below:

[dragonflye] Hello gilmansiu3
[dragonflye] You ran: /home/gilmansiu3/miniconda3/envs/dragonflye/bin/dragonflye --reads 02_fastq/220818_VRE1.fastq.gz --gsize 2.8M --outdir 04_dragonflye/220818_VRE1 --cpus 20 --nanohq --model r941_min_sup_g507
[dragonflye] This is dragonflye 1.0.13
[dragonflye] Written by Robert A Petit III
[dragonflye] Homepage is https://github.com/rpetit3/dragonflye
[dragonflye] Operating system is linux
[dragonflye] Perl version is v5.32.1
[dragonflye] Machine has 20 CPU cores and 125.72 GB RAM
[dragonflye] Verifying input model (--model): r941_min_sup_g507
[dragonflye] Model r941_min_sup_g507 verified!
[dragonflye] Valid model provided, but number of Medaka rounds (--medaka) not given, assuming 1 round
[dragonflye] Using any2fasta - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/any2fasta | any2fasta 0.4.2
[dragonflye] Using assembly-scan - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/assembly-scan | assembly-scan 0.4.1
[dragonflye] Using bwa - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/bwa | Version: 0.7.17-r1188
[dragonflye] Using fastp - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/fastp | fastp 0.23.2
[dragonflye] Using flye - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/flye | 2.9-b1768
[dragonflye] Using kmc - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/kmc | K-Mer Counter (KMC) ver. 3.2.1 (2022-01-04)
[dragonflye] Using medaka - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/medaka | medaka 1.6.1
[dragonflye] Using miniasm - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/miniasm | 0.3-r179
[dragonflye] Using minimap2 - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/minimap2 | 2.24-r1122
[dragonflye] Using nanoq - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/nanoq | nanoq 0.9.0
[dragonflye] Using pigz - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/pigz | pigz 2.6
[dragonflye] Using pilon - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/pilon | Pilon version 1.24 Thu Jan 28 13:00:45 2021 -0500
[dragonflye] Using polypolish - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/polypolish | Polypolish v0.5.0
[dragonflye] Using porechop - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/porechop | 0.2.4
[dragonflye] Using racon - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/racon | 1.5.0
[dragonflye] Using rasusa - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/rasusa | rasusa 0.7.0
[dragonflye] Using raven - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/raven | 1.8.1
[dragonflye] Using samclip - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/samclip | samclip 0.4.0
[dragonflye] Using samtools - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/samtools | Version: 1.15.1 (using htslib 1.15.1)
[dragonflye] Using seqtk - /home/gilmansiu3/miniconda3/envs/dragonflye/bin/seqtk | Version: 1.3-r106
[dragonflye] Using tempdir: /tmp/tXf0FpvLDL
[dragonflye] Changing into folder: /mnt/data/Species-specific/CAUR/04_dragonflye/220818_VRE1
[dragonflye] Collecting raw read statistics with 'seqtk'
[dragonflye] Running: seqtk fqchk -q3 /mnt/data/Species-specific/CAUR/02_fastq/220818_VRE1.fastq.gz 2>&1 1>/tmp/deywkg47eV | sed 's/^/[seqtk] /' | tee -a dragonflye.log
[dragonflye] Read stats: avg_len = 4267
[dragonflye] Read stats: max_len = 45857
[dragonflye] Read stats: min_len = 1000
[dragonflye] Read stats: total_bp = 432559422
[dragonflye] Using genome size 2800000 bp
[dragonflye] Estimated sequencing depth: 154x
[dragonflye] Filter reads based on length and/or quality
[dragonflye] Running: nanoq --min-len 1000 --input /mnt/data/Species-specific/CAUR/02_fastq/220818_VRE1.fastq.gz --min-qual 0 2>&1 1> READS.filt.fq | sed 's/^/[nanoq] /' | tee -a dragonflye.log
[dragonflye] Running: pigz -f -p 20 --fast READS.filt.fq 2>&1 | sed 's/^/[pigz] /' | tee -a dragonflye.log
[dragonflye] No read depth reduction requested or necessary.
[dragonflye] No read adapter trimming requested.
[dragonflye] Running: ln -sf READS.filt.fq.gz READS.fq.gz 2>&1 | sed 's/^/[ln] /' | tee -a dragonflye.log
[dragonflye] Collecting qc'd read statistics with 'seqtk'
[dragonflye] Running: seqtk fqchk -q3 READS.fq.gz 2>&1 1>/tmp/l7Duy6iujE | sed 's/^/[seqtk] /' | tee -a dragonflye.log
[dragonflye] Final Read stats: min_len = 1000
[dragonflye] Final Read stats: max_len = 45857
[dragonflye] Final Read stats: avg_len = 4267
[dragonflye] Final Read stats: total_bp = 432559422
[dragonflye] Average read length looks like 4267 bp
[dragonflye] Assembling reads with 'flye'
[dragonflye] Running: flye --nano-hq READS.fq.gz -g 2800000 -i 0 --threads 20 -o flye 2>&1 | sed 's/^/[flye] /' | tee -a dragonflye.log
[flye] [2022-11-25 15:10:17] INFO: Starting Flye 2.9-b1768
[flye] [2022-11-25 15:10:17] INFO: >>>STAGE: configure
[flye] [2022-11-25 15:10:17] INFO: Configuring run
[flye] [2022-11-25 15:10:22] INFO: Total read length: 432559422
[flye] [2022-11-25 15:10:22] INFO: Input genome size: 2800000
[flye] [2022-11-25 15:10:22] INFO: Estimated coverage: 154
[flye] [2022-11-25 15:10:22] INFO: Reads N50/N90: 5835 / 2000
[flye] [2022-11-25 15:10:22] INFO: Minimum overlap set to 2000
[flye] [2022-11-25 15:10:22] INFO: >>>STAGE: assembly
[flye] [2022-11-25 15:10:22] INFO: Assembling disjointigs
[flye] [2022-11-25 15:10:22] INFO: Reading sequences
[flye] [2022-11-25 15:10:27] INFO: Building minimizer index
[flye] [2022-11-25 15:10:27] INFO: Pre-calculating index storage
[flye] 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[flye] [2022-11-25 15:10:30] INFO: Filling index
[flye] 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[flye] [2022-11-25 15:10:39] INFO: Extending reads
[flye] [2022-11-25 15:11:21] INFO: Overlap-based coverage: 111
[flye] [2022-11-25 15:11:21] INFO: Median overlap divergence: 0.0539394
[flye] 0% 10% 90% 100%
[flye] [2022-11-25 15:12:13] INFO: Assembled 6 disjointigs
[flye] [2022-11-25 15:12:13] INFO: Generating sequence
[flye] 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[flye] [2022-11-25 15:12:14] INFO: Filtering contained disjointigs
[flye] 0% 10% 30% 50% 60% 80% 100%
[flye] [2022-11-25 15:12:14] INFO: Contained seqs: 0
[flye] [2022-11-25 15:12:14] INFO: >>>STAGE: consensus
[flye] [2022-11-25 15:12:14] INFO: Running Minimap2
[flye] [2022-11-25 15:12:43] INFO: Computing consensus
[flye] [2022-11-25 15:14:22] INFO: Alignment error rate: 0.067847
[flye] [2022-11-25 15:14:22] INFO: >>>STAGE: repeat
[flye] [2022-11-25 15:14:22] INFO: Building and resolving repeat graph
[flye] [2022-11-25 15:14:22] INFO: Parsing disjointigs
[flye] [2022-11-25 15:14:22] INFO: Building repeat graph
[flye] 0% 10% 30% 50% 60% 80% 100%
[flye] [2022-11-25 15:14:23] INFO: Median overlap divergence: 0.00335946
[flye] [2022-11-25 15:14:23] INFO: Parsing reads
[flye] [2022-11-25 15:14:27] INFO: Aligning reads to the graph
[flye] 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[flye] [2022-11-25 15:14:36] INFO: Aligned read sequence: 383411070 / 389252928 (0.984992)
[flye] [2022-11-25 15:14:36] INFO: Median overlap divergence: 0.0258973
[flye] [2022-11-25 15:14:36] INFO: Mean edge coverage: 122
[flye] [2022-11-25 15:14:36] INFO: Simplifying the graph
[flye] [2022-11-25 15:14:36] INFO: >>>STAGE: contigger
[flye] [2022-11-25 15:14:36] INFO: Generating contigs
[flye] [2022-11-25 15:14:36] INFO: Reading sequences
[flye] [2022-11-25 15:14:41] INFO: Generated 7 contigs
[flye] [2022-11-25 15:14:41] INFO: Added 0 scaffold connections
[flye] [2022-11-25 15:14:41] INFO: >>>STAGE: finalize
[flye] [2022-11-25 15:14:41] INFO: Assembly statistics:
[flye]
[flye] Total length: 3161882
[flye] Fragments: 7
[flye] Fragments N50: 2791738
[flye] Largest frg: 2791738
[flye] Scaffolds: 0
[flye] Mean coverage: 121
[flye]
[flye] [2022-11-25 15:14:41] INFO: Final assembly: /mnt/data/Species-specific/CAUR/04_dragonflye/220818_VRE1/flye/assembly.fasta
[dragonflye] Polishing with Racon (1 rounds)
[dragonflye] Running: minimap2 -t 19 -x map-ont flye.fasta READS.fq.gz 2>&1 1> flye/polish/racon/1/aligments.paf | sed 's/^/[polishing - racon (1 of 1)] /' | tee -a dragonflye.log
[polishing - racon (1 of 1)] [M::mm_idx_gen::0.0551.01] collected minimizers
[polishing - racon (1 of 1)] [M::mm_idx_gen::0.061
2.58] sorted minimizers
[polishing - racon (1 of 1)] [M::main::0.0612.58] loaded/built the index for 7 target sequence(s)
[polishing - racon (1 of 1)] [M::mm_mapopt_update::0.067
2.45] mid_occ = 26
[polishing - racon (1 of 1)] [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 7
[polishing - racon (1 of 1)] [M::mm_idx_stat::0.0712.36] distinct minimizers: 534042 (95.78% are singletons); average occurrences: 1.109; average spacing: 5.339; total length: 3161882
[polishing - racon (1 of 1)] [M::worker_pipeline::11.029
11.97] mapped 101377 sequences
[polishing - racon (1 of 1)] [M::main] Version: 2.24-r1122
[polishing - racon (1 of 1)] [M::main] CMD: minimap2 -t 19 -x map-ont flye.fasta READS.fq.gz
[polishing - racon (1 of 1)] [M::main] Real time: 11.035 sec; CPU: 132.074 sec; Peak RSS: 0.491 GB
[dragonflye] Running: racon -t 20 READS.fq.gz flye/polish/racon/1/aligments.paf flye.fasta 2>&1 1> flye/polish/racon/1/consensus.fasta | sed 's/^/[polishing - racon (1 of 1)] /' | tee -a dragonflye.log
[polishing - racon (1 of 1)] [racon::Polisher::initialize] loaded target sequences 0.012280 s
[polishing - racon (1 of 1)] [racon::Polisher::initialize] loaded sequences 4.787064 s
[polishing - racon (1 of 1)] [racon::Polisher::initialize] loaded overlaps 0.085765 s
[racon::Polisher::initialize] aligning overlaps [====================] 8.314751 s ] 0.522070 s
[polishing - racon (1 of 1)] [racon::Polisher::initialize] transformed data into windows 0.466438 s
[racon::Polisher::polish] generating consensus [====================] 44.492387 s ] 3.252132 s
[polishing - racon (1 of 1)] [racon::Polisher::] total = 58.222555 s
[dragonflye] Polishing with Medaka (1 rounds)
[dragonflye] Running: medaka_consensus -i READS.fq.gz -d flye/polish/racon/1/consensus.fasta -o flye/polish/medaka/1 -m r941_min_sup_g507 -t 20 2>&1 | sed 's/^/[polishing - medaka (1 of 1)] /' | tee -a dragonflye.log
[polishing - medaka (1 of 1)] Checking program versions
[polishing - medaka (1 of 1)] This is medaka 1.6.1
[polishing - medaka (1 of 1)] Program Version Required Pass
[polishing - medaka (1 of 1)] bcftools 1.15.1 1.11 True
[polishing - medaka (1 of 1)] bgzip 1.15.1 1.11 True
[polishing - medaka (1 of 1)] minimap2 2.24 2.11 True
[polishing - medaka (1 of 1)] samtools 1.15.1 1.11 True
[polishing - medaka (1 of 1)] tabix 1.15.1 1.11 True
[polishing - medaka (1 of 1)] Aligning basecalls to draft
[polishing - medaka (1 of 1)] Creating fai index file /mnt/data/Species-specific/CAUR/04_dragonflye/220818_VRE1/flye/polish/racon/1/consensus.fasta.fai
[polishing - medaka (1 of 1)] Creating mmi index file /mnt/data/Species-specific/CAUR/04_dragonflye/220818_VRE1/flye/polish/racon/1/consensus.fasta.map-ont.mmi
[polishing - medaka (1 of 1)] [M::mm_idx_gen::0.1031.02] collected minimizers
[polishing - medaka (1 of 1)] [M::mm_idx_gen::0.113
1.19] sorted minimizers
[polishing - medaka (1 of 1)] [M::main::0.1291.17] loaded/built the index for 7 target sequence(s)
[polishing - medaka (1 of 1)] [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 7
[polishing - medaka (1 of 1)] [M::mm_idx_stat::0.133
1.16] distinct minimizers: 529319 (95.23% are singletons); average occurrences: 1.119; average spacing: 5.339; total length: 3163172
[polishing - medaka (1 of 1)] [M::main] Version: 2.24-r1122
[polishing - medaka (1 of 1)] [M::main] CMD: minimap2 -I 16G -x map-ont -d /mnt/data/Species-specific/CAUR/04_dragonflye/220818_VRE1/flye/polish/racon/1/consensus.fasta.map-ont.mmi /mnt/data/Species-specific/CAUR/04_dragonflye/220818_VRE1/flye/polish/racon/1/consensus.fasta
[polishing - medaka (1 of 1)] [M::main] Real time: 0.135 sec; CPU: 0.157 sec; Peak RSS: 0.033 GB
[polishing - medaka (1 of 1)] [M::main::0.0191.03] loaded/built the index for 7 target sequence(s)
[polishing - medaka (1 of 1)] [M::mm_mapopt_update::0.024
1.02] mid_occ = 27
[polishing - medaka (1 of 1)] [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 7
[polishing - medaka (1 of 1)] [M::mm_idx_stat::0.0281.02] distinct minimizers: 529319 (95.23% are singletons); average occurrences: 1.119; average spacing: 5.339; total length: 3163172
[polishing - medaka (1 of 1)] [M::worker_pipeline::21.050
13.97] mapped 101377 sequences
[polishing - medaka (1 of 1)] [M::main] Version: 2.24-r1122
[polishing - medaka (1 of 1)] [M::main] CMD: minimap2 -x map-ont --secondary=no -L --MD -A 2 -B 4 -O 4,24 -E 2,1 -t 20 -a /mnt/data/Species-specific/CAUR/04_dragonflye/220818_VRE1/flye/polish/racon/1/consensus.fasta.map-ont.mmi /mnt/data/Species-specific/CAUR/04_dragonflye/220818_VRE1/READS.filt.fq.gz
[polishing - medaka (1 of 1)] [M::main] Real time: 21.053 sec; CPU: 294.060 sec; Peak RSS: 1.828 GB
[polishing - medaka (1 of 1)] [bam_sort_core] merging from 0 files and 20 in-memory blocks...
[polishing - medaka (1 of 1)] Running medaka consensus
[polishing - medaka (1 of 1)] [15:16:26 - Predict] Reducing threads to 2, anymore is a waste.
[polishing - medaka (1 of 1)] [15:16:27 - Predict] Setting tensorflow inter/intra-op threads to 2/1.
[polishing - medaka (1 of 1)] [15:16:27 - Predict] Processing region(s): contig_1:0-235990 contig_2:0-2793735 contig_3:0-45936 contig_4:0-8983 contig_5:0-32435 contig_6:0-34261 contig_7:0-11832
[polishing - medaka (1 of 1)] [15:16:27 - Predict] Using model: /home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/medaka/data/r941_min_sup_g507_model.tar.gz.
[polishing - medaka (1 of 1)] [15:16:27 - Predict] Found a GPU.
[polishing - medaka (1 of 1)] [15:16:27 - Predict] If cuDNN errors are observed, try setting the environment variable TF_FORCE_GPU_ALLOW_GROWTH=true. To explicitely disable use of cuDNN use the commandline option `--disable_cudnn. If OOM (out of memory) errors are found please reduce batch size.
[polishing - medaka (1 of 1)] [15:16:27 - Predict] Processing 9 long region(s) with batching.
[polishing - medaka (1 of 1)] [15:16:27 - ModelLoad] GPU available: building model with cudnn optimization
[polishing - medaka (1 of 1)] [15:16:27 - MdlStrTF] ModelStoreTF exception <class 'NotImplementedError'>
[polishing - medaka (1 of 1)] Traceback (most recent call last):
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/bin/medaka", line 11, in
[polishing - medaka (1 of 1)] sys.exit(main())
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/medaka/medaka.py", line 720, in main
[polishing - medaka (1 of 1)] args.func(args)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/medaka/prediction.py", line 160, in predict
[polishing - medaka (1 of 1)] model = model_store.load_model(time_steps=args.chunk_len)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/medaka/datastore.py", line 159, in load_model
[polishing - medaka (1 of 1)] self.model = model_partial_function(time_steps=time_steps)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/medaka/models.py", line 147, in build_model
[polishing - medaka (1 of 1)] model.add(Bidirectional(gru, input_shape=input_shape))
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
[polishing - medaka (1 of 1)] result = method(self, *args, **kwargs)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/engine/sequential.py", line 198, in add
[polishing - medaka (1 of 1)] layer(x)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/wrappers.py", line 531, in call
[polishing - medaka (1 of 1)] return super(Bidirectional, self).call(inputs, **kwargs)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 922, in call
[polishing - medaka (1 of 1)] outputs = call_fn(cast_inputs, *args, **kwargs)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/wrappers.py", line 644, in call
[polishing - medaka (1 of 1)] y = self.forward_layer(forward_inputs,
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/recurrent.py", line 654, in call
[polishing - medaka (1 of 1)] return super(RNN, self).call(inputs, **kwargs)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 922, in call
[polishing - medaka (1 of 1)] outputs = call_fn(cast_inputs, *args, **kwargs)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/recurrent_v2.py", line 408, in call
[polishing - medaka (1 of 1)] inputs, initial_state, _ = self._process_inputs(inputs, initial_state, None)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/recurrent.py", line 848, in _process_inputs
[polishing - medaka (1 of 1)] initial_state = self.get_initial_state(inputs)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/recurrent.py", line 636, in get_initial_state
[polishing - medaka (1 of 1)] init_state = get_initial_state_fn(
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/recurrent.py", line 1910, in get_initial_state
[polishing - medaka (1 of 1)] return _generate_zero_filled_state_for_cell(self, inputs, batch_size, dtype)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/recurrent.py", line 2926, in _generate_zero_filled_state_for_cell
[polishing - medaka (1 of 1)] return _generate_zero_filled_state(batch_size, cell.state_size, dtype)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/recurrent.py", line 2944, in _generate_zero_filled_state
[polishing - medaka (1 of 1)] return create_zeros(state_size)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/keras/layers/recurrent.py", line 2939, in create_zeros
[polishing - medaka (1 of 1)] return array_ops.zeros(init_state_size, dtype=dtype)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 2677, in wrapped
[polishing - medaka (1 of 1)] tensor = fun(*args, **kwargs)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 2721, in zeros
[polishing - medaka (1 of 1)] output = _constant_if_small(zero, shape, dtype, name)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 2662, in _constant_if_small
[polishing - medaka (1 of 1)] if np.prod(shape) < 1000:
[polishing - medaka (1 of 1)] File "<array_function internals>", line 180, in prod
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 3045, in prod
[polishing - medaka (1 of 1)] return _wrapreduction(a, np.multiply, 'prod', axis, dtype, out,
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
[polishing - medaka (1 of 1)] return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
[polishing - medaka (1 of 1)] File "/home/gilmansiu3/miniconda3/envs/dragonflye/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 748, in array
[polishing - medaka (1 of 1)] raise NotImplementedError("Cannot convert a symbolic Tensor ({}) to a numpy"
[polishing - medaka (1 of 1)] NotImplementedError: Cannot convert a symbolic Tensor (bidirectional/forward_gru1/strided_slice:0) to a numpy array.
[polishing - medaka (1 of 1)] Failed to run medaka consensus.
[dragonflye] Error running command: medaka_consensus -i READS.fq.gz -d flye/polish/racon/1/consensus.fasta -o flye/polish/medaka/1 -m r941_min_sup_g507 -t 20 2>&1 | sed 's/^/[polishing - medaka (1 of 1)] /' | tee -a dragonflye.log

Best regards,
Eddie

Missed plasmid in reoriented.fa

Hello,

thank you for updating dragonflye! I tried the last version and found difference in a number of plasmids in "contigs.fa" and "reoriented.fa". Maybe I didn't understand an output description, because I thought that all plasmids, both oriented and not oriented, will be in the "reoriented" file.
Also, two small plasmids were missed in all versions of assemblies (ColRNA and Col440I), but I think it is a "bug" of Flye assembly. Unicycler long-only assembly had these plasmids.
Upd
I found the plasmid in a 'reoriented" file. It was missed because after reorientation coverage became 62%, but it was 100% in the non-reoriented file.

Best regards,
Valery

Memory calculation

I'm wondering if there is a way to estimate the memory requirements up front so that it doesn't produce an error. Is that even possible? Thinking about pilon but it's possible something else on here takes more memory? If it's insufficient memory, then I'd suggest dragonfly produces an up front error.

Feature request: replicon rotation

Would you be open to adding a replicon rotation feature similar to what unicycler does? There's an existing issue on the flye repo that states that it doesn't directly support rotation, and suggests using circlator for that purpose.

The unicycler paper describes its approach:

A circular sequence can be shifted to any starting position without changing the biological information. Unicycler therefore uses TBLASTN to search for dnaA or repA alleles in each completed replicon[20]. If one is found, the sequence is rotated and/or flipped so that it begins with that gene encoded on the forward strand. This provides consistently oriented assemblies and reduces the risk that a gene will be split across the start and end of the sequence.

...or do you see that as out-of-scope for dragonflye?

Thanks for making such a useful tool. It really simplifies the process of creating high-quality hybrid assemblies.

Polypolish error

Hi!

I'm trying to run dragonflye (v1.1.2) with default parameters using --reads, --R1 and --R2 inputs. The issue arises after assembling with flye, when dragonflye executes polypolish (v0.6.0):
Running: polypolish flye/polish/racon/1/consensus.fasta flye/polish/short_reads/polypolish/1/polypolish_R1-1.sam flye/polish/short_reads/polypolish/1/polypolish-R2-1.sam > flye/polish/short_reads/polypolish/1/polypolish-1.fasta | sed 's/^/[short read polishing - polypolish (1 of 1)] /' | tee -a dragonflye.log

Error running command: polypolish flye/polish/racon/1/consensus.fasta flye/polish/short_reads/polypolish/1/polypolish_R1-1.sam flye/polish/short_reads/polypolish/1/polypolish-R2-1.sam > flye/polish/short_reads/polypolish/1/polypolish-1.fasta | sed 's/^/[short read polishing - polypolish (1 of 1)] /' | tee -a dragonflye.log

error: unrecognized subcommand 'flye/polish/racon/1/consensus.fasta'

I believe the issue is that the polish command is missing when calling polypolish.

Thanks for your hard work!

Andrea

mamba installation problem

Hi,

Thank you for wonderful assembly pipeline!

I have successfully installed dragonflye via mamba but unfortunately installed v1.0.7.

So I decided to force mamba to install newest version:

mamba create -n dragonflye -c conda-forge -c bioconda dragonflye=1.0.13

              __    __    __    __
             /  \  /  \  /  \  /  \
            /    \/    \/    \/    \

███████████████/ /██/ /██/ /██/ /████████████████████████
/ / \ / \ / \ / \ ____
/ / _/ _/ _/ \ o _,
/ / _
__/ `
|/
███╗ ███╗ █████╗ ███╗ ███╗██████╗ █████╗
████╗ ████║██╔══██╗████╗ ████║██╔══██╗██╔══██╗
██╔████╔██║███████║██╔████╔██║██████╔╝███████║
██║╚██╔╝██║██╔══██║██║╚██╔╝██║██╔══██╗██╔══██║
██║ ╚═╝ ██║██║ ██║██║ ╚═╝ ██║██████╔╝██║ ██║
╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝

    mamba (0.22.1) supported by @QuantStack

    GitHub:  https://github.com/mamba-org/mamba
    Twitter: https://twitter.com/QuantStack

█████████████████████████████████████████████████████████████

WARNING: A conda environment already exists at '/home/jang/anaconda3/envs/mamba/envs/dragonflye'
Remove existing environment (y/[n])? y

Looking for: ['dragonflye=1.0.13']

conda-forge/linux-64 Using cache
conda-forge/noarch Using cache
bioconda/linux-64 Using cache
bioconda/noarch Using cache
r/linux-64 Using cache
r/noarch Using cache
pkgs/main/noarch No change
pkgs/r/noarch No change
pkgs/r/linux-64 No change
pkgs/main/linux-64 No change
cruizperez/linux-64 No change
cruizperez/noarch No change
Encountered problems while solving:

  • nothing provides cudatoolkit 8.0.* needed by tensorflow-gpu-base-1.4.1-py27h01caf0a_0

Any hints?

I would like to run dragonflye with medaka using gpu or cpu and finally polish the assembly with polypolish.

Can you also add --prefix option for dragonflye to set custom file name for the final assembly?

Bests,
Jan

Typo

[dragonflye] Dragonfly larva eat just about anything: tadpoles, mosquitoes, fish, other insect larvae and even each other!
Either 'larvae eat' or 'larva eats'.
Everything should be just perfect :)
Thanks for this nice piece of software!

Problem with conda installing

Hello, Robert,

Thank you for a great tool!
I had dragonflye 1.0.10 and now I decided to create new env with the new version. And I have problem the same as was with bactopia ( bactopia/bactopia#334 ):

conda install -c bioconda dragonflye
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: \ 

and the process ends.
Are there any ideas how could I fix it?

A lot of thank,
Valery

trimming and medaka

Hi @rpetit3,

Firstly, thank you for this useful tool, great work and great name:)

Secondly, would it be possible (if it is not already taken care of in the tool, and I didn't realize) to add a adapter trimming (and in some case demultiplexing) step? (like in shovill there is a --trim option). We use porechop (https://github.com/rrwick/Porechop), but any other option would be good too.

Thirdly, is it possible to use medaka in gpu mode?

thanks again!
Yair

Processing multiple samples

Hi Robert,
Can dragonflye process multiple samples? How do I indicate that in the output directory ? Is there a flag for sample?
Thanks,
TJ

medaka fails to open model file for r1041_e82_400bps_sup_g615

Hi Robert,

I have been testing your beauitiful new version using the biocontainers docker image for v1.1.1.

Unfortunately I ran into an issue with medaka again, actually the same that I was experiencing myself with my custom docker image (as mentioned in issue #19)

My run with model r1041_e82_400bps_sup_v4.2.0 went fine and completed successfully.

Another run with model r1041_e82_400bps_sup_g615 failed, see excerpt of dragonfly log below (full log attached):

[dragonflye] Polishing with Medaka (1 rounds)
[dragonflye] Running: medaka_consensus -i READS.fq.gz -d flye/polish/racon/1/consensus.fasta -o flye/polish/medaka/1 -m r1041_e82_400bps_sup_g615 -t 4  2>&1 | sed 's/^/[polishing - medaka (1 of 1)] /' | tee -a dragonflye.log
[polishing - medaka (1 of 1)] Checking program versions
[polishing - medaka (1 of 1)] This is medaka 1.8.0
[polishing - medaka (1 of 1)] Program    Version    Required   Pass
[polishing - medaka (1 of 1)] bcftools   1.17       1.11       True
[polishing - medaka (1 of 1)] bgzip      1.17       1.11       True
[polishing - medaka (1 of 1)] minimap2   2.26       2.11       True
[polishing - medaka (1 of 1)] samtools   1.17       1.11       True
[polishing - medaka (1 of 1)] tabix      1.17       1.11       True
[polishing - medaka (1 of 1)] Traceback (most recent call last):
[polishing - medaka (1 of 1)]   File "/usr/local/bin/medaka", line 11, in <module>
[polishing - medaka (1 of 1)]     sys.exit(main())
[polishing - medaka (1 of 1)]   File "/usr/local/lib/python3.10/site-packages/medaka/medaka.py", line 724, in main
[polishing - medaka (1 of 1)]     args.func(args)
[polishing - medaka (1 of 1)]   File "/usr/local/lib/python3.10/site-packages/medaka/medaka.py", line 267, in is_rle_model
[polishing - medaka (1 of 1)]     print(is_rle_encoder(args.model))
[polishing - medaka (1 of 1)]   File "/usr/local/lib/python3.10/site-packages/medaka/medaka.py", line 274, in is_rle_encoder
[polishing - medaka (1 of 1)]     encoder = modelstore.get_meta('feature_encoder')
[polishing - medaka (1 of 1)]   File "/usr/local/lib/python3.10/site-packages/medaka/datastore.py", line 193, in get_meta
[polishing - medaka (1 of 1)]     self.unpack()
[polishing - medaka (1 of 1)]   File "/usr/local/lib/python3.10/site-packages/medaka/datastore.py", line 118, in unpack
[polishing - medaka (1 of 1)]     with tarfile.open(self.filepath) as tar:
[polishing - medaka (1 of 1)]   File "/usr/local/lib/python3.10/tarfile.py", line 1639, in open
[polishing - medaka (1 of 1)]     raise ReadError(f"file could not be opened successfully:\n{error_msgs_summary}")
[polishing - medaka (1 of 1)] tarfile.ReadError: file could not be opened successfully:
[polishing - medaka (1 of 1)] - method gz: ReadError('empty file')
[polishing - medaka (1 of 1)] - method bz2: ReadError('not a bzip2 file')
[polishing - medaka (1 of 1)] - method xz: ReadError('not an lzma file')
[polishing - medaka (1 of 1)] - method tar: ReadError('empty file')

[dragonflye] Error running command: medaka_consensus -i READS.fq.gz -d flye/polish/racon/1/consensus.fasta -o flye/polish/medaka/1 -m r1041_e82_400bps_sup_g615 -t 4  2>&1 | sed 's/^/[polishing - medaka (1 of 1)] /' | tee -a dragonflye.log

I know this sounds like a medaka issue, but do you have a clue how to fix this before I escalate?
Unfortunately this model is the main model my users are looking to using...

dragonflye.log

Feature request: dragonflye 1.1.N default to flye 2.9.2?

Hi @rpetit3 - flye 2.9.2 is on bioconda now :)

Do you think any major changes will be required for flye 2.9.2? Wondering if that would be fairly safe to pin &/or if you've done anything with it yet!

P.S. totally unrelated, but having the CPU/GPU version separated on bioconda was a great idea. I've only tried CPU medaka so far on 1.1.0. Is dragonflye-gpu on your dev channel?

Expose nano-hq option

Might be useful for data generated with the latest ONT chemistry (Q20+) and basecallers to have the Flye --nano-hq mode available. I found that this can't be added with the --opts flag as --nano-raw is hardcoded. Great tool!

Polypolish error

Hi @rpetit3,

Thanks for your excellent tool. I ran dragonflye (v1.1.2) with the intent of assembling long reads and polishing with short reads. However, I encountered an error when the pipeline got to the polypolish step.

[dragonflye] Polishing with Polypolish
[dragonflye] Running: polypolish flye/polish/racon/1/consensus.fasta flye/polish/short_reads/polypolish/1/polypolish_R1-1.sam flye/polish/short_reads/polypolish/1/polypolish-R2-1.sam > flye/polish/short_reads/polypolish/1/polypolish-1.fasta | sed 's/^/[short read polishing - polypolish (1 of 1)] /' | tee -a dragonflye.logerror: unrecognized subcommand 'flye/polish/racon/1/consensus.fasta'
Usage: polypolish <COMMAND>

For more information, try '--help'.

The command I ran is

dragonflye --reads /MIGE/01_DATA/01_FASTQ/15059.fastq.gz --R1 /MIGE/01_DATA/01_FASTQ/15059_1.fastq.gz --R2 /MIGE/01_DATA/01_FASTQ/15059_2.fastq.gz --prefix 15059 --outdir dragonflye_direct --force

I thought this was due to the dragonflye version that I used. However, the current version (v1.2.0) isn't installable using conda (I tried, but I encountered an error).

The error message gives the impression that polypolish requires a sub-command (e.g., polypolish filter or polypolish polish), which is currently missing from the dragonflye pipeline.

I look forward to hearing back from you.

homologous polishing

Hi Robert,
First of all thank you for this useful tool. I'd like to suggest to add homopolish as a further polishing step (step 6.5?) in the pipeline.

flye log

Hello,
Thanks for this great work.
It would be nice not to remove the flye log information as they are not included in the dragonflye log file.
Best regards
Mostafa

Batch option for Medaka

Hi!

Running into issues with medaka polishing step. Runs out of GPU memory. Medaka manual states that passing a batch option (-b) to medaka_consensus helps limit the GPU memory usage.

Tried by editing the bin file and it works. Could there be a way to dynamically pass a batch size option to medaka when calling dragonflye?

Thanks!

MV

Medaka v1.7.3

Dear @rpetit3, is dragonFlye still being maintained?

Selfish request for medaka to be updated if so!

(I can never get medaka working independently polishing my Flye assemblies, so when I'm doing LR-only assemblies like to use dragonFlye. A bit funny since I originally used dragonFlye for quick&easy polypolish! Such a nice pipeline:)

Unable to verify model 'r941_min_sup_g507'

Hi @rpetit3 ,

We tried to polish the contigs with model r941_min_sup_g507, but we get these error message:

[dragonflye] Verifying input model (--model): r941_min_sup_g507
[dragonflye] Unable to verify model 'r941_min_sup_g507', please check spelling and try again.

May I know is this model included in dragonflye?

Memory default

I'm wondering if you want to increase the default memory up to something more solid since this is nanopore. 64G? The Java error that results is really confusing to people who don't know java. I have one more separate but related idea and will make a separate ticket for that.

Coverage of "0"?

What does a coverage of 0 (for a config) in the "flye-info.txt" mean? I don't think these contigs end up in the final assembly file anyway, but I am curious as to how and why it is being reported as such, since it obviously doesn't make much sense.

how to check software in dragonflye?

Hi,

I don't know how dragonflye check software version. Dragonflye still using system bin path software after conda evn was activated.
source /Bio/User/kxie/software/miniforge3/bin/activate dragonflye
image

Why dragonflye don't use conda enviroment version? Some software in my system are very old.
Like fastp, there are no --unpaired1 --unpaired2 options in early versions, so the pipeline stop with error like following:
image

Best,
Kun

Stuck in kmc part

Hi @rpetit3 ,

Thank you for developing this tool! It is amazing.

I have just tried to run this tool, but sometimes it stuck in the kmc and doesn't continue running (I checked the cpu usage from htop). I quit the terminal and rerun again with option --force. The run will then be successfully completed. May I know how to solve this problem?

Thank you very much!

Can't locate FindBin.pm in @INC

I don't know if this will be an issue to most of your users, but I installed dragonflye via micromamba in a gitpod environment and ran into the following error:

$ dragonflye --help
Can't locate FindBin.pm in @INC (you may need to install the FindBin module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.36.0 /usr/local/share/perl/5.36.0 /usr/lib/x86_64-linux-gnu/perl5/5.36 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.36 /usr/share/perl/5.36 /usr/local/lib/site_perl) at /opt/conda/bin/dragonflye line 58.
BEGIN failed--compilation aborted at /opt/conda/bin/dragonflye line 58.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.