Giter Site home page Giter Site logo

pacificbiosciences / hiphase Goto Github PK

View Code? Open in Web Editor NEW
51.0 6.0 4.0 911 KB

Small variant, structural variant, and short tandem repeat phasing tool for PacBio HiFi reads

License: Other

Rust 100.00%
hifi phasing structural-variants variants pacbio-data short-tandem-repeats

hiphase's Introduction

HiPhase

A tool for jointly phasing small, structural, and tandem repeat variants for PacBio sequencing data


HiPhase will phase variant calls made from PacBio HiFi datasets. Key features relative to other phasing tools include:

  • Joint phasing of small variants and structural variants
  • Support for multi-allelic variation
  • Creates longer, correct phase blocks relative to the current best practice
  • No downsampling of the data
  • Novel algorithms: dual-mode allele assignment and core A* phasing algorithm
  • Quality of life additions: innate multi-threading, simultaneous haplotagging and statistics generation

Authors: Matt Holt, Chris Saunders

Availability

Documentation

Citation

If you use HiPhase, please cite our publication:

Holt, J. M., Saunders, C. T., Rowell, W. J., Kronenberg, Z., Wenger, A. M., & Eberle, M. (2024). HiPhase: Jointly phasing small, structural, and tandem repeat variants from HiFi sequencing. Bioinformatics, btae042.

Original BioRxiv pre-print:

Holt, J. M., Saunders, C. T., Rowell, W. J., Kronenberg, Z., Wenger, A. M., & Eberle, M. (2023). HiPhase: Jointly phasing small and structural variants from HiFi sequencing. bioRxiv, 2023-05.

Need help?

If you notice any missing features, bugs, or need assistance with analyzing the output of HiPhase, please don't hesitate to open a GitHub issue.

Support information

HiPhase is a pre-release software intended for research use only and not for use in diagnostic procedures. While efforts have been made to ensure that HiPhase lives up to the quality that PacBio strives for, we make no warranty regarding this software.

As HiPhase is not covered by any service level agreement or the like, please do not contact a PacBio Field Applications Scientists or PacBio Customer Service for assistance with any HiPhase release. Please report all issues through GitHub instead. We make no warranty that any such issue will be addressed, to any extent or within any time frame.

DISCLAIMER

THIS WEBSITE AND CONTENT AND ALL SITE-RELATED SERVICES, INCLUDING ANY DATA, ARE PROVIDED "AS IS," WITH ALL FAULTS, WITH NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE. YOU ASSUME TOTAL RESPONSIBILITY AND RISK FOR YOUR USE OF THIS SITE, ALL SITE-RELATED SERVICES, AND ANY THIRD PARTY WEBSITES OR APPLICATIONS. NO ORAL OR WRITTEN INFORMATION OR ADVICE SHALL CREATE A WARRANTY OF ANY KIND. ANY REFERENCES TO SPECIFIC PRODUCTS OR SERVICES ON THE WEBSITES DO NOT CONSTITUTE OR IMPLY A RECOMMENDATION OR ENDORSEMENT BY PACIFIC BIOSCIENCES.

hiphase's People

Contributors

holtjma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

hiphase's Issues

Encountered max_edit_distance check

Hi,

I am trying HiPhase with another long-read technology, using a VCF from Clair3. I realize that is probably not strictly supported :-)

The tool ran happily for a couple of hours, with some warnings ([2023-06-14T23:47:13.781Z WARN hiphase::phaser] B#1718 (chr20:59163013-64334012) detected excessive runtime in read parsing, reverting to local re-alignment.) before running into this panic:

edit_distance => 100001
thread '<unnamed>' panicked at 'encountered max_edit_distance check, probably remove when ready but this prevents infinite looping bugs', src/wfa_graph.rs:643:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2023-06-14T23:50:05.584Z ERROR hiphase] Panic detected in ThreadPool, check above for details.

Do you have a suggestion for fixing this?

Wouter

Question on tagging of supplemental alignments

Hi Matt,

Thanks again for the great tool. I am wondering, how are supplemental reads tagged? are they treated as their own reads or do they "inherit" the HP and PS of the primary alignment?

Thanks!
Mitchell

Error “thread '<unnamed>' panicked at 'assertion failed: `(left == right)`” occurred while HIPhase working

Dear developer, currently I have encountered similar issues in the previous issue #7 while using HiPhase. The error details are as follows:
environment:

# hiphase version:0.10.0
# calling snv/indel:deepvariant  version:1.4.0
# calling sv:pbsv version:pbsv 2.8.1 (commit SL-release-10.2.1-31-ge3fa446)
hiphase \
--reference hg38/Homo_sapiens_assembly38.onlychr.fasta \
--global-realignment-cputime 300 --vcf sample1.MergeVcfs.vcf.gz --output-vcf sample1.deepvariant.phased.vcf.gz \
--vcf sample1.PASS.PRECISE.vcf.gz --output-vcf sample1.pbsv.phased.vcf.gz \
--bam sample1.sort.bam --threads 10 --summary-file sample1.summary.tsv --blocks-file sample1.blocks.tsv \
--stats-file sample1.stats.csv --haplotag-file sample1.haplotag.csv

Error like, i hope the developer can check the error and help and guide me on how to handle the data in the future:

[2023-05-29T07:46:01.381Z INFO  hiphase::data_types::reference_genome] Loading "hg38/Homo_sapiens_assembly38.onlychr.fasta"...
[2023-05-29T07:46:09.707Z INFO  hiphase::data_types::reference_genome] Finished loading 25 contigs.
[2023-05-29T07:46:09.707Z INFO  hiphase] Starting job pool with 10 threads...
thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `[71]`,
 right: `[67]`', src/phaser.rs:221:21
stack backtrace:
thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `[71]`,
 right: `[67]`', src/phaser.rs:221:21
thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `[84]`,
 right: `[65]`', src/phaser.rs:221:21
   0:           0x59e5f4 - std::backtrace_rs::backtrace::libunwind::trace::ha9053a9a07ca49cb
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:           0x59e5f4 - std::backtrace_rs::backtrace::trace_unsynchronized::h9c2852a457ad564e
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:           0x59e5f4 - std::sys_common::backtrace::_print_fmt::h457936fbfaa0070f
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:65:5
   3:           0x59e5f4 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h5779d7bf7f70cb0c
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:44:22
   4:           0x4957fe - core::fmt::write::h5a4baaff1bcd3eb5
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/fmt/mod.rs:1232:17
   5:           0x57a992 - std::io::Write::write_fmt::h4bc1f301cb9e9cce
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/io/mod.rs:1684:15
   6:           0x59fc39 - std::sys_common::backtrace::_print::h5fcdc36060f177e8
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:47:5
   7:           0x59fc39 - std::sys_common::backtrace::print::h54ca9458b876c8bf
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:34:9
   8:           0x59f868 - std::panicking::default_hook::{{closure}}::hbe471161c7664ed6
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:271:22
   9:           0x5a0881 - std::panicking::default_hook::ha3500da57aa4ac4f
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:290:9
  10:           0x5a0881 - std::panicking::rust_panic_with_hook::h50c09d000dc561d2
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:692:13
  11:           0x5a036e - std::panicking::begin_panic_handler::{{closure}}::h9e2b2176e00e0d9c
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:583:13
  12:           0x5a02d6 - std::sys_common::backtrace::__rust_end_short_backtrace::h5739b8e512c09d02
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:150:18
  13:           0x5a02cd - rust_begin_unwind
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:579:5
  14:           0x40258c - core::panicking::panic_fmt::hf33a1475b4dc5c3e
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:64:14
  15:           0x4027a6 - core::panicking::assert_failed_inner::haf9816227b20b6f2
  16:           0x4067f1 - core::panicking::assert_failed::h816680c3b2244efb
  17:           0x4e636e - hiphase::phaser::solve_block::h99faa68405ef9ccc
  18:           0x41331c - <F as threadpool::FnBox>::call_box::h07a333169af0e835
  19:           0x5a5d70 - std::sys_common::backtrace::__rust_begin_short_backtrace::h2244e6eb820fac9f
  20:           0x5a49e4 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h2f8049bc016327da
  21:           0x5a1395 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h39990b24eedef2ab
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:1987:9
  22:           0x5a1395 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h01a027258444143b
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:1987:9
  23:           0x5a1395 - std::sys::unix::thread::Thread::new::thread_start::ha4f1cdd9c25884ba
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys/unix/thread.rs:108:17
  24:           0x6658f7 - start_thread
                               at ./nptl/./nptl/pthread_create.c:477:8
  25:           0x6e979f - __clone
  26:                0x0 - <unknown>
stack backtrace:
   0:           0x59e5f4 - std::backtrace_rs::backtrace::libunwind::trace::ha9053a9a07ca49cb
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:           0x59e5f4 - std::backtrace_rs::backtrace::trace_unsynchronized::h9c2852a457ad564e
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:           0x59e5f4 - std::sys_common::backtrace::_print_fmt::h457936fbfaa0070f
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:65:5
   3:           0x59e5f4 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h5779d7bf7f70cb0c
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:44:22
   4:           0x4957fe - core::fmt::write::h5a4baaff1bcd3eb5
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/fmt/mod.rs:1232:17
   5:           0x57a992 - std::io::Write::write_fmt::h4bc1f301cb9e9cce
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/io/mod.rs:1684:15
   6:           0x59fc39 - std::sys_common::backtrace::_print::h5fcdc36060f177e8
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:47:5
   7:           0x59fc39 - std::sys_common::backtrace::print::h54ca9458b876c8bf
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:34:9
   8:           0x59f868 - std::panicking::default_hook::{{closure}}::hbe471161c7664ed6
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:271:22
   9:           0x5a0881 - std::panicking::default_hook::ha3500da57aa4ac4f
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:290:9
  10:           0x5a0881 - std::panicking::rust_panic_with_hook::h50c09d000dc561d2
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:692:13
  11:           0x5a036e - std::panicking::begin_panic_handler::{{closure}}::h9e2b2176e00e0d9c
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:583:13
  12:           0x5a02d6 - std::sys_common::backtrace::__rust_end_short_backtrace::h5739b8e512c09d02
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys_common/backtrace.rs:150:18
  13:           0x5a02cd - rust_begin_unwind
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:579:5
  14:           0x40258c - core::panicking::panic_fmt::hf33a1475b4dc5c3e
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/core/src/panicking.rs:64:14
  15:           0x4027a6 - core::panicking::assert_failed_inner::haf9816227b20b6f2
  16:           0x4067f1 - core::panicking::assert_failed::h816680c3b2244efb
  17:           0x4e636e - hiphase::phaser::solve_block::h99faa68405ef9ccc
  18:           0x41331c - <F as threadpool::FnBox>::call_box::h07a333169af0e835
  19:           0x5a5d70 - std::sys_common::backtrace::__rust_begin_short_backtrace::h2244e6eb820fac9f
  20:           0x5a49e4 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h2f8049bc016327da
  21:           0x5a1395 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h39990b24eedef2ab
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:1987:9
  22:           0x5a1395 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h01a027258444143b
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/alloc/src/boxed.rs:1987:9
  23:           0x5a1395 - std::sys::unix::thread::Thread::new::thread_start::ha4f1cdd9c25884ba
                               at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/sys/unix/thread.rs:108:17
  24:           0x6658f7 - start_thread
                               at ./nptl/./nptl/pthread_create.c:477:8
  25:           0x6e979f - __clone
  26:                0x0 - <unknown>

Expected memory usage

Hi @holtjma,

I was wondering if you could share some approximate memory usage estimates for hiphase? (and sorry if I missed it in the docs)

We're running into some surprising out-of-mem errors and wondering if we need to change resources or if there is something that could be addressed/improved.

Thanks,
Mitchell

Poor utilization of threads (maybe user error?)

Hi @holtjma,

I am seeing that when I give hiphase 32 threads it only uses 150-300% CPU (see screenshot blow with top and run log). Is this expected? And if not do you have any recommendations? This is the command I am using:

        hiphase -t 32 \
            --bam {input.bam} \
            --vcf {input.vcf} \
            --reference {input.ref} \
            --output-bam {output.bam} \
            --output-vcf {output.vcf} \
            --summary-file {output.summary} \
            --stats-file {output.stats} \
            --blocks-file {output.blocks} 

Thanks in advance!
Mitchell

image

Error while parsing VCF file: FORMAT columns

Hi Matt,

I was running Hiphase on a vcf and have an error for a pbsv call. I manually checked this record and didn't see anything obviously wrong.
[E::vcf_parse_format] FORMAT column with no sample columns starting at chr1:80764910 [2024-02-09T19:59:04.480Z ERROR hiphase] Error while parsing VCF file: invalid record in BCF/VCF file

The format columns of this call is here:
chr1 80764910 pbsv.INS.2090 GT:AD:DP:SUPP ./.:.:.:.

In addition, I saw this error before for other pbsv call sets. Is hiphase requires all the records have the same FORMAT tags? Could you provide some suggestions about the error and what does this mean?

Best regards,

Hang Su

A question about HP tag

Hi, good news to hear you developed a useful tool to analyse HIFI data.
I used this tool to analysis my whole genome sequencing data based seq IIe platform. Subreads of one child and his parents NGS datas were got, I want to phase methylation locus and to compare the difference between the two haplotypes.

From user guide instruction, HiPhase follows the same haplotagging convention as WhatsHap,and each mapping is tagged with both a phase set ID (PS) and a haplotype ID (HP). So, what is the relationship of 1 and 2 to their parents? the HP tag "1" represent his paternal resource or the HP tag "2" represent his maternal resource?

The second quenstion: Within a single phase block, all mappings with the same read name will have the same HP tag. Mappings of the same read to different phase blocks (e.g. to different chromosomes) are not guaranteed to have matching HP tags. Can you explain this sentence clearly? Is it possible that HP tag "1" or HP tag "2" has different meanings on different chromosomes?

Looking forward to your reply.
THANKS.

segmentation fault (core dumped)

Hi!

Thank you for developing HiPhase. I came across this error on any one of the versions of HiPhase:

hiphase --bam hifi2ref.sorted.bam --vcf bcftools.vcf --output-vcf bcftools.phased.vcf --reference ref.fasta --threads 40

[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Alignment file: "hifi2ref.sorted.bam"
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Variant file: "bcftools.vcf"
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Reference file: "ref.fasta"
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Minimum call quality: 0
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Minimum mapping quality: 5
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Minimum matched alleles: 2
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Minimum spanning reads: 1
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Supplemental mapping block joins: ENABLED
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Phase singleton blocks: DISABLED
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Local re-alignment maximum reference buffer: +-15 bp
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Global re-alignment: DISABLED
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] Processing threads: 40
[2023-06-12T21:23:56.142Z INFO  hiphase::cli] I/O threads: 40
[1]    38095 segmentation fault (core dumped)  hiphase --bam hifi2ref.sorted.bam --vcf bcftools.vcf --output-vcf  --referenc

Here's my system info:

uname -a
Linux machado-kremer 5.19.0-43-generic #44~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon May 22 13:39:36 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Thank you in advance for your help.

RUST error when phasing with SV VCF file

I am facing a RUST problem when providing HiPhase with a Structural Variants (SV) VCF file to improve the phasing. Phasing works fine when only a Single Nucleotide Variants (SNV) VCF file is provided. This is the command I am using:

/path/hiphase-v0.8.0-x86_64-unknown-linux-gnu/hiphase -t 16 -r /path/ref.fa -b /path/sample.bam -p sample.haplotagged.bam -c /path/SNV.vcf.gz -o sample.SNV.phased.vcf.gz -c /path/SV.vcf.gz -o sample.SV.phased.vcf.gz -s sample

And this is the error:

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `[78]`,
 right: `[71]`', src/phaser.rs:222:25
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2023-05-11T09:18:04.236Z ERROR hiphase] Panic detected in ThreadPool, check above for details.

The SV VCF file has been generated with [Sniffles2](https://github.com/fritzsedlazeck/Sniffles). We thought it might be related to Breakends (BND) variants but the error also appears when these are filtered out.

Thanks in advance for looking into this :)

Phase vcf with pre phased reads

Hi @holtjma,

It's on my todo list to write a tool that takes a bam with phased reads and then phases a vcf with that preexisting information. This often comes up for us when we phase reads with k-mers or an assembly and we want to create a phased vcf to match.

I was going to make a standalone tool but then I thought depending on how things are setup in hiphase much of the work might already be done.

Do you think adding a --use-pre-phased-bam would be something you'd allow? If so, could you maybe point me to a place within the code where I might be able to start working in this functionality?

Cheers,
Mitchell

Normalization of INDELs: required or should be avoided

I was wondering whether INDEL normalization is required for better phasing results or should be avoided as it might also destroy phasing. I was unable to find any discussion on the INDEL normalization in the documentation or the closed issues. Can you please shed some light?
Thank you in advance!

Feature request: haplotag in phased VCF files

Thank you for developing such a useful tool! Would it be possible to include the haplotag information (stored in the --haplotag-file file) in the phased VCF files? HiPhase currently only outputs PS (phase set identifier) in the phased VCF files.

Recommendations for input vcf

Hi @holtjma,

Would you be able to share the DeepVariant parameters and filtering steps you use at PacBio before applying hiphase? I'd like to be as consistent as I can with your process! Sorry if this is documented and I missed it.

Cheers,
Mitchell

[Suggestion] reducing messages to STDOUT to speed up the utility

HiPhase works great. I'd like to suggest to move the warning messages such as the one below to debug. This would reduce a large amount of messages to STDOUT. The log can be less cluttered and the execution can be faster without major changes.

[2024-04-15T18:17:31.161Z WARN hiphase::writers::ordered_vcf_writer] Received 'error seeking to "chrUn_KI270512v1":0 in indexed file', while seeking to chrUn_KI270512v1:0-18446744073709551615 in vcf #0, likely no variants present

Feature request: CRAM compatible

Thanks for developing HiPhase. Will it be possible to input CRAM files and output haplotagged CRAM files? That would suit perfectly our data architecture.

reference letter case issue

HI,

My VCF calls are from sniffles2 and somewhere in the pipeline the reference sequence letter case has been changed to lower-case, leading to the 'fatal' error below.

[2023-10-12T14:21:23.350Z ERROR hiphase]   Reference mismatch error: variant at chr1:10354 has REF allele = "ccctaaccctaaccctaaccctaaccctaaccctaacccctaacccctaaccctaaccctaaccctaaccct", but reference genome has "CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCCTAACCCTAACCCTAACCCTAACCCT".

Is there a hidden parameter that would allow ignoring case and running the tool?

Note: the pbsv calls from the same BAM are processed well so the issue has probably been introduced by sniffles2 when writing the VCF.

Thanks for your help

PS: I also post this on sniffles2 with cross ref to maximize my chance of finding a fix, no offence taken I hope

Running HiPhase with tumor-normal pair

Hello,
Thank you for developing HiPhase! I liked how HiPhase handles haplotagging supplementary alignments. Is this by default, or should I pass a parameter?

Is there a suggested workflow for tumor-normal pairs? We usually run the Clair3 (or DV) + whatshap phase with normal and then use whatshap to haplotag both tumor and normal using the phased vcf. Is there such functionality?

Thank you,
Ayse

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.