0xtcg / aldy Goto Github PK
View Code? Open in Web Editor NEWAllelic decomposition and exact genotyping of highly polymorphic and structurally variant genes
Home Page: http://aldy.csail.mit.edu
License: Other
Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes
Home Page: http://aldy.csail.mit.edu
License: Other
Is there a way to run multiple BAM files in Aldy and have the results from each BAM file contained in a single output file? When I try running each BAM file at a time to the same output file, it overwrites the data from the previous one.
Hi!
When running Aldy, the tool does not recognize [23936, A>G, rs7088784], a variant characteristic for *3.002. I checked my input file manually and found 5 variants also present in PharmVar. However, Aldy only picks up 4 of those 5 and does not return [23936, A>G, rs7088784].
I checked the cyp2c19.yml file of the Aldy resources;
The variant is there, so that is not the problem.
Did someone maybe encounter the same?
Hi there,
I've tried genotyping my sample using Aldy. But there is a issue below.
$ aldy genotype -p wgs -g cftr 1_HB00001.cram -o test_1_HB00001-cftr.aldy
๐ฟ Aldy v4.2.1 (Python 3.9.13 on Linux 3.10.0-1062.4.1.el7.x86_64-x86_64-with-glibc2.17)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample 1_HB00001.cram...
ERROR: truncated file cannot be accessed
I can not find the solution of this error message.
Could you help me to handle this error?
Thank you.
Hi, I am trying to use aldy with cram files.
When I try to use other cn neutral region, aldy seems to go back to cyp2d8 region.
aldy genotype --genome hg38 -r GRCh38_full_analysis_set_plus_decoy_hla.fa -p wes -o example.wes_cram.out -l example.wes_cram.log example.cram -n chr7:55019016-55211628
First part of the log looks like this:
๐ฟ Aldy v4.4 (Python 3.6.5 on Linux 5.15.0-1031-aws-x86_64-with-debian-buster-sid)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample example.cram...
Gene CFTR
Failed gene CFTR
Message: CN-neutral region 22:42151472-42152258 has no reads. Double check your input file for CYP2D8 (are you using hg19?), or pass an alternative CN-neutral region via -n parameter.
Any help on this will be appreciated..
Hello,
I am trying to run Aldy to call CYP2D6 on a set of samples that were genotyped using PacBio long-read whole genome sequencing using 8x coverage. The current profiles available for Aldy use targeted long-read sequencing which does not work for these samples as coverage is low. I tried creating my own profile using one of the samples, but I get the error below:
/home/jupyter/workspaces/piii03variantfrequencyprojectpgxcontrolled/bin/python/bin/python3.9 -m aldy profile --param sam_long_reads=true --genome hg38 123456.bam > long_read.profile
๐ฟ Aldy v4.4 (Python 3.9.15 on Linux 5.10.0-0.deb10.16-amd64-x86_64-with-glibc2.31)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Ignoring utr3 in 5.001
Ignoring utr3 in 7.001
Ignoring utr3 in 8.001
Ignoring utr3 in 10.001
Ignoring utr3 in 19.001
Ignoring utr3 in 24.001
Ignoring utr3 in 24.002
Ignoring utr3 in 28.001
Ignoring utr3 in 28.002
Ignoring utr3 in 35.001
Ignoring utr3 in 35.002
Ignoring utr3 in 36.001
Ignoring utr3 in 37.001
Scanning chr1:59890307-109696745...
Scanning chr2:233583743-233776299...
Scanning chr4:69050374-69114287...
Scanning chr6:18125310-18161143...
Scanning chr7:977198-117718971...
/bin/bash: line 1: 16913 Killed /home/jupyter/workspaces/piii03variantfrequencyprojectpgxcontrolled/bin/python/bin/python3.9 -m aldy profile --param sam_long_reads=true --genome hg38 123456.bam > long_read.profile
Thanks,
Andrew
Hi,
thanks for this useful tool!
In case you would consider also providing this as a docker container, please see my fork: https://github.com/ikmb/aldy
All you need to do is change the docker hub repo in the respectice workflow files (.github/workflows) and add the relevant repo secrets (DOCKERHUB_USERNAME and DOCKERHUB_PASS) and you are basically good to go. It should then auto-build on commits on master and on any release you do.
If you do not allow others to build public Docker containers of your code, please let me know. I actually need that for my compute environment...
For testing, you can simply do:
docker pull docker://ikmb/aldy:latest
or
singularity pull docker://ikmb/aldy:latest
Cheers,
Marc
Hello, hope all is well. I haven noticed that the -n and --cn-neutral-region commands do not update my neutral region. Is there a way I can provide an updated neutral region to the algorithm?
Example commands;
[ceisenhart@localhost foo]$ aldy genotype --cn-neutral-region chr1:10000-20000 -p illumina -g final.bam
๐ฟ Aldy v4.4 (Python 3.9.16 on Linux 5.14.0-284.25.1.el9_2.x86_64-x86_64-with-glibc2.34)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample POC_LAB_1_S19_L00N_R1_001.true.final.bam...
ERROR: gene= cyp2d6, profile= illumina, file= POC_LAB_1_S19_L00N_R1_001.true.final.bam
CN-neutral region 22:42151472-42152258 has no reads. Double check your input file for CYP2D8 (are you using hg19?), or pass an alternative CN-neutral region via -n parameter.
[ceisenhart@localhost foo]$ aldy genotype -n chr1:10000-20000 -p illumina -g final.bam
๐ฟ Aldy v4.4 (Python 3.9.16 on Linux 5.14.0-284.25.1.el9_2.x86_64-x86_64-with-glibc2.34)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample POC_LAB_1_S19_L00N_R1_001.true.final.bam...
ERROR: gene= cyp2d6, profile= illumina, file= POC_LAB_1_S19_L00N_R1_001.true.final.bam
CN-neutral region 22:42151472-42152258 has no reads. Double check your input file for CYP2D8 (are you using hg19?), or pass an alternative CN-neutral region via -n parameter.
I provide an updated copy number region (chr1:10000-20000) and the algorithm still looks for the internally defined region (22:42151472-42152258) then fails when it is not found.
PS: I've noticed similar behavior when trying to specify hg38 as the genome using --genome hg38. The algorithm still runs with hg19 as if I did not specify.
Hi,
I would like to add CES1 and CES2 gene for variant and CNV detection. Is it possible ? Do I just need to edit a new yaml file ?
Is it possible to merge BAM files and make aldy to read them at the same time? Is it necessary to read BAM files one by one? Thank you in advance.
When attempting to use a non-default value for multiple-warn-level, the following error is generated:
Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/aldy/__main__.py", line 120, in main _genotype(gene, output, args) File "/usr/local/lib/python3.6/dist-packages/aldy/__main__.py", line 450, in _genotype run(None) File "/usr/local/lib/python3.6/dist-packages/aldy/__main__.py", line 412, in run min_cov=args.min_coverage, File "/usr/local/lib/python3.6/dist-packages/aldy/genotype.py", line 211, in genotype if multiple_warn_level >= 3 and len(cn_sols) > 1: TypeError: '>=' not supported between instances of 'str' and 'int'
It looks like an int
cast is needed in the genotype call in __main__.py
I'd be happy to add this in a PR if you're open to external entities contributing to the project.
Hi,
May I ask how to fix the "ImportError while loading conftest" issue? I have tried installing all of Aldy's components on a separate Anaconda environment, but it keeps giving me this error? How do I fix this?
Hello,
Do you think it would be a good idea to subset the human reference genome file so as it includes only PGx regions + some regions for normalization? Is this something feasible? Will Aldy work with an approach like that?
I want to speed up the process using WGS data.
Dear
Thanks for the great tool. We know that the combination of Aldy with Nanopore data is not fully supported yet, however, could you help me out with this error?
EDIT: I added the debug .tar file for CYP2A6.
DEBUG.tar.gz
I created a specific profile for our nanopore data by using
aldy profile /data/projects/pass_sorted.bam --genome hg38 > nanopore.profile
Next, I used the code to get the diplotypes for my data:
aldy genotype -p nanopore.profile --genome hg38 pass_sorted.bam
For some genes (e.g. COMT), Aldy runs succesful. For others, it gives error ('IndexError: string index out of range') as stated below. Sometimes, a result is provided despite the error (e.g. CFTR), sometimes not (e.g. CYP2A6). Examples are given below. Any advice is much appreciated!
๐ฟ Aldy v4.4 (Python 3.7.12 on Linux 3.10.0-1062.12.1.el7.x86_64-x86_64-with-centos-7.7.1908-Core) (c) 2016-2023 Aldy Authors. All rights reserved. Free for non-commercial/academic use only. Genotyping sample pass_sorted.bam...
Gene CFTR
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
Potential CFTR gene structures for pass_sorted:
1: 2x1 (confidence: 100%)
Potential major CFTR star-alleles for pass_sorted:
1: 2xWT (confidence: 100%)
Best CFTR star-alleles for pass_sorted:
1: *WT / *WT (confidence=100%)
Minor alleles: *WT, *WT
CFTR results:
Gene COMT
Potential COMT gene structures for pass_sorted:
1: 2x1 (confidence: 100%)
Potential major COMT star-alleles for pass_sorted:
1: 1xMet, 1x*ValA (confidence: 100%)
Best COMT star-alleles for pass_sorted:
1: *Met / *ValA (confidence=100%)
Minor alleles: *(Met +rs174699 +rs165599 +rs165728), *(ValA +rs2020917 +rs13306278 +rs737866 +rs737865 +rs737864 +rs5746849 +rs740603 +rs4646312 +rs2239393 +rs174699 +rs9332377 +rs165728)
COMT results:
Gene CYP2A6
Ignoring utr3 in 5.001
Ignoring utr3 in 7.001
Ignoring utr3 in 8.001
Ignoring utr3 in 10.001
Ignoring utr3 in 19.001
Ignoring utr3 in 24.001
Ignoring utr3 in 24.002
Ignoring utr3 in 28.001
Ignoring utr3 in 28.002
Ignoring utr3 in 35.001
Ignoring utr3 in 35.002
Ignoring utr3 in 36.001
Ignoring utr3 in 37.001
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
IndexError: string index out of range
Exception ignored in: 'aldy.indelpost.utilities.count_lowqual_non_ref_bases'
IndexError: string index out of range
ERROR: gene= all, file= pass_sorted.bam
IndexError('string index out of range')
Traceback (most recent call last):
File "/home/kdserran/miniconda3/envs/PGx/lib/python3.7/site-packages/aldy/main.py", line 122, in main
_genotype(args.gene, output, args)
File "/home/kdserran/miniconda3/envs/PGx/lib/python3.7/site-packages/aldy/main.py", line 443, in _genotype
run(None)
File "/home/kdserran/miniconda3/envs/PGx/lib/python3.7/site-packages/aldy/main.py", line 403, in run
**{k: v for k, v in params.items() if v is not None},
File "/home/kdserran/miniconda3/envs/PGx/lib/python3.7/site-packages/aldy/genotype.py", line 140, in genotype
**params,
File "/home/kdserran/miniconda3/envs/PGx/lib/python3.7/site-packages/aldy/genotype.py", line 186, in genotype
sample = sam.Sample(gene, profile, sam_path, reference, debug)
File "/home/kdserran/miniconda3/envs/PGx/lib/python3.7/site-packages/aldy/sam.py", line 111, in init
norm, muts = self._load_sam(path, reference, debug)
File "/home/kdserran/miniconda3/envs/PGx/lib/python3.7/site-packages/aldy/sam.py", line 162, in _load_sam
self._realign_indels(tmp, sam, reference)
File "/home/kdserran/miniconda3/envs/PGx/lib/python3.7/site-packages/aldy/sam.py", line 395, in _realign_indels
exact_match_for_shiftable=exact_match_for_shiftable,
File "aldy/indelpost/varaln.pyx", line 169, in aldy.indelpost.varaln.VariantAlignment.cinit
File "aldy/indelpost/varaln.pyx", line 205, in aldy.indelpost.varaln.VariantAlignment.__parse_pileup
File "aldy/indelpost/gappedaln.pyx", line 34, in aldy.indelpost.gappedaln.find_by_normalization
File "aldy/indelpost/gappedaln.pyx", line 130, in aldy.indelpost.gappedaln.is_target_by_normalization
File "aldy/indelpost/localn.pyx", line 123, in aldy.indelpost.localn.findall_mismatches
IndexError: string index out of range
The following things are to be implemented for Aldy 2.0:
Improvements:
Integration:
Misc:
Bug fixes:
Hello,
I am currently using ALDY for genotyping my samples, and I have come across an issue regarding the detection of insertion and deletion variants in the samples. It seems that ALDY is unable to identify variants with insertions or deletions, which leads to incorrect star allele assignments.
Examples of the problem are as follows:
(1) In the case of CYP2C9*6, the variant rs9332131 is a deletion of 'A'. However, ALDY fails to detect this deletion and wrongly assigns *6 as *1.
(rs9332131 recorded in cyp2c9.yml as [16126, delA, rs9332131, K273fs])
(2) For CYP3A5*7.001, one of its variants (rs41303343) contains an insertion 'T' that ALDY cannot detect, resulting in misidentification of *7 as 1.002.
(3) UGT1A1 *28 , UGT1A1 *36 , and UGT1A1 *37 all have rs3064744 with insertions 'TA' or deletions 'TA,' and ALDY is unable to detect these, failing to identify the correct alleles.
(4) Additionally, in the case of CYP2C19 *39, one of the ten variants is 4193delT, causing ALDY to produce a result of (CYP2C19 *39.001 - rs17880036).
Is there any specific configuration or setting that needs to be adjusted in ALDY to enable the correct detection of insertion and deletion variants? Your guidance and support would be highly appreciated.
aldy genotype -p illumina --gene cyp2c9 --genome hg38 my.vcf -o my.cyp2c9.aldy
Thank you for your attention and support.
I have a working version of Aldy 1.2 but wanted to try 3.0. When I tested version 3.0 with aldy test
I received 16 failed test because of an assertion error in the test script test_full.py.
Changing the date in test_full.py from 2016-2020
to 2016-2021
resolved those errors.
As the titles says, I think that aldy needs a chromosome notation in the bam header without the 'chr' tag. At least for my files.
aldy profile out.bam > my_profile
*** Aldy v2.2.6 (Python 3.7.6, linux) ***
*** (c) 2016-2020 Aldy Authors & Indiana University Bloomington. All rights reserved.
*** Free for non-commercial/academic use only.
Generating profile for DPYD (1:97541297-98388616)
Cannot fetch gene DPYD (1:97541297-98388616)
Generating profile for CYP2C19 (10:96444999-96615001)
Cannot fetch gene CYP2C19 (10:96444999-96615001)
Generating profile for CYP2C9 (10:96690999-96754001)
Cannot fetch gene CYP2C9 (10:96690999-96754001)
Generating profile for CYP2C8 (10:96795999-96830001)
Cannot fetch gene CYP2C8 (10:96795999-96830001)
Generating profile for CYP4F2 (19:15618999-16009501)
Cannot fetch gene CYP4F2 (19:15618999-16009501)
Generating profile for CYP2A6 (19:41347499-41400001)
Cannot fetch gene CYP2A6 (19:41347499-41400001)
Generating profile for CYP2D6 (22:42518899-42553001)
Cannot fetch gene CYP2D6 (22:42518899-42553001)
Generating profile for TPMT (6:18126540-18157375)
Cannot fetch gene TPMT (6:18126540-18157375)
Generating profile for CYP3A5 (7:99244999-99278001)
Cannot fetch gene CYP3A5 (7:99244999-99278001)
Generating profile for CYP3A4 (7:99353999-99465001)
Cannot fetch gene CYP3A4 (7:99353999-99465001)
Thus, I transformed the header using sth. like
samtools view -H input.bam > header.sam
sed "s/chr//" header.sam > header_corrected.sam
samtools reheader header_corrected.sam input.bam > out.bam
Then I was able to run
samtools index out.bam
aldy profile out.bam > my_profile
*** Aldy v2.2.6 (Python 3.7.6, linux) ***
*** (c) 2016-2020 Aldy Authors & Indiana University Bloomington. All rights reserved.
*** Free for non-commercial/academic use only.
Generating profile for DPYD (1:97541297-98388616)
Generating profile for CYP2C19 (10:96444999-96615001)
Generating profile for CYP2C9 (10:96690999-96754001)
Generating profile for CYP2C8 (10:96795999-96830001)
Generating profile for CYP4F2 (19:15618999-16009501)
Generating profile for CYP2A6 (19:41347499-41400001)
Generating profile for CYP2D6 (22:42518899-42553001)
Generating profile for TPMT (6:18126540-18157375)
Generating profile for CYP3A5 (7:99244999-99278001)
Generating profile for CYP3A4 (7:99353999-99465001)
I was looking through the allele definitions for CYP2E1 and I believe the label for one of the suballeles is incorrect. The label for CYPE2E17.005 is the same as for CYP2E17.004, but the first SNP is not the same as for the *7A allele:
CYP2E17.005:
label: CYP2E17A_1B
mutations:
- [4963, G>T, rs6413420, 5'UTR]
- [15271, G>C, rs2070676]
Can someone double check this or explain it to me?
Hello!
Looks like the change log was last updated for v5.2 in Sept 2023: https://github.com/0xTCG/aldy/blob/master/README.rst?plain=1#L690
Could we get an updated changelog now that v4.5 is released?
Thanks
I want to run aldy for a targeted sequencing.so am planing to create profile using bam file i have(same bam file is using for genotype).But i don't know about the copy number neutral region.Is there any way to find copy number neutral region from bam file?also i have one doubt will aldy create any wrong call when am run with this profile (created using aldy profile with default (cyp2d8)copy number neutral region?)
Hello there,
while testing Aldy v4.4 on CYP2D6-calling on samples from GeT-RM I encountered something unexpected. On a publication that used Aldy v2.2.6, sample NA21781 is correctly genotyped as *2x2/*68+*4 (supplementary materials), however, Aldy v4.4 that I used calls this sample as *4/*63+*65.
The BAM file for this sample was downloaded from the ENA website and used as is.
Aldy was used as :
aldy genotype -p wgs -g cyp2d6 path_to_file.bam
and the output was:
๐ฟ Aldy v4.4 (Python 3.10.6 on Linux 5.15.0-58-generic-x86_64-with-glibc2.35)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample NA21781.bam...
Potential CYP2D6 gene structures for NA21781:
1: 2x*1,1x*141.1001 (confidence: 100%)
2: 2x*1,1x*61 (confidence: 100%)
3: 2x*1,1x*63 (confidence: 100%)
Potential major CYP2D6 star-alleles for NA21781:
1: 1x*2, 1x*4.021.ALDY, 1x*65 (confidence: 100%)
2: 1x*4.021, 1x*63, 1x*65 (confidence: 100%)
Best CYP2D6 star-alleles for NA21781:
1: *4.021 / *63 + *65 (confidence=100%)
Minor alleles: *4.021, *63.001, *(65.001 +rs28371701 +rs28735595)
CYP2D6 results:
- *4.021 / *63 + *65
Minor: [*4.021] / [*63.001] + [*65.001 +rs28371701 +rs28735595]
Legacy notation: [*4.021] / [*63] + [*65 +rs28371701 +rs28735595]
Estimated activity for *65: uncertain function (evidence: L); see https://www.pharmvar.org/haplotype/187 for details
Estimated activity for *4.021: no function (evidence: D); see https://www.pharmvar.org/haplotype/652 for details
Estimated activity for *63: unknown
Could this be due to an error on my end?
Thank you.
I am trying to genotype cyp2d6 gene from Nanopore data (target sequencing) but kept giving me this error:
Traceback (most recent call last): ... /aldy/sam.py", line 389, in _realign_indels
valn = VariantAlignment( # type: ignore
IndexError: string index out of range
IndexError: string index out of range
I tried everything but couldn't fix it.
Would you please help me?
Thank you
Hi there,
Aldy has been working fine for all the other genes however only when I try UGT1A1 do i get this error.
Aldy v4.4 (Python 3.8.10 on Linux 5.15.0-1042-azure-x86_64-with-glibc2.29)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample P000205.hc.vqsr.hs38.vcf.gz...
WARNING: Cannot detect genome, defaulting to hg19.
WARNING: Using VCF file. Copy-number calling is not available.
Using VCF sample P000205
ERROR: gene= UGT1A1, file= /dbfs/mnt/s3/s3/vcf/LKCGP-P000205-251422-02-04-07-G1/hs38/haplotypecaller/v4_2_5_0/P000205.hc.vqsr.hs38.vcf.gz
AttributeError("'NoneType' object has no attribute 'startswith'")
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/aldy/__main__.py", line 122, in main
_genotype(args.gene, output, args)
File "/usr/local/lib/python3.8/dist-packages/aldy/__main__.py", line 443, in _genotype
run(None)
File "/usr/local/lib/python3.8/dist-packages/aldy/__main__.py", line 393, in run
_ = genotype(
File "/usr/local/lib/python3.8/dist-packages/aldy/genotype.py", line 176, in genotype
sample = sam.Sample(gene, profile, sam_path, debug=debug)
File "/usr/local/lib/python3.8/dist-packages/aldy/sam.py", line 116, in __init__
self._make_coverage(norm, muts)
File "/usr/local/lib/python3.8/dist-packages/aldy/sam.py", line 482, in _make_coverage
self.coverage = Coverage(
File "/usr/local/lib/python3.8/dist-packages/aldy/coverage.py", line 50, in __init__
if not (indel_coverage and op.startswith("ins")):
AttributeError: 'NoneType' object has no attribute 'startswith'
wondering if anyone else has run into this issue?
Allowing Aldy to process files that contain multiple samples will shorten the manual work of separating large files with large samples into a different file each
Hi, I'm trying Aldy for genotyping PGX genes, for some run, I encounter this error:
maximum recursion depth exceeded in comparison
Traceback (most recent call last):
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/__main__.py", line 117, in main
_genotype(gene, output, args)
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/__main__.py", line 433, in _genotype
run(None)
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/__main__.py", line 392, in run
debug=debug,
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/genotype.py", line 185, in genotype
debug=debug,
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/major.py", line 75, in estimate_major
gene, alleles, coverage, cn_solution, solver, gap, identifier, debug
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/major.py", line 259, in solve_major_model
for status, opt, sol in model.solutions(gap):
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/lpinterface.py", line 268, in solutions
yield from self.solutions(gap, best_obj, limit, iteration + 1, init)
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/lpinterface.py", line 268, in solutions
yield from self.solutions(gap, best_obj, limit, iteration + 1, init)
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/lpinterface.py", line 268, in solutions
yield from self.solutions(gap, best_obj, limit, iteration + 1, init)
[Previous line repeated 972 more times]
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/lpinterface.py", line 267, in solutions
self.addConstr(self.quicksum(vv.values()) <= len(vv) - 1)
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/aldy/lpinterface.py", line 397, in quicksum
return self.model.Sum(expr)
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/ortools/linear_solver/pywraplp.py", line 468, in Sum
result = SumArray(expr_array)
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/ortools/linear_solver/linear_solver_natural_api.py", line 209, in __init__
self.__array = [CastToLinExp(elem) for elem in array]
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/ortools/linear_solver/linear_solver_natural_api.py", line 209, in <listcomp>
self.__array = [CastToLinExp(elem) for elem in array]
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/site-packages/ortools/linear_solver/linear_solver_natural_api.py", line 53, in CastToLinExp
if isinstance(v, numbers.Number):
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/abc.py", line 190, in __instancecheck__
subclass in cls._abc_negative_cache):
File "/home/nguyen/anaconda3/envs/aldy/lib/python3.6/_weakrefset.py", line 75, in __contains__
return wr in self.data
RecursionError: maximum recursion depth exceeded in comparison
I found that the error comes from reaching maximum recursive depth and can be raised by using sys.setrecursionlimit(1500)
, but I also heard that it's not safe for doing so. Can this be improved?
I also wonder about genotyping multiple genes at once, is this possible to do it?
When I run the following command :
python -m aldy genotype -p 1813PG.bam -g CYP2D6 1856PG.bam
I get the following error
'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
Traceback (most recent call last):
File "/mydata/miniconda/envs/gatk/lib/python3.6/site-packages/aldy/__main__.py", line 116, in main
_genotype(gene, output, args)
File "/mydata/miniconda/envs/gatk/lib/python3.6/site-packages/aldy/__main__.py", line 434, in _genotype
run(None)
File "/mydata/miniconda/envs/gatk/lib/python3.6/site-packages/aldy/__main__.py", line 393, in run
debug=debug,
File "/mydata/miniconda/envs/gatk/lib/python3.6/site-packages/aldy/genotype.py", line 125, in genotype
debug=debug,
File "/mydata/miniconda/envs/gatk/lib/python3.6/site-packages/aldy/sam.py", line 98, in __init__
self.detect_cn(gene, profile, cn_region)
File "/mydata/miniconda/envs/gatk/lib/python3.6/site-packages/aldy/sam.py", line 512, in detect_cn
prof = self._load_profile(profile)
File "/mydata/miniconda/envs/gatk/lib/python3.6/site-packages/aldy/sam.py", line 571, in _load_profile
for line in f:
File "/mydata/miniconda/envs/gatk/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
I am investigating...
Hi there,
I have tried genotyping with CRAM file using Aldy.
But I have a issue.
$ aldy genotype -p wgs -g ifnl3 1_HB00001.cram --reference GRCh38.p12.genome.fa -o test_1_HB00001-ifnl3.aldy
๐ฟ Aldy v4.2.1 (Python 3.9.13 on Linux 3.10.0-1062.4.1.el7.x86_64-x86_64-with-glibc2.17)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample 1_HB00001.cram...
ERROR: gene= ifnl3, profile= wgs, file= 1_HB00001.cram
The average coverage of the sample is too low (1.3).
Is there any way to ignore the average coverage of the VCF file I used as input and just get the output when I genotyping using Aldy?
Please help me.....
Thak you
Hello,
I'm running Aldy on some samples I have and almost all of them worked flawlessly for all the gene profiles that are available. However, I have two samples that are both failing to run DPYD and they're receiving an identical error message:
Gurobi not found. Please install Gurobi and gurobipy Python package.
No module named 'gurobipy'
*** Aldy v1.2 (Python 3.7.0) ***
(c) 2017 SFU, MIT & IUB. All rights reserved.
Arguments:
Gene: DPYD
Profile: illumina
Threshold: 50%
Input: /data_mount/sample.bam
Output: /data_output/DPYD.aldy
Log: /data_output/DPYD.aldy.log
Phasing: False
Gurobi not found. Please install Gurobi and gurobipy Python package.
No module named 'gurobipy'
Gurobi not found. Please install Gurobi and gurobipy Python package.
No module named 'gurobipy'
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/aldy/genotype.py", line 59, in genotype
score, init_sol = protein.get_initial_solution(gene, sample, cn_sol, solver)
File "/usr/local/lib/python3.7/site-packages/aldy/protein.py", line 44, in get_initial_solution
structure
File "/usr/local/lib/python3.7/site-packages/aldy/protein.py", line 235, in solve_ilp
status, opt, solutions = c.solveAll(objective, dict(list(A.items()) + [((a, m), M[a][m]) for a in M for m in M[a]]))
File "/usr/local/lib/python3.7/site-packages/aldy/lpinterface.py", line 157, in solveAll
solutions = [tuple(sorted(y for y in x)) for x in solutions]
File "/usr/local/lib/python3.7/site-packages/aldy/lpinterface.py", line 157, in <listcomp>
solutions = [tuple(sorted(y for y in x)) for x in solutions]
TypeError: '<' not supported between instances of 'tuple' and 'str'
For setup, I'm running inside Docker using SCIP (I can send more if necessary). Thoughts?
May I ask the following, I've read the Nature and BioRxiv paper and have the following questions:
Hello,
I have several questions about Aldy genotypes and a possible bug report. I am using the latest version of Aldy with the default ILP solver. I have WGS samples that I am using to call CYP2D6.
I have a few genotype results that I need help interpreting.
I have a few samples where the stdout call from Aldy is *5/*5. However, the output .tsv file is blank. Why would this be the case? Example is shown below.
๐ฟ Aldy v4.3.1 (Python 3.9.15 on Linux 5.15.65+-x86_64-with-glibc2.31)
(c) 2016-2022 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample 1004207.bam...
Potential CYP2D6 gene structures for 1004207:
1: (confidence: 100%)
Potential major CYP2D6 star-alleles for 1004207:
1: (confidence: 100%)
Best CYP2D6 star-alleles for 1004207:
1: *5 / *5 (confidence=100%)
Minor alleles:
CYP2D6 results:
- *5 / *5
Minor: [*5] / [*5]
Legacy notation: [*5] / [*5]
Preparing debug archive...
I can provide a debug report for anything shown here if needed.
Hi,
I tried Aldy using 'vdr' gene region as -n instead of CYP2D8 default region. Below is the command I ran.
$aldy genotype -p illumina -g cyp2d6 -n 12:47811535-47945000 -o $out $bam
My Bam files are GRCh38 aligned. Aldy works well when I use default CYP2D8 as CN neutral region. But for VDR gene (also I tried egfr - both of these are used as CN neutral in Stargazer tool) I get an error 'coverage for CYP2D6 is too low for copy number calling'. But my Bam files have ~30X coverage. Any comments on why am I getting this error?
Thanks
Best
Sumudu
Add support for VCF output
Hello,
We have a number of short read whole genome sequencing samples (~200) where Aldy is unable to assign a genotype due to low coverage. I was able to successfully run Cyrius and PyPGx on most of these samples and most have some kind of structural variant as part of a tandem. I.e *1/*68+*4 or *68/*68+4, etc. Coverage across the full dataset is roughly 35x, however it is possible that these samples have lower coverage than the rest. Is there a way to force Aldy to try to call these samples or would that just result in too much uncertainty in the call? Attached is debug file from one of the samples.
๐ฟ Aldy v4.5 (Python 3.9.15 on Linux 5.15.133+-x86_64-with-glibc2.31)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample 1020075.bam...
Potential CYP2D6 gene structures for 1020075:
1: 2x*68 (confidence: 100%)
Potential major CYP2D6 star-alleles for 1020075:
1: 2x*68 & rs1135840 (confidence: 100%)
ERROR: gene= CYP2D6, profile= wgs, file= filtered_bams/1020075.bam
Aldy could not phase any major solution.
Possible solutions:
- Check the coverage. Extremely low coverage prevents Aldy from calling star-alleles.
- Run with --debug parameter and notify the authors of Aldy.
Preparing debug archive...
Thanks,
Andrew
Hi there,
I've generated my own profile using:
aldy profile S000029_S4842Nr1.bam > panel.profile
But when I attempt to run this with:
aldy genotype -p panel.profile -g cyp2d6 S000029_S4842Nr1.bam
It returns:
๐ฟ Aldy v4.4 (Python 3.10.6 on Linux 5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.35)
(c) 2016-2023 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample S000029_S4842Nr1.bam...
ERROR: gene= cyp2d6, file= S000029_S4842Nr1.bam
KeyError('hg38')
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/aldy/main.py", line 122, in main
_genotype(args.gene, output, args)
File "/usr/local/lib/python3.10/dist-packages/aldy/main.py", line 443, in _genotype
run(None)
File "/usr/local/lib/python3.10/dist-packages/aldy/main.py", line 393, in run
_ = genotype(
File "/usr/local/lib/python3.10/dist-packages/aldy/genotype.py", line 183, in genotype
profile = Profile.load(gene, profile_name, cn_region, **params)
File "/usr/local/lib/python3.10/dist-packages/aldy/profile.py", line 262, in load
GRange(*prof["neutral"][gene.genome]),
KeyError: 'hg38'
Just wondering if this is an error on my side and if it can be rectified?
Thank you.
Dear sir,
I tried aldy genotype -p illumina -g UGT1A1 my.bam
and got the following information:
๐ฟ Aldy v3.1 (Python 3.8.10 on Linux 5.4.0-121-generic-x86_64-with-glibc2.29)
(c) 2016-2022 Aldy Authors. All rights reserved.
Free for non-commercial/academic use only.
Genotyping sample my.bam...
ERROR: ugt1a1 cannot be accessed
The same massage when doing gene IFNL3, NAT2, UGT1A1, VKORC1
Hi,
I have exome samples by using Illumina exome kit and I got problem in generating profile on trying a few different samples. It's fine to run the bundled test cases.
-bash-4.2$ aldy profile /home/ramsar1971/project/data/RAP028/RAP028-FAT.sorted.markdup.realign.recal.bam > aldy.asd.profile
*** Aldy v2.2.3 (Python 3.6.4, linux) ***
*** (c) 2016-2019 Aldy Authors & Indiana University Bloomington. All rights reserved.
*** Free for non-commercial/academic use only.
Generating profile for DPYD (1:97541297-98388616)
Cannot fetch gene DPYD (1:97541297-98388616)
Generating profile for CYP2C19 (10:96444999-96615001)
Cannot fetch gene CYP2C19 (10:96444999-96615001)
Generating profile for CYP2C9 (10:96690999-96754001)
Cannot fetch gene CYP2C9 (10:96690999-96754001)
Generating profile for CYP2C8 (10:96795999-96830001)
Cannot fetch gene CYP2C8 (10:96795999-96830001)
Generating profile for CYP4F2 (19:15618999-16009501)
Cannot fetch gene CYP4F2 (19:15618999-16009501)
Generating profile for CYP2A6 (19:41347499-41400001)
Cannot fetch gene CYP2A6 (19:41347499-41400001)
Generating profile for CYP2D6 (22:42518899-42553001)
Cannot fetch gene CYP2D6 (22:42518899-42553001)
Generating profile for TPMT (6:18126540-18157375)
Cannot fetch gene TPMT (6:18126540-18157375)
Generating profile for CYP3A5 (7:99244999-99278001)
Cannot fetch gene CYP3A5 (7:99244999-99278001)
Generating profile for CYP3A4 (7:99353999-99465001)
Cannot fetch gene CYP3A4 (7:99353999-99465001)
May I know any advice?
Thanks!
Regards,
Mullin
Hi,
We've been running aldy on our local cohort and are seeing that some alleles have the suffix .ALDY
, e.g.:
cat example.aldy
# #Sample Gene SolutionID Major Minor Copy Allele Location Type Coverage Effect dbSNP Code Status
# #Solution 1: *10.002 -rs28371738, *10.004 -rs28371738 -rs28735595, *36.1001 -rs28735595
# example CYP2D6 1 *10/*36.ALDY+*10 10.002;10.004;36.1001 0 10.002 42126610 C>G -1 S486T rs1135840
# ...
How shall we interpret those outputs? I.e. what's the difference between *10/*36.ALDY+*10
and *10/*36+*10
?
Thanks
Hi,
I would like to report an error in the genotyping of G6PD. If I am not mistaken, G6PD is located on the X chromosome, and therefore we can have homozygous or heterozygous females, but males will only be hemizygous. I think Aldy interprets this gene as diploid in all cases, because when running it on patient exomes it does not distinguish between those with XX or XY.
Thanks
As discussed previously, output file coverage field contains only " -1" as output result, not actual output.
Hello,
We have been evaluating Aldy 4.4 alongside other PGx callers and noticed that rs2740574
is handled a little strangely. We've seen two buckets of similar errors:
rs2740574
is presentAldy command in used both examples:
aldy genotype \
--profile illumina \
--gene cyp3a4 \
--output test.default.cyp3a4.aldy \
--genome hg19 \
SAMPLE_1.bam \
-v T \
-l test.default.cyp3a4.aldylog
*1B/*1A
, should be *1A/\*1A
Full debug log: *1B/*1 test.default.cyp3a4.aldylog.txt
Aldy correctly infers the copy number solution
[cn] result= CNSol[0.00; sol=(2x*1); cn=22222222222222222222222222222] (provided)
Potential CYP3A4 gene structures for SAMPLE_1:
1: 2x*1 (confidence: 100%)
And then correctly detects a near ~2x copynumber for rs2740574
rs2740574 99382096.C>T -390G>A (cov= 344, cn= 1.9;
But in the final solution, instead of calling two *1A
s one of each *1A
and *1B
are called:
Minor: [*1.001] / [*1.002]
Legacy notation: [*1B] / [*1A]
and both copies of rs2740574
are assigned to allele 1 (*1.002
) instead of being split evenly.
*1.002/*(36.001 +rs2740574)
should be *1.002, *36.002
Full debug log: sample_2.default.cyp3a4.aldylog.txt
Aldy correctly infers the cn solution:
[cn] result= CNSol[0.00; sol=(2x*1); cn=22222222222222222222222222222] (provided)
Potential CYP3A4 gene structures for SAMPLE_2:
1: 2x*1 (confidence: 100%)
And correctly detects the rs2740574
cn
rs2740574 99382096.C>T -390G>A (cov= 401, cn= 2.0;
But ultimately calls
Minor: [*1.002] / [*36.001 +rs2740574]
Legacy notation: [*1A] / [*36.001 +rs2740574]
*36.002
is in fact *36.001 + rs2740574
?
I'm happy to provide subsetted and anonymized BAMs privately if it will help.
I have a set of 4 exomes and running aldy 4.4 on them as so:
aldy genotype -g cyp2e1 -p wxs ${bam_location}/*/${sample_name}.bam -o ${output}/${sample_name}.cyp2d6.aldy -l ${output}/${sample_name}.cyp2e1.aldylog
for 3 out of the 4 exomes I am getting errors about the The average coverage of the sample is too low
[main] arguments= subparser=genotype verbosity=INFO file=/home/ryan/NGS_Data/Exome_9-14-23/dragen_analysis/CYP-028/CYP-028.bam gene=cyp2e1 profile=wxs reference=None genome=None cn_neutral_region=None output=/home/ryan/NGS_Data/Exome_9-14-23/vcf/haplotype_aldy4_results_June_2023/CYP-028.CYP2E1.aldy solver=any debug=None cn=None log=/home/ryan/NGS_Data/Exome_9-14-23/vcf/haplotype_aldy4_results_June_2023/CYP-028.CYP2E1.aldylog multiple_warn_level=1 simple=False param=None Genotyping sample CYP-028.bam... WARNING: Copy-number calling is not available for exome data. WARNING: Aldy will NOT be able to detect gene duplications, deletions and fusions. WARNING: Calling of alleles that are defined by non-exonic mutations is not available. Results might not be biologically relevant! [genotype] gene=cyp2e1; start=2023-09-27 18:52:17.572901 [lp] solver= cbc [genotype] reference= hg19 [params] neutral_value=786.0; cn_parsimony=1.0; min_coverage=5.0 [sam] path= /home/ryan/NGS_Data/Exome_9-14-23/dragen_analysis/CYP-028/CYP-028.bam [sam] Read SAM took 1.22s [coverage] scale_ratio: 1.5 [sam] avg_coverage= 43.8x ERROR: gene= cyp2e1, profile= wxs, file= /home/ryan/NGS_Data/Exome_9-14-23/dragen_analysis/CYP-028/CYP-028.bam The average coverage of the sample is too low (1.9).
I have looked at the metrics for coverage for this particular sample and dragen reports 113X coverage over the target bed for the IDT exome capture probes. How should I resolve this? Thanks!
Edit: adding the debug info for one of these samples:
debuginfo.tar.gz
Hi, I am working with Aldy and need to provide --phase parameter for more accurate calling of novel star alleles.
I used HapCUT2, based on the related documentation and received both "haplotype_output_file" and "haplotype_output_file.phased.VCF".
But when I use any of them as phase parameter, I will face these errors respectively:
ERROR: gene= CYP2D6, file= A2_3_11_s53_S48.recal.bam
ValueError("invalid literal for int() with base 10: 'POS'")
and
ERROR: gene= cyp2d6, profile= illumina, file= A2_3_11_s53_S48.recal.bam
Invalid phasing line 1 in haplotype_output_file.phased.VCF (less than 7 columns)
Is anybody can tell me why it's happening and what should I do about that?
Could you update Aldyโs gene databases that are based on Pharmvar. The current pharmvar database is version 5.1.6. The ones in the current Aldy v3.3 is from 4.1.7. Also, there is a new gene added to PharmVar, SLCO1B1. In Aldy v3.3, the SLCO1B1 allele definitions are based on PharmGKB. Could you also update SLCO1B1 allele definitions to be based on PharmVar instead of PharmGKB?
-Best,
Reynold
Update databases to hg38 and PharmVar
In the future, would it be possible to amend the Aldy output to use the conventional minor allele names (i.e., those from PharmVar)?
Thanks,
Tyler
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.