Giter Site home page Giter Site logo

Comments (10)

wshuai294 avatar wshuai294 commented on August 16, 2024

Hi, I guess the installation might failed. Have you run SpecHLA in the test data (in the example folder)? Also, could you please send me the log info (i.e., the information shown on the screen)?

from spechla.

Carovanandel avatar Carovanandel commented on August 16, 2024

Hi, when i run specHLA on the test data I get a full result, so the installation seems to be okay. The log info is as follows:

bash script/whole/SpecHLA.sh -u 1 -j 12 -n sample1 -1 ../testdata/sample1.R1.fastq.gz -2 ../testdata/sample1.R2.fastq.gz -o ../testdata/output
Start profiling HLA for sample1.
use 12 threads.
map the reads to database to assign reads to corresponding genes.
Can't detect novoalign license, use bowtie2.
218851 reads; of these:
  218851 (100.00%) were paired; of these:
    119060 (54.40%) aligned concordantly 0 times
    5334 (2.44%) aligned concordantly exactly 1 time
    94457 (43.16%) aligned concordantly >1 times
    ----
    119060 pairs aligned concordantly 0 times; of these:
      0 (0.00%) aligned discordantly 1 time
    ----
    119060 pairs aligned 0 times concordantly or discordantly; of these:
      238120 mates make up the pairs; of these:
        121130 (50.87%) aligned 0 times
        70 (0.03%) aligned exactly 1 time
        116920 (49.10%) aligned >1 times
72.33% overall alignment rate
[bam_sort_core] merging from 4 files...
start assigning reads...
read bam cost 60.60901975631714
read assigment cost 76.42484998703003
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 104676 sequences (10000112 bp)...
[M::process] read 105080 sequences (10000099 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (18, 32307, 1, 13)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (80, 121, 298)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 734)
[M::mem_pestat] mean and std.dev: (117.00, 68.16)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 952)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (128, 190, 341)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 767)
[M::mem_pestat] mean and std.dev: (219.80, 128.67)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 980)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (102, 159, 901)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2499)
[M::mem_pestat] mean and std.dev: (336.85, 391.07)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3298)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 104676 reads in 7.196 CPU sec, 7.212 real sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:sample1\tSM:sample1 /exports/SpecHLA/script/whole/../../db/HLA/HLA_A/HLA_A.fa ../testdata/sample1/A.R1.fq.gz ../testdata/sample1/A.R2.fq.gz
[main] Real time: 0.036 sec; CPU: 0.002 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:sample1\tSM:sample1 /exports/SpecHLA/script/whole/../../db/HLA/HLA_B/HLA_B.fa ../testdata/sample1/B.R1.fq.gz ../testdata/sample1/B.R2.fq.gz
[main] Real time: 0.033 sec; CPU: 0.002 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:sample1\tSM:sample1 /exports/SpecHLA/script/whole/../../db/HLA/HLA_C/HLA_C.fa ../testdata/sample1/C.R1.fq.gz ../testdata/sample1/C.R2.fq.gz
[main] Real time: 0.034 sec; CPU: 0.002 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:sample1\tSM:sample1 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DPA1/HLA_DPA1.fa ../testdata/sample1/DPA1.R1.fq.gz ../testdata/sample1/DPA1.R2.fq.gz
[main] Real time: 0.035 sec; CPU: 0.002 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:sample1\tSM:sample1 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DPB1/HLA_DPB1.fa ../testdata/sample1/DPB1.R1.fq.gz ../testdata/sample1/DPB1.R2.fq.gz
[main] Real time: 0.033 sec; CPU: 0.002 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:sample1\tSM:sample1 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DQA1/HLA_DQA1.fa ../testdata/sample1/DQA1.R1.fq.gz ../testdata/sample1/DQA1.R2.fq.gz
[main] Real time: 0.036 sec; CPU: 0.002 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:sample1\tSM:sample1 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DQB1/HLA_DQB1.fa ../testdata/sample1/DQB1.R1.fq.gz ../testdata/sample1/DQB1.R2.fq.gz
[main] Real time: 0.034 sec; CPU: 0.002 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:sample1\tSM:sample1 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DRB1/HLA_DRB1.fa ../testdata/sample1/DRB1.R1.fq.gz ../testdata/sample1/DRB1.R2.fq.gz
[main] Real time: 0.035 sec; CPU: 0.002 sec
start realignment...

Attention: please ensure the platform can run gzip -l automatically, otherwise, it may not continue.
ERROR(freebayes): Could not get first alignment from target
BAM and VCF are ready.
Mean depth {'HLA_A': 0, 'HLA_B': 0, 'HLA_C': 0, 'HLA_DPA1': 0, 'HLA_DPB1': 0, 'HLA_DQA1': 0, 'HLA_DQB1': 0, 'HLA_DRB1': 0}
Minimum Minor Allele Frequency is 0.1.
Num of hete small variant is 0 in HLA_A.
No heterozygous locus, no need to phase.
Phasing of HLA_A is done!


Num of hete small variant is 0 in HLA_B.
No heterozygous locus, no need to phase.
Phasing of HLA_B is done!


Num of hete small variant is 0 in HLA_C.
No heterozygous locus, no need to phase.
Phasing of HLA_C is done!


Num of hete small variant is 0 in HLA_DPA1.
No heterozygous locus, no need to phase.
Phasing of HLA_DPA1 is done!


Num of hete small variant is 0 in HLA_DPB1.
No heterozygous locus, no need to phase.
Phasing of HLA_DPB1 is done!


Num of hete small variant is 0 in HLA_DQA1.
No heterozygous locus, no need to phase.
Phasing of HLA_DQA1 is done!


Num of hete small variant is 0 in HLA_DQB1.
No heterozygous locus, no need to phase.
Phasing of HLA_DQB1 is done!


Num of hete small variant is 0 in HLA_DRB1.
No heterozygous locus, no need to phase.
Phasing of HLA_DRB1 is done!


start annotation...
parameter:      sample:sample1       dir:../testdata/sample1       pop:Unknown     wxs:exon        G_nom:0
Traceback (most recent call last):
  File "/exports/SpecHLA/script/whole/g_group_annotation.py", line 7, in <module>
    from Bio import SeqIO
ModuleNotFoundError: No module named 'Bio'
Clean output dir.
# version: IPD-IMGT/HLA 3.38.0
Sample  HLA_A_1 HLA_A_2 HLA_B_1 HLA_B_2 HLA_C_1 HLA_C_2 HLA_DPA1_1      HLA_DPA1_2      HLA_DPB1_1      HLA_DPB1_2      HLA_DQA1_1      HLA_DQA1_2      HLA_DQB1_1      HLA_DQB1_2      HLA_DRB1_1       HLA_DRB1_2
sample1      -       -       -       -       -       -       -       -       -       -       -       -       -       -       -       -
sample1 is done.

from spechla.

wshuai294 avatar wshuai294 commented on August 16, 2024

Hi,
Have you activated the conda environment? It reports No module named 'Bio', but this module is included in the Conda env. Moreover, if the env is not activated, it should also fail in the test data. Could you show me the log while running the test data (exon)?

from spechla.

Carovanandel avatar Carovanandel commented on August 16, 2024

Hi, I have re-installed the conda environment (from the environment.yml file) just to be sure, but it still reports No module named 'Bio'. However, this is also reported when I run the exon test, where I do get a result (log below). Biopython is indeed installed in the environment (version 1.79, located in SpecHLA/spechla_env/lib/python3.8/site-packages), so I don't understand why it does not recognize this module.

The log running the test data:

(/exports/SpecHLA/spechla_env) [user1@res-hpc-exe008 exon]$ bash test_exon.sh
Start profiling HLA for NA06985.
use 12 threads.
map the reads to database to assign reads to corresponding genes.
Can't detect novoalign license, use bowtie2.
52771 reads; of these:
  52771 (100.00%) were paired; of these:
    27282 (51.70%) aligned concordantly 0 times
    2563 (4.86%) aligned concordantly exactly 1 time
    22926 (43.44%) aligned concordantly >1 times
    ----
    27282 pairs aligned concordantly 0 times; of these:
      14 (0.05%) aligned discordantly 1 time
    ----
    27268 pairs aligned 0 times concordantly or discordantly; of these:
      54536 mates make up the pairs; of these:
        53910 (98.85%) aligned 0 times
        21 (0.04%) aligned exactly 1 time
        605 (1.11%) aligned >1 times
48.92% overall alignment rate
start assigning reads...
read bam cost 11.641726732254028
read assigment cost 17.37773323059082
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 105542 sequences (9494975 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 12450, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (121, 144, 178)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (7, 292)
[M::mem_pestat] mean and std.dev: (150.15, 42.25)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 349)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 105542 reads in 5.260 CPU sec, 5.277 real sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2108 sequences (198595 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 1022, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (125, 150, 185)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (5, 305)
[M::mem_pestat] mean and std.dev: (156.61, 42.93)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 365)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 2108 reads in 0.097 CPU sec, 0.097 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:NA06985\tSM:NA06985 /exports/SpecHLA/script/whole/../../db/HLA/HLA_A/HLA_A.fa /exports/SpecHLA/example/exon/output/NA06985/A.R1.fq.gz /exports/SpecHLA/example/exon/output/NA06985/A.R2.fq.gz
[main] Real time: 0.180 sec; CPU: 0.104 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 1604 sequences (152029 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 798, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (124, 150, 180)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (12, 292)
[M::mem_pestat] mean and std.dev: (154.54, 39.93)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 348)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 1604 reads in 0.049 CPU sec, 0.050 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:NA06985\tSM:NA06985 /exports/SpecHLA/script/whole/../../db/HLA/HLA_B/HLA_B.fa /exports/SpecHLA/example/exon/output/NA06985/B.R1.fq.gz /exports/SpecHLA/example/exon/output/NA06985/B.R2.fq.gz
[main] Real time: 0.115 sec; CPU: 0.055 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2162 sequences (207653 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 1070, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (128, 149, 182)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (20, 290)
[M::mem_pestat] mean and std.dev: (155.95, 40.26)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 344)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 2162 reads in 0.100 CPU sec, 0.100 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:NA06985\tSM:NA06985 /exports/SpecHLA/script/whole/../../db/HLA/HLA_C/HLA_C.fa /exports/SpecHLA/example/exon/output/NA06985/C.R1.fq.gz /exports/SpecHLA/example/exon/output/NA06985/C.R2.fq.gz
[main] Real time: 0.174 sec; CPU: 0.107 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 1236 sequences (119293 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 618, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (126, 150, 183)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (12, 297)
[M::mem_pestat] mean and std.dev: (155.54, 42.07)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 354)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 1236 reads in 0.028 CPU sec, 0.028 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:NA06985\tSM:NA06985 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DPA1/HLA_DPA1.fa /exports/SpecHLA/example/exon/output/NA06985/DPA1.R1.fq.gz /exports/SpecHLA/example/exon/output/NA06985/DPA1.R2.fq.gz
[main] Real time: 0.091 sec; CPU: 0.033 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 4786 sequences (465517 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 1940, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (123, 146, 180)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (9, 294)
[M::mem_pestat] mean and std.dev: (152.83, 42.16)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 351)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 4786 reads in 0.297 CPU sec, 0.299 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:NA06985\tSM:NA06985 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DPB1/HLA_DPB1.fa /exports/SpecHLA/example/exon/output/NA06985/DPB1.R1.fq.gz /exports/SpecHLA/example/exon/output/NA06985/DPB1.R2.fq.gz
[main] Real time: 0.420 sec; CPU: 0.310 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2552 sequences (247535 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 1276, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (131, 154, 190)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (13, 308)
[M::mem_pestat] mean and std.dev: (161.30, 42.31)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 367)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 2552 reads in 0.073 CPU sec, 0.074 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:NA06985\tSM:NA06985 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DQA1/HLA_DQA1.fa /exports/SpecHLA/example/exon/output/NA06985/DQA1.R1.fq.gz /exports/SpecHLA/example/exon/output/NA06985/DQA1.R2.fq.gz
[main] Real time: 0.153 sec; CPU: 0.081 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 1668 sequences (160844 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 724, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (125, 149, 178)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (19, 284)
[M::mem_pestat] mean and std.dev: (153.48, 38.44)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 337)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 1668 reads in 0.105 CPU sec, 0.106 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:NA06985\tSM:NA06985 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DQB1/HLA_DQB1.fa /exports/SpecHLA/example/exon/output/NA06985/DQB1.R1.fq.gz /exports/SpecHLA/example/exon/output/NA06985/DQB1.R2.fq.gz
[main] Real time: 0.176 sec; CPU: 0.111 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2386 sequences (228235 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 1174, 1, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (124, 146, 176)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (20, 280)
[M::mem_pestat] mean and std.dev: (151.71, 40.31)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 332)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 2386 reads in 0.137 CPU sec, 0.138 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 12 -U 10000 -L 10000,10000 -R @RG\tID:NA06985\tSM:NA06985 /exports/SpecHLA/script/whole/../../db/HLA/HLA_DRB1/HLA_DRB1.fa /exports/SpecHLA/example/exon/output/NA06985/DRB1.R1.fq.gz /exports/SpecHLA/example/exon/output/NA06985/DRB1.R2.fq.gz
[main] Real time: 0.225 sec; CPU: 0.144 sec
start realignment...

Attention: please ensure the platform can run gzip -l automatically, otherwise, it may not continue.
BAM and VCF are ready.
Mean depth {'HLA_A': 56, 'HLA_B': 37, 'HLA_C': 48, 'HLA_DPA1': 12, 'HLA_DPB1': 39, 'HLA_DQA1': 38, 'HLA_DQB1': 21, 'HLA_DRB1': 20}
Minimum Minor Allele Frequency is 0.1.
Num of hete small variant is 30 in HLA_A.
[SpecHap 2024:03:27 10:25:20]phasing haplotype for HLA_A
3 blocks after phasing.
Start link blocks with database...
Num of all possible haps is 4.
parameter:      sample:NA06985  dir:/exports/SpecHLA/example/exon/output/NA06985     pop:Unknown     gene:HLA_A
HLA_A.0 100
HLA_A.1 99.2588932806324
HLA_A.2 49.407114624506
HLA_A.3 98.7154150197628
Select_id       HLA_A.0
Selected combination is  0
Small variant-phasing of HLA_A is done! Haplotype ratio is 0.527:0.473
Phasing of HLA_A is done!


Num of hete small variant is 26 in HLA_B.
[SpecHap 2024:03:27 10:25:24]phasing haplotype for HLA_B
3 blocks after phasing.
Start link blocks with database...
Num of all possible haps is 4.
parameter:      sample:NA06985  dir:/exports/SpecHLA/example/exon/output/NA06985     pop:Unknown     gene:HLA_B
HLA_B.0 99.2094861660079
HLA_B.1 98.5665583717448
HLA_B.2 49.0108803165183
HLA_B.3 98.7154150197628
Select_id       HLA_B.0
Selected combination is  0
Small variant-phasing of HLA_B is done! Haplotype ratio is 0.538:0.462
Phasing of HLA_B is done!


Num of hete small variant is 27 in HLA_C.
[SpecHap 2024:03:27 10:25:27]phasing haplotype for HLA_C
4 blocks after phasing.
Start link blocks with database...
Num of all possible haps is 8.
parameter:      sample:NA06985  dir:/exports/SpecHLA/example/exon/output/NA06985     pop:Unknown     gene:HLA_C
HLA_C.0 97.7339901477833
HLA_C.1 97.6847290640395
HLA_C.2 98.0295566502463
HLA_C.3 97.9802955665025
HLA_C.4 48.5714285714285
HLA_C.5 48.5714285714285
HLA_C.6 97.3891625615763
HLA_C.7 97.1428571428571
Select_id       HLA_C.2
Selected combination is  2
Small variant-phasing of HLA_C is done! Haplotype ratio is 0.481:0.519
Phasing of HLA_C is done!


Num of hete small variant is 0 in HLA_DPA1.
No heterozygous locus, no need to phase.
Phasing of HLA_DPA1 is done!


Num of hete small variant is 0 in HLA_DPB1.
No heterozygous locus, no need to phase.
Phasing of HLA_DPB1 is done!


Num of hete small variant is 0 in HLA_DQA1.
No heterozygous locus, no need to phase.
Phasing of HLA_DQA1 is done!


Num of hete small variant is 2 in HLA_DQB1.
[SpecHap 2024:03:27 10:25:35]phasing haplotype for HLA_DQB1
1 blocks after phasing.
Start link blocks with database...
Num of all possible haps is 1.
parameter:      sample:NA06985  dir:/exports/SpecHLA/example/exon/output/NA06985     pop:Unknown     gene:HLA_DQB1
HLA_DQB1.0      99.8704663212436
Select_id       HLA_DQB1.0
Selected combination is  0
Small variant-phasing of HLA_DQB1 is done! Haplotype ratio is 0.331:0.669
Phasing of HLA_DQB1 is done!


Num of hete small variant is 4 in HLA_DRB1.
[SpecHap 2024:03:27 10:25:36]phasing haplotype for HLA_DRB1
1 blocks after phasing.
Start link blocks with database...
Num of all possible haps is 1.
parameter:      sample:NA06985  dir:/exports/SpecHLA/example/exon/output/NA06985     pop:Unknown     gene:HLA_DRB1
HLA_DRB1.0      2.80260521042084
Select_id       HLA_DRB1.0
Selected combination is  0
Small variant-phasing of HLA_DRB1 is done! Haplotype ratio is 0.208:0.792
Phasing of HLA_DRB1 is done!


start annotation...
parameter:      sample:NA06985  dir:/exports/SpecHLA/example/exon/output/NA06985     pop:Unknown     wxs:exon        G_nom:0
Traceback (most recent call last):
  File "/exports/SpecHLA/script/whole/g_group_annotation.py", line 7, in <module>
    from Bio import SeqIO
ModuleNotFoundError: No module named 'Bio'
Clean output dir.
# version: IPD-IMGT/HLA 3.38.0
Sample  HLA_A_1 HLA_A_2 HLA_B_1 HLA_B_2 HLA_C_1 HLA_C_2 HLA_DPA1_1      HLA_DPA1_2      HLA_DPB1_1      HLA_DPB1_2      HLA_DQA1_1      HLA_DQA1_2      HLA_DQB1_1      HLA_DQB1_2      HLA_DRB1_1    HLA_DRB1_2
NA06985 A*03:01:01:01   A*02:01:01:01   B*07:02:01:01   B*57:01:01:01   C*06:02:23      C*07:02:25      DPA1*01:03:01:01        DPA1*01:03:01:01        DPB1*04:01:01:01        DPB1*04:01:01:01      DQA1*01:02:01:01        DQA1*01:02:01:01        DQB1*06:02:01:01        DQB1*06:02:01:01        DRB1*15:01:01:01        DRB1*15:01:01:01
NA06985 is done.

from spechla.

wshuai294 avatar wshuai294 commented on August 16, 2024

Hi,
No module named 'Bio' is caused by using the system python instead of the python in conda env. I have fixed it in the latest commit. The error in your sample seems to be derived from the local assembly step. In the output folder, there is a file named sample1.local_assem.log. Could you please send its content to me?

from spechla.

Carovanandel avatar Carovanandel commented on August 16, 2024

Thanks for the fix! The local assembly step indeed seems to have failed. Here is the log:

bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k  -k 11 -t 12 <(cat ../output/sample1/extract.fa) <(cat ../output/sample1/extract.fa) 2> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec1.fq.gz'; \
bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k -Rk 17 -t 12 <(cat ../output/sample1/extract.fa) ../output/sample1/prefix2.ec1.fq.gz 2>> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec.fq.gz'; \
rm -f ../output/sample1/prefix2.ec1.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -1s 1k -k 10 -t 12 ../output/sample1/prefix2.ec.fq.gz 2> ../output/sample1/prefix2.flt.fq.gz.log | gzip -1 > ../output/sample1/prefix2.flt.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/ropebwt2 -dNCr ../output/sample1/prefix2.flt.fq.gz > ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.flt.fmd.log
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log | gzip -1 > ../output/sample1/prefix2.pre.gz
/bin/bash: line 1: 2043628 Aborted                 (core dumped) /exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log
     2043629 Done                    | gzip -1 > ../output/sample1/prefix2.pre.gz
make: *** [../output/sample1/prefix2.mak:34: ../output/sample1/prefix2.pre.gz] Error 134
HLA_B:1500-1800 assemble fail
bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k  -k 11 -t 12 <(cat ../output/sample1/extract.fa) <(cat ../output/sample1/extract.fa) 2> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec1.fq.gz'; \
bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k -Rk 17 -t 12 <(cat ../output/sample1/extract.fa) ../output/sample1/prefix2.ec1.fq.gz 2>> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec.fq.gz'; \
rm -f ../output/sample1/prefix2.ec1.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -1s 1k -k 10 -t 12 ../output/sample1/prefix2.ec.fq.gz 2> ../output/sample1/prefix2.flt.fq.gz.log | gzip -1 > ../output/sample1/prefix2.flt.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/ropebwt2 -dNCr ../output/sample1/prefix2.flt.fq.gz > ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.flt.fmd.log
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log | gzip -1 > ../output/sample1/prefix2.pre.gz
/bin/bash: line 1: 2043688 Aborted                 (core dumped) /exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log
     2043689 Done                    | gzip -1 > ../output/sample1/prefix2.pre.gz
make: *** [../output/sample1/prefix2.mak:34: ../output/sample1/prefix2.pre.gz] Error 134
HLA_DPB1:10000-10500 assemble fail
bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k  -k 11 -t 12 <(cat ../output/sample1/extract.fa) <(cat ../output/sample1/extract.fa) 2> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec1.fq.gz'; \
bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k -Rk 17 -t 12 <(cat ../output/sample1/extract.fa) ../output/sample1/prefix2.ec1.fq.gz 2>> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec.fq.gz'; \
rm -f ../output/sample1/prefix2.ec1.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -1s 1k -k 10 -t 12 ../output/sample1/prefix2.ec.fq.gz 2> ../output/sample1/prefix2.flt.fq.gz.log | gzip -1 > ../output/sample1/prefix2.flt.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/ropebwt2 -dNCr ../output/sample1/prefix2.flt.fq.gz > ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.flt.fmd.log
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log | gzip -1 > ../output/sample1/prefix2.pre.gz
/bin/bash: line 1: 2043745 Aborted                 (core dumped) /exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log
     2043746 Done                    | gzip -1 > ../output/sample1/prefix2.pre.gz
make: *** [../output/sample1/prefix2.mak:34: ../output/sample1/prefix2.pre.gz] Error 134
HLA_DQA1:5600-5850 assemble fail
bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k  -k 11 -t 12 <(cat ../output/sample1/extract.fa) <(cat ../output/sample1/extract.fa) 2> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec1.fq.gz'; \
bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k -Rk 17 -t 12 <(cat ../output/sample1/extract.fa) ../output/sample1/prefix2.ec1.fq.gz 2>> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec.fq.gz'; \
rm -f ../output/sample1/prefix2.ec1.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -1s 1k -k 10 -t 12 ../output/sample1/prefix2.ec.fq.gz 2> ../output/sample1/prefix2.flt.fq.gz.log | gzip -1 > ../output/sample1/prefix2.flt.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/ropebwt2 -dNCr ../output/sample1/prefix2.flt.fq.gz > ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.flt.fmd.log
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log | gzip -1 > ../output/sample1/prefix2.pre.gz
/bin/bash: line 1: 2043803 Aborted                 (core dumped) /exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log
     2043804 Done                    | gzip -1 > ../output/sample1/prefix2.pre.gz
make: *** [../output/sample1/prefix2.mak:34: ../output/sample1/prefix2.pre.gz] Error 134
HLA_DQB1:3205-4115 assemble fail
bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k  -k 11 -t 12 <(cat ../output/sample1/extract.fa) <(cat ../output/sample1/extract.fa) 2> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec1.fq.gz'; \
bash -e -o pipefail -c '/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -s 1k -Rk 17 -t 12 <(cat ../output/sample1/extract.fa) ../output/sample1/prefix2.ec1.fq.gz 2>> ../output/sample1/prefix2.ec.fq.gz.log | gzip -1 > ../output/sample1/prefix2.ec.fq.gz'; \
rm -f ../output/sample1/prefix2.ec1.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/bfc -1s 1k -k 10 -t 12 ../output/sample1/prefix2.ec.fq.gz 2> ../output/sample1/prefix2.flt.fq.gz.log | gzip -1 > ../output/sample1/prefix2.flt.fq.gz
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/ropebwt2 -dNCr ../output/sample1/prefix2.flt.fq.gz > ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.flt.fmd.log
/exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log | gzip -1 > ../output/sample1/prefix2.pre.gz
/bin/bash: line 1: 2043861 Aborted                 (core dumped) /exports/SpecHLA/script/../bin/fermikit/fermi.kit/fermi2 assemble -l 40 -m 53 -t 12 ../output/sample1/prefix2.flt.fmd 2> ../output/sample1/prefix2.pre.gz.log
     2043862 Done                    | gzip -1 > ../output/sample1/prefix2.pre.gz
make: *** [../output/sample1/prefix2.mak:34: ../output/sample1/prefix2.pre.gz] Error 134
HLA_DRB1:6700-7100 assemble fail

from spechla.

wshuai294 avatar wshuai294 commented on August 16, 2024

Could you please delete the line bash $dir/../clear_output.sh $outdir/ from the SpecHLA.sh and rerun it. It seems that the read assignment has a problem. Please look at the fastq files in the outdir and check if they contain reads. Or could you please send me your test data for me to test ([email protected])?

from spechla.

Carovanandel avatar Carovanandel commented on August 16, 2024

The fastq files A.R1.fq.gz, B.R1.fq.gz, etc are indeed empty. The files hla.allele.1.HLA_A.fasta etc contain sequences of mostly N's, like this:

>HLA_A:1301-1373
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNGG
>HLA_A:1504-1773
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNCG
>HLA_A:2015-2290
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGG
>HLA_A:2870-3145
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGG
>HLA_A:3248-3364
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAG

I have sent you an email with my test data. Thanks in advance!

from spechla.

wshuai294 avatar wshuai294 commented on August 16, 2024

Hi, this bug is caused by the uncommon read name like @SRR8615409.205003 205003/2. I have fixed it. Please update to the latest commit and let me know if it works.

from spechla.

Carovanandel avatar Carovanandel commented on August 16, 2024

Fantastic, it works now! Thanks

from spechla.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.