Giter Site home page Giter Site logo

gtc2vcf's Issues

GTC file format identifier is bad

I'm using bcftools Version: 1.10.2 (using htslib 1.10.2)

bcftools +gtc2vcf -c HumanOmniExpressExome-8-v1-0-B.csv -f human_g1k_v37.fasta test.gtc -o test.vcf
================================================================================
Reading CSV file HumanOmniExpressExome-8-v1-0-B.csv
BPM manifest file version = 0
Name of manifest = HumanOmniExpressExome-8v1_B.bpm
Number of loci = 951117
================================================================================
Reading GTC files
GTC file test.gtc format identifier is bad

First couple of lines in my gtc file

[Header]
Autocall Version        1.6.2.2
Processing Date 8/24/2012 9:10 PM
Content HumanOmniExpressExome-8v1_B.bpm
Cluster File    StCtrCEPH_OMXEX_B.egt
Gender  F
Num SNPs        951117
Total SNPs      951117
Num Samples     1
Total Samples   1
[Data]
SNP Name        Chromosome      Position        GC Score        Allele1 - Top   Allele2 - Top   Allele1 - AB    Allele
2 - AB  X       Y       Raw X   Raw Y   R Illumina      Theta Illumina  bAllele Freq    Log R Ratio Illumina
200610-104      MT      212     0.4097353       A       A       A       A       3.1761038       0.037173282     23245.
0       433.0   3.213277        0.0074506905    0.0020603272    0.27923167
200610-106      MT      246     0.3716166       A       A       A       A       3.1220326       0.1416725       22856.
0       1400.0  3.2637053       0.028868914     0.0     0.3078592

my manifest file

HumanOmniExpressExome-8-v1-0-B.csv
Illumina, Inc.
[Heading]
Descriptor File Name,HumanOmniExpressExome-8v1_B.bpm
Assay Format,Infinium HD Super
Date Manufactured,4/21/2014
Loci Count ,951117
[Assay]
IlmnID,Name,IlmnStrand,SNP,AddressA_ID,AlleleA_ProbeSeq,AddressB_ID,AlleleB_ProbeSeq,GenomeBuild,Chr,MapInfo,Ploidy,Sp
ecies,Source,SourceVersion,SourceStrand,SourceSeq,TopGenomicSeq,BeadSetID,RefStrand,Exp_Clusters
200610-104-0_B_F_1867864664,200610-104,BOT,[T/C],0095685332,CGCACCTACGTTCAATATTACAGGCGAACATACTTACTAAAGTGTGTTAA,,,37,MT
,212,diploid,Homo sapiens,BGI,0,BOT,TTATTTATCGCACCTACGTTCAATATTACAGGCGAACATACTTACTAAAGTGTGTTAA[T/C]TAATTAATGCTTGTAGGAC
ATAATAATAACAATTGAATGTCTGCACAGCCACTTTCCACACAGACATCATAACAA,TTGTTATGATGTCTGTGTGGAAAGTGGCTGTGCAGACATTCAATTGTTATTATTATGTCCT
ACAAGCATTAATTA[A/G]TTAACACACTTTAGTAAGTATGTTCGCCTGTAATATTGAACGTAGGTGCGATAAATAA,485,+,2

IDAT to GTC not working

Hello, I tried to obtain gtc files from idat using the command line in the tutorial :

mono $HOME/bin/autoconvert/AutoConvert.exe $path_to_idat_folder $path_to_output_folder $manifest_file $egt_file

unfortunately the process gives me, as you said, the normalization error. I tried to use a custom cluster file and a custom manifest file with a ".csv" extension, could it be possible that the error raises because of this. For me it's mandatory to use custom egt and csv or bpm files because of some added SNPs is there a solution to this issue?

Thank you

IDAT not found

Hi freeseek,
Thank you for your help yesterday and I installed the gtc2vcf successfully. But when I convert idat to gtc, the IDAT always not found at the location. I tried many times but can't solve it. I checked the bpm and egt, I am sure they are right:
Chip Prefix (Guess),InfiniumPsychArray-24v1-1
I don't know why the idat not found, my idat files just like this:
GSM3096512_200687150051_R01C01_Grn.idat
GSM3096512_200687150051_R01C01_Red.idat

This is the log:
ArrayAnalysis.NormToGenCall.CLI.App[0]
[10:25:21 2352]: Crawling /media/EXTend2018/Wanghe2019/GEO/GSE113093/GSE113093_RAW for samples ...
info: ArrayAnalysis.NormToGenCall.CLI.App[0]
[10:25:21 3578]: Number of samples to process: 103
info: ArrayAnalysis.NormToGenCall.Services.NormToGenCallSvc[0]
[10:25:21 3714]:
Starting processing...
Manifest file: /media/EXTend2018/Wanghe2019/GEO/GSE113093/InfiniumPsychArray-24v1-1_A1.bpm
Cluster file: /media/EXTend2018/Wanghe2019/GEO/GSE113093/InfiniumPsychArray-24v1-1_A1_ClusterFile.egt
Include file:
Output directory: /media/EXTend2018/Wanghe2019/GEO/GSE113093
GenCall score cutoff: 0.15
GenTrain ID: 3
Gender Estimate Settings:
Version: 2
MinAutosomalLoci : 100
MaxAutosomalLoci : 10000
MinXLoci : 20
MinYLoci : 20
AutosomalCallRateThreshold : 0.97
YIntensityThreshold : 0.3
XIntensityThreshold : 0.9
XHetRateThreshold : 0.1
Output Settings:
Output GTC: True
Output PED: False
PED tab delmited: False
PED use customer strand: False
Number of threads: 1
Buffer size: 131072

info: ArrayAnalysis.NormToGenCall.Services.NormToGenCallSvc[0]
[49m: ArrayAnalysis.NormToGenCall.Services.NormToGenCallSvc[0]
[12:33:32 8929]: Failed to normalize or gencall - GSM3096512_200687150051_R01C01: IDAT not found at location: /media/EXTend2018/Wanghe2019/GEO/GSE113093/GSE113093_RAW/GSM3096512_200687150051_Red.idat
at ArrayAnalysis.NormToGenCall.Services.SampleNormToGenCallSvc.LoadIdat(String idatPath, Manifest manifest) in /src/ArrayAnalysis.NormToGenCall.Services/Services/SampleNormToGenCallSvc.cs:line 63
at ArrayAnalysis.NormToGenCall.Services.SampleNormToGenCallSvc.Normalize(NormalizationBase normAlg, Manifest manifest, Byte[] transformLookups, Boolean needGreen, Boolean needRed, SampleData sample, String[] includeLociNames) in /src/ArrayAnalysis.NormToGenCall.Services/Services/SampleNormToGenCallSvc.cs:line 106
at ArrayAnalysis.NormToGenCall.Services.NormToGenCallSvc.<>c__DisplayClass7_0.b__2(SampleData sample) in /src/ArrayAnalysis.NormToGenCall.Services/Services/NormToGenCallSvc.cs:line 113
...There are many idat files fault like this.

Best wishes,
Crane

affy2vcf: How to make the IDs of generated vcf files be rsid from annotation files given by Affymetrix but not probeset id?

Hi,

I had succeeded transforming CEL files to vcf files, but I found the ID column of vcf files were still probeset ID. I have tried

bcftools annotate -a 00-All.vcf.gz -c ID xxx.vcf.gz

to make the ID column annotated by rsids, but there are still some SNPs failing to be annotated for 00-All.vcf.gz not containing all the SNPs from GenomeWideSNP_6.na35.annot.csv. Is there anyway to annotate the IDs in the step

bcftools +affy2vcf \
  --no-version -Ou \
  --csv $csv_manifest_file \
  --fasta-ref $ref \
  --chps $path_to_chp_folder \
  --snp $path_to_txt_folder/AxiomGT1.snp-posteriors.txt \
  --extra $out_prefix.tsv | \
  bcftools sort -Ou -T ./bcftools-sort.XXXXXX | \
  bcftools norm --no-version -Ob -o $out_prefix.bcf -c x -f $ref && \
  bcftools index -f $out_prefix.bcf

or transform the GenomeWideSNP_6.na35.annot.csv to vcf annotation file? Thank you!

Failed to open file "Ou" : No such file or directory

Hello, freeseek,

I cannot seem to convert my .gtc files to a vcf file using the following code:

bpm_manifest_file="InfiniumOmni2-5-8v1-5_A1.bpm"
csv_manifest_file="InfiniumOmni2-5-8v1-5_A1.csv"
egt_cluster_file="InfiniumOmni2-5-8v1-5_A1_ClusterFile.egt"
ref="$HOME/GRCh37/human_g1k_v37.fasta"
out_prefix="batch1_vcf"
bcftools +gtc2vcf --no-version -Ou --bpm $bpm_manifest_file --csv $csv_manifest_file --egt $egt_cluster_file --gtcs $path_to_gtc_folder --fasta-ref $ref --extra $out_prefix.tsv | bcftools sort -Ou -T ./bcftools-sort.XXXXXX | bcftools norm --no-version -Ob -c x -f $ref | tee $out_prefix.bcf | bcftools index --force --output $out_prefix.bcf.csi

The error that I receive is [E: :hts_open_format] Failed to open file "Ou" : No such file or directory
Reading BPM file InfiniumOmni2-5-8v1-5_A1.bpm
Could not read Ou
Failed to read from standard input: unknown file type
index: "-" is in a format that cannot be usefully indexed

I've tried adapting the command by reading through the other issues that have come up, but have had no luck creating a bcf file that has > 0 bytes. May I ask for assistance in resolving this issue? I should mention that the manifest and cluster files provided by illumina are in the same directory in which I am running this command.

Thank you,
Chris

compressed VCF

Hi,

When I'm changing from CHP files to BCF this is the command:

  1. bcftools +affy2vcf \
  2. --no-version -Ou \
  3. --csv "GenomeWideSNP_6.na35.annot.csv" \
  4. --fasta-ref "human_g1k_v37.fasta" \
  5. --chps /home/user/project/cc-chp/NAME \
  6. --snp /home/user/project/AxiomGT1.snp-posteriors.txt \
  7. --extra NAME.tsv | \
  8. bcftools sort -Ou -T ./bcftools-sort.XXXXXX | \
  9. bcftools norm --no-version -Ob -o NAME.vcf -c x -f "human_g1k_v37.fasta" && \
  10. bcftools index -f NAME.vcf

I was wondering, if I want to change the format to VCF I need to change the lines 2, 8 and 9 to "-Ov", "-Ov" and "-Oz", respectively? I mean, because "-Ov" and "-Oz" is for VCF, instead of "-Ou" and "-Ob" that is for BCF format.

If this is correct, It would look like this:

  1. bcftools +affy2vcf \
  2. --no-version -Ov \
  3. --csv "GenomeWideSNP_6.na35.annot.csv" \
  4. --fasta-ref "human_g1k_v37.fasta" \
  5. --chps /home/user/project/cc-chp/NAME \
  6. --snp /home/user/project/AxiomGT1.snp-posteriors.txt \
  7. --extra NAME.tsv | \
  8. bcftools sort -Ov -T ./bcftools-sort.XXXXXX | \
  9. bcftools norm --no-version -Oz -o NAME.vcf -c x -f "human_g1k_v37.fasta" && \
  10. bcftools index -f NAME.vcf

When I run it in this way, I have the VCF file in the end, but also I have this message:

index: "NAME.vcf" is in a format that cannot be usefully indexed

I just want to know if the change is correct and if its correct, there is any way to index the file usefully?

How to get Call_Freq, AAfreq, BBfreq, and AB Freq

Hi thanks for the great tool. I have some query, I want to remove some poor quality snps from the vcf file. The filteration I want should be based the following threshold

  1. "Call Freq" < 0.97
  2. "AA Freq" = 1 AND "AA T Mean" > 0.3
  3. "BB Freq" = 1 AND BB T Dev" > 0.06

I can see that AA T Mean and BB T Dev are there in the VCF file but I am unable to find Call Freq, AA Freq, BB Freq and AB freq.
Please let me know how can I get these values.
Awaiting for your reply
Thanks

RUN bcftools +affy2vcf --models get some error . How do i fix it?

I'm sorry to bother you.
I got this error "Probe Set AX-82929059 not found in models file" when I run bcftools +affy2vcf.
How do i fix it? Thanks!

Could I not use this command --models xxxxxx.snp-posteriors.txt when I ran bcftools +affy2vcf. Any Different? => If I don't use --models command, I can get vcf file.


bcftools +affy2vcf
--csv ../APT-library/biobank/Axiom_BioBank1.na35.annot.csv
--fasta-ref ../resource-humanv37/human_g1k_v37.fasta
--calls ./GPS-step7-output/AxiomGT1.calls.txt
--confidences ./GPS-step7-output/AxiomGT1.confidences.txt
--summary ./GPS-step7-output/AxiomGT1.summary.txt
--models ./GPS-step7-output/AxiomGT1.snp-posteriors.txt
--output ./bcf-output/AxiomGT1.vcf

--- RUNNING LOG ---
Reading CSV file ../APT-library/biobank/Axiom_BioBank1.na35.annot.csv
Reading SNP file ./GPS-step7-output/AxiomGT1.snp-posteriors.txt
Writing VCF file
Probe Set AX-82929059 not found in models file

bcftools +affy2vcf
--csv ../APT-library/biobank/Axiom_BioBank1.na35.annot.csv
--fasta-ref ../resource-humanv37/human_g1k_v37.fasta
--chps ./GPS-step7-output/cc-chp/
--models ./GPS-step7-output/AxiomGT1.snp-posteriors.txt
--output bcf0517chp.vcf

--- RUNNING LOG ---
Reading CSV file ../APT-library/biobank/Axiom_BioBank1.na35.annot.csv
Reading CHP file ./GPS-step7-output/cc-chp//xxxxxxxxxxx.chp
...
Reading SNP file ./GPS-step7-output/AxiomGT1.snp-posteriors.txt
Writing VCF file
Probe Set AX-82929059 not found in models file

Genomestudio file to vcf

Dear Freeseek,

The conversion from a genomestudio file to a vcf file works fine, but a lot of SNPs are missing after this conversion. I looked into this and observed that only the SNPs without any missings are in the vcf file, but I am not sure about this yet, so I have some questions about this.

Is it true that the gtc2vcf tool only keep the complete SNPs without any missings after conversion? Or is there another way to handle them in this tool? And is it right if I use -- for missings in the Genomestudio file?

Thanks in advance!

More GS report queries...

Hi,

I have attempted the thankless task of using a genomestudio .txt file. Don't have other options.

This is my genomestudio header:

Index   Name    Address Chr     Position        GenTrain Score  59_1.GType      59_1.Score      59_1.Theta      59_1.R  59_1.X Raw      59_1.Y Raw      59_1.X  59_1.Y  59_1.B Allele Freq      59_1.Log R Ratio
        59_1.Top Alleles        59_1.Import Calls       59_1.Concordance        59_1.Orig Call  59_1.CNV Value  59_1.CNV Confidence     59_1.Plus/Minus Alleles
1       rs1000000       95775890        12      126890980       0.7825049       AB      0.7878883       0.4333902       2.230212        14921   7256    1.232208        0.9980044       0.5075449       0.008405539     AG              -1                              AG
2       rs1000002       20798118        3       183635768       0.8463691       AB      0.879837        0.4056498       1.041987        7707    3384    0.5987776       0.4432094       0.4658202       -0.06460849     AG              -1                              TC

This is what I get after running the --genome studio option.
As you can see the gtf almost exclusively has A/N as reference and G/N for alternative.
Counts REF: A=400K+, N=270K+, C=817. No G or T
Counts ALT: G=380K+, C=80K+, N=230K+, T=600. No A

I assume something went wrong there, if ther is a fix, would be rather grateful for advice.
Jakub

> ##contig=<ID=chrUn_GL000218v1,length=161147>
> ##contig=<ID=chrEBV,length=171823>
> ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
> ##FORMAT=<ID=IGC,Number=1,Type=Float,Description="Illumina GenCall Confidence Score">
> ##FORMAT=<ID=BAF,Number=1,Type=Float,Description="B Allele Frequency">
> ##FORMAT=<ID=LRR,Number=1,Type=Float,Description="Log R Ratio">
> ##bcftools_+gtc2vcfVersion=1.9+htslib-1.9
> ##bcftools_+gtc2vcfCommand=gtc2vcf -f GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --genome-studio P150645.txt -o P150645.vcf; Date=Sun Apr 19 16:38:13 2020
> #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  59_1
> chr12   126890980       rs1000000       A       G       .       .       .       GT:IGC:BAF:LRR  0/1:0.787888:0.507545:0.00840554
> chr3    183635768       rs1000002       A       G       .       .       .       GT:IGC:BAF:LRR  0/1:0.879837:0.46582:-0.0646085
> chr4    95733906        rs10000023      A       N       .       .       .       GT:IGC:BAF:LRR  0/0:0.755057:0.0114496:0.0626198
> chr3    98342907        rs1000003       A       N       .       .       .       GT:IGC:BAF:LRR  0/0:0.790033:0.0316309:0.157232
> chr4    103374154       rs10000030      N       G       .       .       .       GT:IGC:BAF:LRR  1/1:0.7819:0.989484:0.0112141
> chr4    38924330        rs10000037      A       G       .       .       .       GT:IGC:BAF:LRR  0/1:0.899376:0.512505:-0.0105628
> chr4    165621955       rs10000041      A       N       .       .       .       GT:IGC:BAF:LRR  0/0:0.923617:0.0272044:0.23714
> chr4    5237152 rs10000042      N       G       .       .       .       GT:IGC:BAF:LRR  1/1:0.784419:1:0.0955165
> chr4    118948220       rs10000049      A       N       .       .       .       GT:IGC:BAF:LRR  0/0:0.432164:0.00305451:0.0203382
> chr2    237752054       rs1000007       A       N       .       .       .       GT:IGC:BAF:LRR  0/0:0.908865:0.0369862:-0.0206654
> chr4    43022222        rs10000073      A       N       .       .       .       GT:IGC:BAF:LRR  0/0:0.925892:0.0150725:0.0507056
> chr4    17348363        rs10000081      A       N       .       .       .       GT:IGC:BAF:LRR  0/0:0.905235:0:0.0410943
> chr4    21895517        rs10000092      A       G       .       .       .       GT:IGC:BAF:LRR  0/1:0.839045:0.558548:-0.333746
> chr4    53623677        rs10000105      N       G       .       .       .       GT:IGC:BAF:LRR  1/1:0.864878:0.981329:0.114554
> chr4    37796830        rs10000119      N       G       .       .       .       GT:IGC:BAF:LRR  1/1:0.907763:1:-0.069378
> chr4    109106451       rs10000124      N       C       .       .       .       GT:IGC:BAF:LRR  1/1:0.810363:0.995153:-0.108182
> chr4    80666077        rs10000154      A       N       .       .       .       GT:IGC:BAF:LRR  0/0:0.926977:0:-0.175949
> chr2    235690982       rs1000016       A       N       .       .       .       GT:IGC:BAF:LRR  0/0:0.870474:0:-0.068737
> chr4    69033099        rs10000160      N       G       .       .       .       GT:IGC:BAF:LRR  1/1:0.901467:1:0.174004

Error in names(object) <- nm gtc2vcf_plot.R

Dear freeseek,

I have some issues with running the R script gtc2vcf_plot.R to generate plots. My input was first a .vcf file, but i got an error about the file format, so I converted it with bgzip to a vcf.gz file (as suggested in the message) with the following command: bgzip file.vcf. After converting the file to a .vcf.gz file format, I got the error below.

gtc2vcf_plot.R 2020-09-01 https://github.com/freeseek/gtc2vcf
Command: bcftools query --format [%CHROM\t%POS\t%ID\t%INFO/meanR_AA\t%INFO/meanR_AB\t%INFO/meanR_BB\t%INFO/meanTHETA_AA\t%INFO/meanTHETA_AB\t%INFO/meanTHETA_BB\t%INFO/devR_AA\t%INFO/devR_AB\t%INFO/devR_BB\t%INFO/devTHETA_AA\t%INFO/devTHETA_AB\t%INFO/devTHETA_BB\t%GT\t%X\t%Y\t%NORMX\t%NORMY\t%R\t%THETA\t%BAF\t%LRR\n]" all_qc.unphased_extra.vcf.gz -r 11:66328095-66328095
Error in names(object) <- nm :
  'names' attribute [24] must be the same length as the vector [0]
Calls: setNames
In addition: Warning message:
In fread(cmd = cmd, sep = "\t", header = FALSE, na.strings = ".",  :
  File '/tmp/RtmpaEu4mo/file13974573efd5' has size 0. Returning a NULL data.frame.
Execution halted

Thanks in advance!

Some installation issues

Hi Giulio,

I met some issues when installing the tools. I'm using Ubuntu 16.04 and I'm not experienced at Ubuntu installation. Could you help me with them?

  1. Cannot install libicu66
sudo apt install libicu66
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package libicu66

I did some search on Google but did not find a package named libicu66. If I just want to convert .idat files into .vcf files (do not have .bpm files), do I need to install this package?

  1. Cannot install gtc2vcf correctly
    I tried to use the first method to install gtc2vcf:
git clone --branch=develop --recurse-submodules git://github.com/samtools/htslib.git
git clone --branch=develop git://github.com/samtools/bcftools.git
/bin/rm -f bcftools/plugins/{gtc2vcf.{c,h},affy2vcf.c}
wget -P bcftools/plugins https://raw.githubusercontent.com/freeseek/gtc2vcf/master/{gtc2vcf.{c,h},affy2vcf.c}
cd htslib && autoheader && (autoconf || autoconf) && ./configure --disable-bz2 --disable-gcs --disable-lzma && make && cd ..
cd bcftools && make && cd ..
/bin/cp bcftools/{bcftools,plugins/{gtc,affy}2vcf.so} $HOME/bin/
export PATH="$HOME/bin:$PATH"
export BCFTOOLS_PLUGINS="$HOME/bin"

These commands all run correctly but when I tried to use

gtc2vcf

I got

gtc2vcf: command not found

When I tried

gtc2vcf.so

I got

Segmentation fault (core dumped)

My system has 16GB RAM and 8 cores. Do you think it due to the lack of RAM?

When I tried to use the alternative method to install gtc2vcf, I got :

sudo apt install ./{libhts3_1.11-4,bcftools_1.11-1,gtc2vcf_1.11-dev}_amd64.deb
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Note, selecting 'libhts3' instead of './libhts3_1.11-4_amd64.deb'
Note, selecting 'bcftools' instead of './bcftools_1.11-1_amd64.deb'
Note, selecting 'gtc2vcf' instead of './gtc2vcf_1.11-dev_amd64.deb'
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 bcftools : Depends: libc6 (>= 2.29) but 2.23-0ubuntu11.2 is to be installed
 gtc2vcf : Depends: libc6 (>= 2.29) but 2.23-0ubuntu11.2 is to be installed
 libhts3 : Depends: libc6 (>= 2.29) but 2.23-0ubuntu11.2 is to be installed
           Depends: libdeflate0 (>= 1.0) but it is not installable
           Depends: libssl1.1 (>= 1.1.0) but it is not installable
E: Unable to correct problems, you have held broken packages.
  1. Consequently, when I tried to find BPM manifest infomation, I got nothing.
bcftools + gtc2vcf -i -g ~/Desktop/test/
Could not initialize , neither run or init found

Any suggestions would be greatly appreciated!

Thank you!
Xiaotong

NORMX/NORMY/R/THETA missing from GenomeStudio text output

Thanks for an excellent tool! I have been trying to use it to generate input for a CNV calling pipeline, and was pleased to discover the -Ot option for GenomeStudio text format export, which looked close enough to the format I needed. However, it seems some fields that make it to the VCF output are not exported to the text format.

Specifically, the ones I miss are NORMX/NORMY/R/THETA. I checked the code of gtcs_to_gs, and all the missing fields seem to depend on BPM_LOOKUPS being set. I couldn't see a reason why it wouldn't be though, so maybe this is the wrong track.

Exporting the same collection of GTCs to VCF had the proper format tags included.

This call:

bcftools +${GTC2VCF} \
        -Ot \
        --bpm ${BATCH1_MFT_BPM} \
        --csv ${BATCH1_MFT_CSV} \
        --egt ${BATCH1_EGT} \
        --gtcs ${GTCDIR}/${BATCH1_NAME} \
        --fasta-ref ${REF} > ${OUT_PREFIX}.FDT.tsv

Produces output with these columns (truncated):

Index
Name
Address
Chr
Position
GenTrain Score
Frac A
Frac C
Frac G
Frac T
204379800081_R02C02.GType
204379800081_R02C02.Score
204379800081_R02C02.B Allele Freq
204379800081_R02C02.Log R Ratio
204379800081_R02C02.X Raw
204379800081_R02C02.Y Raw
204379800081_R02C02.Top Alleles
204379800081_R02C02.Plus/Minus Alleles
204379800081_R02C01.GType
204379800081_R02C01.Score
204379800081_R02C01.B Allele Freq
204379800081_R02C01.Log R Ratio
204379800081_R02C01.X Raw
204379800081_R02C01.Y Raw
...

While an equivalent call requesting vcf output:

bcftools +${GTC2VCF} \
        -Ou \
        --bpm ${BATCH1_MFT_BPM} \
        --csv ${BATCH1_MFT_CSV} \
        --egt ${BATCH1_EGT} \
        --gtcs ${GTCDIR}/${BATCH1_NAME} \
        --fasta-ref ${REF} \
        --extra ${OUT_PREFIX}.tsv | \
        bcftools sort -Ou -T $TMPDIR/bcftools-sort.XXXXXX | \
        bcftools norm -Oz -o ${OUT_PREFIX}.vcf.gz -c x -f $REF

produces a VCF with the expected format tags:

GT:GQ:IGC:BAF:LRR:NORMX:NORMY:R:THETA:X:Y

Tested on the stable version from http://software.broadinstitute.org/software/gtc2vcf/ and the current github version getting the same results.

I can query the VCF to get the data I need, but thought I should report this since the behavior was unexpected.

idat to vcf without bpm file

Hi @freeseek ,

I dont have the bpm manifest file but I have csv manifest file. Is there any options to convert the idat file to gtc & vcf?

Regards,
Karthick

GTC files cannot be listed through both command interface and file list when only submitting a .txt file

Hi,

I am getting the error message "GTC files cannot be listed through both command interface and file list" even though I am only submitting a single .txt file with a list of the gtc file names. I have tried this where the actual gtc files are in the directory where I am running the script, and also where they are in their own directory. I am running on a google cloud instance and using a singularity container. Here is the code, and I have attached the gtc_list file.

`bpm_manifest_file="./GDA_PGx-8v1-0_20042614_A2.bpm"
csv_manifest_file="./ProjectDetailReport ILMN GDA 07-11-22 AMS1.csv"
egt_cluster_file="./GDA FINAL 3 plate validation reclustered 06302022.egt"
path_to_gtc_folder="./gtc_file_list.csv"
ref="./GRCh38_full_analysis_set_plus_decoy_hla.fa" # or ref="$HOME/GRCh37/human_g1k_v37.fasta"
out_prefix="206486390022"

singularity exec gtc2vcf_072922.sif bcftools +gtc2vcf
--no-version -Ou
--bpm $bpm_manifest_file
--csv $csv_manifest_file
--egt $egt_cluster_file
--gtcs ./gtc_list_file.txt
--fasta-ref $ref
--output $out_prefix.vcf
--output-type v
--extra $out_prefix.tsv
--verbose
`

Thank you
Harry
gtc_list_file.txt

Feature request: alternative genome reference for --genome-studio input

Hello, and thanks for a great tool!

I am working on some older genotype data (on the PsychChip) where the IDAT files have unfortunately been lost to time, but where we do have a reasonably rich GenomeStudio text format export, and the original csv manifest file used when generating the export. I want to combine this with newer genotyping waves where we do have the IDATs, and would like to remap the markers using gtc2vcf to hopefully be done with strand and allele issues once and for all. But currently gtc2vcf does not permit --genome-studio to be used with --csv and/or --sam-flank.

Would it be possible to extend gtc2vcf to this use case, or is there some vital information I am missing that makes it a bad idea or impossible?

The GS export has columns (followed by 6-15 repeated for each sample):

1: Index
2: Name
3: Address
4: Chr
5: Position
6: S1.GType
7: S1.Score
8: S1.Theta
9: S1.R
10: S1.X Raw
11: S1.Y Raw
12: S1.X
13: S1.Y
14: S1.B Allele Freq
15: S1.Log R Ratio
16: ...

My csv manifest has columns:

1: IlmnID
2: Name
3: IlmnStrand
4: SNP
5: AddressA_ID
6: AlleleA_ProbeSeq
7: AddressB_ID
8: AlleleB_ProbeSeq
9: GenomeBuild
10: Chr
11: MapInfo
12: Ploidy
13: Species
14: Source
15: SourceVersion
16: SourceStrand
17: SourceSeq
18: TopGenomicSeq
19: BeadSetID

Problem compile affy2vcf.c

Hello,

I just try to compile bcftools with your new plugin with error:

gcc -fPIC -shared -g -Wall -O2 -I. -I../htslib    -o plugins/affy2vcf.so version.c plugins/affy2vcf.c 
In file included from plugins/affy2vcf.c:39:0:
plugins/gtc2vcf.h: In function ‘flank_reverse_complement’:
plugins/gtc2vcf.h:186:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (size_t i = 0; i < len / 2; i++) {
  ^
plugins/gtc2vcf.h:186:2: note: use option -std=c99 or -std=gnu99 to compile your code
plugins/gtc2vcf.h: In function ‘flank_left_shift’:
plugins/gtc2vcf.h:215:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (const char *ptr = middle + 2; ptr < right; ptr++)
  ^
plugins/gtc2vcf.h: In function ‘get_position’:
plugins/gtc2vcf.h:306:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int k = 0; k < n_cigar && qlen > 1; k++) {
    ^
plugins/affy2vcf.c: In function ‘read_bytes’:
plugins/affy2vcf.c:79:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < nbytes; i++)
   ^
plugins/affy2vcf.c: In function ‘read_string16’:
plugins/affy2vcf.c:132:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < len; i++) {
   ^
plugins/affy2vcf.c: In function ‘xda_cel_print’:
plugins/affy2vcf.c:298:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < xda_cel->num_cells; i++)
   ^
plugins/affy2vcf.c:308:12: error: redefinition of ‘i’
   for (int i = 0; i < xda_cel->num_masked_cells; i++)
            ^
plugins/affy2vcf.c:298:12: note: previous definition of ‘i’ was here
   for (int i = 0; i < xda_cel->num_cells; i++)
            ^
plugins/affy2vcf.c:308:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < xda_cel->num_masked_cells; i++)
   ^
plugins/affy2vcf.c:317:12: error: redefinition of ‘i’
   for (int i = 0; i < xda_cel->num_outlier_cells; i++)
            ^
plugins/affy2vcf.c:308:12: note: previous definition of ‘i’ was here
   for (int i = 0; i < xda_cel->num_masked_cells; i++)
            ^
plugins/affy2vcf.c:317:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < xda_cel->num_outlier_cells; i++)
   ^
plugins/affy2vcf.c: In function ‘agcc_read_data_header’:
plugins/affy2vcf.c:459:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_header->n_parameters; i++)
  ^
plugins/affy2vcf.c:465:11: error: redefinition of ‘i’
  for (int i = 0; i < data_header->n_parents; i++)
           ^
plugins/affy2vcf.c:459:11: note: previous definition of ‘i’ was here
  for (int i = 0; i < data_header->n_parameters; i++)
           ^
plugins/affy2vcf.c:465:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_header->n_parents; i++)
  ^
plugins/affy2vcf.c: In function ‘agcc_read_data_set’:
plugins/affy2vcf.c:477:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_set->n_parameters; i++)
  ^
plugins/affy2vcf.c:482:11: error: redefinition of ‘i’
  for (int i = 0; i < data_set->n_cols; i++) {
           ^
plugins/affy2vcf.c:477:11: note: previous definition of ‘i’ was here
  for (int i = 0; i < data_set->n_parameters; i++)
           ^
plugins/affy2vcf.c:482:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_set->n_cols; i++) {
  ^
plugins/affy2vcf.c:492:11: error: redefinition of ‘i’
  for (int i = 0; i < data_set->n_cols; i++) {
           ^
plugins/affy2vcf.c:482:11: note: previous definition of ‘i’ was here
  for (int i = 0; i < data_set->n_cols; i++) {
           ^
plugins/affy2vcf.c:492:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_set->n_cols; i++) {
  ^
plugins/affy2vcf.c: In function ‘agcc_read_data_group’:
plugins/affy2vcf.c:514:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_group->num_data_sets; i++)
  ^
plugins/affy2vcf.c: In function ‘agcc_init’:
plugins/affy2vcf.c:548:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < agcc->num_data_groups; i++)
  ^
plugins/affy2vcf.c: In function ‘agcc_destroy_parameters’:
plugins/affy2vcf.c:576:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < n_parameters; i++) {
  ^
plugins/affy2vcf.c: In function ‘agcc_destroy_data_header’:
plugins/affy2vcf.c:591:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_header->n_parents; i++)
  ^
plugins/affy2vcf.c: In function ‘agcc_destroy_data_set’:
plugins/affy2vcf.c:600:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_set->n_cols; i++)
  ^
plugins/affy2vcf.c: In function ‘agcc_destroy_data_group’:
plugins/affy2vcf.c:610:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_group->num_data_sets; i++)
  ^
plugins/affy2vcf.c: In function ‘agcc_destroy’:
plugins/affy2vcf.c:623:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < agcc->num_data_groups; i++)
  ^
plugins/affy2vcf.c: In function ‘agcc_print_parameters’:
plugins/affy2vcf.c:639:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < n_parameters; i++) {
  ^
plugins/affy2vcf.c:674:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int j = 0; j < parameters[i].n_value / 2; j++)
    ^
plugins/affy2vcf.c: In function ‘agcc_print_data_header’:
plugins/affy2vcf.c:694:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_header->n_parents; i++)
  ^
plugins/affy2vcf.c: In function ‘agcc_print_data_set’:
plugins/affy2vcf.c:731:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_set->n_cols; i++)
  ^
plugins/affy2vcf.c:749:11: error: redefinition of ‘i’
  for (int i = 0; i < data_set->n_cols; i++) {
           ^
plugins/affy2vcf.c:731:11: note: previous definition of ‘i’ was here
  for (int i = 0; i < data_set->n_cols; i++)
           ^
plugins/affy2vcf.c:749:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_set->n_cols; i++) {
  ^
plugins/affy2vcf.c:774:11: error: redefinition of ‘i’
  for (int i = 0; i < data_set->n_rows; i++) {
           ^
plugins/affy2vcf.c:749:11: note: previous definition of ‘i’ was here
  for (int i = 0; i < data_set->n_cols; i++) {
           ^
plugins/affy2vcf.c:774:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_set->n_rows; i++) {
  ^
plugins/affy2vcf.c:776:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int j = 0; j < data_set->n_cols; j++) {
   ^
plugins/affy2vcf.c: In function ‘agcc_print_data_group’:
plugins/affy2vcf.c:788:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < data_group->num_data_sets; i++)
  ^
plugins/affy2vcf.c: In function ‘agcc_print’:
plugins/affy2vcf.c:799:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < agcc->num_data_groups; i++)
  ^
plugins/affy2vcf.c: In function ‘agccs_to_tsv’:
plugins/affy2vcf.c:826:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int j = 0; j < 20; j++)
  ^
plugins/affy2vcf.c:829:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < n; i++) {
  ^
plugins/affy2vcf.c:833:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int j = 0, k = 0; j < 20; j++) {
   ^
plugins/affy2vcf.c: In function ‘cels_to_tsv’:
plugins/affy2vcf.c:976:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < n; i++) {
  ^
plugins/affy2vcf.c:1004:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int k = 0; k < data_header->parameters[j].n_value / 2; k++)
    ^
plugins/affy2vcf.c: In function ‘models_init’:
plugins/affy2vcf.c:1119:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < 2; i++) {
  ^
plugins/affy2vcf.c: In function ‘models_destroy’:
plugins/affy2vcf.c:1225:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < 2; i++) {
  ^
plugins/affy2vcf.c:1227:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int j = 0; j < models->n_snps[i]; j++)
   ^
plugins/affy2vcf.c: In function ‘annot_init’:
plugins/affy2vcf.c:1316:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < ncols; i++) {
  ^
plugins/affy2vcf.c:1421:5: error: ‘for’ loop initial declarations are only allowed in C99 mode
     for (int i = 1; i < ncols; i++) {
     ^
plugins/affy2vcf.c: In function ‘annot_destroy’:
plugins/affy2vcf.c:1538:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < annot->n_records; i++) {
  ^
plugins/affy2vcf.c: In function ‘report_destroy’:
plugins/affy2vcf.c:1594:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < report->n_samples; i++)
  ^
plugins/affy2vcf.c: In function ‘varitr_init_cc’:
plugins/affy2vcf.c:1645:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < n; i++) {
  ^
plugins/affy2vcf.c: In function ‘varitr_init_txt’:
plugins/affy2vcf.c:1700:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int i = 1; i < ncols; i++) {
   ^
plugins/affy2vcf.c:1716:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int i = 1; i < ncols; i++) {
    ^
plugins/affy2vcf.c:1733:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int i = 1; i < ncols; i++) {
    ^
plugins/affy2vcf.c: In function ‘varitr_loop’:
plugins/affy2vcf.c:1782:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < varitr->nsmpl; i++) {
   ^
plugins/affy2vcf.c:1839:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int i = 1; i < 1 + varitr->nsmpl; i++)
    ^
plugins/affy2vcf.c:1852:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int i = 1; i < 1 + varitr->nsmpl; i++)
    ^
plugins/affy2vcf.c:1885:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int i = 1; i < 1 + varitr->nsmpl; i++)
    ^
plugins/affy2vcf.c:1895:13: error: redefinition of ‘i’
    for (int i = 1; i < 1 + varitr->nsmpl; i++) {
             ^
plugins/affy2vcf.c:1885:13: note: previous definition of ‘i’ was here
    for (int i = 1; i < 1 + varitr->nsmpl; i++)
             ^
plugins/affy2vcf.c:1895:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int i = 1; i < 1 + varitr->nsmpl; i++) {
    ^
plugins/affy2vcf.c: In function ‘hdr_init’:
plugins/affy2vcf.c:1949:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < n; i++) {
  ^
plugins/affy2vcf.c: In function ‘adjust_clusters’:
plugins/affy2vcf.c:2139:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < n; i++) {
  ^
plugins/affy2vcf.c: In function ‘compute_baf_lrr’:
plugins/affy2vcf.c:2228:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < n; i++) {
  ^
plugins/affy2vcf.c: In function ‘process’:
plugins/affy2vcf.c:2340:5: error: ‘for’ loop initial declarations are only allowed in C99 mode
     for (int i = 0; i < nsmpl; i++) {
     ^
plugins/affy2vcf.c:2389:4: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int i = 0; i < 2; i++) {
    ^
plugins/affy2vcf.c: In function ‘run’:
plugins/affy2vcf.c:2708:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < report->n_samples; i++) {
   ^
plugins/affy2vcf.c:2729:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < nfiles; i++) {
  ^
plugins/affy2vcf.c:2825:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
   for (int i = 0; i < nfiles; i++)
   ^
plugins/affy2vcf.c:2829:11: error: redefinition of ‘i’
  for (int i = 0; i < nfiles; i++) {
           ^
plugins/affy2vcf.c:2729:11: note: previous definition of ‘i’ was here
  for (int i = 0; i < nfiles; i++) {
           ^
plugins/affy2vcf.c:2829:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
  for (int i = 0; i < nfiles; i++) {

Any suggestion to solve this compiling problem? I Have CentOS 7 and all compilation with all plugins work perfectly fine.

Best,

Petr.

idat to vcf conversion

Hello, I have a list of idat files. I can read them in R using https://github.com/HenrikBengtsson/illuminaio
But how can I convert them into a vcf file? If I use +gtc2vcf plugin as follows:
bcftools +gtc2vcf -c /shire/databases/InfiniumOmni2-5-8v1-5_A1.csv -f /shire/databases/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna -i EGAF00000868323.idat -o test.vcf
I am getting this:
IDAT file only allowed when converting to CSV
Any help? Best, Zillur

gtc2vcf cannot open gtc files

I used the GenCall algorithm to generate gtc files. My generated gtc files are not in a readable format- is this supposed to be the case? (I have set the LANG variable as instructed). My egt and bpm files are also correctly called on, and GenCall seems to run fine.
However, the gtc2vcf plugin is also unable to read in these gtc files.

This is the command I have used to generate gtc files:
LANG="en_US.UTF-8" $HOME/bin/iaap-cli/iaap-cli gencall /path/to/manifest/file.bpm /path/to/cluster/file.egt /path/to/output/folder --idat-folder /path/to/idat/folder/--output-gtc --gender-estimate-call-rate-threshold -0.1

Am I generating gtc files incorrectly?

Some package installed wrong in Centos

Dear freeseek,
I have some trouble in installing gtc2vcf. When I installed htslib,there is something wrong.
$./configure
checking for gcc... /usr/local/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether /usr/local/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc accepts -g... yes
checking for /usr/local/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cc option to accept ISO C89... none needed
checking for ranlib... /usr/local/anaconda3/bin/x86_64-conda_cos6-linux-gnu-ranlib
checking for grep that handles long lines and -e... /usr/bin/grep
checking for C compiler warning flags... -Wall
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for special C compiler options needed for large files... no
checking for _FILE_OFFSET_BITS value needed for large files... no
checking shared library type for unknown-Linux... plain .so
checking whether the compiler accepts -fvisibility=hidden... yes
checking how to run the C preprocessor... /usr/local/anaconda3/bin/x86_64-conda_cos6-linux-gnu-cpp
checking for egrep... /usr/bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for stdlib.h... (cached) yes
checking for unistd.h... (cached) yes
checking for sys/param.h... yes
checking for getpagesize... yes
checking for working mmap... yes
checking for gmtime_r... yes
checking for fsync... yes
checking for drand48... yes
checking for srand48_deterministic... no
checking whether fdatasync is declared... yes
checking for fdatasync... yes
checking for library containing log... -lm
checking for zlib.h... no
checking for inflate in -lz... no
configure: error: zlib development files not found

HTSlib uses compression routines from the zlib library http://zlib.net.
Building HTSlib requires zlib development files to be installed on the build
machine; you may need to ensure a package such as zlib1g-dev (on Debian or
Ubuntu Linux) or zlib-devel (on RPM-based Linux distributions or Cygwin)
is installed.
FAILED. This error must be resolved in order to build HTSlib successfully.
But my zlib-devel has been installed ,the version of zlib-devel:zlib-devel-1.2.7-18.el7.x86_64

I wish you can give me some help. Thank you for your help.

Best wishes,
Crane

## can't find file to patch at input line 3

Hi!
When using your fantastic tool towards the readme file, i get this step and i do not know how to proceed. In fact, I jump to the next step (compile htslib and bcftools...). At the end I can use the converter IDAT to GTC for llumina but I want to run the whole tool.
Coul you please help me with this?

I paste the error and some aditional information

Add patch (to allow the fixref plugin to flip BAF values) and code for plugins

/bin/rm -f bcftools/plugins/{gtc2vcf.c,affy2vcf.c,fixref.patch}
wget -P bcftools/plugins https://raw.githubusercontent.com/freeseek/gtc2vcf/master/{gtc2vcf.c,affy2vcf.c,fixref.patch}
cd bcftools/plugins && patch < fixref.patch && cd ../..

the error is:
can't find file to patch at input line 3
Perhaps you should have used the -p or --strip option?
The text leading up to this was:

|--- fixref.c 2018-09-05 12:00:00.000000000 -0500
|+++ fixref.c 2018-09-05 12:00:00.000000000 -0500

File to patch:

Possible to extract SNP table metrics?

Thanks you for developing this tool! The one single Windows dependency we have is in running GenomeStudio, and getting rid of this is a huge help.

I am wondering if it would be possible to extract SNP table metrics using this tool. For instance we are often faced with the need to extract eg. logR-ratio and B allele frequencies when using PennCNV (http://penncnv.openbioinformatics.org/en/latest/user-guide/input/) among other minor interactions with GenomeStudio. Would it be possible to extract these starting from IDAT files without ever having to interact with GenomeStudio?

Thanks again for your work!!

ERROR in converting CHP to VCF

hi, devoloper. After I install bcftools-1.11 and gtc2vcf, I run the following code
/data_6t/lizhan/02.software/bcftools-1.11/bcftools +affy2vcf \ --no-version -Ou \ --csv $csv_manifest_file \ --fasta-ref $ref \ --chps $path_to_chp_folder \ --snp $path_to_txt_folder/AxiomGT1.snp-posteriors.txt \ --extra $out_prefix.tsv

but there are some error message.

Writing to ./bcftools-sort.ribgu4
/data_6t/lizhan/02.software/bcftools-1.11/plugins/affy2vcf.so:
dlopen .. /data_6t/lizhan/02.software/bcftools-1.11/plugins/affy2vcf.so: undefined symbol: set_wmode
affy2vcf:
dlopen .. affy2vcf: cannot open shared object file: No such file or directory

The bcftools plugin "affy2vcf" was not found or is not functional in
BCFTOOLS_PLUGINS="/data_6t/lizhan/02.software/bcftools-1.11/plugins".

  • Is the plugin path correct?

  • Run "bcftools plugin -l" or "bcftools plugin -lvv" for a list of available plugins.

Could not load "affy2vcf".

patch not working; vcf version is out of date

Hello -- We could not patch the +gtc2vcf plugin using bcftools/1.9 on centOs 6

in another install attempt "MODE_SWAP" said undefined in the c code - first attempt to install.

bcftools-1.9/plugins]$ patch < fixref.patch
patching file fixref.c
Hunk #1 FAILED at 91.
Hunk #2 succeeded at 104 (offset -1 lines).
Hunk #3 FAILED at 134.
Hunk #4 FAILED at 155.
Hunk #5 succeeded at 180 (offset -5 lines).
Hunk #6 succeeded at 193 with fuzz 2 (offset -6 lines).
Hunk #7 succeeded at 236 (offset -6 lines).
Hunk #8 succeeded at 428 with fuzz 2 (offset -14 lines).
Hunk #9 succeeded at 586 (offset -14 lines).
3 out of 9 hunks FAILED -- saving rejects to file fixref.c.rej

This is with bcftools-1.9 etc.
somehow we something without the patch and it gave vcf version 3ish not 4.2?
Any plans to do more with this plugin maybe cover indels and some updating for the vcf spec?

I like the concept of making a bcftools plugin - that's kinda nifty :-)

Issue "Reading EGT file: Data block version 5 in cluster file not supported"

Hello,

after I obtained gtc files from idat files using Human CVN 370 manifest (.egt and .bpm files), I ran this code to get vcf file:

source /software/bcftools/1.9/start_bcftools.sh
bpm_manifest_file="humancnv370v1_c.bpm"
egt_cluster_file="HumanCNV370v1_C.egt"
gtc_list_file="gtc_370.txt"
ref="human_g1k_v37.fasta"
out_prefix="X"
bcftools +gtc2vcf
--no-version -Ov
-b $bpm_manifest_file
-e $egt_cluster_file
-g $gtc_list_file
-f $ref
-x $out_prefix.sex |
bcftools sort -Ov -T ./bcftools-sort.XXXXXX |
bcftools norm --no-version -Ov -o $out_prefix.vcf -c x -f $ref &&
bcftools index -f $out_prefix.vcf

I get the following error:
Reading EGT file HumanCNV370v1_C.egt
Data block version 5 in cluster file not supported
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Could not read VCF/BCF headers from -
Cleaning
Failed to read from standard input: unknown file type

Can you please help me with this?

Thank you.

Error in running Affymetrix

I already have the call, intensities, and confidence file. I am running the gtc2vcf on my Affymetrix genotype calls and intensities with the code provided but it returns with this error message:
[W::bcf_record_check] Bad BCF record: Invalid CONTIG id -1

VCF lines lacking GT tag

Dear freeseek,

I installed the gtc2vcf plugin yesterday in docker:
https://gitlab.com/intelliseq/workflows/-/blob/BIOINFO-998-genotype-source/src/main/docker/task/task_gtc-to-vcf/Dockerfile
(the reference is added later).
The plugin works without raising any error, but some vcf lines don't have the GT tag:

chr1    30345446        22:24375752_CNV_GSTT1   A       C       .       .       GC=0.4625;ALLELE_A=0;ALLELE_B=1;FRAC_A=0.360656;FRAC_C=0.262295;FRAC_G=0.229508;FRAC_T=0.147541;NORM_ID=1;BEADSET_ID=1705;INTENSITY_ONLY;ASSAY_TYPE=0;GenTrain_Score=0;Orig_Score=0.68275;Cluster_Sep=0.948275;N_AA=1236;N_AB=0;N_BB=0;devR_AA=0.30742;devR_AB=0.39422;devR_BB=0.20131;devTHETA_AA=0.0121041;devTHETA_AB=0.0223607;devTHETA_BB=0.0223607;meanR_AA=2.73401;meanR_AB=3.4935;meanR_BB=2.30665;meanTHETA_AA=0.130089;meanTHETA_AB=0.554171;meanTHETA_BB=0.978252;Intensity_Threshold=0.05     GQ:IGC:BAF:LRR:NORMX:NORMY:R:THETA:X:Y   0:0:0.0246714:-0.31685:1.76758:0.427339:2.19492:0.151014:32616:2228     0:0:0.0246714:-0.31685:1.76758:0.427339:2.19492:0.151014:32616:2228
chr1    109685814       1:110228436_CNV_GSTM1   T       C       .       .       GC=0.385;ALLELE_A=0;ALLELE_B=1;FRAC_A=0.147541;FRAC_C=0;FRAC_G=0.180328;FRAC_T=0.672131;NORM_ID=0;BEADSET_ID=1625;INTENSITY_ONLY;ASSAY_TYPE=0;GenTrain_Score=0;Orig_Score=0.376871;Cluster_Sep=0.173743;N_AA=0;N_AB=0;N_BB=1239;devR_AA=0.1;devR_AB=0.1;devR_BB=0.1;devTHETA_AA=0.0223607;devTHETA_AB=0.0223607;devTHETA_BB=0.140788;meanR_AA=0.17845;meanR_AB=0.194985;meanR_BB=0.207459;meanTHETA_AA=0.0145364;meanTHETA_AB=0.297995;meanTHETA_BB=0.581454;Intensity_Threshold=0.05     GQ:IGC:BAF:LRR:NORMX:NORMY:R:THETA:X:Y  0:0:1.28708:-0.170678:0.0549625:0.129349:0.184312:0.744207:1017:614      0:0:1.28708:-0.170678:0.0549625:0.129349:0.184312:0.744207:1017:614

The program is run with this wdl:
https://gitlab.com/intelliseq/workflows/-/blob/dev/src/main/wdl/tasks/gtc-to-vcf/gtc-to-vcf.wdl

Is it intentional? This has not happened with the previous installation (bcftools11-54-gaf54707, htslib1.11-74-gb8dcbd1
and gtc2vcf cloned on 2021-01-20).
Best,
Kasia

Sample_ID from samples file not saved to VCF -file

First of all, thank You very much for this excellent pipeline!

I have been able to convert idat files successfully to GTC and during the conversion, iaap-cli recognises the sample ID from samples file successfully. How ever, when converting from GTC to VCF, ID is set back to "SentrixBarcode_A_SentrixPosition_A"

Samples CSV file is structured as follows:

[Data]
Sample_ID,SentrixBarcode_A,SentrixPosition_A,Path

During the iaap-cli conversion i get message:
info: ArrayAnalysis.NormToGenCall.Services.NormToGenCallSvc[0]
[07:09:03 1893]: Writing [Sample_ID_Obfuscated] to gtc...

when I query the IDs from the converted VCF file: bcftools query -l
I get:
[SentrixBarcode_A][SentrixPosition_A]
[SentrixBarcode_A]
[SentrixPosition_A]
[SentrixBarcode_A]_[SentrixPosition_A]
.....

I know I can annotate VCF IDs again, but would rather form a pipeline where this is not nescessary.

Could not initialize gtc2vcf.so, neither run or init found

Hello, I am trying to use the "Convert Illumina GTC files to VCF" example shown in the README, but I am getting this error:

Writing to .
Could not initialize gtc2vcf.so, neither run or init found
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Failed to open -: unknown file type
Failed to open -: unknown file type

Looking in the file, there is a run function defined, but no init function, and bcftools vcfplugin.c appears to be checking for both.

I am using bcftools version 1.9. Any idea what could be causing this?

Include other metrics in the vcf output

Hi there,

I'm looking for a way to include the "cluster separation" [0-1] metric to the output vcf produced using the gtc2vcf method. Could someone please tell me if this would be possible and how I could change the code to achieve this goal?

Thank you!

[E::bcf_hdr_read] Input is not detected as bcf or vcf format

Hello,

when I try to convert .gtc files to .vcf I get the error "[E::bcf_hdr_read] Input is not detected as bcf or vcf format". It seems like the .gtc header size is bigger than expected. Can you please help me to fix this error?

Thank you.

build in macOS

Dear gtc2vcf team

I was wondering whether a prebuilt binary file for mac exists?
if not, is there any recipe/instruction to build the package from source in macos?

Thank you in advance.

Regards,
Sina

How to install wine64 without "sudo"?

Hi,

Can you tell me if there is a way to install wine64 not using the sudo command. I don't have the right to use sudo on the cluster that I use.

Thanks.

idat or gtc in command line

I will want to used idat file more than gtc, do you have an example of command line?
bcftools +gtc2vcf -Ou --bpm .bpm --egt egt --idat filelink --fasta-ref fasta --extra gtc2vcf_idat".tsv" --output gtc2vcf_idat".vcf.gz" --threads 35 --output-type z
with filelink contains each idat file
error that I obtained ;
The --idat option can only be used alone or with option --gtcs
Could you explained more how to use idat with gtc2vcf? what algoritms ? what is the interrest?
thank you

No output files generated for Illunmina reports

Hello,

I have tried to use the following command to convert Illumina reports to VCF.

bcftools +gtc2vcf --genome-studio FinalReport24.txt -o GenotypeReport24.vcf

Output from the run in the terminal is only one line:

gtc2vcf 2021-06-01 https://github.com/freeseek/gtc2vcf

And the GenotypeReport24.vcf file is created but with no contents in it.

An extract from the Illumina report:

[Header]
GSGT Version	2.0.4
Processing Date	3/29/2021 4:13 PM
Content		GSA-24v3-0_A2.bpm
Num SNPs	654027
Total SNPs	654027
Num Samples	24
Total Samples	24
File 	24 of 24
[Data]
Sample Index	Sample ID	Sample Name	SNP Index	SNP Name	Chr	Position	GT Score	GC Score	Allele1 - AB	Allele2 - AB	Allele1 - Top	Allele2 - Top	Allele1 - Forward	Allele2 - Forward	Allele1 - Design	Allele2 - Design	Theta	R	X Raw	Y Raw	X	Y	B Allele Freq	Log R Ratio	SNP Aux	SNP	ILMN Strand	Top Genomic Sequence	Customer Strand
24	03-031		1	1:103380393	1	102914837	0.7987	0.8136	B	B	G	G	G	G	C	C	0.963	0.722	1101	3453	0.040	0.682	1.0000	0.3609	0	[T/C]	BOT		TOP
24	03-031		2	1:109439680	1	108897058	0.8792	0.4803	A	A	A	A	A	A	A	A	0.039	0.895	11409	497	0.843	0.052	0.0000	0.4173	0	[A/G]	TOP		TOP

I spent some hours trying to figure out what i might be doing wrong but couldn't figure it out.
Any tips on what might be going wrong with my steps is appreciated.

Thanks,
Rashindrie

Update

Tried with below command

ref="/tmp/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna"
bcftools +gtc2vcf   --no-version -Ov -o $out_prefix  --genome-studio "FinalReport24.txt" -f $ref

Output on terminal

gtc2vcf 2021-06-01 https://github.com/freeseek/gtc2vcf
Writing VCF file
Could not recognize INFO field: [Header]

plink matrix format

Hi, I have data from dbGaP that is in 'plink matrix format'. Can I use this tool? If not, what is the best way to prepare this data for MoCHa?

Error: Too many open files

Hi,

I am currently using gtc2vcf tools to transform ~1800 GTC files into a single BCF file. I met a error report as below. However, when I tried small samples (like 20 samples) including the reported error sample 9479477122_R04C01.gtc , the pipeline could work with the correct bcf file produced. Is this a memory problem? Would you pls help me to figure this problem? Thank you very much for your help!

"Could not open 9479477122_R04C01.gtc: Too many open files
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Could not read VCF/BCF headers from -
Cleaning
Failed to read from standard input: unknown file type"

Best regards,
Qidi

how to convert CEL to CHP?

After I install bcftools, I follow the README document and run the following code, but there is a error message.
path_to_output_folder="..." cel_list_file="..." apt-probeset-genotype \ --analysis-files-path . \ --xml-file GenomeWideSNP_6.apt-probeset-genotype.AxiomGT1.xml \ --out-dir $path_to_output_folder \ --cel-files $cel_list_file \ --special-snps GenomeWideSNP_6.specialSNPs \ --chip-type GenomeWideEx_6 \ --chip-type GenomeWideSNP_6 \ --table-output false \ --cc-chp-output \ --write-models \ --read-models-brlmmp GenomeWideSNP_6.generic_prior.txt

The question is that what software should be install when use [apt-probeset-genotype]?

Can't install wine32 on my machine

I have wine64 installed, but the commands I ran from read me also needed wine32. So I switched to a 32 bit vm and then it needs wine64. Pretty frustrated with installing this pipeline as I would prefer using this than running beeline 100x to get my idat to gtc.

cannot open more than 4096 files at once while 30546 is required

Hi!
I get this error when using your pretty tool trying to convert gtcs to vcfs:

$HOME/bin/bcftools +$HOME/bin/gtc2vcf.so --no-version -Ou -b $manifest_file -e $egt_file -g $gtc_list -f $ref -x $out.sex
...cannot open more than 4096 files at once while 30546 is required

I need another machine >4Gb RAM or I can do something with RAM capacities?

Thank you in advance Dr. Genovese!!

No output VCF files

Hello

I see it's necessary two steps to convert from .CEL to .VCF. In the first step is generated xxxxx.AxiomGT1.chp files (where xxxxx is the name of the original file) is this correct?

Now, I'm having problem with the second step. When I run that part of the program I have no errors but also I can't find the VCF files. This is the code I'm running:

bcftools +affy2vcf
--no-version -Ou
--csv "GenomeWideSNP_6.na35.annot.csv"
--fasta-ref "human_g1k_v37.fasta"
--chps /home/adrianib/Proyecto/cc-chp
--snp /home/adrianib/Proyecto/AxiomGT1.snp-posteriors.txt
--extra result.tsv |
bcftools sort -Ou -T ./bcftools-sort.XXXXXX |
bcftools norm --no-version -Ob -o result.bcf -c x -f "human_g1k_v37.fasta" &&
bcftools index -f result.bcf

I see there is no command to indicate the output folder as in the first step. This could be the reason I don't have output VCF files?

In summary, I have this:
Original file: xxxxx.CEL
1st step (CEL to CHP): xxxxx.AxiomGT1.chp
2nd step (CHP to VCF): ?

And my question is: Should I have a xxxxx.VCF file at the end of the second step?

Thanks for your help
Adrian

No releases

Commit messages contain phrases like new release or new version, but there are no versioned releases/tags for this repo. That makes it hard to create a reproducible deployment for reproducible science...

Problem converting Illumina Genome reports to vcf

Hi,
Thank you for the wonderful set of tool for converting the illumina reports to vcf files.

I am getting a error while using the matrix format illumina reports.

Error is as follows:

./bcftools +gtc2vcf.so --no-version -o --genome-studio /Users/vikrants/Desktop/testvcf/ILHC24-12806_FinalReport.txt -f /Users/vikrants/res/hg38.fa

Reading GTC file /Users/vikrants/Desktop/testvcf/ILHC24-12806_FinalReport.txt
GTC file /Users/vikrants/Desktop/testvcf/ILHC24-12806_FinalReport.txt format identifier is bad

Can you please have a look and let me know why i am getting this error.

P.S. - I have generated the matrix format report from the genome studio.

Thanks in advance,
Vikrant

vcf files not being saved

I have approximately 4000 gtcs that I am trying to convert to vcf files using the gtc2vcf plugin but even though the script reads gtcs correctly and writes the vcf file - no output is produced. I have tried to run it by reducing the number of gtcs to 8 and get the same result.
I get this output;
Writing to ./bcftools-sort.XXXXXXMMTHoa gtc2vcf 2022-01-12 https://github.com/freeseek/gtc2vcf Reading BPM file /bochica/shared/numom/raw_babies/GUER_20211019_MEGA_1001_1002/Multi-EthnicGlobal_D2.bpm Reading EGT file /bochica/shared/numom/raw_babies/GUER_20211019_MEGA_1001_1002/Multi-EthnicGlobal_D1_ClusterFile.egt Reading GTC file /home5/maamir/mfgitry/somegtc/206043240081_R02C01.gtc Reading GTC file /home5/maamir/mfgitry/somegtc/206043240081_R07C01.gtc Reading GTC file /home5/maamir/mfgitry/somegtc/206043240081_R06C01.gtc Reading GTC file /home5/maamir/mfgitry/somegtc/206043240081_R01C01.gtc Reading GTC file /home5/maamir/mfgitry/somegtc/206043240081_R08C01.gtc Reading GTC file /home5/maamir/mfgitry/somegtc/206043240081_R03C01.gtc Reading GTC file /home5/maamir/mfgitry/somegtc/206043240081_R05C01.gtc Reading GTC file /home5/maamir/mfgitry/somegtc/206043240081_R04C01.gtc Writing VCF file Lines total/missing-reference/skipped: 1748250/23814/14885 Merging 2 temporary files Cleaning Lines total/split/realigned/skipped: 1733365/0/0/23817

But no sub directory of bcftools-sort.XXXXXXMMTHoa is present in my directory when the programme has stopped running.

Below is the code I am using -

ref="/home5/maamir/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna" bcftools +gtc2vcf --no-version -Ou --bpm /bochica/shared/numom/raw_babies/GUER_20211019_MEGA_1001_1002/Multi-EthnicGlobal_D2.bpm --egt /bochica/shared/numom/raw_babies/GUER_20211019_MEGA_1001_1002/Multi-EthnicGlobal_D1_ClusterFile.egt --gtcs /home5/maamir/mfgi --fasta-ref $ref --extra $out_prefix.tsv | bcftools sort -Ou -T ./bcftools-sort.XXXXXX | bcftools norm --no-version -Ob -c x -f $ref && \ bcftools index --f $out_prefix.bcf

Docker

Dear Giulio,

Thanks a lot for such nice workfellow for the conversion of the gtc files to vcf.

because of some limitations, I wasn't able to install everything and tried to convert the whole package to a docker and I failed here too.

Do you have any plane to make a docker container that does the whole process?

Really appreciate it.

Regards

Can't open .xcl.bcf file

Hello,

I got the .xcl.bcf file after this step:

/bcftools annotate --no-version -Ob -o $pfx.unphased.bcf -x ID,QUAL,INFO,^FMT/GT,^FMT/BAF,^FMT/LRR $pfx.vcf &&
/bcftools index -f $pfx.unphased.bcf

n=$(/bcftools query -l $pfx.unphased.bcf|wc -l);
ns=$((n*98/100));
echo '##INFO=<ID=JK,Number=1,Type=Float,Description="Jukes Cantor">' |
/bcftools annotate --no-version -Ou -a $dup -c CHROM,FROM,TO,JK -h /dev/stdin $pfx.unphased.bcf |
/bcftools +/fill-tags.so --no-version -Ou -- -t NS,ExcHet |
bcftools +mochatools.so --no-version -Ou -- -x $sex -G |
bcftools annotate --no-version -Ob -o $pfx.xcl.bcf
-i 'FILTER!="." && FILTER!="PASS" || JK<.02 || NS<'$ns' || ExcHet<1e-6 || AC_Sex_Test>6'
-x FILTER,^INFO/JK,^INFO/NS,^INFO/ExcHet,^INFO/AC_Sex_Test &&
bcftools index -f $pfx.xcl.bcf

Then, when I ran eagle:

for chr in {1..22} X; do
eagle
--geneticMapFile $map
--chrom $chr
--outPrefix $pfx.chr$chr
--numThreads 4
--vcfRef $kgp_pfx${chr}$kgp_sfx.bcf
--vcfTarget $pfx.unphased.bcf
--vcfOutFormat b
--noImpMissing
--outputUnphased
--vcfExclude $dir/$pfx.xcl.bcf && bcftools index -f $pfx.chr$chr.bcf
done

I get the following: ERROR: Could not open X.xcl.bcf for reading: unknown file type.

I have full permissions on the file. I am not sure if it's the eagle problem or it's the file generating issues.

Can you please help me with this?

Thank you.

Failed to read 1359180426 bytes when convert gtc files

Dear Giulio,
Thank you for developing such a good tooI to deal with idat files. I have converted gtc files from idat successfully, thank you for your suggestion. When I run the code just like the guide, an error occured and I saw someone have the similar issue, but not suitable for me (#13). I used the -gtcs, the folder have 103 gtc files and less files still have the same error.

$bcftools +gtc2vcf \

--no-version -Ou \

--bpm $bpm_manifest_file \

--csv $csv_manifest_file \

--egt $egt_cluster_file \

--gtcs $path_to_gtc_folder \

--fasta-ref $ref \

--extra $out_prefix.tsv

gtc2vcf 2020-08-26 https://github.com/freeseek/gtc2vcf

Reading BPM file /media/EXTend2018/Wanghe2019/GEO/GSE113093/InfiniumPsychArray-24v1-1_A1.bpm

Reading CSV file /media/EXTend2018/Wanghe2019/GEO/GSE113093/InfiniumPsychArray-24v1-1_A1.csv

Reading EGT file /media/EXTend2018/Wanghe2019/GEO/GSE113093/InfiniumPsychArray-24v1-1_A1_ClusterFile.egt

Reading GTC file /media/EXTend2018/Wanghe2019/GEO/GSE113093/GSE113093_GTC/GSM3096512_200687150051.gtc

Failed to read 1359180426 bytes from stream

Best wishes,
Crane

will generate same output as using AutoConvert via Beeline?

Hello,

Thank you for the handy tool! I'm able to generate gtc files from idat files using your software. However, I'd like to know if the results are the same as Beeline's AutoConvert function. I don't have a windows os with illumina, so I can't compare by myself. I really appreciate if anyone has any inputs.

Thanks
Fan

Citation

Hi Giulio,

I'm trying to cite this tool in my manuscripts but I did not find a related paper. Could you please share a citation format?

Great thanks!

Xiaotong

Question affy2vcf

Hi Giulio,

Quick question, I see affy2vcf can convert cel to chp and chp to vcf. I am just wondering if this is required to do two steps to get from cel to vcf? I don't see in description requiring this and I know PennCNV goes from cel to vcf but requires multiple steps. Let me know whether we can go straight from CEL to VCF. Thanks.
Brian

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.