Data descriptions for newer data

giab_data_indexes

This repository contains data indexes from NIST's Genome in a Bottle (GIAB) project. The indexes for sequences and alignments are also available under: https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data_indexes .

AshkenazimTrio

Son:HG002 _{https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG002_NA24385_son/}
Father:HG003 _{https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG003_NA24149_father/}
Mother:HG004 _{https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG004_NA24143_mother/}

_{Sequencing Platform}	_Sequence	_Alignment
_{Illumina WGS 2x150bp 300X per individual}	_{All HG002 HG003 HG004}	_{novoalign: All HG002 HG003 HG004}
_{Illumina 6KB Matepair}	_{All HG002 HG003 HG004}	_{bwamem:hg19 All HG002 HG003 HG004}
_{Illumina WGS 2X250bp}	_{All HG002 HG003 HG004}	_{isaac:hg19 All HG002 HG003 HG004 novoalign: All HG002 HG003 HG004}
_Moleculo	_{All HG002 HG003 HG004}
_{Illumina Whole Exome}	_-	_{bwamem:hg19 All HG002 HG003 HG004}
_{SOLiD 60x for son}	_{All HG002}	_{LifeScope:hg19 All HG002}
_{CompleteGenomics}	_-	_{CGAtools:hg19 All HG002 HG003 HG004}
_{Ion Proton 1000x Exome}	_-	_{TMAP:hg19 All HG002 HG003 HG004}
_{10X Genomics}	_-	_{bwamem:hg19 All HG002 HG003 HG004}
_{10X Genomics ChromiumGenome}	_{All HG002}	_{LongRanger2.0:hg19 All HG002 HG003 HG004}
_BioNano	_{All:bnx HG002:bnx HG003:bnx HG004:bnx}	_{All:cmap HG002 HG003 HG004}
_{PacBio 70x/30x/30x}	_{All HG002 HG003 HG004 All:hdf5 HG002 HG003 HG004}	_{NGMLR:hg19 All HG002 HG003 HG004 minimap2: All HG002 HG003 HG004}
_{PacBio CCS 10kb}	_{All HG002}	_{pbmm2:hg19 All HG002}
_{PacBio CCS 11kb}	_{All HG002}	_{pbmm2:hg19 All HG002}
_{PacBio CCS 15kb}	_{All HG002}	_{pbmm2:hg19 All HG002}
_{PacBio CCS 15kb_20kb chemistry2}	_{All HG002}	_{pbmm2: All HG002 HG003 HG004}
_{Oxford Nanopore 2D}	_{All HG002}	_-
_{Oxford Nanopore ultralong (guppy-V3.2.4_2020-01-22)}	_{All HG002}	_{minimap2:whatshap:hg19 All HG002}
_{Oxford Nanopore ultralong Promethion}	_{All HG002 HG003 HG004}	_-
_{BGI BGISEQ500}	_{All HG002}	_-
_{BGI MGISEQ PCR-free}	_{All HG002}	_-
_{BGI stLFR}	_{All HG002 HG003 HG004}	_{All:bwamem:hg19 HG002 HG003 HG004}
_{Strand-Seq HG002 by BCCRC}	_{All HG002}	_-

_{* CompleteGenomics LFR raw or alignment data not available, but analysis results available under: https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/analysis/CompleteGenomics_newLFR_CGAtools_06122015/}

ChineseTrio

Son:HG005 _{https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/ChineseTrio/HG005_NA24631_son/}
Father:HG006 _{https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/ChineseTrio/HG006_NA24694-huCA017E_father/}
Mother:HG007 _{https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/ChineseTrio/HG007_NA24695-hu38168_mother/}

_{Sequencing Platform}	_Sequence	_Alignment
_{Illumina WGS 2x250bp 300X for son; 2x150bp 100x for parents}	_{All HG005 HG006 HG007}	_{novoalign: All:hg19-hg38 HG005:hg19-hg38 HG006:hg19-hg38 HG007:hg19-hg38}
_{Illumina 6KB Matepair}	_{All HG005 HG006 HG007}
_Moleculo	_{All HG005 HG006 HG007}
_{SOLiD 60x for son}	_{All:xsq HG005:xsq}	_{LifeScope: All:hg19 HG005:hg19}
_{CompleteGenomics}		_{CGAtools: All:hg19 (RMDNA) HG005:hg19 HG006:hg19 HG007:hg19 CGAtools: All:hg19 (cellsDNA) HG005:hg19}
_{Illumina Whole Exome}		_{bwamem: All:hg19 HG005:hg19}
_{Ion Proton 1000x Exome}		_{TMAP: All:hg19 HG005:hg19}
_{BioNano for son}	_{All:bnx HG005:bnx}	_{All:hg19 (cmap) HG005:hg19 (cmap)}
_{PacBio Sequel for the trio}	_{All HG005 HG006 HG007}
_{PacBio SequelII CCS 11kb}
_{BGI BGISEQ500, MGISEQ, stLFR}

NA12878

NA12878:HG001 _{https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/NA12878/}

_{Sequencing Platform}	_Sequence	_Alignment
_{Illumina WGS 2x150bp 300X}	_HG001	_{bwamem: HG001:hg19 (downsampled30x)} _{novoalign: HG001}
_{Illumina HiSeq Exome}	_{HG001 HG001:trimmed_fastq}	_{bwamem: HG001:hg19}
_{Illumina TruSeq Exome}		_{bwamem: HG001:hg19}
_{10X Genomics}		_{bwamem: HG001:hg19 bwamem: HG001:hg19 (size_selected)}
_{10X Genomics ChromiumGenome}		_{LongRanger2.0: HG001:hg19-hg38 LongRanger2.1: HG001:hg19-hg38}
_{CompleteGenomics}		_{CGAtools: HG001:hg19}
_{Ion Proton 1000x Exome}		_{TMAP: HG001:hg19}
_{NA12878 SOLiD5500W}		_{LifeScope: HG001:hg19}
_{BGI BGISEQ500, MGISEQ, stLFR}
_{PacBio 40x}	_HG001:hdf5
_{PacBio SequelII CCS 11kb}
_{Ultralong_OxfordNanopore}	_-	_{minimap2: HG001}

_{CompleteGenomics LFR raw or alignment data not available, but analysis results available under: https://ftp.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/NA12878/analysis/CompleteGenomics_newLFR_CGAtools_06122015/ .}

Please Note:
_{1. If you want to use raw sequencing data (fastq, fasta, hdf5, xsq, bnx etc) for your analysis, then you can use the sequence.index.* files when you need to download the data.}
_{2. If you want to use aligned data (bam, xmap/cmap etc.) for your analysis, then you can use the alignment.index.* files when you need to download the data.}

genome-in-a-bottle / giab_data_indexes Goto Github PK

giab_data_indexes's Introduction

giab_data_indexes

giab_data_indexes's People

Contributors

Stargazers

Watchers

Forkers

giab_data_indexes's Issues

Recommend Projects

Recommend Topics

Recommend Org