D3b AutoPVS1

An automatic classification tool for PVS1 interpretation of null variants. This is a modifed port of the original https://github.com/JiguangPeng/autopvs1, branched from commit 7fb1be97667e5ef576f81bf2fabbddcf9a4c7594. Major modifications include:

Gutting running of VEP, as it is assumed it was run ahead of time
Gutting CNV functions as they are not currently used
Parsing of an already VEP-annotated vcf file using pysam
Config file should be at pwd, data files in config best with full paths
Dropped hg19 references and made imports absolute instead of relative We have noted in some sections where things stayed the same and other changed

A web version for AutoPVS1 is also provided: http://autopvs1.genetics.bgi.com

PREREQUISITE

1. Variant Effect Predictor (VEP)

VEP should have been run ahead of time, no longer built-in. Recommend KFDRC Germline Annotation Workflow: CWL source code can be run on Cavatica or with any cwl runner.

2. pyfaidx (original author recommendation)

Samtools provides a function “faidx” (FAsta InDeX), which creates a small flat index file “.fai” allowing for fast random access to any subsequence in the indexed FASTA file, while loading a minimal amount of the file in to memory.

pyfaidx module implements pure Python classes for indexing, retrieval, and in-place modification of FASTA files using a samtools compatible index.

3. maxentpy (original author recommendation)

maxentpy is a python wrapper for MaxEntScan to calculate splice site strength. It contains two functions. score5 is adapted from MaxEntScan::score5ss to score 5' splice sites. score3 is adapted from MaxEntScan::score3ss to score 3' splice sites. maxentpy is already included in the autopvs1 Dockerfile.

4. pyhgvs (original author recommendation)

pyhgvs provides a simple Python API for parsing, formatting, and normalizing HGVS names. But it only supports python2, I modified it to support python3 and added some other features. pyhgvs is also included in the autopvs1 Dockerfile.

5. Configuration

pwd/config.ini

[DEFAULT]
pvs1levels = data/PVS1.level
gene_alias = data/hgnc.symbol.previous.tsv
gene_trans = data/clinvar_trans_stats.tsv

[HG38]
genome = data/hg38.fa
transcript = data/ncbiRefSeq_hg38.gpe
domain = data/functional_domains_hg38.bed
hotspot = data/mutational_hotspots_hg38.bed
curated_region = data/expert_curated_domains_hg38.bed
exon_lof_popmax = data/exon_lof_popmax_hg38.bed
pathogenic_site = data/clinvar_pathogenic_GRCh38.vcf

All refs were obtained from original git repo from data dir except for hg38 fasta. They now live here User should provide that as part of input

Note: the chromosome name in fasta files should have chr prefix

USAGE

python3 pathogenicity-assessment/autopvs1/autoPVS1_from_VEP_vcf.py --genome_version hg38 --vep_vcf ~/volume/VEP_TEST/AUTOPVS1_TEST/input_VEP_annotated.vcf.gz > output.autopvs1.tsv

FAQ

Please see https://autopvs1.genetics.bgi.com/faq/

TERM OF USE

Users may freely use the AutoPVS1 for non-commercial purposes as long as they properly cite it.

This resource is intended for research purposes only. For clinical or medical use, please consult professionals.

📝citation: Jiale Xiang, Jiguang Peng, Samantha Baxter, Zhiyu Peng. (2020). AutoPVS1: An automatic classification tool for PVS1 interpretation of null variants. Hum Mutat 41, 1488-1498. (Editor's choice and cover article)

d3b-center / d3b-autopvs1 Goto Github PK

d3b-autopvs1's Introduction

D3b AutoPVS1

PREREQUISITE

1. Variant Effect Predictor (VEP)

2. pyfaidx (original author recommendation)

3. maxentpy (original author recommendation)

4. pyhgvs (original author recommendation)

5. Configuration

USAGE

FAQ

TERM OF USE

d3b-autopvs1's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent