Giter Site home page Giter Site logo

d3b-center / d3b-autopvs1 Goto Github PK

View Code? Open in Web Editor NEW
1.0 5.0 0.0 10.43 MB

This is a modiifed port of the original https://github.com/JiguangPeng/autopvs1, branched from commit `7fb1be97667e5ef576f81bf2fabbddcf9a4c7594`. It was re-worked in order to work with Kids First germline annotation outputs.

License: GNU General Public License v3.0

Python 99.65% Cython 0.35%

d3b-autopvs1's Introduction

D3b AutoPVS1

D3b logo

An automatic classification tool for PVS1 interpretation of null variants. This is a modifed port of the original https://github.com/JiguangPeng/autopvs1, branched from commit 7fb1be97667e5ef576f81bf2fabbddcf9a4c7594. Major modifications include:

  • Gutting running of VEP, as it is assumed it was run ahead of time
  • Gutting CNV functions as they are not currently used
  • Parsing of an already VEP-annotated vcf file using pysam
  • Config file should be at pwd, data files in config best with full paths
  • Dropped hg19 references and made imports absolute instead of relative We have noted in some sections where things stayed the same and other changed

AutoPVS1

A web version for AutoPVS1 is also provided: http://autopvs1.genetics.bgi.com AutoPVS1App

PREREQUISITE

1. Variant Effect Predictor (VEP)

VEP should have been run ahead of time, no longer built-in. Recommend KFDRC Germline Annotation Workflow: CWL source code can be run on Cavatica or with any cwl runner.

2. pyfaidx (original author recommendation)

Samtools provides a function “faidx” (FAsta InDeX), which creates a small flat index file “.fai” allowing for fast random access to any subsequence in the indexed FASTA file, while loading a minimal amount of the file in to memory.

pyfaidx module implements pure Python classes for indexing, retrieval, and in-place modification of FASTA files using a samtools compatible index.

3. maxentpy (original author recommendation)

maxentpy is a python wrapper for MaxEntScan to calculate splice site strength. It contains two functions. score5 is adapted from MaxEntScan::score5ss to score 5' splice sites. score3 is adapted from MaxEntScan::score3ss to score 3' splice sites. maxentpy is already included in the autopvs1 Dockerfile.

4. pyhgvs (original author recommendation)

pyhgvs provides a simple Python API for parsing, formatting, and normalizing HGVS names. But it only supports python2, I modified it to support python3 and added some other features. pyhgvs is also included in the autopvs1 Dockerfile.

5. Configuration

pwd/config.ini

[DEFAULT]
pvs1levels = data/PVS1.level
gene_alias = data/hgnc.symbol.previous.tsv
gene_trans = data/clinvar_trans_stats.tsv

[HG38]
genome = data/hg38.fa
transcript = data/ncbiRefSeq_hg38.gpe
domain = data/functional_domains_hg38.bed
hotspot = data/mutational_hotspots_hg38.bed
curated_region = data/expert_curated_domains_hg38.bed
exon_lof_popmax = data/exon_lof_popmax_hg38.bed
pathogenic_site = data/clinvar_pathogenic_GRCh38.vcf

All refs were obtained from original git repo from data dir except for hg38 fasta. They now live here User should provide that as part of input

Note: the chromosome name in fasta files should have chr prefix

USAGE

python3 pathogenicity-assessment/autopvs1/autoPVS1_from_VEP_vcf.py --genome_version hg38 --vep_vcf ~/volume/VEP_TEST/AUTOPVS1_TEST/input_VEP_annotated.vcf.gz > output.autopvs1.tsv

FAQ

Please see https://autopvs1.genetics.bgi.com/faq/

TERM OF USE

Users may freely use the AutoPVS1 for non-commercial purposes as long as they properly cite it.

This resource is intended for research purposes only. For clinical or medical use, please consult professionals.

📝citation: Jiale Xiang, Jiguang Peng, Samantha Baxter, Zhiyu Peng. (2020). AutoPVS1: An automatic classification tool for PVS1 interpretation of null variants. Hum Mutat 41, 1488-1498. (Editor's choice and cover article)

d3b-autopvs1's People

Contributors

migbro avatar

Stargazers

 avatar

Watchers

Allison Heath avatar Yuankun Zhu avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.