Giter Site home page Giter Site logo

ptuneos's Introduction

pTuneos: prioritizing Tumor neoantigen from next-generation sequencing data

pTuneos is the state-of-the-art computational pipeline for identifying personalized tumor neoantigens from next-generation sequencing data. With raw whole-exome sequencing data and/or RNA-seq data, pTuneos calculates five important immunogenicity features to construct a machine learning-based classifier (Pre&RecNeo) to predict and prioritize neoantigens recognized by T cell, followed by an efficient score scheme (RefinedNeo) to ealuate naturally processed, MHC presented and T cell recognized probability of a predicted neoepitope.

Authors:

Chi Zhou and Qi Liu

Citation:

Zhou, C., Wei, Z., Zhang, Z. et al. pTuneos: prioritizing tumor neoantigens from next-generation sequencing data. Genome Med 11, 67 (2019) doi:10.1186/s13073-019-0679-x

Web sever:

TBD

Dependencies

Hardware:

pTuneos currently test on x86_64 on ubuntu 16.04.

Required software:

Required Python package:

Required R package:

Installation

Install via Docker

Docker image of pTuneos is at https://cloud.docker.com/u/bm2lab/repository/docker/bm2lab/ptuneos. See the user manual for a detailed description usage.

Install from source

  1. Install all software listed above.

  2. Download or clone the pTuneos repository to your local system:

     git clone https://github.com/bm2-lab/pTuneos.git
    
  3. Obtain the reference files from GRCh38. These include cDNA, peptide; please refer to user manual for a detailed description.

Usage

pTuneos has two modes, WES mode and VCF mode.

PairMatchDna mode accepts WES and RNA-seq sequencing data as input, it conduct sequencing quality control, mutation calling, hla typing, expression profiling and neoantigen prediction, filtering, annotation.

VCF mode accepts mutation VCF file, expression profile, copy number profile and tumor cellularity as input, it performs neoantigen prediction, filtering, annotation directly on input file.

You can use these two mode by:

    python pTuneos.py WES -i config_WES.yaml

or

    python pTuneos.py VCF -i config_VCF.yaml

User Manual

For detailed information about usage, input and output files, test examples and data preparation please refer to the pTuneos User Manual

Contact

[email protected] or [email protected] Tongji University, Shanghai, China

ptuneos's People

Contributors

chizhou-siti avatar michaelchuai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

ptuneos's Issues

Requirement for training data

Hi~Chi Zhou!
I just wonder that is it possible to provide the training data? It will help a lot!
Thanks for the amazing work!
Best,
Liang Yi

Dockerfile?

可否方便提供构建镜像使用的Dockerfile?

HLA Typing Issue

Hi,

I am running into an issue with the HLA typing that I cannot figure out. The program fails with the message:
Traceback (most recent call last):
File "bin/optitype_ext.py", line 31, in
with open(input_optitype_result) as f:
IOError: [Errno 2] No such file or directory: '/root/data/hlatyping/2021_01_08_05_28_37/2021_01_08_05_28_37_result.tsv'
hla type process done.

And when I look at the log file for the hla typing it says:
/root/source/seqan-seqan-v2.4.0/include/seqan/basic/basic_exception.h:363 FAILED! (Uncaught exception of type std::bad_alloc: std::bad_alloc)

From my research it looks like it is possibly a memory problem but I have expanded the volume I am working on twice with no change in the error. Do you have any estimates of how much space running the entire docker image requires?

Thank you!
-Elizabeth

Stuck on the step of "snv2fasta.py"

Hi, Stuck on the step of "snv2fasta.py", Can you help me to fix it?
$ python2 bin/snv2fasta.py -i VCF_test/somatic_mutation/test_snv_vep_ann.txt -o VCF_test/netmhc -s test -p database/Protein/human.pep.all.fa
Traceback (most recent call last):
File "bin/snv2fasta.py", line 90, in
gene_symbol.append(g_s)
NameError: name 'g_s' is not defined

So I made a small change from

            if sub_ele[0]=="SYMBOL":
                    g_s= sub_ele[1]
    gene_symbol.append(g_s)

to

            if sub_ele[0]=="SYMBOL":
                    g_s= sub_ele[1]
                    gene_symbol.append(g_s)

But get a new error info:
$ python2 bin/snv2fasta.py -i VCF_test/somatic_mutation/test_snv_vep_ann.txt -o VCF_test/netmhc -s test -p database/Protein/human.pep.all.fa
Traceback (most recent call last):
File "bin/snv2fasta.py", line 109, in
wt_head='>WT_'+gene_symbol[i]+''+ ref_animo_acid[i]+str(pro_change_pos)+ alt_animo_acid[i]+''+ref_nucleotide[i]+str(cdna_position[i])+alt_nucleotide[i]+''+chrom_pos[i]+''+trans_name[i]
IndexError: list index out of range

A bug in neo_pyclone_annotation.py

Some positive pMHC are missing in my pTuneos results. I think that's because of a bug in the 119 line of the neo_pyclone_annotation.py script. People should change it:

from
data_drop=data_fill_na.drop_duplicates(subset=["Gene","MT_pep","WT_pep"])
to
data_drop=data_fill_na.drop_duplicates(subset=["HLA_type","Gene","MT_pep","WT_pep"])

INDELs VEP anntotation Consequence is missense_variant

cmd_vep_indel=vep_path + " -i " + somatic_out_fold + '/' + prefix + '_'+ 'INDELs_only.recode.vcf' + " --cache --dir " + vep_cache_path + " --dir_cache " + vep_cache_path + " --force_overwrite --canonical --symbol -o STDOUT --offline | filter_vep --ontology --filter \"Consequence is missense_variant\" -o " + somatic_out_fold + '/' + prefix + '_'+ 'mutect_indel_vep_ann.txt' + " --force_overwrite"

For INDELs, it seems that no Consequence is missense_variant ?

Typo in docker pull

Hi,
you have a typo in your readme, the T in ptuneos should be lowercase.
Call docker pull bm2lab/pTuneos which will download the Docker image.

Cheers

human.fasta.sa' : No such file or directory

Hello,
I'm using the docker version of pTuneos and I have a problem in stage 1: hla typing, sequence mapping and expression profiling!
I followed all the steps of the manual including the download of the reference data with bash data_download.sh.

The error is:
[bwt_restore_sa] fail to open file '/home/bioworker/project/pTuneos/database/Fasta/human.fasta.sa' : No such file or directory

Thanks
Mattia

Error in stage5

In the docker image, I managed to run most of the first, but there was an error in stage5.
Start stage 5: neoantigen filtering using Pre&RecNeo model and refined immunogenicity score scheme.
Process Process-11:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/root/pTuneos/src/core/pairendMDNA/pairendMDNAprocessor.py", line 471, in InVivoModelAndScore
homolog_s=cal_similarity_per(M_P,H_P)
File "/root/pTuneos/src/core/pairendMDNA/pairendMDNAprocessor.py", line 356, in cal_similarity_per
score_pair=aligner(mut_seq,normal_seq)[0][2]
File "/root/pTuneos/src/core/pairendMDNA/pairendMDNAprocessor.py", line 329, in aligner
aln = pairwise2.align.localds(seq1.upper(), seq2.upper(), matrix,gap_open,gap_extend)
File "/usr/local/lib/python2.7/dist-packages/Bio/pairwise2.py", line 408, in call
return _align(**keywds)
File "/usr/local/lib/python2.7/dist-packages/Bio/pairwise2.py", line 462, in _align
align_globally, score_only)
File "/usr/local/lib/python2.7/dist-packages/Bio/pairwise2.py", line 1031, in call
return self.score_dict[(charA, charB)]
KeyError: ('-', 'E')
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.