Giter Site home page Giter Site logo

zpeng1989 / epitopefindr Goto Github PK

View Code? Open in Web Editor NEW

This project forked from brandonsie/epitopefindr

0.0 1.0 0.0 2.62 MB

R package to BLAST peptide sequences against each other and identify the minimal overlap of aligning regions.

Home Page: https://brandonsie.github.io/epitopefindr/

License: GNU General Public License v3.0

R 100.00%

epitopefindr's Introduction

epitopefindr

License Lifecycle Code Size Last Commit

The purpose of this package is to describe the BLAST alignments among a set of peptide sequences by reporting the overlaps of each peptide's alignments to other peptides in the set. One can imagine inputting a list of peptides enriched by immunoprecipitation (e.g. by PhIP-seq) to identify corresponding epitopes.

epitopefindr takes a .fasta file listing peptide sequences of interest and calls BLASTp from within R to identify alignments among these peptides. Each peptide's alignments to other peptides are then simplified to the minimal number of "non overlapping" intervals* of the index peptide that represent all alignments to other peptides reported by BLAST. (*By default, each interval must be at least 7 amino acids long, and two intervals are considered NOT overlapping if they share 6 or fewer amino acids). After the minimal overlaps are identified for each peptide, these overlaps are gathered into aligning groups based on the initial BLAST. For each group, a multiple sequence alignment logo (motif) is generated to represent the collective sequence. Additionally, a spreadsheet is written to list the final trimmed amino acid sequences and some metadata.

workflow

Installation:

  1. Install R (version 3.5+).
  2. Install BLAST+ (version 2.7.1). (Note: we have observed some issues with more recent versions of BLAST+ and will monitor for bugfixes.)
  3. In R console, execute:
if (!requireNamespace("devtools")) install.packages("devtools")
devtools::install_github("brandonsie/epitopefindr")
library(epitopefindr)

Optional (Suggested) Additional Setup :

(These are not essential to epitopefindr, but are used to generate alignment logo PDFs from the alignment data, which can be valuable visualizations.)

  1. Install a TeX distribution with pdflatex. (e.g. MiKTeX, Tex Live). (Optional; used to convert multiple sequence alignment TeX files to PDF.)
  2. Install pdftk (version 2.02+). (Optional; used to merge individual PDFs into a single file.) If you are unable to install pdftk, but your system has the pdfunite command line utility, you can install the R package pdfuniter, which performs a similar function. With pdfuniter, run epfind with pdftk = FALSE, pdfunite = TRUE.
    • as of epitopefindr version 1.1.30 (2020-09-20), pdftk = FALSE, pdfunite = TRUE is the default behavior. If your machine does not have the underlying pdfunite utility (e.g. macOS), try brew install poppler and then gem install pdfunite.

Debugging

  • epitopefindr 1.1.29 (2020-05-30) updates the DESCRIPTION file to specify sources of Bioconductor and Github packages. If the above installation produces issues during certain package installations, try the following:
if (!requireNamespace("BiocManager")) install.packages("BiocManager")
BiocManager::install(c("Biostrings", "IRanges", "msa", "S4Vectors"))

# Install Github packages
if(!requireNamespace("devtools")) install.packages("devtools")
devtools::install_github("mhahsler/rBLAST") 
devtools::install_github("brandonsie/pdfuniter")   
devtools::install_github("brandonsie/epitopefindr")

Guide

  1. Prepare a list of your peptides of interest using one of the following two methods. Either of these can be fed as the first input parameter to epfind.
  2. To run a typical epitopefindr pipeline, try calling epfind:
# Basic call
epfind(<path to .fasta>, <path to output dir>)

# Without pdflatex or pdftk
epfind(<path to .fasta>, <path to output dir>, 
        pdflatex = FALSE, pdftk = FALSE)

# More stringent e-value threshold
epfind(<path to .fasta>, <path to output dir>, e.thresh = 0.0001)

You can try running epfind() with some provided example data:

my_peptides <- epitopefindr::pairwise_viral_hits
epitopefindr::epfind(data = my_peptides, output.dir = "my_epf_1/")

A brief summary of the functions called by epfind:

  • pbCycleBLAST cycles through each input peptide to find the overlap of its alignment with other peptides from the input. Nested within a call to pbCycleBLAST are calls to epitopeBLAST, indexEpitopes.
  • trimEpitopes performs a second pass through the identified sequences to tidy alignments.
  • indexGroups collects trimmed sequences into aligning groups
  • groupMSA creates a multiple sequence alignment motif logo for each group
  • outputTable creates a spreadsheet summarizing identified sequences and epitope groups

For more information, please visit:

epitopefindr's People

Contributors

brandonsie avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.