Giter Site home page Giter Site logo

icemansina / p2rank Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rdk/p2rank

0.0 1.0 0.0 82.1 MB

P2Rank: Protein-ligand binding site prediction tool based on machine learning. Stand-alone command line program / Java library for predicting ligand binding pockets from protein structure.

Home Page: http://siret.ms.mff.cuni.cz/p2rank

License: MIT License

Shell 4.52% Groovy 81.22% Batchfile 0.09% Tcl 2.29% Java 11.89%

p2rank's Introduction

P2Rank

Ligand-binding site prediction based on machine learning.

version 2.0 Build Status License: MIT

Description

P2Rank is a stand-alone command line program that predicts ligand-binding pockets from a protein structure. It achieves high prediction success rates without relying on an external software for computation of complex features or on a database of known protein-ligand templates.

Requirements

  • JRE 8 (Java 1.8) or JRE 11 (Java 11)
  • PyMOL 1.7.x for viewing visualizations (optional)

Setup

P2Rank requires no installation. Binary packages can be downloaded from the project website.

Usage

prank predict -f test_data/1fbl.pdb         # predict pockets on a single pdb file 

See more usage examples below...

Compilation

To compile P2Rank you need Gradle (https://gradle.org/). Build with ./make.sh or gradle assemble.

Algorithm

P2Rank makes predictions by scoring and clustering points on the protein's solvent accessible surface. Ligandability score of individual points is determined by a machine learning based model trained on the dataset of known protein-ligand complexes. For more details see slides and publications.

Slides: http://bit.ly/p2rank_slides

Publications

If you use P2Rank, please cite relevant papers:

  • Software article in JChem about P2Rank pocket prediction tool
    Krivák R, Hoksza D. P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. Journal of cheminformatics. 2018 Aug.
  • Conference paper inroducing P2Rank prediction algorithm
    Krivák R, Hoksza D. P2RANK: Knowledge-Based Ligand Binding Site Prediction Using Aggregated Local Features. InInternational Conference on Algorithms for Computational Biology 2015 Aug 4 (pp. 41-52). Springer, Cham.
  • Research article in JChem about PRANK rescoring algorithm
    Krivák R, Hoksza D. Improving protein-ligand binding site prediction accuracy by classification of inner pocket points using local features. Journal of cheminformatics. 2015 Dec;7(1):12.

Usage Examples

Following commands can be executed in the installation directory.

Print help

prank help

Predict ligand binding sites (P2Rank algorithm)

prank predict test.ds                             # run on whole dataset (containing list of pdb files)

prank predict -f test_data/1fbl.pdb               # run on single pdb file
prank predict -f test_data/1fbl.pdb.gz            # run on single gzipped pdb file

prank predict -threads 8          test.ds         # specify no. of working threads for parallel processing
prank predict -o output_here      test.ds         # explicitly specify output directory
prank predict -c predict2.groovy  test.ds         # specify configuration file (predict2.groovy uses 
                                                    different prediction model and combination of parameters)

Evaluate prediction model

...on a file or a dataset with known ligands.

prank eval-predict -f test_data/1fbl.pdb
prank eval-predict test.ds

Prediction output

For each file in the dataset program produces a CSV file in the output directory named <pdb_file_name>_predictions.csv, which contains an ordered list of predicted pockets, their scores, coordinates of their centroids and list of PDBSerials of adjacent amino acids and solvent exposed atoms.

If coordinates of SAS points that belong to predicted pockets are needed they can be found in visualizations/data/<pdb_file_name>_points.pdb. There "Residue sequence number" (23-26) of HETATM record corresponds to the rank of corresponding pocket (points with value 0 do not belong to any pocket).

Configuration

You can override default params with custom config file:

prank predict -c config/example.groovy  test.ds
prank predict -c example.groovy         test.ds

It is also possible to override the default params on the command line using their full name. To see complete list of params look into config/default.groovy.

prank predict                   -seed 151 -threads 8  test.ds
prank predict -c example.groovy -seed 151 -threads 8  test.ds

Rescoring (PRANK algorithm)

In addition to predicting new ligand binding sites, P2Rank is also able to rescore pockets predicted by other methods (Fpocket, ConCavity, SiteHound, MetaPocket2, LISE and DeepSite are supported at the moment).

prank rescore test_data/fpocket.ds
prank rescore fpocket.ds                 # test_data/ is default 'dataset_base_dir'
prank rescore fpocket.ds -o output_dir   # test_output/ is default 'output_base_dir'

Evaluate rescoring model

prank eval-rescore fpocket.ds

Comparison with Fpocket

Fpocket is widely used open source ligand binding site prediction program. It is fast, easy to use and well documented. As such, it was a great inspiration for this project. Fpocket is written in C and it is based on a different geometric algorithm.

Some practical differences:

  • Fpocket
    • has much smaller memory footprint
    • runs faster when executed on a single protein
    • produces a high number of less relevant pockets (and since the default scoring function isn't very effective the most relevant pockets often doesn't get to the top)
    • contains MDpocket algorithm for pocket predictions from molecular trajectories
    • still better documented
  • P2Rank
    • achieves significantly better identification success rates when considering top-ranked pockets
    • produces smaller number of more relevant pockets
    • speed:
      • slower when running on a single protein (due to JVM startup cost)
      • approximately as fast on average running on a big dataset on a single core
      • due to parallel implementation potentially much faster on multi core machines
    • higher memory footprint (~1G but doesn't grow much with more parallel threads)

Both Fpocket and P2Rank have many configurable parameters that influence behaviour of the algorithm and can be tweaked to achieve better results for particular requirements.

Thanks

This program builds upon software written by other people, either through library dependencies or through code included in it's source tree (where no library builds were available). Notably:

Contributing

We welcome any bug reports, enhancement requests, and other contributions. To submit a bug report or enhancement request, please use the GitHub issues tracker. For more substantial contributions, please fork this repo, push your changes to your fork, and submit a pull request with a good commit message.

p2rank's People

Contributors

rdk avatar jendelel avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.