pharmai / plip Goto Github PK

Protein-Ligand Interaction Profiler - Analyze and visualize non-covalent protein-ligand interactions in PDB files according to 📝 Adasme et al. (2021), https://doi.org/10.1093/nar/gkab294

Home Page: http://plip.biotec.tu-dresden.de

License: GNU General Public License v2.0

Python 99.12% Shell 0.15% Dockerfile 0.73%

bioinformatics docker openbabel pdb plip protein-structure python-bindings scientific-computing singularity

plip's Introduction

PharmAI

Analyze medication profiles using machine learning.

This project was started to provide clean and adaptable code building on medorder_prediction.

Motivation

Health-system pharmacists review almost all medication orders for hospitalized patients. Considering that most orders contain no errors, especially in the era of CPOE with CDS,¹ pharmacists have called for technology to enable the triage of routine orders to less extensive review, in order to focus pharmacist attention on unusual orders requiring.^2,3,4

Description

We propose a machine learning model that can do different things:

In prospective mode, it learns medication order patterns from historical data, and then predicts the next medication order given a patient's previous order sequence, and currently active drugs. The predictions would be compared to actual orders. Our hypothesis is that orders ranking low in predictions would be unusual, while orders ranking high would be more likely to be routine or unremarkable.
In retrospective mode, the order patterns are learned from "stable" medication profiles, that is when no orders have been placed in a given period of time. We assume that when no orders have been entered in a certain period, the prescribed medications represent a stable and correct medication profile. We then try to predict each drug in the stable profile from what happened before this drug (the pre sequence), what happened after this drug (the post sequence) and what is currently active.
In retrospective autoencoder mode, the order patters are also learned from "stable" medication profiles. This mode uses an autoencoder to encode the order sequence up to the point the stable profile was sampled as well as what is currently active, to a latent space. From the latent space, the model reconstructs the active medication profile.

This repository presents the code used to preprocess the data from our institution's dataset or from the MIMIC dataset, and to train and evaluate a machine learning model performing either of the tasks described above. Because we are unable to share our dataset, we provide a working implementation on MIMIC-III as a demonstration (see caveats below). The model uses three inputs:

The sequence of previous drug orders (the pre sequence), represented as word2vec embeddings.
The sequence of orders that happened after the prescribed drug, and before the time when orders stopped being entered for a given period of time (the post sequence), represented as word2vec embeddings. This sequence is not used in prospective mode or in retrospective autoencoder mode.
The currently active drugs and pharmacological classes, as well as the ordering department, represented as a bag-of-words, transformed into a multi-hot vector through binary count vectorization.

The output is (the drug only, not the dose, route and frequency):

In prospective mode, the prediction of the next medication order
In retrospective mode, the prediction of the drug which was prescribed at the given point in the sequence and which fits given what is currently prescribed to the patient.
In retrospective autoencoder mode, the active medications.

Caveats about using the MIMIC dataset with our approach

There are important limitations to using the MIMIC dataset with this approach as compared to to real, unprocessed data. We do not believe the results of the model on the MIMIC dataset would reliable enough for use in research or in practice. The mimic files should only be used for demonstration purposes.

There are four main characteristics of the MIMIC dataset which make it less reliable for this model:

The MIMIC data only comes from ICU patients. As shown in our paper, ICU patients were one of the populations where our model showed the worst performance, probably because of the variability and the complexity of these patients. It would be interesting to explore how to increase the performance on this subset of patients, possibly by including more clinical data as features. Also, the same drugs may be used in patients in and out of ICU, and the information about the usage trends outside of ICU may be useful to the model, especially for the word2vec embeddings. This data is not present in MIMIC.
The MIMIC data was time-shifted inconsistently between patients. Medication use patterns follow trends over time, influcended by drug availability (new drugs on the market, drugs withdrawn from markets, drug shortages) and clinical pratices following new evidence being published. The inconsistent time-shifting destroys these trends; the dataset becomes effectiely pre-shuffled. In our original implementation, we were careful to use only the Scikit-Learn TimeSeries Split in cross-validation to preserve these patterns, and our test set was chronologically after our training-validation set and was preprocessed separately. In our MIMIC demonstration, we changed these to regular shuffle splits and our preprocessor generates the training-validation and test sets from the raw MIMIC files using a random sampling strategy.
The MIMIC prescription data does not include the times of the orders, only the dates. This destroys the sequence of orders within a single day. In NLP, this would be equivalent to shuffling each sentence within a text and then trying to extract sequence information. This makes the creation of word2vec embeddings much more difficult because the exact context of the orders (i.e. those that occured immediately before and after) is destroyed. This shows in analogy accuracy which does not go above 20-25% on MIMIC while we achived close to 80% on our data. The sequence of orders, an important input to the model, is therefore inconsistent. Also, it becomes impossible to reliably know whether orders within the same day as the target were discontinued or not when the target was prescribed, descreasing the reliability of our multi-hot vector input.
The MIMIC dataset does not include the pharmacological classes of the drugs. The GSN (generic sequence number) allows linkage of the drug data to First Databank, which could allow extraction of classes, but this database is proprietary. One of our model inputs is therefore eliminated.

Files

We present the files in the order they should be run to go from the original data files to a trained and evaluated model. The python script files are provided as commented files that can be used to generate Jupyter Notebooks or can be run as-is in the terminal. Some files are also provided directly as Jupyter Notebooks.

We provide all the code to work from our local dataset or from the MIMIC dataset. Although we cannot share our dataset, this code could be adapted and used by other researchers to replicate our approach on data from other hospitals.

preprocessor.py / mimic_preprocessor.py

This file is not formatted to generate a Jupyter notebook, can only be run as a script.

This script will transform the source files, which should be essentialy lists of orders and associated data, into several pickle files containing dictionaries where the keys are encounter ids and the values are the features for this encounter, in chronological order.

enc_list.pkl is a simple list of encounter ids, to allow for easy splitting into sets. profiles_list.pkl is the list of the raw order sequences in each encounter, to train the word2vec embeddings.

After being loaded and processed by the data loader in components.py, each order gets considered as a label (targets.pkl). The features associated with this label are:

The sequence of orders preceding it within the encounter (pre_seq_list.pkl). Orders happening at the exact same time are kept in the sequence. In MIMIC, because order times are precise to the day, this means each order that happened in the same day is present in the sequence (except the label).
The sequence of orders after the target within the encounter until orders stopped being entered for a given period of time (post_seq_list.pkl). Orders happening at the exact same time are kept in the sequence. In MIMIC, because order times are precise to the day, this means each order that happened in the same day is present in the sequence (except the label).
The active drugs at the time the label was ordered (active_meds_list.pkl). Orders happening at the same time as the label are considered active.
The active pharmacological classes at the time the label was ordered (active_classes_list.pkl). This is not created by the MIMIC preprocessor.
The departement where the order happened (depa_list.pkl).

original version

Arguments:

--mode	'retrospective' or 'prospective' depending on how the preprocessed data will be used.
--sourcefile	indicates where the original data, in csv format, is located.
--definitionsfile	indiciates a separate file linking medication numbers to full medication names and pharmacological classes.
--numyears	indicates how many years of data to process from the file (starting from the most recent). Defaults to 5.

mimic version

Arguments:

--mode	'retrospective' or 'prospective' depending on how the preprocessed data will be used. Retrospective autoencoder mode is not yet implemented for the mimic preprocessor.

w2v_embeddings.py

Find the best word2vec training hyperparameters to maximize the accuracy on a list of analogies. We provide a list of pairs for the mimic dataset where the semantic relationship is going from a drug in tablet form to a drug in oral solution form (mimic/data/pairs.txt), as described in our paper. The file utils/w2v_analogies.py transforms these pairs into an analogy file (mimic/data/eval_analogy.txt) matching specifications for the gensim accuracy evaluation method that is used for scoring.

The script performs grid search with 3-fold cross-validation to explore the hyperparameter space, and then refits on the whole data with the best hyperparameters returns the analogy accuracy on the entire dataset. Clustering on 3d UMAP projected word2vec embeddings is explored to qualitatively evaluate if clusters correlate to clinical concepts and a 3d plot is returned showing the 3d projected embeddings with color-coded clusters. The clustering part is not used in the subsequent neural network.

We provide a Jupyter Notebook showing our summary exploration of the hyperparameter space on the MIMIC dataset. Performance is poor on this dataset, see caveats above.

Once the word2vec hyperparameters are found, they should be adjusted in the train.py file.

train.py

This file is used to begin training and resume training a partially trained model. The file contains a parameter dictionary used to adjust the training mode and parameters as well as training hyperparameters. First, this script should be used to explore the hyperparameter space and configuration of the neural network by enabling the CROSS_VALIDATE flag. Experiments with a validation set but without cross-validation can be performed by setting the CROSS-VALIDATE flag to False and the VALIDATE flag to True. A final model can be trained on the entire dataset by setting both flags to False.

This parameter dictionary contains a RESTRICT_DATA flag that can be enabled to use only a sample of orders instead of the whole dataset. The sample size can be adjusted. This can be useful to try different things and to debug faster.

evaluate.py

This script uses the test subset of the data to determine the final performance metrics of the model. Will compute global metrics and metrics by patient category, using a dictionary mapping of departments to patient categories specified in a department file which must be manually encoded.

In autoencoder mode, will show samples of reconstructed patient profiles.

components.py

This file is not meant to be run, but is a collection of classes and functions used in the previous scripts. The neural network architecture can be adjusted within this file.

utils/dummy_class.py

This script acts as a dummy classifier, calculating accuracy metrics if the top1, top10 and top30 most popular drugs in the dataset were always predicted. It also returns the number of classes within that set and a frequency histogram of the 50 most popular classes.

utils/extract_druginfo_mimic.py

This script extracts the drug information from the MIMIC PRESCRIPTIONS table to a definitions.csv file providing the formulary drug code in relation to a computed string including the drug name, product strength and pharmaceutical form. This can be useful to match formulary drug codes to human-readable strings. Be careful, some formulary codes match to multiple strings.

utils/w2v_analogies.py

This script takes a list of pairs of drugs and computes analogies, see w2v_embeddings above.

Prerequisites

Developed using Python 3.7

Requires:

Joblib
Numpy
Pandas
Scikit-learn
Scikit-plot
UMAP
Matplotlib
Pydot
Graphviz
TQDM
Seaborn
Gensim
Tensorflow 2.0 or later
Jupyter

Contributors

Maxime Thibault.

References

Paper currently under peer review for publication.
Abstract presented at the Machine Learning for Healthcare 2019 conference
Abstract (spotlight session 6 abstract 2)
Spotlight presentation (from 52:15 to 54:15)
Poster

License

GNU GPL v3

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

plip's People

Contributors

Stargazers

Watchers

Forkers

greatlse frchalaoux cardboard-king lacymat xiongzhp chaninn chemlove veselovmark unixjunkie thomas-coudrat mengwuxiao aspirincode acruzpr ericboittier jaimergp salents tonyyzy linuxsjn konghui099 ruixuefennnian caiyingchun yusuf1759 charleshahn sikwoxy volkamerlab jxshi demattox xrobin bio-otto corykornowicz pawansit wangdi2014 icamps kelly1210 minghao2016 kalinni mikorabago new67 davidpenkler melchorsanchez feigeliudan01 chemnet cigdemozen biocheming erikzhang-9762 pnnl-compbio omagebright jscant naf-guo mirastic prasadtk truatpasteurdotfr shunsunsun sungekim takshan bensultan freeenergylab zagrosman ngrooom ipark2021 madasme abhik1368 framade wangleiofficial luwei0917 bouncewei busugaacevedo rnaimehaom rocke2020 lqx-ai zyh0608 elijahahianyo nimstepf zhanghaotian1 tanxiaoqin888 kehan777 joy210702 liuzi919 ashtawy iq-scm berlinliumingfei abhilashmathews tiger-tiger quantaosun marcoschaefert qqlaoxia isip-jp paulraj-e psychedelic2007 nireus-lgx qy0831

plip's Issues

Segmentation fault with PDB ID 3g71

I am getting a new segmentation fault on PDB ID 3g71. Latest PLIP version v1.4.2.

zohixe92@worker02:~/plip/excluded_ligands$ plipcmd -i 3g71 --verbose -x

**********************************************
* Protein-Ligand Interaction Profiler v1.4.2 *
**********************************************


Checking status of PDB ID 3g71 ... entry is up to date.
Downloading file from PDB ... file downloaded as ./3g71.pdb


Starting analysis of 3g71.pdb
===============================
75 lines automatically fixed in PDB input file.
PDB structure successfully read.
Segmentation fault

Difference between `atoms` and `orig_atoms` attributes

Hello, first of all thanks for the great package.

I'm playing around with some of the interaction objects return by the Python package, and I came across the pication and pistack ones:

Out[24]: pication(ring=aromatic_ring(atoms=[<pybel.Atom object at 0x7f88b75ee110>,
<pybel.Atom object at 0x7f88b75ee190>, <pybel.Atom object at 0x7f88b75ee210>, <pybel.Atom
object at 0x7f88b75ee310>, <pybel.Atom object at 0x7f88b75ee410>], orig_atoms=[<pybel.Atom
 object at 0x7f88b80f06d0>, <pybel.Atom object at 0x7f88b80f09d0>, <pybel.Atom object at 
0x7f88b80f0990>, <pybel.Atom object at 0x7f88b80f0a90>, <pybel.Atom object at 
0x7f88b80f0b10>], atoms_orig_idx=[224, 225, 226, 228, 230], normal=array([-0.34151512, 
-0.93419256,  0.10320694]), obj=<openbabel.OBRing; proxy of <Swig Object of type 
'OpenBabel::OBRing *' at 0x7f88b80ec0c0> >, center=[5.3118, 22.6008, 35.391999999999996], 
type='5-membered'), charge=lcharge(atoms=[<pybel.Atom object at 0x7f88b6499fd0>], 
orig_atoms=[<pybel.Atom object at 0x7f88b64ad350>], atoms_orig_idx=[5556], type='positive',
 center=[3.28, 19.829, 36.006], fgroup='tertamine'), distance=3.4911434344638432, 
offset=0.9939751145873313, type='regular', restype='HIS', resnr=17, reschain='A', 
restype_l='MOL', resnr_l=1, reschain_l='Z', protcharged=False)

I was wondering, what is the difference between the list of atoms in the atoms attribute and the one in orig_atoms?

tempfile not imported in modules/supplemental.py

needed in l.39: return tempfile.mktemp(prefix=prefix, suffix='.pdb', dir=direc)

Include API in dockerized PLIP

Is your feature request related to a problem? Please describe.
To try out or to deploy PLIP within a microservice architecture, it would be helpful to have an existing API already running as a service inside the dockerized version of PLIP.

Describe the solution you'd like
Offer all command line functionality, including batch processing, inside an API shipped with dockerized PLIP. A modern, self-documenting API framework such as FastAPI would be nice.

Describe alternatives you've considered
Offer a limited set of basic functionality first and/or choose another API framework.

Additional context
Having the results in JSON format (see #76) should be solved first.

Can't process large structures with more than 99999 atoms

An example is 4V9O:

$ plipcmd -i 4V9O
Error:  No file in PDB format available from wwPDB for the given PDB ID.

I currently have the structure stored in an "invalid" PDB format where atoms > 99999 are replaced by *****, but PLIP can't read the file:

/scicore/home/schwede/zohixe92/SMNG/stage/bin/sm run-plip -f /scicore/home/schwede/zohixe92/SMTL_ribosome/entries/4v/9o/biounit.1.pdb -o /scicore/home/schwede/zohixe92/SMTL_ribosome/entries/4v/9o --name plip_report.1 -x
Traceback (most recent call last):
  File "/opt/pliptool/plip/plipcmd", line 291, in <module>
    main(expanded_path, arguments.pdbid)  # Start main script
  File "/opt/pliptool/plip/plipcmd", line 152, in main
    process_pdb(inputstruct, config.OUTPATH, as_string=read_from_stdin, outputprefix=outputprefix)
  File "/opt/pliptool/plip/plipcmd", line 57, in process_pdb
    mol.load_pdb(pdbfile, as_string=as_string)
  File "/opt/pliptool/plip/modules/preparation.py", line 1286, in load_pdb
    pdbparser = PDBParser(pdbpath, as_string=as_string)  # Parse PDB file to find errors and get additonal data
  File "/opt/pliptool/plip/modules/preparation.py", line 28, in __init__
    self.proteinmap, self.modres, self.covalent, self.altconformations, self.corrected_pdb = self.parse_pdb()
  File "/opt/pliptool/plip/modules/preparation.py", line 61, in parse_pdb
    corrected_line, newnum = self.fix_pdbline(line, lastnum)
  File "/opt/pliptool/plip/modules/preparation.py", line 128, in fix_pdbline
    currentnum = int(pdbline[6:11])
ValueError: invalid literal for int() with base 10: '*****'

An alternative would be to read files in the mmCIF format, but that doesn't seem possible at the moment either.

I believe there are currently 482 structures with > 99999 atoms on PDB that I'd like to analyze.

Would it be possible to add a {-c|--chimera} option?

So that users can load PLIP results into UCSF chimera?
I know about the -y|--pymol option, but I am a chimera user...
To create graphical annotations for UCSF chimera, this file format
is quite useful:
https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/bild.html
Regards,
Francois.

Is 1.3.5 ready for python 3?

If it's right, what version of python 3, the 1.3.5 ready? Is 3.5, 3.6, or 3.7 version?

Thanks

note: I tried this version (1.3.5) using python 2.7.13, pymol 1.7.6 for windows and got error result for 1bju.pdb

Segmentation fault with some files

I have 3 files in PDB format that cause a segmentation fault when I run plipcmd.

4lck.biounit.2.pdb.gz (a small file)
1yj9.biounit.1.pdb.gz (a large file) but this filtered one works fine: 1yj9.biounit.1.filtered.pdb.gz
1ijs.biounit.6.filtered.pdb.gz (this one only contains a single ligand chain pf DNA so it is maybe not surprising to see it crash, though not with a segfault)

All the files I use are generated by OpenStructure, and only these 3 appear to cause a crash. I don't actually know if they are due to a single or multiple bugs. The PDB files from RCSB work fine.

$ gunzip 4lck.biounit.2.pdb.gz
$ plipcmd -f 4lck.biounit.2.pdb
Segmentation fault (core dumped)

I can see it crash with plip 1.4.0 and 1.4.1, both running Python 2.7.11 and OpenBabel 2.4.1.

I tried to run it on your web server and the job didn't get through either:

Sorry, your job has been failed for some reason. Please try again or contact us, if the problem still persists

So it doesn't seem to be specific to our system.

TypeError: unhashable type: 'OBAtom'

platform: macOS Catalina
python: 3.7.7

With reference to #46 I have commented out the three lines from supplemental.py

Error:

$ python3 plip/plipcmd.py -I 6lu7

/usr/local/lib/python3.7/site-packages/openbabel/__init__.py:14: UserWarning: "import openbabel" is deprecated, instead use "from openbabel import openbabel"
  warnings.warn('"import openbabel" is deprecated, instead use "from openbabel import openbabel"')
Traceback (most recent call last):
  File "plip/plipcmd.py", line 311, in <module>
    main_init()
  File "plip/plipcmd.py", line 307, in main_init
    main(expanded_path, arguments.pdbid)  # Start main script
  File "plip/plipcmd.py", line 171, in main
    process_pdb(pdbpath, config.OUTPATH, outputprefix=outputprefix)
  File "plip/plipcmd.py", line 63, in process_pdb
    mol.load_pdb(pdbfile, as_string=as_string)
  File "/Users/navanchauhan/Desktop/nCOV-19/scripts/plip/plip/modules/preparation.py", line 1354, in load_pdb
    ligandfinder = LigandFinder(self.protcomplex, self.altconf, self.modres, self.covalent, self.Mapper)
  File "/Users/navanchauhan/Desktop/nCOV-19/scripts/plip/plip/modules/preparation.py", line 229, in __init__
    self.ligands = self.getligs()
  File "/Users/navanchauhan/Desktop/nCOV-19/scripts/plip/plip/modules/preparation.py", line 274, in getligs
    ligands.append(self.extract_ligand(kmer))
  File "/Users/navanchauhan/Desktop/nCOV-19/scripts/plip/plip/modules/preparation.py", line 311, in extract_ligand
    hetatoms_res = set([(obatom.GetIdx(), obatom) for obatom in pybel.ob.OBResidueAtomIter(obresidue)
TypeError: unhashable type: 'OBAtom'

OSX fails with resource.getrlimit (supplemental.py)

if os.name != 'nt':  # Resource module not available for Windows
        maxsize = resource.getrlimit(resource.RLIMIT_STACK)[-1]
        resource.setrlimit(resource.RLIMIT_STACK, (min(2 ** 28, maxsize), maxsize))

Causes the following error on OSX with python 3.6:

ValueError: current limit exceeds maximum limit

This has been encountered in a few other places, e.g.:
cea-hpc/clustershell#285

Easily fixed by ignoring for mac, like windows.

if os.name != 'nt' and platform.system() != 'Darwin':  
        maxsize = resource.getrlimit(resource.RLIMIT_STACK)[-1]
        resource.setrlimit(resource.RLIMIT_STACK, (min(2 ** 28, maxsize), maxsize))

plipcmd returns IndexError with PDBID 3q45

Hi,

plipcmd raises an IndexError when PDB ID 3q45 is used as an input. Apparently it has to do with the metal complex detection step. Here is the sample output:

$ plipcmd -v -i 3q45

**********************************************
* Protein-Ligand Interaction Profiler v1.3.1 *
**********************************************


Checking status of PDB ID 3q45 ... entry is up to date.
Downloading file from PDB ... file downloaded as ./3q45.pdb


Starting analysis of 3q45.pdb
===============================
PDB structure successfully read.
Analyzing 9 ligands...

MG-DAL-VAL [SMALLMOLECULE+ION] -- MG:D:1604 + DAL:D:24...
----------------------------------------------------------
  Binding site atoms in vicinity (7.5 A max. dist: 216).
  Reduced number of hydrophobic contacts from 15 to 6.
  Metal ion Mg complexed with square.planar geometry (coo. number 4/ 4 observed).
  Ligand interacts with 14 binding site residue(s) in chain(s) D.
  Complex uses 3 salt bridge(s), 5 hydrogen bond(s).

MG-DAL-VAL [SMALLMOLECULE+ION] -- MG:E:1605 + DAL:E:24...
----------------------------------------------------------
  Binding site atoms in vicinity (7.5 A max. dist: 214).
  Reduced number of hydrophobic contacts from 14 to 8.
  Metal ion Mg complexed with square.planar geometry (coo. number 4/ 4 observed).
  Ligand interacts with 17 binding site residue(s) in chain(s) E.
  Complex uses 3 salt bridge(s), 6 hydrogen bond(s).

MG-DAL-VAL [SMALLMOLECULE+ION] -- MG:H:1608 + DAL:H:24...
----------------------------------------------------------
  Binding site atoms in vicinity (7.5 A max. dist: 219).
  Reduced number of hydrophobic contacts from 15 to 8.
  Metal ion Mg complexed with trigonal.pyramidal geometry (coo. number 3/ 6 observed).
  Ligand interacts with 17 binding site residue(s) in chain(s) H.
  Complex uses 3 salt bridge(s), 7 hydrogen bond(s).

MG-DAL-VAL [SMALLMOLECULE+ION] -- MG:I:1609 + DAL:I:24...
----------------------------------------------------------
  Binding site atoms in vicinity (7.5 A max. dist: 217).
  Reduced number of hydrophobic contacts from 19 to 10.
  Metal ion Mg complexed with square.planar geometry (coo. number 4/ 6 observed).
  Ligand interacts with 17 binding site residue(s) in chain(s) I.
  Complex uses 3 salt bridge(s), 5 hydrogen bond(s).

MG-DAL-VAL [SMALLMOLECULE+ION] -- MG:B:1602 + DAL:B:24...
----------------------------------------------------------
  Binding site atoms in vicinity (7.5 A max. dist: 213).
  Reduced number of hydrophobic contacts from 15 to 8.
Traceback (most recent call last):
  File "/.../plip/plipcmd", line 321, in <module>
    main(expanded_path, arguments.pdbid)  # Start main script
  File "/.../plip/plipcmd", line 224, in main
    process_pdb(pdbpath, config.OUTPATH)
  File "/.../plip/plipcmd", line 80, in process_pdb
    mol.characterize_complex(ligand)
  File "/.../plip/modules/preparation.py", line 1281, in characterize_complex
    pli_obj = PLInteraction(lig_obj, bs_obj, self)
  File "/.../plip/modules/preparation.py", line 512, in __init__
    self.bindingsite.metal_binding)
  File "/.../plip/modules/detection.py", line 365, in metal_complexation
    next_total = all_total[i + 1]
IndexError: list index out of range

System information

PLIP 1.3.1
Ubuntu 14.04.5 LTS
Open Babel 2.3.2+dfsg-1.1
PyMOL 1.7.0.0
ImageMagick 6.7.7-10 2016-06-01 Q16

How does PLIP handle multiple models in a PDB structure?

Hi, I am sorry to bother you. I have some confusions about how does PLIP handle multiple models in a PDB structure.

just take 1lxf.pdb(downloaded in RCSB by [https://files.rcsb.org/download/1lxf.pdb])，it has 30 MODELs.

So I am confused. Which MODEL would be used in PLIP? Is the first one? Or choose one according to some rules?

Thanks for your time to read this message! Looking forward to you

restype_lig and other text fields converted to float

I just ran into an issue with 1HDQ, where the ligand, INF, is wrongly converted to a floating point number representation of positive infinity, which causes issues downstream.

plipcmd -i 1hdq -x

And then:

>>> import plip.modules.plipxml as plip
>>> report = plip.PLIPXML("report.xml")
>>> print(report.bsites['INF:A:1308'].hydrophobics[0].restype_lig)
inf
>>> print(type(report.bsites['INF:A:1308'].hydrophobics[0].restype_lig))
<type 'float'>

When dumped into json the ligand name becomes Infinity (without quotes) instead of 'INF' (with quotes) so I cannot get the ligand back. When dumped with ujson an OverflowError is raised.

There are other ligand names which can be interpreted as numbers, for instance 1ONY has a ligand named 588, and a ligand named NAN exists, although it isn't used in any PDB entry so far.

PLIP installation problem

I’m trying to install PLIP on my Linux machine and got an error during setting.
Pymol 1.8.0.2 from Schrodinger is installed on this linux machine but I don’t understand the error (see below).

Does it need a pymol python module, where should be this module ?

Cheers, FR.

ERROR:

Processing dependencies for plip==1.2.0
Searching for pymol
Reading https://pypi.python.org/simple/pymol/
No local packages or download links found for pymol
error: Could not find suitable distribution for Requirement.parse('pymol')

Not a recognised Open Babel descriptor type

Dear, When I run this command

plipcmd -i 1ATP -v

I got into the problem which is:

ValueError: �Ɵ� is not a recognised Open Babel descriptor type

Can you help me

interesting related project

Hello,

At RCSB, they have a ligand viewer nowadays:
https://www.rcsb.org/3d-view/1OXR
then choose Ligand View and select AIN (aspirin).

They can detect all those interactions:
Hydrogen Bonds (blue)
Halogen Bonds (turquoise)
Hydrophobic Contacts (grey)
Pi Interactions (orange, green)
Metal Interactions (purple)

Does plip have all of them?

The molecular viewer is open source:
https://github.com/arose/ngl

Plip is even more useful because it is a standalone program one can run
from the command line.

Thanks,
F.

Example command "plip -i 1vsn -yv" is not working

$plip -i 1vsn -yv
...
Checking status of PDB ID 1vsn ... entry is up to date.
Downloading file from PDB ... file downloaded as ./1vsn.pdb
...
PDB structure successfully read.
Analyzing one ligand...
...
Contains 2 aromatic ring(s).
Traceback (most recent call last):
File "/home/user/bin/source/pliptool/plip/plipcmd", line 306, in
main(expanded_path, arguments.pdbid) # Start main script
File "/home/user/bin/source/pliptool/plip/plipcmd", line 172, in main
process_pdb(pdbpath, config.OUTPATH, outputprefix=outputprefix)
File "/home/user/bin/source/pliptool/plip/plipcmd", line 68, in process_pdb
mol.characterize_complex(ligand)
File "/home/user/bin/source/pliptool/plip/modules/preparation.py", line 1390, in characterize_complex
lig_obj = Ligand(self, ligand)
File "/home/user/bin/source/pliptool/plip/modules/preparation.py", line 1001, in init
descvalues = self.molecule.calcdesc()
File "/usr/lib/python2.7/dist-packages/pybel.py", line 361, in calcdesc
raise ValueError("%s is not a recognised Open Babel descriptor type" % descname)
ValueError: @w7� is not a recognised Open Babel descriptor type

#System set-up:
Python 2.7.2
Open Babel version: 2.3.2
Ubuntu 16.04 LTS

--name OUTPUTFILENAME not available in 1.4.1

I'm sorry I'm realizing this only now, as I'm rebuilding our singularity container to get it into an lmod module. It appears that the --name option didn't make it to v1.4.1 but is only available in the master branch which wasn't merged before the release.

Is it possible to have a new release with the --name option available?

missing dependency

Hi. When I install the command line version, the builtins module is missing. Fixed by "sudo pip2 install future". I'm using python Python 2.7.12 on Ubuntu 16.04 LTS.

the webserver is down

http://plip.biotec.tu-dresden.de/

It would be nice to see it up and running again...

ImportError: cannot import name pyparsing_common

Hi,

I have been trying to download "plip" on my Mac laptop to use in a drug docking pipeline. When I try to run "plip" from the bash command line "$ plip" (after creating the alias) and from python, I get this same error:

Traceback (most recent call last):
File "/Users/gaby/pliptool/plip/plipcmd", line 33, in
from modules.preparation import *
File "/Users/gaby/pliptool/plip/modules/preparation.py", line 24, in
from detection import *
File "/Users/gaby/pliptool/plip/modules/detection.py", line 24, in
from supplemental import *
File "/Users/gaby/pliptool/plip/modules/supplemental.py", line 38, in
import pybel
File "/Users/gaby/Library/Python/2.7/lib/python/site-packages/pybel/init.py", line 61, in
from . import canonicalize
File "/Users/gaby/Library/Python/2.7/lib/python/site-packages/pybel/canonicalize.py", line 14, in
from .parser.language import rev_abundance_labels
File "/Users/gaby/Library/Python/2.7/lib/python/site-packages/pybel/parser/init.py", line 5, in
from .parse_bel import BelParser
File "/Users/gaby/Library/Python/2.7/lib/python/site-packages/pybel/parser/parse_bel.py", line 16, in
from .modifiers import *
File "/Users/gaby/Library/Python/2.7/lib/python/site-packages/pybel/parser/modifiers/init.py", line 3, in
from .fragment import FragmentParser
File "/Users/gaby/Library/Python/2.7/lib/python/site-packages/pybel/parser/modifiers/fragment.py", line 59, in
from pyparsing import pyparsing_common as ppc, Keyword, Optional
ImportError: cannot import name pyparsing_common

If i try the following:

$ python

import plip
import openbabel
import pyparsing

no error, however when I do the following command

from plip.modules.preparation import PDBComplex

I get the same error mentioned above.

now, in regards to pyparsing_comman. before trying plip command, I tried to import "pyparsing_common" to check if it was the problem. when I ran it this morning (before trying plip):
$ python

from pyparsing import pyparsing_common as ppc, Keyword, Optional

it works just fine, then I tried running same plip commands as above, I reach same import error!

I understand this seems crazy, however, I have no idea what to do, I tried everything.

for how I installed plip:

method 1: using pip

installed plip using pip which also installed the needed dependencies openbabel, numpy, lxml

Then I installed PyMol and all its dependencies using homebrew as indicated here
https://pymolwiki.org/index.php/MAC_Install

result: same error

Method 2: manual

before i installed using pip, i tried installing manually by downloading the plip-stable zip file and following the commands specified in the README file. when i reach the step

plip -i 1vsn -yv

I get the same error which is "ImportError: cannot import name pyparsing_common"
also installed openbabel and all other dependencies in this case and added needed stuff to the PATH variable.

python version: 2.7.10
OS X el capitan 10.11.3

Any help will be appriciated, i don't know how to solve this.

Thanks a lot and great work on this amazing software (i tried the web service and it provides amazing information regarding the interactions - especially the amazing 3D visualization of the interactions).

ValueError: inchikey is not a recognised Open Babel format

Hi, when I run the test_literature_validated.py, it has some errors asbelow:

Error
Traceback (most recent call last):
  File "D:\anaconda\envs\my-python2.7\lib\unittest\case.py", line 329, in run
    testMethod()
  File "E:\plip-stable\plip\test\test_literature_validated.py", line 469, in test_1aku
    tmpmol.characterize_complex(ligand)
  File "E:\plip-stable\plip\modules\preparation.py", line 1409, in characterize_complex
    lig_obj = Ligand(self, ligand)
  File "E:\plip-stable\plip\modules\preparation.py", line 1003, in __init__
    self.inchikey = self.molecule.write(format='inchikey')
  File "D:\anaconda\envs\my-python2.7\lib\site-packages\pybel.py", line 527, in write
    format)
ValueError: inchikey is not a recognised Open Babel format

my environment info:

Not successful to run PLIP in python 3.5.2 64bit

here the verbose when installed from python setup.py install

`Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.

C:\Python35\py35whl\plip>python setup.py install
running install
running bdist_egg
running egg_info
creating plip.egg-info
writing requirements to plip.egg-info\requires.txt
writing dependency_links to plip.egg-info\dependency_links.txt
writing top-level names to plip.egg-info\top_level.txt
writing plip.egg-info\PKG-INFO
writing manifest file 'plip.egg-info\SOURCES.txt'
reading manifest file 'plip.egg-info\SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'plip.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
running build_py
creating build
creating build\lib
creating build\lib\plip
copying plip_init_.py -> build\lib\plip
creating build\lib\plip\modules
copying plip/modules\chimeraplip.py -> build\lib\plip/modules
copying plip/modules\config.py -> build\lib\plip/modules
copying plip/modules\detection.py -> build\lib\plip/modules
copying plip/modules\mp.py -> build\lib\plip/modules
copying plip/modules\plipremote.py -> build\lib\plip/modules
copying plip/modules\plipxml.py -> build\lib\plip/modules
copying plip/modules\preparation.py -> build\lib\plip/modules
copying plip/modules\pymolplip.py -> build\lib\plip/modules
copying plip/modules\report.py -> build\lib\plip/modules
copying plip/modules\supplemental.py -> build\lib\plip/modules
copying plip/modules\visualize.py -> build\lib\plip/modules
copying plip/modules\webservices.py -> build\lib\plip/modules
copying plip/modules_init_.py -> build\lib\plip/modules
creating build\bdist.win-amd64
creating build\bdist.win-amd64\egg
creating build\bdist.win-amd64\egg\plip
creating build\bdist.win-amd64\egg\plip\modules
copying build\lib\plip\modules\chimeraplip.py -> build\bdist.win-amd64\egg\plip
modules
copying build\lib\plip\modules\config.py -> build\bdist.win-amd64\egg\plip\modul
es
copying build\lib\plip\modules\detection.py -> build\bdist.win-amd64\egg\plip\mo
dules
copying build\lib\plip\modules\mp.py -> build\bdist.win-amd64\egg\plip\modules
copying build\lib\plip\modules\plipremote.py -> build\bdist.win-amd64\egg\plip\m
odules
copying build\lib\plip\modules\plipxml.py -> build\bdist.win-amd64\egg\plip\modu
les
copying build\lib\plip\modules\preparation.py -> build\bdist.win-amd64\egg\plip
modules
copying build\lib\plip\modules\pymolplip.py -> build\bdist.win-amd64\egg\plip\mo
dules
copying build\lib\plip\modules\report.py -> build\bdist.win-amd64\egg\plip\modul
es
copying build\lib\plip\modules\supplemental.py -> build\bdist.win-amd64\egg\plip
\modules
copying build\lib\plip\modules\visualize.py -> build\bdist.win-amd64\egg\plip\mo
dules
copying build\lib\plip\modules\webservices.py -> build\bdist.win-amd64\egg\plip
modules
copying build\lib\plip\modules_init_.py -> build\bdist.win-amd64\egg\plip\mod
ules
copying build\lib\plip_init_.py -> build\bdist.win-amd64\egg\plip
byte-compiling build\bdist.win-amd64\egg\plip\modules\chimeraplip.py to chimerap
lip.cpython-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules\config.py to config.cpytho
n-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules\detection.py to detection.
cpython-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules\mp.py to mp.cpython-35.pyc

byte-compiling build\bdist.win-amd64\egg\plip\modules\plipremote.py to plipremot
e.cpython-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules\plipxml.py to plipxml.cpyt
hon-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules\preparation.py to preparat
ion.cpython-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules\pymolplip.py to pymolplip.
cpython-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules\report.py to report.cpytho
n-35.pyc
File "build\bdist.win-amd64\egg\plip\modules\report.py", line 100
print et.tostring(self.xmlreport, pretty_print=True)
^
SyntaxError: invalid syntax

byte-compiling build\bdist.win-amd64\egg\plip\modules\supplemental.py to supplem
ental.cpython-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules\visualize.py to visualize.
cpython-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules\webservices.py to webservi
ces.cpython-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip\modules_init_.py to init.cp
ython-35.pyc
byte-compiling build\bdist.win-amd64\egg\plip_init_.py to init.cpython-35
.pyc
creating build\bdist.win-amd64\egg\EGG-INFO
installing scripts to build\bdist.win-amd64\egg\EGG-INFO\scripts
running install_scripts
running build_scripts
creating build\scripts-3.5
copying and adjusting plip\plipcmd -> build\scripts-3.5
creating build\bdist.win-amd64\egg\EGG-INFO\scripts
copying build\scripts-3.5\plipcmd -> build\bdist.win-amd64\egg\EGG-INFO\scripts
copying plip.egg-info\PKG-INFO -> build\bdist.win-amd64\egg\EGG-INFO
copying plip.egg-info\SOURCES.txt -> build\bdist.win-amd64\egg\EGG-INFO
copying plip.egg-info\dependency_links.txt -> build\bdist.win-amd64\egg\EGG-INFO

copying plip.egg-info\not-zip-safe -> build\bdist.win-amd64\egg\EGG-INFO
copying plip.egg-info\requires.txt -> build\bdist.win-amd64\egg\EGG-INFO
copying plip.egg-info\top_level.txt -> build\bdist.win-amd64\egg\EGG-INFO
creating dist
creating 'dist\plip-1.3.3-py3.5.egg' and adding 'build\bdist.win-amd64\egg' to i
t
removing 'build\bdist.win-amd64\egg' (and everything under it)
Processing plip-1.3.3-py3.5.egg
creating c:\python35\lib\site-packages\plip-1.3.3-py3.5.egg
Extracting plip-1.3.3-py3.5.egg to c:\python35\lib\site-packages
File "c:\python35\lib\site-packages\plip-1.3.3-py3.5.egg\plip\modules\report.p
y", line 100
print et.tostring(self.xmlreport, pretty_print=True)
^
SyntaxError: invalid syntax

Adding plip 1.3.3 to easy-install.pth file
Installing plipcmd script to C:\Python35\Scripts

Installed c:\python35\lib\site-packages\plip-1.3.3-py3.5.egg
Processing dependencies for plip==1.3.3
Searching for lxml==3.7.0
Best match: lxml 3.7.0
Adding lxml 3.7.0 to easy-install.pth file

Using c:\python35\lib\site-packages
Searching for numpy==1.11.3+mkl
Best match: numpy 1.11.3+mkl
Adding numpy 1.11.3+mkl to easy-install.pth file

Using c:\python35\lib\site-packages
Searching for openbabel==2.4.1
Best match: openbabel 2.4.1
Adding openbabel 2.4.1 to easy-install.pth file

Using c:\python35\lib\site-packages
Finished processing dependencies for plip==1.3.3

C:\Python35\py35whl\plip>`

Here the result when I typed python plipcmd in Python35 Scripts

`C:\Python35\Scripts>python plipcmd
Traceback (most recent call last):
File "c:\python35\lib\site-packages\plip-1.3.3-py3.5.egg\EGG-INFO\scripts\plip
cmd", line 25, in
from plip.modules.preparation import *
File "C:\Python35\lib\site-packages\plip-1.3.3-py3.5.egg\plip\modules\preparat
ion.py", line 24, in
from detection import *
ImportError: No module named 'detection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "plipcmd", line 4, in
import('pkg_resources').run_script('plip==1.3.3', 'plipcmd')
File "C:\Python35\lib\site-packages\pkg_resources_init_.py", line 719, in r
un_script
self.require(requires)[0].run_script(script_name, ns)
File "C:\Python35\lib\site-packages\pkg_resources_init_.py", line 1504, in
run_script
exec(code, namespace, namespace)
File "c:\python35\lib\site-packages\plip-1.3.3-py3.5.egg\EGG-INFO\scripts\plip
cmd", line 33, in
from modules.preparation import *
ImportError: No module named 'modules'

C:\Python35\Scripts>`

How to resolve this?
I neeed your assisstance, thanks

python modules PLIP is unable to import pybel & openbabel with OB 3.x

i think this should be a little bug.
In the "supplemental.py" and "detection.py" scripts, the way to import pybel & openbabel is wrong.
see https://open-babel.readthedocs.io/en/latest/UseTheLibrary/migration.html#python-module
It should be like:

# External libraries
# import pybel
from openbabel import pybel
# from pybel import *
from openbabel.pybel import *
# from openbabel import *
from openbabel.openbabel import *

good wish!

request for feature: project isolated ligand or protein into pharmacophore feature space

Hello,

Currently, PLIP does the interaction calculation only for protein-ligand complexes.

Would it be possible to detect and list possible interaction sites for a ligand or a protein only?

It is OK for one atom to be listed with several possible interactions.

Regards,
Francois.

Glutamic acid as hydrogen bond donor

Hi!

As shown below, a hydrogen bond is predicted between glutamic acid's side-chain with an oxygen. Under physiological pH, the side-chain should be deprotonated hence shouldn't act as a hydrogen bond donor?

Set up continuous integration

Because this project is already on GitHub, it would be a great thing in my opinion to set up continuous integration for this repository to improve code quality and cross-check pull requests. A good example for continuous integration can be found on our TU Dresden INLOOP-Project at https://github.com/st-tu-dresden/inloop. We use Travis and Codacy to review our code before merging it in master and deploying it on our servers. As in Pull Request st-tu-dresden/inloop#217 in our INLOOP Repository, one can see how the commited code is only passed-through after the code checks are positive. Setting up continuous integration (CI) is also very simple and can be done right away. Therefore I would propose CI for this repository. What do you think about it @ssalentin?

Is this a bug from PLIP v2.1.0-beta?

I am trying to run 'python plipcmd.py -i 1s3v -xv' and encounter the follow problem:

in extract_ligand() of preparation.py
hetatoms_res = set([(obatom.GetIdx(), obatom) for obatom in pybel.ob.OBResidueAtomIter(obresidue)
TypeError: unhashable type: 'OBAtom'

Segfault with Python 3.6

The example ln the README segfaults with Python 3.6

Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 17:14:51)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.1.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from plip.modules.preparation import PDBComplex

In [2]: my_mol = PDBComplex()

In [3]: my_mol.load_pdb('1EVE.pdb')

In [4]: print(my_mol)
Protein structure 1eve with ligands:
NAG:A:3002
NAG:A:3001
NAG:A:3004
NAG:A:3005
E20:A:2001

In [5]: my_bsid = 'E20:A:2001'

In [6]: my_mol.analyze()
[1] 29106 segmentation fault (core dumped) ipython

XMLParser: MetalComplex'es have no ligcoo and protcoo

In the XML parser, when I have a metal complex (say complex = pm.bsites['CU:_:2'].metal_complexes[0]) and I get the ligcoo or protcoo attributes, I get an empty list [] instead of a coordinates tuple.

This makes sense when I look at the XML itself: there is indeed no ligcoo/protcoo but instead metalcoo and targetcoo.

At this point I'm not sure if it's an issue with documentation (which has: "All | ligcoo" and "All | protcoo") or the XML writer in plipcmd. What is the reason to use different XML elements for metal complexes?

plipcmd not in the good directory

Hi,

While I cloned the repo, the first time I tried to launch plip, I got the following error :
Traceback` (most recent call last): File "plipcmd", line 12, in <module> from plip.modules.preparation import * ModuleNotFoundError: No module named 'plip'

It seem the plipcmd should be in the upper directory. Or the import should be modules.preparation and not pip.odeules.preparation.

plipcmd segfaults with PDBID 1u6b

plipcmd throws a segmentation fault when PDB entry 1u6b is used as an input. I haven't tried to debug the problem yet, but may look into it later.

$ plipcmd -v -i 1u6b

**********************************************
* Protein-Ligand Interaction Profiler v1.3.0 *
**********************************************


Checking status of PDB ID 1u6b ... entry is up to date.
Downloading file from PDB ... file downloaded as /tmp/1u6b.pdb


Starting analysis of 1u6b.pdb
===============================
PDB structure successfully read.
Excluded molecules as ligands: 5MU,A23,GTP
Analyzing 4 ligands...

K-MG-K-G-C-C-G-U-G-U... [RNA+ION] -- K:B:2 + MG:B:3 + K:B...
-------------------------------------------------------------
  [Warning] Could not write SMILES for this ligand.
Segmentation fault (core dumped)

PLIP version 1.3.0
Python 2.7.6
Ubuntu 14.04

It worked in version 1.3.3, but Now it did not work in 1.3.4 version for some PDB-ID

when I used 1.3.3, PDB-ID 1bju gave me all (vptyx) result. after upgrading to 1.3.4, it reported an error:

Run PLIP
**********************************************
* Protein-Ligand Interaction Profiler v1.3.4 *
**********************************************
Starting analysis of 1bju.pdb
================================
2217 lines automatically fixed in PDB input file.
Error:  File contains no valid molecules.

There was a fix pdb file in output folder (attached as f1bju-locally.pdb) but still give an error. When I downloaded a fix pdb file from https://projects.biotec.tu-dresden.de/plip-web/plip/ for the same PDB-ID / file (attached as f1bju-web.pdb), it worked again like charmed.

This is my system:
Win 7 32 bit (virtual: 1 core), Python 2.7.12, PyMOL 1.7.6.0, OpenBabel 2.4.1
Thanks

f1bju.zip

Error missing builtins module

Hi,
When I execute the command
plip -i 1vsn -yv

I've an error for File $MY_PLIPTOOL_PATH/plip/modules/preparation.py, line 9, in from builtins import filter
ImportError : No module named builtins.

Results as JSON

Is your feature request related to a problem? Please describe.
The standard XML result format of PLIP leads to large result files and makes more difficult to readily integrate the tool into existing microservice architectures.

Describe the solution you'd like
Change the default output format of PLIP to JSON.

Describe alternatives you've considered
Offer it as an alternative output format.

Additional context

PLIP fails to recognize water bridges

For some structures, PLIP fails to recognize water bridges with:

Python 3.6
OpenBabel 3.0.0

Examples include: 4CUM where PLIP does not identify water bridges when used with Python 3 and OpenBabel 3.

Please port to Python3

Hello,
the Debian Med team is maintaining PLIP for official Debian. The recently released Debian 10 was the last Debian release featuring Python2 since this programming language is EOL. If you are interested that we continue to maintain PLIP in official Debian (and that users of other modern distributions will have no problems to install PLIP on their systems) I'd recommend you port your code to Python3. The 2to3 tool might be of great help here.
Kind regards, Andreas.

Example of python library

This piece of code was not working for me (in 2. Run PLIP - as a python library):

print [pistack.resnr for pistack in s.pistacking] # NameError: name 's' is not defined

I am not sure whether that is how it was supposed to be but with:
print [pistack.resnr for pistack in my_interactions.pistacking]
actually prints [84, 129]

Unstable results and large angle differences with Python 3.6

We are porting SWISS-MODEL to Python 3 (3.6.6 compiled with GCCcore-7.3.0). I am having trouble with the tests related to PLIP. It seems that the ordering of residues in the XML output (bs_residue) is arbitrary, and changes from run to run. This also results in slightly different numerical results from run to run. This would be fine, but some changes are surprisingly large.

To demonstrate the problem, I ran the following command on 4DST:

wget https://files.rcsb.org/download/4DST.pdb
plipcmd.py -f 4DST.pdb --name 4DST_report -o . -x

with PLIP 1.4.5 (from the stable branch), installed with pip install . under different conditions. I am attaching the XML outputs below (gzipped because GitHub doesn't accept XML directly).

First with Python 2.7.11: 4DST_report_2.7.xml.gz
Then with Python 3.6.6 (run 1): 4DST_report_3.6_1.xml.gz
Then again with Python 3.6.6 (run 2)" 4DST_report_3.6_2.xml.gz

Running diff -u 4DST_report_3.6_1.xml 4DST_report_3.6_2.xml higlights changes in ordering and rounding, which are fairly minor. But some changes are uncomfortably large. For instance if you compare the angles of the Pi stack between GCP:202 and PHE:28, the first decimal is off:

-          <angle>80.14</angle>
-          <offset>1.24</offset>
+          <angle>80.62</angle>
+          <offset>1.28</offset>

Is this expected?

Note that with Python 2.7, I was getting 80.21 reliably as an angle, and that this is what I get with the web version of PLIP.

Installation of PLIP

Dear I am trying to install plip by using the downloaded local plip as well as by using pip as per instructions. However, I am getting this error when install from local plip setup.py

Reading https://pypi.python.org/simple/pymol/
No local packages or download links found for pymol
error: Could not find suitable distribution for Requirement.parse('pymol')

On the other hand after the installation by pip I could not find the command plip in command prompt.

Please help me to trouble shoot

Best regards,
Rehan

ModuleNotFoundError: No module named 'plip.basic'

I am trying to install the latest PLIP beta on Ubuntu 19.10. I'm using an apt-get installed python 3.7.5, and a manually compiled and installed OpenBabel 3.1.1.

Then I tried to install the beta PLIP:

$ git checkout tags/v2.1.0-beta
$ python3 setup.py install --user
/usr/lib/python3/dist-packages/setuptools/dist.py:474: UserWarning: Normalizing '2.1.0-beta' to '2.1.0b0'
  normalized_version,
running install
running bdist_egg
running egg_info
writing plip.egg-info/PKG-INFO
writing dependency_links to plip.egg-info/dependency_links.txt
writing requirements to plip.egg-info/requires.txt
writing top-level names to plip.egg-info/top_level.txt
reading manifest file 'plip.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'plip.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib
creating build/lib/plip
copying plip/plipcmd.py -> build/lib/plip
copying plip/__init__.py -> build/lib/plip
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/plip
copying build/lib/plip/plipcmd.py -> build/bdist.linux-x86_64/egg/plip
copying build/lib/plip/__init__.py -> build/bdist.linux-x86_64/egg/plip
byte-compiling build/bdist.linux-x86_64/egg/plip/plipcmd.py to plipcmd.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/plip/__init__.py to __init__.cpython-37.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
installing scripts to build/bdist.linux-x86_64/egg/EGG-INFO/scripts
running install_scripts
running build_scripts
creating build/scripts-3.7
copying and adjusting plip/plipcmd.py -> build/scripts-3.7
changing mode of build/scripts-3.7/plipcmd.py from 644 to 755
creating build/bdist.linux-x86_64/egg/EGG-INFO/scripts
copying build/scripts-3.7/plipcmd.py -> build/bdist.linux-x86_64/egg/EGG-INFO/scripts
changing mode of build/bdist.linux-x86_64/egg/EGG-INFO/scripts/plipcmd.py to 755
copying plip.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying plip.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying plip.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying plip.egg-info/not-zip-safe -> build/bdist.linux-x86_64/egg/EGG-INFO
copying plip.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying plip.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
creating dist
creating 'dist/plip-2.1.0b0-py3.7.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing plip-2.1.0b0-py3.7.egg
creating /home/xavier/.local/lib/python3.7/site-packages/plip-2.1.0b0-py3.7.egg
Extracting plip-2.1.0b0-py3.7.egg to /home/xavier/.local/lib/python3.7/site-packages
Adding plip 2.1.0b0 to easy-install.pth file
Installing plipcmd.py script to /home/xavier/.local/bin

Installed /home/xavier/.local/lib/python3.7/site-packages/plip-2.1.0b0-py3.7.egg
Processing dependencies for plip==2.1.0b0
Searching for pymol==2.2.0
Best match: pymol 2.2.0
Adding pymol 2.2.0 to easy-install.pth file

Using /usr/lib/python3/dist-packages
Searching for openbabel==3.1.0
Best match: openbabel 3.1.0
Adding openbabel 3.1.0 to easy-install.pth file

Using /home/xavier/.local/lib/python3.7/site-packages
Searching for numpy==1.16.2
Best match: numpy 1.16.2
Adding numpy 1.16.2 to easy-install.pth file
Installing f2py script to /home/xavier/.local/bin
Installing f2py3 script to /home/xavier/.local/bin
Installing f2py3.7 script to /home/xavier/.local/bin

Using /usr/lib/python3/dist-packages
Searching for lxml==4.5.0
Best match: lxml 4.5.0
Adding lxml 4.5.0 to easy-install.pth file

Using /home/xavier/.local/lib/python3.7/site-packages
Finished processing dependencies for plip==2.1.0b0

As indicated by the log, plipcmd.py is correctly installed in /home/xavier/.local/bin. However it doesn't work:

$ which plipcmd.py
/home/xavier/.local/bin/plipcmd.py
$ plipcmd.py 
Traceback (most recent call last):
  File "/home/xavier/.local/bin/plipcmd.py", line 4, in <module>
    __import__('pkg_resources').run_script('plip==2.1.0b0', 'plipcmd.py')
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 666, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1462, in run_script
    exec(code, namespace, namespace)
  File "/home/xavier/.local/lib/python3.7/site-packages/plip-2.1.0b0-py3.7.egg/EGG-INFO/scripts/plipcmd.py", line 16, in <module>
    from plip.basic import config, logger
ModuleNotFoundError: No module named 'plip.basic'

It looks like there was a problem during the installation, and some python files weren't copied. The plip folder within the egg is nearly empty:

$ ls /home/xavier/.local/lib/python3.7/site-packages/plip-2.1.0b0-py3.7.egg/plip/
__init__.py  __pycache__  plipcmd.py

And there is no plip folder directly in site-packages

$ ls /home/xavier/.local/lib/python3.7/site-packages/plip
ls: cannot access '/home/xavier/.local/lib/python3.7/site-packages/plip': No such file or directory

I'm not sure what's going on and how to proceed. Please let me know if you need any additional information to help solve the issue.

Not supporting for multicore in Windows enviroment

when I did not set --maxthread to 1, plip did not work well. I got loop process in something like "..multiprocessing/fork.."
Plip can work again when we set 1 core.

Would you like to add this feature in the future update.

Thanks

PyMol in setup.py requirements

PyMol on PyPI is not Schrödinger's pymol, and as far as I know there's no package for that on PyPI. In my opinion it should be removed, because it's misleading and causes an error:

$> pip install pymol
ERROR: Could not find a version that satisfies the requirement pymol (from versions: none)
ERROR: No matching distribution found for pymol

This was initially noticed in conda-forge/plip-feedstock#1 (comment)

conda forge package

I have created a conda forge package for plip here: conda-forge/staged-recipes#11680

I can add you as a maintainer if you want, just leave a comment on the PR.

Update Web Server's Download link

It still points to https://github.com/ssalentin/plip/archive/stable.zip

Hydrogen Bond Annotation is Non-Deterministic

This might be linked with issue #49 I just opened, but this might be different so I am reporting it separately. Running with Python 3.6.6 and PLIP 1.4.5 from branch stable.

I an using 4DST again as before:

wget https://files.rcsb.org/download/4DST.pdb
plipcmd.py -f 4DST.pdb --name 4DST_report -o . -x

Here are two runs that display the problem:

Run 1: 4DST_report_3.6_1.xml.gz
Run 2: 4DST_report_3.6_4.xml.gz

I normally see 16 hydrogen bonds reported for ligand GCP:202. However if I run the same command again I can sometimes see a 17th bond popping up. In 4DST_report_3.6_4.xml you will see the new bond id="9" to VAL:29 on line 1030. It is the same as bond with id="10" but with the donor and acceptor atoms swapped.

Only one of the two is reported in XML file 4DST_report_3.6_1.xml.gz.

Is it expected to see different bonds from run to run?

Create ARM compatible docker images

The current images aren't ARM compatible and give exec error when you try to execute them

Different results between web and command line

Describe the bug

Relevant files attached: 3sgj_smr.tar.gz

I'm trying to obtain the PLIP interactions for file 3sgj_smr.pdb (derived from PDB 3SGJ, see attachment above). When I run the same file on the command line, I get very different results than when I run it on the command line

I submitted a request to the web interface by uploading this file: 68269632-0eee-493b-993d-b2a5362840e0. In the SMALLMOLECULE section, under NAG, I can see several hydrogen bonds between NAG-_-1 and several residues in chain A: LYS.24, ASP.43, ASN.75, ARG.79.

I downloaded the XML report from this web report (3sgj_smr_plipweb_report.xml) and I can clearly see those interactions in there.

When I run PLIP on the command line with

plip -f 3sgj_smr.pdb -x

I can only see NAG-_-1 making hydrogen bonds with LYS.24. I can see no interactions with ASP.43, ASN.75 or ARG.79. See the file 3sgj_smr_plipcmd_report.xml in attachment.

This XML report was generated with PLIP 2.1.3, Python 3.6.6 and OpenBabel 3.1.1. However I get similar results with our older build PLIP 1.4.2/Python 2.7.11/OpenBabel 2.4.1 (all on Centos7 systems), as well as the current code on the master branch (which I guess is equivalent to v.2.1.4).

To Reproduce
Web:

Go to https://projects.biotec.tu-dresden.de/plip-web/plip/index
Select PDB file and Browse for file 3sgj_smr.pdb
Run analysis
Wait a bit
Open SMALLMOLECULE > NAG, > NAG-_-1 and look at Hydrogen bonds table.
Download XML with "... in XML format" link (giving 3sgj_smr_plipweb_report.xml)

Command line:

Run command plip -f 3sgj_smr.pdb -x
Look at report.xml file (equivalent to 3sgj_smr_plipcmd_report.xml).

Expected behavior
I was expecting to see the same interactions reported from the command line run and on the web.

What could explain these differences?

TypeError: '<' not supported between instances of 'Atom' and 'Atom' with Python 3

Running the current 'stable' branch from GitHub, Python 3.6.6 and OpenBabel 2.4.1, it seems that PLIP fails on water bridges:

$ python plip/plipcmd.py -i 3GKR
Traceback (most recent call last):
  File "plip/plipcmd.py", line 311, in <module>
    main_init()
  File "plip/plipcmd.py", line 307, in main_init
    main(expanded_path, arguments.pdbid)  # Start main script
  File "plip/plipcmd.py", line 171, in main
    process_pdb(pdbpath, config.OUTPATH, outputprefix=outputprefix)
  File "plip/plipcmd.py", line 66, in process_pdb
    mol.characterize_complex(ligand)
  File "/scicore/home/schwede/zohixe92/projects/plip/plip/modules/preparation.py", line 1439, in characterize_complex
    pli_obj = PLInteraction(lig_obj, bs_obj, self)
  File "/scicore/home/schwede/zohixe92/projects/plip/plip/modules/preparation.py", line 628, in __init__
    self.water_bridges = self.refine_water_bridges(self.water_bridges, self.hbonds_ldon, self.hbonds_pdon)
  File "/scicore/home/schwede/zohixe92/projects/plip/plip/modules/preparation.py", line 865, in refine_water_bridges
    wb_dict2[water] = sorted(wb_dict2[water])
TypeError: '<' not supported between instances of 'Atom' and 'Atom'

This affects apparently a lot of entries from the PDB, a quick stats from our side indicates around 30% of new entries. To cite a few: 3EMS, 4CUM, 5DDR, 1D7P, 6R79.

Usermanual

Is your feature request related to a problem? Please describe.
It's very hard, at least for me. Processing a lot of molecules using PLIp as part of a python script. I have tried everything for saving png visualization using pymol on various complex but i have got nothing.

Describe the solution you'd like
Add python module manual

Describe alternatives you've considered
Add a quick tutorial
Additional context
Add any other context or screenshots about the feature request here.

pharmai / plip Goto Github PK

plip's Introduction

PharmAI

Motivation

Description

Caveats about using the MIMIC dataset with our approach

Files

preprocessor.py / mimic_preprocessor.py

original version

mimic version

w2v_embeddings.py

train.py

evaluate.py

components.py

utils/dummy_class.py

utils/extract_druginfo_mimic.py

utils/w2v_analogies.py

Prerequisites

Contributors

References

License

plip's People

Contributors

Stargazers

Watchers

Forkers

plip's Issues

System information

ERROR:

for how I installed plip:

method 1: using pip

Method 2: manual

Additional context

Recommend Projects

Recommend Topics

Recommend Org