acellera / moleculekit Goto Github PK

MoleculeKit: Your favorite molecule manipulation kit

License: Other

Shell 0.25% Python 68.10% Rich Text Format 0.08% Jupyter Notebook 0.38% Makefile 0.01% Cython 8.35% HTML 0.63% C 11.20% C++ 10.99% Dockerfile 0.01%

molecule machine-learning proteins drug-discovery molecular-modeling molecular-simulation

moleculekit's Introduction

MoleculeKit

A molecule manipulation library

Getting started

We recommend installing Miniconda on your machine to better manage python packages and environments.

You can install moleculekit either in the "base" conda environment or in a new conda environment. We recommend the second.

Install it into the base conda environment

With conda

Installation Instructions

With pip

The pip version of moleculekit is VERY limited and not officially supported. Use at your own risk.

(base) user@computer:~$ pip install moleculekit

Optional dependencies of moleculekit

Moleculekit has a small number of optional dependencies which are needed for some of it's functionalities. They were not added to the default dependencies to keep moleculekit a fast and small installation and to avoid unnecessary conflicts with other software. However if you want to leverage all of it's functionality you can install the rest of the dependencies with the following command:

(moleculekit) user@computer:~$ wget https://raw.githubusercontent.com/Acellera/moleculekit/master/extra_requirements.txt
(moleculekit) user@computer:~$ conda install --file extra_requirements.txt -c acellera

Using moleculekit in ipython

Install ipython in the correct conda enviroment using the following command. If you have installed the extra dependencies as above, you can skip this step since it already installs ipython.

(moleculekit) user@computer:~$ conda install ipython

Now you can start an ipython console with

(moleculekit) user@computer:~$ ipython

In the ipython console you can now import any of the modules of moleculekit and use it as normal.

from moleculekit.molecule import Molecule

mol = Molecule('3ptb')
mol.view()

API

For the official documentation of the moleculekit API head over to https://software.acellera.com/moleculekit/index.html

Issues

For any bugs or questions on usage feel free to use the issue tracker of this github repo.

Dev

If you are using moleculekit without installing it by using the PYTHONPATH env var you will need to compile the C++ extensions in-place with the following command:

python setup.py build_ext --inplace

Building for WebAssembly

Install emscripten https://emscripten.org/docs/getting_started/downloads.html

mamba create -n pyodide-build
mamba activate pyodide-build
mamba install python=3.11
pip install pyodide-build==0.25.1

# Activate the emscripten environment
cd ../emsdk
./emsdk install 3.1.46
./emsdk activate 3.1.46
source emsdk_env.sh
cd -

# Build the package
export PYO3_CROSS_INCLUDE_DIR="HACK"
export PYO3_CROSS_LIB_DIR="HACK"
rm -rf .pyodide-xbuildenv
pyodide build -o dist_pyodide
cp dist_pyodide/*.whl test_wasm/wheels/
cd test_wasm
python3 -m http.server

If you get an error at building about numpy missing, check this issue pyodide/pyodide#4347

Citing MoleculeKit

If you use this software in your publication please cite:

Stefan Doerr, Matthew J. Harvey, Frank Noé, and Gianni De Fabritiis. HTMD: High-throughput molecular dynamics for molecular discovery. Journal of Chemical Theory and Computation, 2016, 12 (4), pp 1845–1852. doi:10.1021/acs.jctc.6b00049

moleculekit's People

Stargazers

Watchers

moleculekit's Issues

tmalign.dll can't be found

When I run code using moleculekit in window system, error raised: couldn't find tmalign.dll.
How could I find this dll file?
Thanks.

RuntimeError: Element So not found in the periodictable.

HTMD read this structure.pdb.txt, but MoleculeKit fails.

In [1]: import htmd                                                                                                                                                      

In [2]: htmd.__version__                                                                                                                                                 
Out[2]: '1.13.10'

In [3]: from htmd.molecule.molecule import Molecule                                                                                                                      

In [4]: Molecule('structure.pdb')                                                                                                                                        
Out[4]: 
<htmd.molecule.molecule.Molecule object at 0x7fc51d9a8240>
Molecule with 32404 atoms and 1 frames
Atom field - altloc shape: (32404,)
Atom field - atomtype shape: (32404,)
Atom field - beta shape: (32404,)
Atom field - chain shape: (32404,)
Atom field - charge shape: (32404,)
Atom field - coords shape: (32404, 3, 1)
Atom field - element shape: (32404,)
Atom field - insertion shape: (32404,)
Atom field - masses shape: (32404,)
Atom field - name shape: (32404,)
Atom field - occupancy shape: (32404,)
Atom field - record shape: (32404,)
Atom field - resid shape: (32404,)
Atom field - resname shape: (32404,)
Atom field - segid shape: (32404,)
Atom field - serial shape: (32404,)
angles shape: (0, 3)
bonds shape: (64854, 2)
bondtype shape: (64854,)
box shape: (3, 1)
boxangles shape: (3, 1)
crystalinfo: {}
dihedrals shape: (0, 4)
fileloc shape: (1, 2)
impropers shape: (0, 4)
reps: 
ssbonds shape: (0,)
step shape: (1,)
time shape: (1,)
topoloc: /home/user/acemd3.git/tests/functional/models/dhfr/equil_charmm/structure.pdb
viewname: structure.pdb

In [5]: import moleculekit                                                                                                                                               

In [6]: moleculekit.__version__                                                                                                                                          
Out[6]: '0.1.4'

In [7]: from moleculekit.molecule import Molecule                                                                                                                        

In [8]: Molecule('structure.pdb')                                                                                                                                        
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-8-f80fca32ed08> in <module>
----> 1 Molecule('structure.pdb')

~/conda/lib/python3.6/site-packages/moleculekit/molecule.py in __init__(self, filename, name, **kwargs)
    239 
    240         if filename is not None:
--> 241             self.read(filename, **kwargs)
    242 
    243     @staticmethod

~/conda/lib/python3.6/site-packages/moleculekit/molecule.py in read(self, filename, type, skip, frames, append, overwrite, keepaltloc, guess, guessNE, _logger, **kwargs)
    951             for rr in readers:
    952                 try:
--> 953                     mol = rr(fname, frame=frame, topoloc=tmppdb, **kwargs)
    954                 except FormatError:
    955                     continue

~/conda/lib/python3.6/site-packages/moleculekit/readers.py in PDBread(filename, mode, frame, topoloc)
    895     topo.crystalinfo = crystalinfo
    896     traj = Trajectory(coords=coords)
--> 897     return MolFactory.construct(topo, traj, filename, frame)
    898 
    899 

~/conda/lib/python3.6/site-packages/moleculekit/readers.py in construct(topos, trajs, filename, frame)
    142             if topo is not None:
    143                 mol._emptyTopo(natoms)
--> 144                 MolFactory._parseTopology(mol, topo, filename)
    145             if traj is not None:
    146                 mol._emptyTraj(natoms)

~/conda/lib/python3.6/site-packages/moleculekit/readers.py in _parseTopology(mol, topo, filename)
    223             el = mol.element[i].lower().capitalize()
    224             if el not in periodictable:
--> 225                 raise RuntimeError('Element {} not found in the periodictable.'.format(el))
    226             mol.element[i] = el
    227 

RuntimeError: Element So not found in the periodictable.

The PDB was build with HTMD as a part of the ACEMD3 tests by Alberto.

mol.append with collisions issue

from moleculekit.molecule import Molecule
mol = Molecule()
mol1 = Molecule('3ptb')
# mol.append(mol1) # works
mol.append(mol1, collisions=True) # crashes

mol.bonds error after prepareProteinForAtomtyping()

I encounted another issue with PDBID 5DOW. Below is the minimum code:

from moleculekit.molecule import Molecule
from moleculekit.tools.atomtyper import prepareProteinForAtomtyping
from moleculekit.tools.voxeldescriptors import getVoxelDescriptors


mol = Molecule("pdb5dow.ent.gz")
mol.filter("protein or ions and (not resname CL) and (not resname NA)")
prepareProteinForAtomtyping(mol)
getVoxelDescriptors(mol)

It resulted in below error at getChannels() -> getPDBQTAtomTypesAndCharges() -> atomtypingValidityChecks().

ValueError: The protein has less bonds than (number of atoms - 1). This seems incorrect. You can assign bonds with mol.bonds = mol._getBonds()

The version of moleculekit is 0.1.30. Is there any suggestion to avoid it?

problem with prepareProtein

Hello, fantastic work with moleculekit and HTMD, just fantastic.

I have a problem getting prepareProtein to run or prepareProteinForAtomtyping (im trying to do the voxelization tutorial) and I've reached the limit of my ability to debug the problem. I'm using the conda install of python 3.6 moleculekit 0.3.2, with a moleculekit conda environment and the extra requirements (pandas 0.22.0 numpy 1.18.5), prepareProtein errors with
ValueError: The 'dtype' option is not supported with the 'python-fwf' engine

on my base conda python 3.7.6 with pandas 1.0.3 numpy 1.18.1 the prepareProtein seems to be throwing an exception in the preparationdata.py module line 123 (def _findRes) at pos = int(np.argwhere(mask)) with

Exception: Data must be 1-dimensional

python3.6.txt
python3.7.txt

Any insights you have to solving this problem would be greatly appreciated.

        Drew

SmallMol object missing getCoords() function

Was there ever a getCoords() function in the SmallMol object?

A module released in this literature (https://github.com/compsciencelab/ligdream) (7 months old) try to call this function and I get an error.

Thanks

VMD visualization does not pop up

Hello,

I have followed the voxelization tutorial and the VMD visualization doesn't pop up. In particular, nothing happens after executing the following lines:

from moleculekit.molecule import Molecule
from moleculekit.tools.voxeldescriptors import getVoxelDescriptors, viewVoxelFeatures
from moleculekit.tools.atomtyper import prepareProteinForAtomtyping
from moleculekit.smallmol.smallmol import SmallMol
from moleculekit.home import home
import os

slig = SmallMol('benzamidine.mol2')   # I have placed benzamidine.mol2 in the same folder
lig_vox, lig_centers, lig_N = getVoxelDescriptors(slig, voxelsize=0.5, buffer=1)
slig.view(guessBonds=False)
viewVoxelFeatures(lig_vox, lig_centers, lig_N)

I have tried both in jupyter notebook and in iPython. I have installed moleculekit through conda (conda install moleculekit -c acellera) and the version of VMD installed in my system is VMD for LINUXAMD64, version 1.9.3 (December 1, 2016). VMD is precompiled. My OS is Ubuntu 18.04.

I have searched the documentation but I haven't found any info on how to make VMD work. Do you think you could give me some guidance? Thanks.

Change behaviour of selection in MetricSASA

Currently it filters out all other atoms. The more intuitive behaviour would be to calculate the SASA including all atoms and just return the selected values. Will need to add a filtersel option for filtering if desired

SIGSEGV-Error

Hi Devs,

I have found a new problem. Maybe this is worth a new ticket.

I have observed, that sometimes the call of getVoxelDescriptors leads to a SIGSEGV-Error. After reading through this error, it seems, that this is often a software-sided problem. Did you have experience with this?

Voxelization of PDB 1E2X ends strange error message

I faced a RuntimeError when I tried to voxelize 1E2X.

# I tried to voxelize based on https://software.acellera.com/docs/latest/moleculekit/tutorials/voxelization_tutorial.html

from moleculekit.molecule import Molecule
from moleculekit.tools.voxeldescriptors import getVoxelDescriptors, viewVoxelFeatures
from moleculekit.tools.atomtyper import prepareProteinForAtomtyping
from moleculekit.smallmol.smallmol import SmallMol
from moleculekit.home import home
import os

prot = Molecule("pdb1e2x.ent.gz")
prot.filter("protein or ions") # remove H2O and SO4
prot = prepareProteinForAtomtyping(prot)

prot_vox, prot_centers, prot_N = getVoxelDescriptors(prot, buffer=1)
# -> it causes an RuntimeError
# RuntimeError: Found atoms with resnames ['GLN'] in the Molecule 
# which can cause issues with the voxelization. 
# Please make sure to only pass protein atoms and metals.

Issue with mol.time mol.step

In [33]: mol.step                                                                                                                                                     
Out[33]: 
array([ 85899345930, 171798691870, 257698037810, 343597383750,
       429496729690, 515396075630, 601295421570, 687194767510,
       773094113450, 858993459390,            0,            0,
                  0,            0,            0,            0,
                  0,            0,            0,            0])

In [34]: mol.time                                                                                                                                                     
Out[34]: 
array([ 0.04      ,  0.08      ,  0.12      ,  0.16      ,  0.2       ,
        0.23999999,  0.28      ,  0.31999999,  0.36000001,  0.40000001,
        0.44      ,  0.47999999,  0.51999998,  0.56      ,  0.60000002,
        0.63999999,  0.68000001,  0.72000003,  0.75999999,  0.80000001], dtype=float32)

mol.time should be somehow rounded?
mol.step seems wrong. Need to fix the XTC reader

Problem with vmd parser

(My apologies ... I post this in htmd section)

I want to test htmd on the moleculekit tutorial files. When I type "mol.filter('protein')", program complains and return error that the selection ('protein') is not valid. When I change it to 'mol.filter('all') it works. I also get the same error from prepareProtein module which reports "resname DUM" is not a valid selection.

Maybe the libvmdlib requires update?

Thank you,
Alireza

Metals in voxel generation

Hi Stefan,

I was trying to generate voxels for structures that contain metals, but I'm not able to pass the prepareProteinForAtomtyping or the getChannels functions.
Basically fails at this check

moleculekit/moleculekit/tools/atomtyper.py

Line 109 in 4be4e4f

if np.any(~protsel):

with

RuntimeError: Found non-protein atoms with resnames ['HG' 'ZN'] in the Molecule. Please make sure to only pass protein atoms.

Is there another way to make it work? Otherwise the metal channel will always be empty.
An example PDB that contains zinc would be 1atl

Updated SASA behavior

Hi,
I'm trying the new SASA behaviour

sasa_met = MetricSasa(sel="segid P1 and resid 64", mode="residue") which I used before
and I get the error

    134         else:
    135             logger.warning('Cannot calculate description of dimensions due to different topology files for each trajectory.')
--> 136         mapping = self.getMapping(uqMol)
    137 
    138         logger.debug('Metric: Starting projection of trajectories.')

/shared/pablo/github/htmd/htmd/projections/metric.py in getMapping(self, mol)
     99         for proj in self.projectionlist:
    100             if isinstance(proj, Projection):
--> 101                 pandamap = pandamap.append(proj.getMapping(mol), ignore_index=True)
    102         return pandamap
    103 

/shared/pablo/github/moleculekit/moleculekit/projections/metricsasa.py in getMapping(self, mol)
    139         elif self._mode == 'residue':
    140             _, firstidx = np.unique(atom_mapping, return_index=True)
--> 141             atomidx = np.where(atomsel)[0][firstidx]
    142         else:
    143             raise ValueError('mode must be one of "residue", "atom". "{}" supplied'.format(self._mode))

IndexError: index 25 is out of bounds for axis 0 with size 21```

AtomTyper bug

During the processing of 1a42_1 of scPDB, I got following error.
File ".../site-packages/moleculekit/tools/atomtyper.py", line 56, in getPDBQTAtomType
bond = np.where(mol.bonds == aidx)[0][0]
IndexError: index 0 is out of bounds for axis 0 with size 0
I checked the case and one HG atom is present in the protein.
The getPDBQTAtomType code is not handling the HG atom case properly.
There are more than 50 proteins containg HG atom in the scPDB.

Chloride converted to carbon

I have encountered this issue in the moleculekit package.

Writing a pdb
In pdb file when there is a CL the name field is expected to be like the following:

ATOM      7  C7  ZYI A   1      14.452   7.328  -0.008  1.00 30.28      2    C
ATOM      8 CL8  ZYI A   1      15.987   8.160  -0.102  1.00 36.23      2   CL
ATOM      9  C9  ZYI A   1      14.023   6.546  -1.093  1.00 29.82      2    C

But Molecule object wrongly aligned the name field like the following:

ATOM      7  C7  ZYI A   1      14.452   7.328  -0.008  1.00 30.28      2    C
ATOM      8  CL8 ZYI A   1      15.987   8.160  -0.102  1.00 36.23      2   CL
ATOM      9  C9  ZYI A   1      14.023   6.546  -1.093  1.00 29.82      2    C

This will cause problem to third-party tools that we are using.

Using proteinPrepare.
The proteinPrepare convert the CL into C. I think the reason is the same as above, but in this case even if I am able to add a space for the correct alignment I suspect it stripped the string at some point.
I said that because if:

'CL8' --> converted to C
'CL8 ' --> converted to C
'CL81' --> not convert C
[3pdcHTMD.txt](https://github.com/Acellera/moleculekit/files/4147728/3pdcHTMD.txt)
[3pdcORIGINAL.txt](https://github.com/Acellera/moleculekit/files/4147729/3pdcORIGINAL.txt)

Error in Molecule and Smallmol while working with PDBBind dataset

I was trying to voxelize the PDBBind complexes with moleculekit but for most of the complexes, Moleculekit is throwing errors while building the Molecule or SmallMol object. Here is my code:

from moleculekit.molecule import Molecule
from moleculekit.tools.voxeldescriptors import getVoxelDescriptors, viewVoxelFeatures
from moleculekit.tools.atomtyper import prepareProteinForAtomtyping
from moleculekit.smallmol.smallmol import SmallMol
from moleculekit.home import home
import os

data_dir = "../../dataset/refined-set-2016"
protein_files = sorted(glob('../../dataset/refined-set-2016/*/*protein.pdb'))
ligand_files = sorted(glob('../../dataset/refined-set-2016/*/*ligand.mol2'))

for p, l in zip(protein_files, ligand_files):
    try:
        prot = Molecule(p)
        ligand = SmallMol(l)
    except Exception as e:
        print(str(e))

Here is the error

Failed to read file ../../dataset/refined-set-2016/2cli/2cli_ligand.mol2. Try by setting the force_reading option as True.

Is there any workaround? I want to voxelize all the complexes after building the Molecule and SmallMol object

can't import SmallMol

In [2]: from moleculekit.smallmol.smallmol import SmallMol
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-26fcfa8998d6> in <module>()
----> 1 from moleculekit.smallmol.smallmol import SmallMol

/shared/jose/moleculekit/moleculekit/smallmol/smallmol.py in <module>()
     29 
     30 _hybridizations_IdxToType = HybridizationType.values
---> 31 _hybridizations_StringToType = {'S': HybridizationType.S,
     32                                 'SP': HybridizationType.SP,
     33                                 'SP2': HybridizationType.SP2,

AttributeError: type object 'HybridizationType' has no attribute 'S'

SmallMol and SmallMolLib reading issues

Taking on example from PDBBind refined set, SmallMol and SmallMolLib have issues. With Molecule:

from moleculekit.molecule import Molecule
m = Molecule('1a1e_ligand.mol2')

works. With SmallMol:

from moleculekit.smallmol.smallmol import SmallMol
sm = SmallMol('1a1e_ligand.mol2', removeHs=True, fixHs=True, force_reading=True)

works with:

[15:33:04] 1a1e_ligand: warning - O.co2 with non C.2 or S.o2 neighbor.
2019-06-04 15:33:04,717 - moleculekit.smallmol.smallmol - WARNING - Reading 1a1e_ligand.mol2 with force_reading procedure

but produces a molecule with an hydrogen completely out of place:

The force_reading is necessary to work. Combinations of removeHs and fixHs produce the same output.

With SmallMolLib and using the sdf instead:

from moleculekit.smallmol.smallmollib import SmallMolLib
sml = SmallMolLib('1a1e_ligand.sdf')

fails with:

  0%|                                                     | 0/1 [00:00<?, ?it/s][15:38:20] Explicit valence for atom # 25 C, 6, is greater than permitted
[15:38:20] ERROR: Could not sanitize molecule ending on line 152
[15:38:20] ERROR: Explicit valence for atom # 25 C, 6, is greater than permitted

---------------------------------------------------------------------------
ArgumentError                             Traceback (most recent call last)
~/maindisk/SANDBOX/acemanager/latest_2019-05-02_11h59/lib/python3.6/site-packages/moleculekit/smallmol/smallmollib.py in sdfReader(file, removeHs, fixHs, isgzip)
     56         try:
---> 57             mols.append(SmallMol(mol, removeHs=removeHs, fixHs=fixHs))
     58         except:

~/maindisk/SANDBOX/acemanager/latest_2019-05-02_11h59/lib/python3.6/site-packages/moleculekit/smallmol/smallmol.py in __init__(self, mol, ignore_errors, force_reading, fixHs, removeHs)
    104         if fixHs:
--> 105             _mol = Chem.AddHs(_mol, addCoords=True)
    106 

ArgumentError: Python argument types in
    rdkit.Chem.rdmolops.AddHs(NoneType)
did not match C++ signature:
    AddHs(RDKit::ROMol mol, bool explicitOnly=False, bool addCoords=False, boost::python::api::object onlyOnAtoms=None)

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-18-b44ad4f6fbb3> in <module>
----> 1 sml = SmallMolLib('1a1e_ligand.sdf')

~/maindisk/SANDBOX/acemanager/latest_2019-05-02_11h59/lib/python3.6/site-packages/moleculekit/smallmol/smallmollib.py in __init__(self, libfile, removeHs, fixHs)
    100     def __init__(self, libfile=None, removeHs=False, fixHs=True):  # , n_jobs=1
    101         if libfile is not None:
--> 102             self._mols = self._loadLibrary(libfile, removeHs=removeHs, fixHs=fixHs, ext=None)
    103 
    104 

~/maindisk/SANDBOX/acemanager/latest_2019-05-02_11h59/lib/python3.6/site-packages/moleculekit/smallmol/smallmollib.py in _loadLibrary(self, libfile, removeHs, fixHs, ext)
    112 
    113         if ext == '.sdf':
--> 114             return sdfReader(libfile, removeHs, fixHs, isgzip)
    115         elif ext == '.smi':
    116             return smiReader(libfile, removeHs, fixHs, isgzip)

~/maindisk/SANDBOX/acemanager/latest_2019-05-02_11h59/lib/python3.6/site-packages/moleculekit/smallmol/smallmollib.py in sdfReader(file, removeHs, fixHs, isgzip)
     57             mols.append(SmallMol(mol, removeHs=removeHs, fixHs=fixHs))
     58         except:
---> 59             if mol.HasProp('_Name'):
     60                 name = mol.GetProp('_Name')
     61             print('Failed to load molecule{}. Skipping to next molecule.'.format(' with name {}'.format(name)))

AttributeError: 'NoneType' object has no attribute 'HasProp'

The files: 1a1e.zip

Issues with prepareProteinForAtomtyping

When the mol.chain is all empty, the L83 in moleculekit/moleculekit/tools/preparation.py, set the mol.chain as sequenceID(mol.segid), But L94 use len() function on it, then, TypeError: object of type 'numpy.int64' has no len().

moleculekit/moleculekit/tools/preparation.py

Line 83 in ea58f7d

mol.chain = sequenceID(mol.segid)

moleculekit/moleculekit/tools/preparation.py

Line 94 in ea58f7d

if np.any([len(cc) > 1 for cc in chainids]):

Wrapping a whole trajectory

Wrapping a whole trajectory behaves differently than wrapping each frame individually. Water molecules are sometimes not wrapped correctly in the last frames. The first is always wrapped fine.

Conda build 3.7 fails

Due to competing dependencies. Should figure out how to build it locally first.

'protein' atomselect disagreement + bond guessing issue

Some scPDB proteins mentioned in this issue #31 like 1a42_1/ have terminal residues which are not correctly bonded to the rest of the protein. Guessing bonds does not seem to work in moleculekit, however it works in VMD.

Ideas:

We are not sending all the data to VMD or to atomselect (I'm quite sure we do though)
Maybe there has been an update of the atomselection code in VMD?

Erro with voxelization

Hi guys,

I am experiecing some problems in the voxelization procedure. Even following the example, an error is raised.

from htmd.molecule.molecule import Molecule
from moleculekit.tools.voxeldescriptors import getVoxelDescriptors

m = Molecule('3PTB')
m.filter('protein')
features, centers, N = getVoxelDescriptors(m, buffer=8)

The error:

  File "current_session.py", line 7, in <module>
    features, centers, N = getVoxelDescriptors(m, buffer=8)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/voxeldescriptors.py", line 221, in getVoxelDescriptors
    channels, mol = getChannels(mol, aromaticNitrogen, version, validitychecks)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/voxeldescriptors.py", line 106, in getChannels
    mol.atomtype, mol.charge = getPDBQTAtomTypesAndCharges(mol, aromaticNitrogen=aromaticNitrogen, validitychecks=validitychecks)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/atomtyper.py", line 222, in getPDBQTAtomTypesAndCharges
    atomtypingValidityChecks(mol)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/atomtyper.py", line 196, in atomtypingValidityChecks
    raise ValueError('The protein has less bonds than (number of atoms - 1). This seems incorrect. You can assign bonds with `mol.bonds = mol._getBonds()`')
ValueError: The protein has less bonds than (number of atoms - 1). This seems incorrect. You can assign bonds with `mol.bonds = mol._getBonds()

I tried to follow the suggestion about the bonds:

from htmd.molecule.molecule import Molecule
from moleculekit.tools.voxeldescriptors import getVoxelDescriptors

m = Molecule('3PTB')
m.filter('protein')
m.bonds = m._getBonds()
features, centers, N = getVoxelDescriptors(m, buffer=8)

The error:

  File "current_session2.py", line 8, in <module>
    features, centers, N = getVoxelDescriptors(m, buffer=8)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/voxeldescriptors.py", line 221, in getVoxelDescriptors
    channels, mol = getChannels(mol, aromaticNitrogen, version, validitychecks)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/voxeldescriptors.py", line 106, in getChannels
    mol.atomtype, mol.charge = getPDBQTAtomTypesAndCharges(mol, aromaticNitrogen=aromaticNitrogen, validitychecks=validitychecks)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/atomtyper.py", line 222, in getPDBQTAtomTypesAndCharges
    atomtypingValidityChecks(mol)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/atomtyper.py", line 201, in atomtypingValidityChecks
    raise RuntimeError('The protein has duplicate bond information. This will mess up atom typing. Please keep only unique bonds in the molecule. If you want you can use moleculekit.molecule.calculateUniqueBonds for this.')
RuntimeError: The protein has duplicate bond information. This will mess up atom typing. Please keep only unique bonds in the molecule. If you want you can use moleculekit.molecule.calculateUniqueBonds for this.

Ok, as before I tried to follow the suggestion by using calculateUniqueBonds (following the example in the method).

from htmd.molecule.molecule import Molecule
from moleculekit.tools.voxeldescriptors import getVoxelDescriptors
from moleculekit.molecule import calculateUniqueBonds

m = Molecule('3PTB')
m.filter('protein')
m.bonds, m.bondtype = calculateUniqueBonds(m.bonds, m.bondtype)
features, centers, N = getVoxelDescriptors(m, buffer=8)

The error:

Traceback (most recent call last):
  File "current_session3.py", line 9, in <module>
    features, centers, N = getVoxelDescriptors(m, buffer=8)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/voxeldescriptors.py", line 221, in getVoxelDescriptors
    channels, mol = getChannels(mol, aromaticNitrogen, version, validitychecks)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/voxeldescriptors.py", line 106, in getChannels
    mol.atomtype, mol.charge = getPDBQTAtomTypesAndCharges(mol, aromaticNitrogen=aromaticNitrogen, validitychecks=validitychecks)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/atomtyper.py", line 222, in getPDBQTAtomTypesAndCharges
    atomtypingValidityChecks(mol)
  File "/home/alberto/Software/miniconda3/envs/htmd/lib/python3.6/site-packages/moleculekit/tools/atomtyper.py", line 196, in atomtypingValidityChecks
    raise ValueError('The protein has less bonds than (number of atoms - 1). This seems incorrect. You can assign bonds with `mol.bonds = mol._getBonds()`')
ValueError: The protein has less bonds than (number of atoms - 1). This seems incorrect. You can assign bonds with `mol.bonds = mol._getBonds()

Do you have any suggestion?

Molecules in getVoxelDescriptors()

Hi devs,

I have a short question to the Topic getVoxelDescriptors. Like mentioned in Issue #12, sometimes it is not possible to create a SmallMol. I am trying to reproduce the KDeep-paper. You said me already, that during the dev of the paper, the SmallMol-Class does not exist and you use the Molecule-Class.

Well If I create the Molecule class from a ligand, than I get an Error in getVoxelDescriptor(), that my Mol is not a protein.
Is the only option, the deactivation of the validitychecks?
How I get a similar result like it was used in the 'old' package.

If this is not possible, I think I have to use the old version of the package. But then I need to have my data in the ptbqt-format form AutoDock4. Can you give me a hint, how you have converted the files in the ptbqt-format?

moleculekit.molecule.Molecule.view(viewer='webgl') does not work

In [1]: from moleculekit.molecule import Molecule                                                                                                 
m
In [2]: m = Molecule('3ptb')                                                                                                                      
m.view()
In [3]: m.view(viewer='webgl')                                                                                                                    
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-fb9322d6c848> in <module>
----> 1 m.view(viewer='webgl')

~/maindisk/SANDBOX/moleculekit/miniconda3/lib/python3.6/site-packages/moleculekit/molecule.py in view(self, sel, style, color, guessBonds, viewer, hold, name, viewerhandle, gui)
   1618             self._viewVMD(psf, pdb, xtc, viewerhandle, name, guessBonds)
   1619         elif viewer.lower() == 'ngl' or viewer.lower() == 'webgl':
-> 1620             retval = self._viewNGL(gui=gui)
   1621         else:
   1622             os.remove(xtc)

~/maindisk/SANDBOX/moleculekit/miniconda3/lib/python3.6/site-packages/moleculekit/molecule.py in _viewNGL(self, gui)
   1661 
   1662     def _viewNGL(self, gui=False):
-> 1663         from nglview import HTMDTrajectory
   1664         import nglview
   1665         traj = HTMDTrajectory(self)

ModuleNotFoundError: No module named 'nglview'

If HTMD is supposed to be the only with nglview available, there should be a try on the import, perhaps.

viewCrystalPacking writes cell onto top molecule

Should instead first visualize the protein packing and then draw on that molecule the cell, or create a new dummy molecule on which to visualize the cell. Otherwise if you call it twice for different molecules it will write the cell lines on the previous molecule.

SmallMol issue with writing multiple frames

from moleculekit.smallmol.smallmol import SmallMol
sm = SmallMol('CCOO')
sm.generateConformers(10)
for i in range(10):
  sm.write(f'/tmp/stefan{i}.pdb', frames=i)

crashes with

NameError                                 Traceback (most recent call last)
<ipython-input-17-aeef6cceb0cb> in <module>
      1 for i in range(10):
----> 2     sm.write(f'/tmp/stefan{i}.pdb', frames=[i,])
      3 

~/Work/moleculekit/moleculekit/smallmol/smallmol.py in write(self, fname, frames, merge)
    601         else:
    602             mol = self.toMolecule()
--> 603             mol.write(fname, frames)
    604 
    605     def view(self, *args, **kwargs):

~/Work/moleculekit/moleculekit/molecule.py in write(self, filename, sel, type, **kwargs)
   1309         if not (sel is None or (isinstance(sel, str) and sel == "all")):
   1310             src = self.copy()
-> 1311             src.filter(sel, _logger=False)
   1312 
   1313         if type in _WRITERS:

~/Work/moleculekit/moleculekit/molecule.py in filter(self, sel, _logger)
    784 
    785         if not isinstance(s, np.ndarray) or s.dtype != bool:
--> 786             raise NameError("Filter can only work with string inputs or boolean arrays")
    787         return self.remove(np.invert(s), _logger=_logger)
    788 

NameError: Filter can only work with string inputs or boolean arrays

Make clear that distances care about molecule chains

Since we don't want to wrap distances inside a molecule fragment/chain, the chain field of the molecule needs to be defined correctly before calculating distances. We should:
a) Make it clear somewhere
b) Consider maybe even the segid field

Issue with vmd viewer

Hi Dev,

I tried to view a voxel using the below function but nothing is showing up at the end.

slig = SmallMol("lig.mol2")
lig_vox, lig_centers, lig_N = getVoxelDescriptors(slig, voxelsize=0.5, buffer=1) # gives me 2d output
slig.view(guessBonds=False)
viewVoxelFeatures(lig_vox, lig_centers, lig_N)

I tried to solve it by checking the thread but I am not getting a clear solution for my problem. Last, I am using Mac and I have VMD installed in my applications. I am running code in Jupyter notebook

Thanks.

Error .view() for ligand

Hi guys,

Here me again, I was playing around with the molecule object and trying to visualize the ligand.

from htmd.molecule.molecule import Molecule
m = Molecule('4eiy')
lig = m.copy()
lig.filter('resname ZMA')
m.write('4eiy.pdb')
lig.write('zma.pdb')
lig.view()
lig.view()
m.reps.add('protein', 'NewCartoon', 8)
m.reps.add('resname ZMA', 'Licorice')
m.view()

The first image is the visualization of the initial molecule m.

The second image is the visualization of the molecule object for the ligand only lig obtained by the filter command.

Finally, the last image is the ligand saved as pdb from the ligand object lig and then opened by vmd

viewVoxelFeatures uses hardcoded set of features

Hi everyone,

Firstly, thanks for providing this super helpful package!
I noticed that moleculekit.tools.voxeldescriptors.viewVoxelFeatures uses a hardcoded set of eight features for the graphical representation, making it impossible to use the function to draw custom features. Providing a set of eight custom features still works, but will use an incorrect naming scheme. It would be great if this were more customizable.

Best,
Clemens

Error Smallmol reading file in jupyter-notebook

Hi guys,

I started to use HTMD again, and I was doing some simple tests. I was trying to load a mol2 file with SmallMol in a jupyter-notebook session, but apparently there is something going wrong (see the image). If I use an ipython session evrything goes well.

OSX

Sorry to be the 'other arch' guy, but... binaries are included for all arches, while the conda package is built for linux only.

pip install seems to work though.

element Cg not found

I am reading protein.mol2 as an input. However, when I read the file by Molecule, the error comes up.
``

from moleculekit.molecule import Molecule
Molecule("/BiO/pekim/test/1jje/protein.mol2")
Traceback (most recent call last):
File "", line 1, in
File "/home/jlk/anaconda3/envs/LigVoxel/lib/python3.6/site-packages/moleculekit/molecule.py", line 270, in init
self.read(filename, **kwargs)
File "/home/jlk/anaconda3/envs/LigVoxel/lib/python3.6/site-packages/moleculekit/molecule.py", line 1102, in read
mol = rr(fname, frame=frame, topoloc=tmppdb, **kwargs)
File "/home/jlk/anaconda3/envs/LigVoxel/lib/python3.6/site-packages/moleculekit/readers.py", line 588, in MOL2read
return MolFactory.construct(topologies[0], trajectories[0], filename, frame)
File "/home/jlk/anaconda3/envs/LigVoxel/lib/python3.6/site-packages/moleculekit/readers.py", line 197, in construct
uniqueBonds=uniqueBonds,
File "/home/jlk/anaconda3/envs/LigVoxel/lib/python3.6/site-packages/moleculekit/readers.py", line 330, in _parseTopology
MolFactory._elementChecks(mol, filename)
File "/home/jlk/anaconda3/envs/LigVoxel/lib/python3.6/site-packages/moleculekit/readers.py", line 281, in _elementChecks
el, filename
RuntimeError: Element Cg was read in file /BiO/pekim/test/1jje/protein.mol2 but was not found in the periodictable. To disable this check, pass validateElements=False to the Molecule constructor or read method.
``

Pydantic Models

Hi, really cool package! I was poking through your code (especially the Molecule objects) and noticed that you may be interested in the Pydantic package for automatic validation of shapes, class variables, schema, etc. See here for a similar Molecule model implementation with automatic shaping and validation of NumPy arrays and other objects.

This is part of the MolSSI QCArchive infrastructure tool stack. It may be worth a chat to see if we could collaborate on some core infrastructure!

Error in atom typing

prot = Molecule('3PTB', name='Trypsin')
slig = SmallMol('./benzamidine.mol2')
prot = prepareProteinForAtomtyping(prot)

The following code snippet gives a runtine error.
The runtime error is:
"Found atoms with resnames ['BEN'] in the Molecule which can cause issues with the voxelization. Please make sure to only pass protein atoms and metals".
Any inputs on how to resolve this?

Voxel descriptors - problem with assigning atom types to SmallMol

Hi devs,

Before you get the documentations ready, I tried to figure out how to generate voxel descriptors for my small molecules myself. I used this molecule from the PDB as a test. Below is the code:

from rdkit import Chem

from moleculekit.smallmol.smallmol import SmallMol
from moleculekit.tools.voxeldescriptors import getVoxelDescriptors

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d

# read molecule
suppl = Chem.SDMolSupplier("../data/GDP_model.sdf")
mol = SmallMol(suppl[0])

# generate voxels
features, _, _ = getVoxelDescriptors(mol,
                                     boxsize=(16, 16, 16),
                                     voxelsize=0.5,
                                     center=mol.getCenter())
features = features.reshape(32, 32, 32, 8)

# visualize voxels
channel_names = ["hydrophobic", "aromatic", "hbond_acceptor", "hbond_donor",
                 "positive_ionizable", "negative_ionizable", "metal", "occupancies"]
threshold = 0.9
features = (features >= threshold).astype(float)

fig = plt.figure(figsize=(16, 8))

for i in range(8):
    ax = fig.add_subplot(241+i, projection="3d")
    ax.voxels(features[:, :, :, i], facecolors="red", edgecolor="k")
    ax.set_title(channel_names[i])

fig.tight_layout()
plt.show()

Most of the channels generated by this code look fine, but hydrophobic and negative_ionizable channels are completely empty. This does not make sense chemically, since at least some carbon atoms on the guanine and ribose rings should be considered hydrophobic, and the hydroxyls on the phosphates are definitively negatively ionizable.

I noticed when a SmallMol object is passed into getVoxelDescriptors, the function calls _getPropertiesRDkit, which in turn calls some RDKit functions to assign properties to atoms. I tried to print the channels array generated by this function, and it looked like all atoms are assigned zeros for hydrophobic and negative_ionizable channels.

Did I do this wrong, or is this an issue with RDKit? Thank you

Implement unique bonds from two files

We should try to keep unique bonds only. Seems like if we read a PSF and a PDB we store each bond twice. While this is not an issue in most of moleculekit it trips off pybel which then detects wrong atomtypes for the atoms which affects the voxelization.

_deduce_PDB_atom_name problem

When the mol2 file contains invalid element such as 'Du',
writers _duduce_PDB_atom_name function is called with (name, resname).
When the function is called with ('XE', 'XE7') as arguments, return value is ' XE '.
It should be 'XE ' not ' XE '.
The returned value is used as the value of element argument of the _getPDBElement function.
If element.isalpha() is true, first two characters are used. In this case, ' X' and the return value
becomes 'X' not 'XE'.

importing error for voxeldescriptor

Hi Stephan,

I was trying to voxelize a structure, while I got the following error when importing getVoxelDescriptors. Any hints for how to fix this?

from moleculekit.tools.voxeldescriptors import getVoxelDescriptors
Traceback (most recent call last):
File "", line 1, in
File "/home/wang123/anaconda3/envs/moleculekit/lib/python3.6/site-packages/moleculekit/tools/voxeldescriptors.py", line 23, in
occupancylib = ctypes.cdll.LoadLibrary(os.path.join(libdir, "occupancy_ext.so"))
File "/home/wang123/anaconda3/envs/moleculekit/lib/python3.6/ctypes/init.py", line 426, in LoadLibrary
return self._dlltype(name)
File "/home/wang123/anaconda3/envs/moleculekit/lib/python3.6/ctypes/init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: /home/wang123/anaconda3/envs/moleculekit/lib/python3.6/site-packages/moleculekit/lib/Linux/occupancy_ext.so: undefined symbol: __exp_finite

Thanks a lot!

Debby

Error with getVoxelDescriptors: protein has less bonds than atoms

Hi,

I am trying to generate voxel descriptors for some peptide segments. If I use the mol.filter() method to select a segment in a protein then generate voxel descriptors, I get the following error. If I do not do the filtering, the code works fine without error.

from moleculekit.molecule import Molecule
from moleculekit.tools.atomtyper import prepareProteinForAtomtyping
from moleculekit.tools.voxeldescriptors import getVoxelDescriptors

mol = Molecule("1h8d")
mol.filter("protein")
mol.filter("chain H and resid 16 to 21")  # If I comment this line, the code works fine!
mol = prepareProteinForAtomtyping(mol)
mol.center()
features, _, _ = getVoxelDescriptors(mol,
                                     boxsize=(32., 32., 32.),
                                     voxelsize=1.,
                                     center=(0., 0., 0.))
features = features.reshape(32, 32, 32, 8)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-7702ebcfca0a> in <module>
     11                               boxsize=(32., 32., 32.),
     12                               voxelsize=1.,
---> 13                               center=(0., 0., 0.))
     14 v = v.reshape(32, 32, 32, 8)

~/.conda/envs/test-env/lib/python3.6/site-packages/moleculekit/tools/voxeldescriptors.py in getVoxelDescriptors(mol, boxsize, voxelsize, buffer, center, usercenters, userchannels, usercoords, aromaticNitrogen, method, version, validitychecks)
    193     channels = userchannels
    194     if channels is None:
--> 195         channels, mol = getChannels(mol, aromaticNitrogen, version, validitychecks)
    196 
    197     if channels.dtype == bool:

~/.conda/envs/test-env/lib/python3.6/site-packages/moleculekit/tools/voxeldescriptors.py in getChannels(mol, aromaticNitrogen, version, validitychecks)
     83         elif version == 2:
     84             from moleculekit.tools.atomtyper import getFeatures, getPDBQTAtomTypesAndCharges
---> 85             mol.atomtype, mol.charge = getPDBQTAtomTypesAndCharges(mol, aromaticNitrogen=aromaticNitrogen, validitychecks=validitychecks)
     86             channels = getFeatures(mol)
     87 

~/.conda/envs/test-env/lib/python3.6/site-packages/moleculekit/tools/atomtyper.py in getPDBQTAtomTypesAndCharges(mol, aromaticNitrogen, validitychecks)
    160 def getPDBQTAtomTypesAndCharges(mol, aromaticNitrogen=False, validitychecks=True):
    161     if validitychecks:
--> 162         atomtypingValidityChecks(mol)
    163 
    164     atomsProp = getProperties(mol)

~/.conda/envs/test-env/lib/python3.6/site-packages/moleculekit/tools/atomtyper.py in atomtypingValidityChecks(mol)
    139 
    140     if mol.bonds.shape[0] < mol.numAtoms:
--> 141         raise ValueError('The protein has less bonds than atoms. This seems incorrect. Assign them with `mol.bonds = mol._getBonds()`')
    142 
    143     if np.all(mol.segid == '') or np.all(mol.chain == ''):

ValueError: The protein has less bonds than atoms. This seems incorrect. Assign them with `mol.bonds = mol._getBonds()`

Interestingly, I found that a workaround is to save the filtered segment and load it again:

mol = Molecule("1h8d")
mol.filter("protein")
mol.filter("chain H and resid 16 to 21")
mol = prepareProteinForAtomtyping(mol)
mol.center()
# Save the Molecule object to file...
mol.write("test.pdb")
# ...and load it again
mol = Molecule("test.pdb")
# Then it works fine without error!
features, _, _ = getVoxelDescriptors(mol,
                                     boxsize=(32., 32., 32.),
                                     voxelsize=1.,
                                     center=(0., 0., 0.))
features = features.reshape(32, 32, 32, 8)

Any idea? Thanks.

Errors on .so file loading

/media/mydata/repos/moleculekit/moleculekit/util.py in <module>
     17     tmalignlib = ct.cdll.LoadLibrary(os.path.join(libdir, "tmalign.dll"))
     18 else:
---> 19     tmalignlib = ct.cdll.LoadLibrary(os.path.join(libdir, "tmalign.so"))
     20 
     21 

~/miniconda3/lib/python3.6/ctypes/__init__.py in LoadLibrary(self, name)
    424 
    425     def LoadLibrary(self, name):
--> 426         return self._dlltype(name)
    427 
    428 cdll = LibraryLoader(CDLL)

~/miniconda3/lib/python3.6/ctypes/__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error)
    346 
    347         if handle is None:
--> 348             self._handle = _dlopen(self._name, mode)
    349         else:
    350             self._handle = handle

OSError: /media/mydata/repos/moleculekit/moleculekit/lib/Linux/tmalign.so: failed to map segment from shared object

There is an issue on some machines. Might try to do static compilation

smallmol containing Hs

Use SmallMol to transform a molecule with 30 heavy atoms, in version 0.1.12, it retrurns: SmallMol with 30 atoms and 1 conformers
But when I updated the moleculekit version to 0.4.7 , it becomes : SmallMol with 52 atoms and 1 conformers

Missing dependency openbabel

Hi,

I tried to read a ligand with SmallMol('ligands.sdf', force_reading=True), but openbabel seems to be missing in the dependency list. With force_reading=False it also fails and suggests to set it True.

I installed moleculekit through conda install -c acellera moleculekit

In [1]: from moleculekit.smallmol.smallmol import SmallMol                                                                                                                               

In [2]: ligs = SmallMol('ligands.sdf', force_reading=True)                                                                                                                               
Reading ligands.sdf with force_reading procedure
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-2-f3cfc0eeef37> in <module>
----> 1 ligs = SmallMol('ligands.sdf', force_reading=True)

/shared/dominik/repos/moleculekit/moleculekit/smallmol/smallmol.py in __init__(self, mol, ignore_errors, force_reading, fixHs, removeHs)
     98         self._frame = 0
     99 
--> 100         _mol = self._initializeMolObj(mol, force_reading, ignore_errors)
    101 
    102         if removeHs:

/shared/dominik/repos/moleculekit/moleculekit/smallmol/smallmol.py in _initializeMolObj(self, mol, force_reading, ignore_errors)
    145                 if _mol is None and force_reading:
    146                     logger.warning('Reading {} with force_reading procedure'.format(mol))
--> 147                     sdf = openbabelConvert(mol, name_suffix, 'sdf')
    148                     _mol = Chem.SDMolSupplier(sdf, removeHs=False)[0]
    149                     os.remove(sdf)

/shared/dominik/repos/moleculekit/moleculekit/smallmol/util.py in openbabelConvert(input_file, input_format, output_format)
    213     """
    214 
--> 215     import openbabel
    216     import tempfile
    217     input_format = input_format[1:] if input_format.startswith('.') else input_format

ModuleNotFoundError: No module named 'openbabel'

Error on loading moleculekit (tmalign.dll?)

OSError Traceback (most recent call last)
in
----> 1 from moleculekit.molecule import Molecule

C:\Anaconda3\envs\moleculekit\lib\site-packages\moleculekit\molecule.py in
5 #
6 import numpy as np
----> 7 from moleculekit.util import tempname, ensurelist
8 from copy import deepcopy
9 from os import path

C:\Anaconda3\envs\moleculekit\lib\site-packages\moleculekit\util.py in
15 libdir = home(libDir=True)
16 if platform.system() == "Windows":
---> 17 tmalignlib = ct.cdll.LoadLibrary(os.path.join(libdir, "tmalign.dll"))
18 else:
19 tmalignlib = ct.cdll.LoadLibrary(os.path.join(libdir, "tmalign.so"))

C:\Anaconda3\envs\moleculekit\lib\ctypes_init_.py in LoadLibrary(self, name)
424
425 def LoadLibrary(self, name):
--> 426 return self._dlltype(name)
427
428 cdll = LibraryLoader(CDLL)

C:\Anaconda3\envs\moleculekit\lib\ctypes_init_.py in init(self, name, mode, handle, use_errno, use_last_error)
346
347 if handle is None:
--> 348 self._handle = _dlopen(self._name, mode)
349 else:
350 self._handle = handle

OSError: [WinError 126] The specified module could not be found

Error with getVoxelDescriptors

Hi guys,

I am experiencing problems using function getVoxelDescriptors()

~/anaconda3/lib/python3.7/site-packages/moleculekit/tools/atomtyper.py in getProperties(mol) 78 from openbabel import pybel 79 except ImportError: ---> 80 raise ImportError('Could not import openbabel. The atomtyper requires this dependency so please install it with conda install openbabel -c conda-forge`')
81
82 name = NamedTemporaryFile(suffix='.pdb').name

ImportError: Could not import openbabel. The atomtyper requires this dependency so please install it with conda install openbabel -c conda-forge `

I have installed openbabel. And I do not have any problems with importing:
import openbabel from openbabel import * from pybel import * import pybel

How can I fix this problem?

Make ipython a requirement

(so that we avoid conda taking it from the base env and nothing working)

Problem in getChannels with the old version

Trying to reproduce results of the DeepSite paper, i found out that the reported channels in the paper are extracted by applying version=1 argument to the getChannels function (something that should more clearly declared somewhere in the code or the documentation). Applying now this function to my test case, the extracted channels were all zeros except from the last, and i found that it is caused by the mol.atomtype which is an array of empty values. In the version=2 branch of getChannels there is a specific atom_typing extraction. Maybe it should be added something similar and to the version=1 branch?

Tutorial on voxelization and ML

Speaking to people at a confrence, it would be good to add to the documentation a simple tutorial to use voxelization for ML with a trivial pytorch code.

acellera / moleculekit Goto Github PK

moleculekit's Introduction

MoleculeKit

Getting started

Install it into the base conda environment

With conda

With pip

Optional dependencies of moleculekit

Using moleculekit in ipython

API

Issues

Dev

Building for WebAssembly

Citing MoleculeKit

moleculekit's People

Stargazers

Watchers

Forkers

moleculekit's Issues

Recommend Projects

Recommend Topics

Recommend Org