bp-kelley / descriptastorus Goto Github PK
View Code? Open in Web Editor NEWDescriptor computation(chemistry) and (optional) storage for machine learning
License: Other
Descriptor computation(chemistry) and (optional) storage for machine learning
License: Other
when packages installed as:
descriptastorus==2.6.1
scipy==1.8.1
will raise the error:
python3.8/site-packages/descriptastorus/descriptors/rdNormalizedDescriptors.py, line 56, in <module>
dist = getattr(st, dist)
AttributeError: module 'scipy.stats' has no attribute 'gibrat'
and, packages as below is ok.
descriptastorus==2.6.1
scipy==1.9.0
so, maybe scipy>=1.9.0
is required? not scipy>=1.7.0
in requirements.txt.
When trying to install descriptastorus via pip (through poetry), I see the following error:
poetry add git+https://github.com/bp-kelley/descriptastorus
PackageInfoError
Unable to determine package info for path: /var/folders/1m/281yfqmj42n90_znys0027y00000gp/T/pypoetry-git-descriptastorusw440te64
Fallback egg_info generation failed.
Command ['/var/folders/1m/281yfqmj42n90_znys0027y00000gp/T/tmp72kvq807/.venv/bin/python', 'setup.py', 'egg_info'] errored with the following return code 1, and output:
Descriptastorus requires rkdit to function, this is not installable by pip
see https://rdkit.org for more information
at ~/.poetry/lib/poetry/inspection/info.py:501 in _pep517_metadata
497│ venv.run_python("setup.py", "egg_info")
498│ return cls.from_metadata(path)
499│ except EnvCommandError as fbe:
500│ raise PackageInfoError(
→ 501│ path, "Fallback egg_info generation failed.", fbe
502│ )
503│ finally:
504│ os.chdir(cwd.as_posix())
505│
This error is now incorrect with the stable packaging of rdkit as rdkit-pypi
Hi, thanks for your great job. May I ask which normalization method you've used for normalizing the rdkit feature?
I noticed that the dists.py
showed some value but what does that mean? And how was those digits calculated?
Thanks!
In #22 the following fix was applied to account for scipy changing gilbrat
to gibrat
.
# fix change in scikit learn
if hasattr(dist, 'gilbrat'):
dist = 'gilbrat'
else:
dist = 'gibrat'
I think hasattr(dist, 'gilbrat')
should be hasattr(st, 'gilbrat')
as the statement will always be false. A string doesn't have the attribute 'gilbrat'
.
i did install descriptastorus via pip install git+https://github.com/bp-kelley/descriptastorus , and successfully installed .but every time when i import descriptastorus , i got this error (ERROR:root:Unable to make new descriptors, descriptor generator not installed, but Process finished with exit code 0 in pycharm) .
when i try this( from descriptastorus.descriptors import rdDescriptors, rdNormalizedDescriptors) . i also got error (ValueError: RDKit2D: Failed to initialize: unable to find specified properties:
fr_Al_COO
fr_Al_OH
fr_Al_OH_noTert
fr_ArN
fr_Ar_COO
fr_Ar_N
)
i been stuck here few days , i did try reinstall few times (f pip install git+https://github.com/bp-kelley/descriptastorus or github source) ,but it didn't work. (RDkit 2021.9.2 scikit-learn already installed)
did i miss anything during deployment? thank for any suggestion.
newer versions of numpy don't support numpy.float
.
Getting this error when using descriptastorus w/ numpy==1.24.3
:
Traceback (most recent call last):
File "/opt/conda/envs/env/lib/python3.8/site-packages/descriptastorus/descriptors/rdDescriptors.py", line 336, in applyFunc
return FUNCS[name](m)
File "/opt/conda/envs/env/lib/python3.8/site-packages/rdkit/Chem/GraphDescriptors.py", line 124, in Ipc
cPoly = abs(Graphs.CharacteristicPolynomial(mol, adjMat))
File "/opt/conda/envs/env/lib/python3.8/site-packages/rdkit/Chem/Graphs.py", line 43, in CharacteristicPolynomial
res = numpy.zeros(nAtoms + 1, numpy.float)
File "/opt/conda/envs/env/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__
raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Importing MakeGenerator from descriptastorus.descriptors.DescriptorGenerator messes up how logging works. Below is an MWE demoing this unwanted behavior.
import logging
from descriptastorus.descriptors.DescriptorGenerator import MakeGenerator
logging.basicConfig(handlers=[logging.FileHandler("temp.log", "a", "utf-8"),
logging.StreamHandler()], level=logging.INFO,
format="%(asctime)s %(levelname)s [%(module)s.%(funcName)s]: %(message)s")
def status():
logging.info("Hello!")
if __name__ == '__main__':
status()
Running this code does not result in "Hello!" being written to the log file "temp.log" as is expected. If you comment out the row from descriptastorus.descriptors.DescriptorGenerator import MakeGenerator
the log entry is written without error.
To verify this issue I created a new conda environment, first installing rdkit conda install -c rdkit rdkit
, followed by pip install git+https://github.com/bp-kelley/descriptastorus
and pip install scipy
.
Would it be possible to tag the current head of the master branch and resume tagging each time a PR/groups of PRs is merged down? There are a number of tools that only allow a repo to be pulled based on a tag or release version, which forces forking your repo in order to use them. I would prefer to reference your repo directly.
Dear Team,
Help me to solve the following error.
I used !pip install git+https://github.com/bp-kelley/descriptastorus for installation. Thank you
ERROR:root:Unable to make new descriptors, descriptor generator not installed
Traceback (most recent call last):
File "tran_data.py", line 256, in
networks = [smile_to_graph(smile) for smile in smiles]
File "tran_data.py", line 256, in
networks = [smile_to_graph(smile) for smile in smiles]
File "tran_data.py", line 142, in smile_to_graph
graph_feature = rdkit_2d_normalized_features_generator(smile)
File "tran_data.py", line 135, in rdkit_2d_normalized_features_generator
raise ImportError('Failed to import descriptastorus. Please install descriptastorus '
ImportError: Failed to import descriptastorus. Please install descriptastorus (https://github.com/bp-kelley/descriptastorus) to use RDKit 2D normalized features.
conda install -c bp-kelley/label/bp-kelley kyotocabinet-python
It is not working on windows 10
Secondly I am facing this issue
base) C:\Users\Dr. Abdul Majid>pip install git+https://github.com/bp-kelley/descriptastorus
Collecting git+https://github.com/bp-kelley/descriptastorus
Cloning https://github.com/bp-kelley/descriptastorus to c:\users\dr9ee71.abd\appdata\local\temp\pip-req-build-owno_hf91.ABD\AppData\Local\Temp\pip-req-build-owno_hf9'
Running command git clone -q https://github.com/bp-kelley/descriptastorus 'C:\Users\DR9EE7
fatal: unable to access 'https://github.com/bp-kelley/descriptastorus/': Could not resolve host: github.com
ERROR: Command errored out with exit status 128: git clone -q https://github.com/bp-kelley/descriptastorus 'C:\Users\DR9EE7~1.ABD\AppData\Local\Temp\pip-req-build-owno_hf9' Check the logs for full command output.
Kindly guide me about these two problems. I am using windows 10 OS.
Dear descriptastorus team,
Our software project Chemprop relies on descriptastorus (thanks for this great project btw) to compute a couple of descriptors for molecules. Newly installed versions of descriptastorus cause a couple of warnings when imported:
>>> import descriptastorus
WARNING:root:No normalization for BCUT2D_MWHI
WARNING:root:No normalization for BCUT2D_MWLOW
WARNING:root:No normalization for BCUT2D_CHGHI
WARNING:root:No normalization for BCUT2D_CHGLO
WARNING:root:No normalization for BCUT2D_LOGPHI
WARNING:root:No normalization for BCUT2D_LOGPLOW
WARNING:root:No normalization for BCUT2D_MRHI
WARNING:root:No normalization for BCUT2D_MRLOW
The descriptors seem to be unaffected by the warnings, however, they are always a cause of concern to our users. What causes the warning, and is there anything we can do about it? The warnings occur over different OSs and python versions (I can create a list if that is of any help). Thanks for your help!
>>> from descriptastorus.descriptors import rdNormalizedDescriptors
ERROR:root:Unable to make new descriptors, descriptor generator not installed
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/dakka/miniconda3/envs/ligand_ml/lib/python3.6/site-packages/descriptastorus/descriptors/__init__.py", line 2, in <module>
from .DescriptorGenerator import *
File "/home/dakka/miniconda3/envs/ligand_ml/lib/python3.6/site-packages/descriptastorus/descriptors/DescriptorGenerator.py", line 35, in <module>
import pandas_flavor as pf
ModuleNotFoundError: No module named 'pandas_flavor'
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Collecting git+https://github.com/bp-kelley/descriptastorus
Cloning https://github.com/bp-kelley/descriptastorus to /tmp/pip-req-build-Trc97p
Running command git clone -q https://github.com/bp-kelley/descriptastorus /tmp/pip-req-build-Trc97p
ERROR: Command errored out with exit status 1:
command: /home/tsa87/anaconda3/envs/python2/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-Trc97p/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-Trc97p/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base pip-egg-info
cwd: /tmp/pip-req-build-Trc97p/
Complete output (6 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-req-build-Trc97p/setup.py", line 36
print("Descriptastorus requires rkdit to function, this is not installable by pip", file=sys.stderr)
^
SyntaxError: invalid syntax
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
print
is not a function in Python 2.x, causing a syntax error. Using from __future__ import print_function
could make the code compatible with both versions.
In the script rdNormalizedDescriptors.py line #49 distributions are loaded dynamically from scipy.stats.
In scipy version 1.11.0 the gilbrat
function is replaced by the gibrat
function (typo).
This raises an error during initializing rdNormalizedDescriptors.py
Proposed fix:
gilbrat
with gibrat
in dists.pyscipy>=1.11.0
in requirements.txtWhile I tried this amazing package, I was able to replicate the results from #11.
python3.8 -c 'import rdkit;print(rdkit.__version__);from rdkit.Chem import Descriptors;print(len(Descriptors.descList))'
As you suggested in #11, I tried this and got results:
2021.09.3
123
I am using conda-forge version of RDKit on python 3.8, macOS 11 Big Sur (Apple Silicon). As you discussed, it seems like RDKit have moved fingerprints into rdkit.Chem.Fragments
. Would you please look into this issue? Thanks for reading.
Hi,
In the rdDescriptors.py
script there is the function:
def clip_sparse(vect, nbits):
l = [0]*nbits
for i,v in vect.GetNonzeroElements().items():
l[i] = v if v > 255 else 255
return l
I wonder if the greater-than should not be smaller-than: l[i] = v if v < 255 else 255
Also, if np.int8 is used, then the upper limit is 127, right?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.