hackingmaterials / matminer_examples Goto Github PK
View Code? Open in Web Editor NEWA repo of examples for the matminer (https://github.com/hackingmaterials/matminer) code
License: Other
A repo of examples for the matminer (https://github.com/hackingmaterials/matminer) code
License: Other
and also put an image of it in the README so it shows up in the visual gallery
Getting following error:
Traceback (most recent call last):
File "MP_Citrine_MDF_MPDS_Masher.py", line 37, in
from matminer.data_retrieval.retrieve_Citrine import CitrineDataRetrieval
File "matminer/data_retrieval/retrieve_Citrine.py", line 2, in
from citrination_client import CitrinationClient, ChemicalFieldQuery,
ModuleNotFoundError: No module named 'citrination_client'
It would be nice if the sklearn Pipeline tutorial had an intro section briefly explaining the concept of a pipeline and why it is useful. I think most people reviewing the examples repo will not be familiar with sklearn pipelines as they are materials scientists. So it would help them to gain some context as to why they want to create a pipeline.
I don't think you need to go overboard. Maybe 1 paragraph or 2, with appropriate links to externally hosted docs as needed.
as requested on matminer help list (somehow not showing up on Google groups)
The bulk modulus notebook takes the absolute value of r^2 when computing the scoring metric. But, it should not do that - if r^2 is less than zero, it is worse than predicting the mean.
Probably it won't affect this specific notebook, but better not to have that code in there. Very simple fix - just remove np.abs() around any r2 computation.
In featurize function, there are below coments.
""
Get Coulomb matrix of input structure.
Args:
s: input Structure (or Molecule) object.
Returns:
m: (Nsites x Nsites matrix) Coulomb matrix.
"""
Which said that the function returns N * N matrix.
But Actually, it return an N-dimension vector, because they return the characteristic value of Coulomb Matrix.
When running the notebooks in binder (as per the documentation), the lack of API keys mean the notebooks cannot be run interactively. Another issue is that matminer is not installed using the optional dependencies, so the MPDS examples fail with another error message.
Possible solutions:
If binder provides other benefits further to running the notebooks interactively then I guess it could make sense to leave the link in the documentation.
I think it would be nice to have these as notebooks. Users would be able to see the figures rendered here on GitHub. Plus, iteratively tweaking figures seems is much faster in notebooks (at least for me) and I'd like to advocate their use here.
What is your take on porting these examples to notebooks?
Right now you are plotting the errors on the training set with PlotlyFig which is misleading
You should generate the cross-validation plot and update the examples to show how to do that
e.g., if a notebook stops working for the latest version of the code, this will tell a user what version of matminer to roll back to get things working at least in the short term.
I am using the Anaconda environment and the Ipython notebook.
When using:
from matminer.featurizers.composition import ElementProperty
I get following error:
NameError Traceback (most recent call last)
in
----> 1 from matminer.featurizers.composition import ElementProperty
~/anaconda/envs/py3/lib/python3.6/site-packages/matminer/featurizers/composition.py in
16
17 from matminer.featurizers.base import BaseFeaturizer
---> 18 from matminer.featurizers.utils.stats import PropertyStats
19 from matminer.utils.data import DemlData, MagpieData, PymatgenData,
20 CohesiveEnergyData, MixingEnthalpy, MatscholarElementData
~/anaconda/envs/py3/lib/python3.6/site-packages/matminer/featurizers/utils/stats.py in
13
14
---> 15 class PropertyStats(object):
16 """This class contains statistical operations that are commonly employed
17 when computing features.
~/anaconda/envs/py3/lib/python3.6/site-packages/matminer/featurizers/utils/stats.py in PropertyStats()
337 return np.array(data_lst).flatten()
338
--> 339 @staticmethodzoom
340 def quantile(data_lst, weights=None, q=0.5):
341 """
NameError: name 'staticmethodzoom' is not defined
The 'kernel_ridge_scm_ofm' example script produces the following error when trying to load the 'flla' dataset:
REMOVE UNSTABLE ENTRIES: False
USE FABER DATASET: True
USE TERNARY OXIDE DATASET: False
NUMBER OF JOBS: 24
DEBUG MODE: False
Traceback (most recent call last):
File "kernel_ridge_SCM_OFM.py", line 67, in
df = load_dataset("flla")
File "/home/dennis/.local/lib/python3.5/site-packages/matminer/datasets/dataset_retrieval.py", line 63, in load_dataset
df = load_dataframe_from_json(data_path)
File "/home/dennis/.local/lib/python3.5/site-packages/matminer/utils/io.py", line 58, in load_dataframe_from_json
dataframe_data = json.load(f, cls=MontyDecoder)
File "/usr/lib/python3.5/json/init.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib/python3.5/json/init.py", line 312, in loads
s.class.name))
TypeError: the JSON object must be str, not 'bytes'
Any help in debugging is appreciated.
might be nice to see how to re-use the same ML pipeline for different data problems
Probably for @Doppe1g4nger
Hi,
I follow the example of matminer with figrecipes.
But it error at the first import line
from matminer.figrecipes.plot import PlotlyFig
ModuleNotFoundError: No module named 'matminer.figrecipes'
OR
from figrecipes import PlotlyFig
ModuleNotFoundError: No module named 'figrecipes'
** matminer (0.7.2)
** plotly (4.14.3)
i want to know how to import matminer into jupyter notebook and what it requires to run correctly
When I run PRDF class, I encounter this error "You must run 'fit' first!".
sorry, but I just pip install the lib, then import failure...
could anyone give me some idea about it?
thanks
Python 3.7.4 (default, Dec 17 2019, 17:07:17)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from automatminer import MatPipe
/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.metrics.scorer module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.metrics. Anything that cannot be imported from sklearn.metrics is now part of the private API.
warnings.warn(message, FutureWarning)
/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.feature_selection.base module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.feature_selection. Anything that cannot be imported from sklearn.feature_selection is now part of the private API.
warnings.warn(message, FutureWarning)
/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.neighbors.unsupervised module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
warnings.warn(message, FutureWarning)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/automatminer/__init__.py", line 2, in <module>
from automatminer.featurization import AutoFeaturizer # noqa
File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/automatminer/featurization/__init__.py", line 1, in <module>
from .core import AutoFeaturizer # noqa
File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/automatminer/featurization/core.py", line 27, in <module>
from automatminer.featurization.sets import (
File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/automatminer/featurization/sets.py", line 10, in <module>
import matminer.featurizers.structure as sf
File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/matminer/featurizers/structure.py", line 36, in <module>
from matminer.featurizers.site import OPSiteFingerprint, \
File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/matminer/featurizers/site.py", line 1670, in <module>
class LocalPropertyDifference(BaseFeaturizer):
File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/matminer/featurizers/site.py", line 1695, in LocalPropertyDifference
def __init__(self, data_source=MagpieData(), weight='area',
File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/matminer/utils/data.py", line 215, in __init__
prop_value = float(lines[atomic_no - 1])
IndexError: list index out of range
probably @ardunn
Otherwise I get errors with visualization, eg. see computed vs experimental bandgaps notebook
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-21-c406415a642e> in <module>()
13
14 hist_plot["data"][0]['name'] = 'train'
---> 15 hist_plot["data"][1]['name'] = 'test'
16 pf_rf.create_plot(hist_plot)
IndexError: list index out of range
I've been struggling a lot with getting CGCNNFeaturizer to work. The google groups doesn't seem to have much discussion on this, nor does the CGCNN repo have much to say about it other than a clarification about python bindings. The matminer documentation is also a bit hard to follow, but I think I've understood the main components. Does someone have (or could someone put together) a basic .ipynb showing how to get features from one of the pretrained models (e.g. 'bulk-moduli)?
I think someone just browsing the repo should be able to see the PlotlyFig example outputs.
I'd suggest:
If you want to see an example README with images, just look at the main matminer README
hei
i use module from matminer.figrecipes.plot import PlotlyFig
and i try this basic code
pf = PlotlyFig(title="Basic Example", mode='notebook')
pf.xy(([1, 2, 3], [4, 5, 6]))
but the plot not showed
so, can you tell how to show this plot in google colab ?
I've noticed a couple of typos and other minor bugs in the basic data retrieval notebook. I will keep track of them in this issue and submit a pull request containing fixes when I've run through all the notebook examples.
In data_retrieval_basics.ipynb
:
df.to_cv(...)
should be df.to_csv(...)
CitrineDataRetrieval
. This should be removed.(I'll edit this message if I find anything else)
Hello!How can I get the same format as the 'structure' field in the examples datasets? I only have some POSCAR files.Can someone help me?
BestWishes!
File "/Users/jason/opt/anaconda3/lib/python3.9/site-packages/matminer/featurizers/structure/matrix.py", line 292, in init
my_ohvs[Z] = self.get_ohv(el, period_tag)
File "/Users/jason/opt/anaconda3/lib/python3.9/site-packages/matminer/featurizers/structure/matrix.py", line 338, in get_ohv
my_ohv = np.zeros(self.size, np.int)
File "/Users/jason/opt/anaconda3/lib/python3.9/site-packages/numpy/init.py", line 305, in getattr
raise AttributeError(former_attrs[attr])
AttributeError: module 'numpy' has no attribute 'int'.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.