Giter Site home page Giter Site logo

matminer_examples's People

Contributors

albalu avatar ardunn avatar blokhin avatar computron avatar kylebystrom avatar montoyjh avatar spacedome avatar utf avatar wardlt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

matminer_examples's Issues

PlotlyFig examples - use matminer data sets?

@ardunn @albalu

Should the PlotlyFig examples use the matminer example data sets?

  • They are just as easy to load as sklearn data sets
  • They would be more relevant / interesting to materials scientists than plotting boston housing prices or something

Citrine informatics in the matminer_examples not working

Getting following error:

Traceback (most recent call last):
File "MP_Citrine_MDF_MPDS_Masher.py", line 37, in
from matminer.data_retrieval.retrieve_Citrine import CitrineDataRetrieval
File "matminer/data_retrieval/retrieve_Citrine.py", line 2, in
from citrination_client import CitrinationClient, ChemicalFieldQuery,
ModuleNotFoundError: No module named 'citrination_client'

pipeline tutorial could benefit from some background

@spacedome

It would be nice if the sklearn Pipeline tutorial had an intro section briefly explaining the concept of a pipeline and why it is useful. I think most people reviewing the examples repo will not be familiar with sklearn pipelines as they are materials scientists. So it would help them to gain some context as to why they want to create a pipeline.

I don't think you need to go overboard. Maybe 1 paragraph or 2, with appropriate links to externally hosted docs as needed.

Predicting bulk modulus - r^2 values

The bulk modulus notebook takes the absolute value of r^2 when computing the scoring metric. But, it should not do that - if r^2 is less than zero, it is worse than predicting the mean.

Probably it won't affect this specific notebook, but better not to have that code in there. Very simple fix - just remove np.abs() around any r2 computation.

There is an Inconsistency in matminer.featurizers.structure.CoulombMatrix()

In featurize function, there are below coments.
""
Get Coulomb matrix of input structure.

    Args:
        s: input Structure (or Molecule) object.

    Returns:
        m: (Nsites x Nsites matrix) Coulomb matrix.

"""
Which said that the function returns N * N matrix.

But Actually, it return an N-dimension vector, because they return the characteristic value of Coulomb Matrix.

Binder examples cannot be run interactively

When running the notebooks in binder (as per the documentation), the lack of API keys mean the notebooks cannot be run interactively. Another issue is that matminer is not installed using the optional dependencies, so the MPDS examples fail with another error message.

Possible solutions:

  • Set the proper environment variables in binder. I don't know if this can be done without the keys being public in anyway.
  • Remove the link to binder as the notebooks can be viewed in the GitHub repository, just not run interactively.

If binder provides other benefits further to running the notebooks interactively then I guess it could make sense to leave the link in the documentation.

Can the figrecipe examples be ported to notebooks?

I think it would be nice to have these as notebooks. Users would be able to see the figures rendered here on GitHub. Plus, iteratively tweaking figures seems is much faster in notebooks (at least for me) and I'd like to advocate their use here.

What is your take on porting these examples to notebooks?

error: "NameError: name 'staticmethodzoom' is not defined"

I am using the Anaconda environment and the Ipython notebook.

When using:
from matminer.featurizers.composition import ElementProperty

I get following error:


NameError Traceback (most recent call last)
in
----> 1 from matminer.featurizers.composition import ElementProperty

~/anaconda/envs/py3/lib/python3.6/site-packages/matminer/featurizers/composition.py in
16
17 from matminer.featurizers.base import BaseFeaturizer
---> 18 from matminer.featurizers.utils.stats import PropertyStats
19 from matminer.utils.data import DemlData, MagpieData, PymatgenData,
20 CohesiveEnergyData, MixingEnthalpy, MatscholarElementData

~/anaconda/envs/py3/lib/python3.6/site-packages/matminer/featurizers/utils/stats.py in
13
14
---> 15 class PropertyStats(object):
16 """This class contains statistical operations that are commonly employed
17 when computing features.

~/anaconda/envs/py3/lib/python3.6/site-packages/matminer/featurizers/utils/stats.py in PropertyStats()
337 return np.array(data_lst).flatten()
338
--> 339 @staticmethodzoom
340 def quantile(data_lst, weights=None, q=0.5):
341 """

NameError: name 'staticmethodzoom' is not defined

JSON issues with 'kernel_ridge_SCM_OFM' example

The 'kernel_ridge_scm_ofm' example script produces the following error when trying to load the 'flla' dataset:

REMOVE UNSTABLE ENTRIES: False
USE FABER DATASET: True
USE TERNARY OXIDE DATASET: False
NUMBER OF JOBS: 24
DEBUG MODE: False
Traceback (most recent call last):
File "kernel_ridge_SCM_OFM.py", line 67, in
df = load_dataset("flla")
File "/home/dennis/.local/lib/python3.5/site-packages/matminer/datasets/dataset_retrieval.py", line 63, in load_dataset
df = load_dataframe_from_json(data_path)
File "/home/dennis/.local/lib/python3.5/site-packages/matminer/utils/io.py", line 58, in load_dataframe_from_json
dataframe_data = json.load(f, cls=MontyDecoder)
File "/usr/lib/python3.5/json/init.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib/python3.5/json/init.py", line 312, in loads
s.class.name))
TypeError: the JSON object must be str, not 'bytes'

Any help in debugging is appreciated.

No module named 'figrecipes'

Hi,

I follow the example of matminer with figrecipes.
But it error at the first import line

from matminer.figrecipes.plot import PlotlyFig
ModuleNotFoundError: No module named 'matminer.figrecipes'

OR

from figrecipes import PlotlyFig
ModuleNotFoundError: No module named 'figrecipes'

** matminer (0.7.2)
** plotly (4.14.3)

from automatminer import MatPipe -> IndexError: list index out of range

sorry, but I just pip install the lib, then import failure...
could anyone give me some idea about it?
thanks

Python 3.7.4 (default, Dec 17 2019, 17:07:17) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from automatminer import MatPipe
/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.metrics.scorer module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.metrics. Anything that cannot be imported from sklearn.metrics is now part of the private API.
  warnings.warn(message, FutureWarning)
/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.feature_selection.base module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.feature_selection. Anything that cannot be imported from sklearn.feature_selection is now part of the private API.
  warnings.warn(message, FutureWarning)
/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.neighbors.unsupervised module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.neighbors. Anything that cannot be imported from sklearn.neighbors is now part of the private API.
  warnings.warn(message, FutureWarning)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/automatminer/__init__.py", line 2, in <module>
    from automatminer.featurization import AutoFeaturizer  # noqa
  File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/automatminer/featurization/__init__.py", line 1, in <module>
    from .core import AutoFeaturizer  # noqa
  File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/automatminer/featurization/core.py", line 27, in <module>
    from automatminer.featurization.sets import (
  File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/automatminer/featurization/sets.py", line 10, in <module>
    import matminer.featurizers.structure as sf
  File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/matminer/featurizers/structure.py", line 36, in <module>
    from matminer.featurizers.site import OPSiteFingerprint, \
  File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/matminer/featurizers/site.py", line 1670, in <module>
    class LocalPropertyDifference(BaseFeaturizer):
  File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/matminer/featurizers/site.py", line 1695, in LocalPropertyDifference
    def __init__(self, data_source=MagpieData(), weight='area',
  File "/home/inode01/xiaotong/code/mossbauer_preprocess/venv_moss/lib/python3.7/site-packages/matminer/utils/data.py", line 215, in __init__
    prop_value = float(lines[atomic_no - 1])
IndexError: list index out of range

error in bulk_modulus notebook

@albalu
@ardunn


---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-21-c406415a642e> in <module>()
     13 
     14 hist_plot["data"][0]['name'] = 'train'
---> 15 hist_plot["data"][1]['name'] = 'test'
     16 pf_rf.create_plot(hist_plot)

IndexError: list index out of range

CGCNNFeaturizer Example

I've been struggling a lot with getting CGCNNFeaturizer to work. The google groups doesn't seem to have much discussion on this, nor does the CGCNN repo have much to say about it other than a clarification about python bindings. The matminer documentation is also a bit hard to follow, but I think I've understood the main components. Does someone have (or could someone put together) a basic .ipynb showing how to get features from one of the pretrained models (e.g. 'bulk-moduli)?

PlotlyFig examples - easy images

I think someone just browsing the repo should be able to see the PlotlyFig example outputs.

I'd suggest:

  • putting an image file (at least one for each plot type) in the repo itself that one can directly see
  • having a README file in the PlotlyFig example directory that links to those figures. Thus, when you navigate to the PlotlyFig examples directory, the README just directly shows all the image outputs. Much nicer than cloning and running all the code.

If you want to see an example README with images, just look at the main matminer README

using figrecipes in google colab

hei

i use module from matminer.figrecipes.plot import PlotlyFig
and i try this basic code

A Simple XY plot

pf = PlotlyFig(title="Basic Example", mode='notebook')

Inputs are tuples contain a list of x variables and y variables

pf.xy(([1, 2, 3], [4, 5, 6]))

but the plot not showed

so, can you tell how to show this plot in google colab ?

Anotasi 2020-07-13 020738

Typo and API key shown in notebook

I've noticed a couple of typos and other minor bugs in the basic data retrieval notebook. I will keep track of them in this issue and submit a pull request containing fixes when I've run through all the notebook examples.

In data_retrieval_basics.ipynb:

  • The order of cell execution is non-linear (so the numbers at the side of the cells are not incremental). This isn't necessarily a problem but may be confusing for people new to notebooks.
  • Under the Materials Project heading, example 2: df.to_cv(...) should be df.to_csv(...)
  • Under Citrine informatics: an API key is given when initializing the CitrineDataRetrieval. This should be removed.
  • Capitalise Globus
  • Empty cell at end of notebook

(I'll edit this message if I find anything else)

The 'Structure' object?

Hello!How can I get the same format as the 'structure' field in the examples datasets? I only have some POSCAR files.Can someone help me?
BestWishes!

issue with higher version of numpy.

File "/Users/jason/opt/anaconda3/lib/python3.9/site-packages/matminer/featurizers/structure/matrix.py", line 292, in init
my_ohvs[Z] = self.get_ohv(el, period_tag)
File "/Users/jason/opt/anaconda3/lib/python3.9/site-packages/matminer/featurizers/structure/matrix.py", line 338, in get_ohv
my_ohv = np.zeros(self.size, np.int)
File "/Users/jason/opt/anaconda3/lib/python3.9/site-packages/numpy/init.py", line 305, in getattr
raise AttributeError(former_attrs[attr])
AttributeError: module 'numpy' has no attribute 'int'.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.