aced-differentiate / auto_cat Goto Github PK

Tools for automated structure generation of catalyst systems

License: MIT License

Python 100.00%

auto_cat's Introduction

AutoCat

AutoCat is a suite of python tools for sequential learning for materials applications and automating structure generation for DFT catalysis studies. Documentation for the package can be found here.

Development of this package stems from ACED, as part of the ARPA-E DIFFERENTIATE program.

Installation

There are two options for installation, either via pip or from the repo directly.

`pip` (recommended)

If you are planning on strictly using AutoCat rather than contributing to development, we recommend using pip within a virtual environment (e.g. conda ). This can be done as follows:

pip install autocat

Github (for developers)

Alternatively, if you would like to contribute to the development of this software, AutoCat can be installed via a clone from Github. First, you'll need to clone the github repo to your local machine (or wherever you'd like to use AutoCat) using git clone. Once the repo has been cloned, you can install AutoCat as an editable package by changing into the created directory (the one with setup.py) and installing via:

pip install -e .

Contributing

Contributions through issues, feature requests, and pull requests are welcome. Guidelines are provided here.

Acknowledgements

The code presented herein was funded by the Advanced Research Projects Agency-Energy (ARPA-E), U.S. Department of Energy, under Award Number DE-AR0001211 and in part by the National Science Foundation, under Award Number CBET-1554273. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

auto_cat's People

Contributors

Stargazers

Watchers

auto_cat's Issues

Cannot build slabs with high miller index

I was trying to build a bcc211 slab using generate_surface_structures and got the following error:
File "/opt/homebrew/lib/python3.9/site-packages/autocat/surface.py", line 236, in generate_surface_structures struct = ase_build_funcs[f"{cs}{facet}"]( KeyError: 'bcc211'
I guess this is because of using ase.build.bccxxx series function instead of ase.build.surface function. I suggest to build the surface with ase.build.surface instead

view_ads_sites with constraints

Fails if the host slab has constraints that prevent repetition

Enforce different SA & Host species

SAA functions currently do not enforce that the SA is a different species than the host

e.g. Pt1/Pt should not be allowed

This is especially important if multiple host and SA species are listed

Specifying rotations and heights of adsorbates

Currently requires manually specifying rotations and heights for adsorbates, even if no rotations should be applied.

Planning on rewriting such that the inputs for these parameters are dictionaries (currently lists) corresponding to the specified adatoms.

Rotations:

If dict is empty apply no rotations to any of the adatoms
If specific adatom is listed but not in the dict, do not apply rotation to only that adatom

Initial Heights:

If dict is empty use default height for all adatoms
If specific adatom is listed but not in the dict, use a default height

Multiple Adsorption for Dissociative Mechanisms

Add remaining NRR intermediate definitions to intermediates.nrr

Automation of Bulk Lattice Parameters

Test issue

@aced-differentiate/dft-automation

Systematize ser/deser for all data structures

Most data structures, especially in the learning module, perform serialization (to JSON) and deserialization (from JSON) using custom, adhoc logic for each data structure/class. A systematic way to handle ser/deser in general would be to make all data structures in autocat inherit from a base Serializable class that implements generic ser/deser functionality.

Example (basic) Serializable class:

class Serializable(object):
    """Base abstract class for a serializable object."""

    def to_dict(self):
        """Convert and return object as dictionary."""
        keys = {k.lstrip("_") for k in vars(self)}
        attr = {k: Serializable._to_dict(self.__getattribute__(k)) for k in keys}
        return attr

    @staticmethod
    def _to_dict(obj):
        """Convert obj to a dictionary, and return it."""
        if isinstance(obj, list):
            return [Serializable._to_dict(i) for i in obj]
        elif hasattr(obj, "as_dict"):
            return obj.as_dict()
        else:
            return obj

    @classmethod
    def from_dict(cls, ddict):
        """Construct an object from the input dictionary."""
        return cls(**ddict)

and then autocat data structures need only to inherit from the Serializable class as follows:

class AutoCatDesignSpace(Serializable):
    ...

Allow atoms object for adsorption directory functions

If feeding an atoms object into gen_rxn_int_sym or gen_rxn_int_pos it fails.

Need to:

Use symbols of atoms object as directory name
Update height, dict, and rotations dictionaries to take in symbols of atoms objects as keys

Replace pymatgen dopant generation functionality

The pymatgen doping functionality (generate_substitution_structures, used here:

auto_cat/src/autocat/saa.py

Line 299 in 88ae733

    
           pmg_substituted_structures = finder.generate_substitution_structures(dopant_element)

) looks half-baked, e.g., comparing coordinates by converting them into a string (https://github.com/materialsproject/pymatgen/blob/c3f139c8cd5aa7d55cc09ce56a6177d634355ae8/pymatgen/analysis/adsorption.py#L585). Better to either fix these issues upstream or implement functionality in-house.

Bidentate Adsorption

Documentation

Unittests

Need to be written for:

-bulk functions

-saa functions

-adsorption functions

Make unittests more modular

At present some of tests for a given submodule rely on another submodule (e.g. tests in adsorption using functions from surface)

This can be avoided by generating the structure files beforehand and including those with the tests.

Guessing of BV lattice based on host species

Potentially use ASE defaults to make guesses.

Update to allow manual specification via input dictionary

incorporating adsorption structures into sl framework

At present the AutoCatDesignSpace (and by extension AutoCatSequentialLearner) has a 1:1 correspondence between a single structure and corresponding label. So currently only the clean structures are featurized to learn the binding energies that are provided via labels.

Moving forward there are a few potential options that could generalize this:

Extend how structures are provided and stored to include the adsorbed structures instead of as a list of just directly supplying the ase.Atoms
e.g. [{'substrate': ase.Atoms, 'adsorbed': OUTPUT_DICT_FROM_GENERATE_RXN_STRUCTURES}, {…}]
or [{'substrate': ase.Atoms, 'adsorbed': ase.Atoms}, {…}]
If the labels are going to be adsorption energies, then this is something that can be calculated internally given the ase.Atoms objects for both the adsorbed and clean structures (as well as user specified reference states) rather than as a separately supplied np.ndarray. This could then be placed within a corresponding adsorbed dictionary to be pulled as needed
The downside of the above point is that it is arguably at the sacrifice of generalizability if the label is going to be something other than adsorption energies, e.g. d-band centers. (unless there is a clean way to allow for both approaches to coexist...)