Giter Site home page Giter Site logo

synthesizability-stoi-cgnf's Introduction

Synthesizability-stoi-CGNF

Synthesizability-stoi-CGNF is a python code for predicting synthesizability score which is quantitative synthesizability metric of inorganic crystal compositions. This is a partially supervised machine learning protocol (PU-learning) using CGNF(Composition Graph Neural Fingerprint) atomic embedding method developed by prof. Yousung Jung group (contact: [email protected]).

Developers

Jidon Jang, Juhwan Noh

Prerequisites

Python3
Numpy
Pytorch
Pymatgen

Publication

Jidon Jang, Juhwan Noh, Lan Zhou, Geun Ho Gu, John M. Gregoire, and Yousung Jung, "Synthesizability of materials stoichiometry using semi-supervised learning", Matter, 2024, 7(6), 2294-2312 (DOI: 10.1016/j.matt.2024.05.002)

Usage

[1] Define a customized data format and prepare atomic embedding vector file for generation of CGNF

To input crystal structures to Synthesizability-stoi-CGNF, you will need to define a customized dataset and pre-generate CGNF as pickle files for bootstrap aggregating in semi-supervised learning. Note that this is required for both training and predicting. Following files should be needed to generate CGNF.

id_prop.csv: a CSV file with two columns for positive data(synthesizable) and unlabeled data(not-yet-synthesized). The first column recodes a inorganic composition (The formula string format of Composition class in Pymatgen package is recommended), and the second column recodes the value (1 = positive, 0 = unlabeled) according to whether they were synthesized already or not.

cgcnn_hd_rcut4_nn8.element_embedding.json: a JSON file containing atomic embedding vectors for generation of CGNF

[2] Train a Synthesizability-PU-CGCNN model

python main_PU_learning.py --bag 100 --data id_prop.csv --embedding cgcnn_hd_rcut4_nn8.element_embedding.json --split ./split

Load composition information from 'id_prop.csv' and generate data split files for PU-learning in 'split' folder.
After training, prediction results for test-unlabeled data (csv file) corresponding to each iteration will be generated.
Result of bootstrap aggregating is saved as 'test_results_ensemble_100models.csv'
You can change the number of bootstrap samples using '--bag' option

[3] Predict synthesizability of new crystals with pre-trained models

python predict_PU_learning.py --bag 100 --data id_prop_test.csv --embedding cgcnn_hd_rcut4_nn8.element_embedding.json --modeldir ./models

Load composition information from 'id_prop_test.csv' file for test materials and pre-trained models from 'models' folder.
Predict synthesizability of crystal composition in id_prop_test.csv file using the loaded models.
Result of bootstrap aggregating is saved as 'test_results_ensemble_100models.csv'

synthesizability-stoi-cgnf's People

Contributors

joshua-416 avatar

Stargazers

Guobin Zhao avatar J. George avatar Anthony Onwuli avatar

Watchers

Geun Ho Gu avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.