Giter Site home page Giter Site logo

ps-vae's Introduction

PS-VAE: Molecule Generation by Principal Subgraph Mining and Assembling

WARNING: This is a legacy repo and has stopped updating, please refer to https://github.com/THUNLP-MT/PS-VAE for future updates.

This repo contains the codes for our paper Molecule Generation by Principal Subgraph Mining and Assembling accepted by NeurIPS 2022. We have also written a blog to illustrate our paper in more details.

if you use the code, please cite the following paper:

@article{kong2021molecule,
    title={Molecule Generation by Principal Subgraph Mining and Assembling},
    author={Kong, Xiangzhe and Huang, Wenbing and Tan, Zhixing and Liu, Yang},
    journal={Advances in neural information processing systems},
    year={2022}
}

Quick Links

Introduction

Our proposed method contains two parts: principal subgraphs extraction and VAE-based two-step subgraph generation & assembly.

Generally speaking, the concept of principal subgraph (PS) and its extraction algorithm make it possible to mine the frequent subgraphs that reflects the atom combination pattern of a given dataset efficiently. As validated in our paper, compared to hand-crafted or rule-based vocabularies, using PS-based vocabulary improves subgraph-level molecule generation methods significantly because these subgraphs not only capture the general pattern in molecules, but also their correlations with the molecular properties.

The two-step generation framework first generates subgraphs in a sequential manner by treating each type of the subgraph as a discrete token in the vocabulary. Then we union the generated subgraphs as a disconnected molecular graph for message passing and globally predict the connections between each these subgraphs.

Below is the overall diagram of our method:

Codes

End-to-end Framework

The directory src contains the complete codes of the principal subgraph extraction algorithm, our principal subgraph variational autoencoder (PS-VAE) and the checkpoints / data used in our experiments. If you are interested in training a PS-VAE on your own dataset or running some experiments on PS-VAE, please refer to the instructions provided in that directory.

Principal Subgraph Extraction

We have also provided a polished version of the principal subgraph extraction algorithm decoupled with other codes in the directory ps, which we recommend you to use if you are only interested in the extracted principal subgraphs as well as the subgraph-level decomposition of molecules. Please refer to that directory for detailed instructions.

Examples

Here are examples of extracted principal subgraphs on the ZINC250K dataset:

We provide examples of molecules generated by our PS-VAE as well:

Contact

Thank you for your interest in our work!

Please feel free to ask about any questions about the algorithms, codes, as well as problems encountered in running them so that we can make it clearer and better. You can either create an issue in the github repo or contact us at [email protected].

ps-vae's People

Contributors

kxz18 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

ndnng

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.