Giter Site home page Giter Site logo

zpeng1989 / essay_for_molecular_generation Goto Github PK

View Code? Open in Web Editor NEW

This project forked from keithtab/essay_for_molecular_generation

0.0 1.0 0.0 219 KB

This repo have a tendency to provide the newest paper about molecular generation or evolution, it will also be convenient for everyone to get info about Drug Discovery).

License: Creative Commons Attribution 4.0 International

Shell 100.00%

essay_for_molecular_generation's Introduction

Essay list about Molecular Generation or Drug Discovery

Menu

Survey

  • [Elsevier 2022] Deep learning approaches for de novo drug design: An overview [Paper]

0.Basic datasets for Drug Discovery

0.1 Molecular Based Structure

0.1.1 Structure Database

  1. ChemBl Datasets
  2. PubChem
  3. PDBbind
  4. Cortellis Drug Discovery Intelligence
  5. ZINC15 database
  6. DrugBank
  7. GDB-13
  8. ANI-1

Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17
Lars Ruddigkeit, Ruud van Deursen, Lorenz C. Blum, Jean-Louis Reymond
Journal of Chemical Information and Modeling, November 2012, https://doi.org/f4d9mt
DOI: 10.1021/ci300415d · PMID: 23088335

The PDBbind Database:  Methodologies and Updates
Renxiao Wang, Xueliang Fang, Yipin Lu, Chao-Yie Yang, Shaomeng Wang
Journal of Medicinal Chemistry, May 2005, https://doi.org/djbvfc
DOI: 10.1021/jm048957q · PMID: 15943484

970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13
Lorenz C. Blum, Jean-Louis Reymond
Journal of the American Chemical Society, June 2009, https://doi.org/dwxj84
DOI: 10.1021/ja902302h · PMID: 19505099

ZINC 15 – Ligand Discovery for Everyone
Teague Sterling, John J. Irwin
Journal of Chemical Information and Modeling, November 2015, https://doi.org/gf4zg2
DOI: 10.1021/acs.jcim.5b00559 · PMID: 26479676 · PMCID: PMC4658288

ANI-1: A data set of 20M off-equilibrium DFT calculations for organic molecules
Justin S. Smith, Olexandr Isayev, Adrian E. Roitberg
arXiv, January 2018, https://arxiv.org/abs/1708.04987
DOI: 10.1038/sdata.2017.193 || code

The ANI-1 potential was shown to be chemically accurate for systems of 50 atoms and more, demonstrating extensibility and transferability to much larger molecules than those in the training set.” “This phenomenon, whereby an ML model is trained on small systems (which could be thought of as fragments of large systems), then demonstrated to be extensible to large systems has also been confirmed in other recent studies.

0.2 Protein Based Structure

0.2.1 Protein Structure Datasets

SidechainNet: An All-Atom Protein Structure Dataset for Machine Learning Jonathan E. King, David Ryan Koes
arxiv || github::sidechainnet

TDC maintains a resource list that currently contains 22 tasks (and its datasets) related to small molecules and macromolecules, including PPI, DDI and so on. MoleculeNet published a small molecule related benchmark four years ago.

In terms of datasets and benchmarks, protein design is far less mature than drug discovery (paperwithcode drug discovery benchmarks). (Maybe should add the evaluation of protein design for deep learning method (especially deep generative model))
Difficulties and opportunities always coexist. Happy to see the work of Christian Dallago, Jody Mou, Kadina E. Johnston, Bruce J. Wittmann, Nicholas Bhattacharya, Samuel Goldman, Ali Madani, Kevin K. Yang and Zhangyang Gao, Cheng Tan, Stan Z. Li. How grateful.

0.3 Molecular Render Tools

0.3.1 Pymol

If you are a green hand for pymol, I will recommand you visiting this website PymolGallery, and it will be a very fantasitic instruction for you!

0.3.2 ChiemraX

0.3.3 VMD

0.3.4 Blender

0.3.5 Protein Imager

The Protein Imager: a full-featured online molecular viewer interface with server-side HQ-rendering capabilities
Gianluca Tomasello, Ilaria Armenia, Gianluca Molla
Bioinformatics, January 2020, https://doi.org/gqhbf2
DOI: 10.1093/bioinformatics/btaa009 · PMID: 31930403

0.4 Benchmark Datasets for evalution about molecular generation Models

0.4.1 Basic suite

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
Daniil Polykovskiy, Alexander Zhebrak, Benjamin Sanchez-Lengeling, Sergey Golovanov, Oktai Tatanov, Stanislav Belyaev, Rauf Kurbanov, Aleksey Artamonov, Vladimir Aladinskiy, Mark Veselov, … Alex Zhavoronkov
arXiv, October 2020, https://arxiv.org/abs/1811.12823 || pdf || code

GuacaMol: Benchmarking Models for de Novo Molecular Design
Nathan Brown, Marco Fiscato, Marwin H. S. Segler, Alain C. Vaucher
Journal of Chemical Information and Modeling, March 2019, https://doi.org/ggpn3x
DOI: 10.1021/acs.jcim.8b00839 · PMID: 30887799 || pdf || code

1.Repository Introduction

Inspired by YanZhe Zhang's papers_for_protein_design_using_DL, I have a tendency to organize drug discovery papers by deep learning published in recent years especially on molecular generation, and this repo in the future will always be dynamic.We will make this list by Manubot, If you know some literature in this regard, I also very welcome you to put forward the doi/url/arxiv/PMID and so on of the literature collected in this issue in the issue, On the other way, you can also contribute by create or edit the file in the content directory, as follows is for example:

## Manubot example documention and introduction link
url:https://greenelab.github.io/meta-review/ 
doi:10.1098/rsif.2017.0387 
url:https://github.com/manubot/manubot/ 

In this repository, README.md is created via continuous integration and should not be edited directly. Edit README-BASE.md to update this text. Update the reference lists in the content directory to add new sections or references. This is only a proof of concept that is not robust against errors in the scripts or merge conflicts.

The deploy.sh, and environment.yml files were derived from https://github.com/manubot/rootstock (CC0 1.0 license).

If you are still confused with the markdown format about reasonable reference and In addition, this workflow only runs on issues with the label reference. Please See #7 for an example:)

2.Environment Setup (Ubuntu-22.04)

sudo apt install pandoc-citeproc pandoc build-essential
pip install --upgrade git+https://github.com/manubot/manubot@$COMMIT 
pip install panflute==1.12.5

3.Molecular_Generation(Overview)

GitHub - admislf/MINN-DTI: Effective drug-target interaction prediction with mutual interaction neural network
GitHub
https://github.com/admislf/MINN-DTI

An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming
Minkai Xu, Wujie Wang, Shitong Luo, Chence Shi, Yoshua Bengio, Rafael Gomez-Bombarelli, Jian Tang
arXiv, June 2021, https://arxiv.org/abs/2105.07246

Learning Gradient Fields for Molecular Conformation Generation
Chence Shi, Shitong Luo, Minkai Xu, Jian Tang
arXiv, June 2021, https://arxiv.org/abs/2105.03902

GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation
Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, Jian Tang
arXiv, March 2022, https://arxiv.org/abs/2203.02923

Deep Evolutionary Learning for Molecular Design
Karl Grantham, Muhetaer Mukaidaisi, Hsu Kiang Ooi, Mohammad Sajjad Ghaemi, Alain Tchagang, Yifeng Li
IEEE Computational Intelligence Magazine, May 2022, https://doi.org/gqdbrc
DOI: 10.1109/mci.2022.3155308

MGCVAE: Multi-Objective Inverse Design via Molecular Graph Conditional Variational Autoencoder
Myeonghun Lee, Kyoungmin Min
Journal of Chemical Information and Modeling, June 2022, https://doi.org/gqhf8q
DOI: 10.1021/acs.jcim.2c00487 · PMID: 35666276

3D Infomax improves GNNs for Molecular Property Prediction
Hannes Stärk, Dominique Beaini, Gabriele Corso, Prudencio Tossou, Christian Dallago, Stephan Günnemann, Pietro Liò
arXiv, June 2022, https://arxiv.org/abs/2110.04126

Pre-training Molecular Graph Representation with 3D Geometry
Shengchao Liu, Hanchen Wang, Weiyang Liu, Joan Lasenby, Hongyu Guo, Jian Tang
arXiv, May 2022, https://arxiv.org/abs/2110.07728 || pdf

EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction
Hannes Stärk, Octavian-Eugen Ganea, Lagnajit Pattanaik, Regina Barzilay, Tommi Jaakkola
arXiv, June 2022, https://arxiv.org/abs/2202.05146 || pdf

Spherical Message Passing for 3D Graph Networks
Yi Liu, Limei Wang, Meng Liu, Xuan Zhang, Bora Oztekin, Shuiwang Ji
arXiv, May 2022, https://arxiv.org/abs/2102.05013 || pdf

A Deep Generative Model for Molecule Optimization via One Fragment Modification
Ziqi Chen, Martin Renqiang Min, Srinivasan Parthasarathy, Xia Ning
arXiv, January 2022, https://arxiv.org/abs/2012.04231
DOI: 10.1038/s42256-021-00410-2

GF-VAE
Changsheng Ma, Xiangliang Zhang
Proceedings of the 30th ACM International Conference on Information & Knowledge Management, October 2021, https://doi.org/gp2883
DOI: 10.1145/3459637.3482260 || code

Geometry-Based Molecular Generation With Deep Constrained Variational Autoencoder
Chunyan Li, Junfeng Yao, Wei Wei, Zhangming Niu, Xiangxiang Zeng, Jin Li, Jianmin Wang
IEEE Transactions on Neural Networks and Learning Systems, 2022, https://doi.org/gpjb8f
DOI: 10.1109/tnnls.2022.3147790 · PMID: 35171779 || code

Molecular visual representation based on 3D spatial structure: Referring to the extensive application of CNN in computer vision, we proposed a representation method of encoding molecular spatial structure into pictures, that is, converting molecular spatial coordinates into RGB attributes of pictures and using CNN for feature extraction. Then enter VAE model.

DeePKS+ABACUS as a Bridge between Expensive Quantum Mechanical Models and Machine Learning Potentials
Wenfei Li, Qi Ou, Yixiao Chen, Yu Cao, Renxi Liu, Chunyi Zhang, Daye Zheng, Chun Cai, Xifan Wu, Han Wang, … Linfeng Zhang
arXiv, June 2022, https://arxiv.org/abs/2206.10093

The Pre-main Sequence: Challenges and Prospects for Asteroseismology
Konstanze Zwintz, Thomas Steindl
arXiv, June 2022, https://arxiv.org/abs/2206.09171
DOI: 10.3389/fspas.2022.914738

LIMO: Latent Inceptionism for Targeted Molecule Generation
Peter Eckmann, Kunyang Sun, Bo Zhao, Mudong Feng, Michael K. Gilson, Rose Yu
arXiv, June 2022, https://arxiv.org/abs/2206.09010 || code || pdf

Attention-wise masked graph contrastive learning for predicting molecular property
Hui Liu, Yibiao Huang, Xuejun Liu, Lei Deng
arXiv, June 2022, https://arxiv.org/abs/2206.08262

Exploring Chemical Space with Score-based Out-of-distribution Generation
Seul Lee, Jaehyeong Jo, Sung Ju Hwang
arXiv, June 2022, https://arxiv.org/abs/2206.07632

A 3D Molecule Generative Model for Structure-Based Drug Design
Shitong Luo, Jiaqi Guan, Jianzhu Ma, Jian Peng
arXiv, March 2022, https://arxiv.org/abs/2203.10446 || pdf

Molecular Optimization by Capturing Chemist’s Intuition Using Deep Neural Networks November 2020, https://doi.org/gqgzp7
DOI: 10.21203/rs.3.rs-101137/v1

GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Kaushalya Madhawa, Katushiko Ishiguro, Kosuke Nakago, Motoki Abe
arXiv, May 2019, https://arxiv.org/abs/1905.11600

Graph Residual Flow for Molecular Graph Generation
Shion Honda, Hirotaka Akita, Katsuhiko Ishiguro, Toshiki Nakanishi, Kenta Oono
arXiv, October 2019, https://arxiv.org/abs/1909.13521

Junction Tree Variational Autoencoder for Molecular Graph Generation
Wengong Jin, Regina Barzilay, Tommi Jaakkola
arXiv, April 2019, https://arxiv.org/abs/1802.04364

Grammar Variational Autoencoder
Matt J. Kusner, Brooks Paige, José Miguel Hernández-Lobato
arXiv, March 2017, https://arxiv.org/abs/1703.01925

Syntax-Directed Variational Autoencoder for Structured Data
Hanjun Dai, Yingtao Tian, Bo Dai, Steven Skiena, Le Song
arXiv, February 2018, https://arxiv.org/abs/1802.08786

GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders
Martin Simonovsky, Nikos Komodakis
arXiv, February 2018, https://arxiv.org/abs/1802.03480

Scaffold-constrained molecular generation
Maxime Langevin, Herve Minoux, Maximilien Levesque, Marc Bianciotto
arXiv, January 2021, https://arxiv.org/abs/2009.07778
DOI: 10.1021/acs.jcim.0c01015

MolGPT: Molecular Generation Using a Transformer-Decoder Model
Viraj Bagal, Rishal Aggarwal, P. K. Vinod, U. Deva Priyakumar
Journal of Chemical Information and Modeling, October 2021, https://doi.org/gnw9m7
DOI: 10.1021/acs.jcim.1c00600 · PMID: 34694798

Hierarchical Generation of Molecular Graphs using Structural Motifs
Wengong Jin, Regina Barzilay, Tommi Jaakkola
arXiv, April 2020, https://arxiv.org/abs/2002.03230

3.1 VAE-based

An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming
Minkai Xu, Wujie Wang, Shitong Luo, Chence Shi, Yoshua Bengio, Rafael Gomez-Bombarelli, Jian Tang
arXiv, June 2021, https://arxiv.org/abs/2105.07246

Geometry-Based Molecular Generation With Deep Constrained Variational Autoencoder
Chunyan Li, Junfeng Yao, Wei Wei, Zhangming Niu, Xiangxiang Zeng, Jin Li, Jianmin Wang
IEEE Transactions on Neural Networks and Learning Systems, 2022, https://doi.org/gpjb8f
DOI: 10.1109/tnnls.2022.3147790 · PMID: 35171779 || code

MGCVAE: Multi-Objective Inverse Design via Molecular Graph Conditional Variational Autoencoder
Myeonghun Lee, Kyoungmin Min
Journal of Chemical Information and Modeling, June 2022, https://doi.org/gqhf8q
DOI: 10.1021/acs.jcim.2c00487 · PMID: 35666276 || code || pdf

GF-VAE
Changsheng Ma, Xiangliang Zhang
Proceedings of the 30th ACM International Conference on Information & Knowledge Management, October 2021, https://doi.org/gp2883
DOI: 10.1145/3459637.3482260 || code

LIMO: Latent Inceptionism for Targeted Molecule Generation
Peter Eckmann, Kunyang Sun, Bo Zhao, Mudong Feng, Michael K. Gilson, Rose Yu
arXiv, June 2022, https://arxiv.org/abs/2206.09010 || code || pdf

Junction Tree Variational Autoencoder for Molecular Graph Generation
Wengong Jin, Regina Barzilay, Tommi Jaakkola
arXiv, April 2019, https://arxiv.org/abs/1802.04364

Grammar Variational Autoencoder
Matt J. Kusner, Brooks Paige, José Miguel Hernández-Lobato
arXiv, March 2017, https://arxiv.org/abs/1703.01925

Syntax-Directed Variational Autoencoder for Structured Data
Hanjun Dai, Yingtao Tian, Bo Dai, Steven Skiena, Le Song
arXiv, February 2018, https://arxiv.org/abs/1802.08786

GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders
Martin Simonovsky, Nikos Komodakis
arXiv, February 2018, https://arxiv.org/abs/1802.03480

3.2 Structure-Based-Drug-Design

Hierarchical Generation of Molecular Graphs using Structural Motifs
Wengong Jin, Regina Barzilay, Tommi Jaakkola
arXiv, April 2020, https://arxiv.org/abs/2002.03230

A 3D Molecule Generative Model for Structure-Based Drug Design
Shitong Luo, Jiaqi Guan, Jianzhu Ma, Jian Peng
arXiv, March 2022, https://arxiv.org/abs/2203.10446 || pdf || code

3.3 Transformer-Based

MolGPT: Molecular Generation Using a Transformer-Decoder Model
Viraj Bagal, Rishal Aggarwal, P. K. Vinod, U. Deva Priyakumar
Journal of Chemical Information and Modeling, October 2021, https://doi.org/gnw9m7
DOI: 10.1021/acs.jcim.1c00600 · PMID: 34694798 || code

Molecular Optimization by Capturing Chemist’s Intuition Using Deep Neural Networks November 2020, https://doi.org/gqgzp7
DOI: 10.21203/rs.3.rs-101137/v1 || code

3.4 Flow-Based

GraphNVP: An Invertible Flow Model for Generating Molecular Graphs
Kaushalya Madhawa, Katushiko Ishiguro, Kosuke Nakago, Motoki Abe
arXiv, May 2019, https://arxiv.org/abs/1905.11600

Graph Residual Flow for Molecular Graph Generation
Shion Honda, Hirotaka Akita, Katsuhiko Ishiguro, Toshiki Nakanishi, Kenta Oono
arXiv, October 2019, https://arxiv.org/abs/1909.13521

3.5 Geometry-based

Shape-Based Generative Modeling for de Novo Drug Design
Miha Skalic, José Jiménez, Davide Sabbadin, Gianni De Fabritiis
Journal of Chemical Information and Modeling, February 2019, https://doi.org/gfv7f3
DOI: 10.1021/acs.jcim.8b00706 · PMID: 30762364 || code

Deep Generative Models for 3D Linker Design
Fergus Imrie, Anthony R. Bradley, Mihaela van der Schaar, Charlotte M. Deane
Journal of Chemical Information and Modeling, March 2020, https://doi.org/gnfhsq
DOI: 10.1021/acs.jcim.9b01120 · PMID: 32195587 · PMCID: PMC7189367 || code

Geometry-Based Molecular Generation With Deep Constrained Variational Autoencoder
Chunyan Li, Junfeng Yao, Wei Wei, Zhangming Niu, Xiangxiang Zeng, Jin Li, Jianmin Wang
IEEE Transactions on Neural Networks and Learning Systems, 2022, https://doi.org/gpjb8f
DOI: 10.1109/tnnls.2022.3147790 · PMID: 35171779 || code

GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation
Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, Jian Tang
arXiv, March 2022, https://arxiv.org/abs/2203.02923

3.6 Scaffold-based

Scaffold-constrained molecular generation
Maxime Langevin, Herve Minoux, Maximilien Levesque, Marc Bianciotto
arXiv, January 2021, https://arxiv.org/abs/2009.07778
DOI: 10.1021/acs.jcim.0c01015

4.Protein Design(Overview)

If you want to learn more about protein design paper, recommand you visit papers_for_protein_design_using_DL

Machine-learning-guided directed evolution for protein engineering
Kevin K. Yang, Zachary Wu, Frances H. Arnold
Nature Methods, July 2019, https://doi.org/gf43h4
DOI: 10.1038/s41592-019-0496-6 · PMID: 31308553

Batched Stochastic Bayesian Optimization via Combinatorial Constraints Design
Kevin K. Yang, Yuxin Chen, Alycia Lee, Yisong Yue
arXiv, April 2019, https://arxiv.org/abs/1904.08102

Unified rational protein engineering with sequence-only deep representation learning
Ethan C. Alley, Grigory Khimulya, Surojit Biswas, Mohammed AlQuraishi, George M. Church
Cold Spring Harbor Laboratory, March 2019, https://doi.org/gf48g2
DOI: 10.1101/589333

Navigating the protein fitness landscape with Gaussian processes
Philip A. Romero, Andreas Krause, Frances H. Arnold
Proceedings of the National Academy of Sciences, December 2012, https://doi.org/f4k8bz
DOI: 10.1073/pnas.1215251110 · PMID: 23277561 · PMCID: PMC3549130

5.Single-cell-pseudotime(Overview)

A comparison of single-cell trajectory inference methods
Wouter Saelens, Robrecht Cannoodt, Helena Todorov, Yvan Saeys
Nature Biotechnology, April 2019, https://doi.org/gfxsgd
DOI: 10.1038/s41587-019-0071-9 · PMID: 30936559

GitHub - agitter/single-cell-pseudotime: An overview of algorithms for estimating pseudotime in single-cell RNA-seq data
GitHub
https://github.com/agitter/single-cell-pseudotime

Network Inference with Granger Causality Ensembles on Single-Cell Transcriptomic Data
Atul Deshpande, Li-Fang Chu, Ron Stewart, Anthony Gitter
Cold Spring Harbor Laboratory, January 2019, https://doi.org/gft4bb
DOI: 10.1101/534834

essay_for_molecular_generation's People

Contributors

keithtab avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.