Giter Site home page Giter Site logo

pmdm's Introduction

PMDM: A dual diffusion model enables 3D binding bioactive molecule generation and lead optimization given target pockets

Official implementation of PMDM, a dual diffusion model enables 3D binding bioactive molecule generation and lead optimization given target pockets, by Lei Huang.

๐Ÿ“ข News

  • Our paper is accepted by Nature Communications !!

biorxiv DOI

GIF
  1. Dependencies
    1. Conda environment
    2. QuickVina 2
    3. Pre-trained models
  2. Benchmarks
    1. CrossDocked Benchmark
    2. Binding MOAD
  3. Training
  4. Inference
    1. Test set sampling
    2. Sample molecules for a given pocket
    3. Metrics
    4. QuickVina2
  5. Citation

Dependencies

Conda environment

Please use our environment file to install the environment.

# Clone the environment
conda env create -f mol.yml
# Activate the environment
conda activate mol

QuickVina 2

For docking, install QuickVina 2:

wget https://github.com/QVina/qvina/raw/master/bin/qvina2.1
chmod +x qvina2.1

Preparing the receptor for docking (pdb -> pdbqt) requires a new environment which is based on python 2x, so we need to create a new environment:

# Clone the environment
conda env create -f evaluation/env_adt.yml
# Activate the environment
conda activate adt

Pre-trained models

The pre-trained models could be downloaded from Zenodo.

Benchmarks

CrossDocked

Data preparation

Download and extract the dataset is provided in Zenodo

The original CrossDocked dataset can be found at https://bits.csb.pitt.edu/files/crossdock2020/

Binding MOAD

Data preparation

Download the dataset

wget http://www.bindingmoad.org/files/biou/every_part_a.zip
wget http://www.bindingmoad.org/files/biou/every_part_b.zip
wget http://www.bindingmoad.org/files/csv/every.csv

unzip every_part_a.zip
unzip every_part_b.zip

Training

We provide two training scripts train.py and train_ddp_op.py for single-GPU training and multi-GPU training.

Starting a new training run:

python -u train.py --config <config>.yml

The example configure file is in configs/crossdock_epoch.yml

Resuming a previous run:

python -u train.py --config <configure file path>

The config argument should be the upper path of the configure file.

Inference

Sample molecules for all pockets in the test set

python -u sample_batch.py --ckpt <checkpoint> --num_samples <number of samples> --sampling_type generalized

Sample molecules for given customized pockets

python -u sample_for_pdb.py --ckpt <checkpoint> --pdb_path <pdb path> --num_atom <num atom> --num_samples <number of samples> --sampling_type generalized

num_atom is the number of atoms of generated molecules.

Sample novel molecules given seed fragments

python -u sample_frag.py --ckpt <checkpoint> --pdb_path <pdb path> --mol_file <mole file> --keep_index <seed fragments index> --num_atom <num atom> --num_samples <number of samples> --sampling_type generalized

num_atom is the number of atoms of generated fragments. keep_index is the index of the atoms of the seed fragments.

Sample novel molecules for linker

python -u sample_linker.py --ckpt <checkpoint> --pdb_path <pdb path> --mol_file <mole file> --keep_index <seed fragments index> --num_atom <num atom> --num_samples <number of samples> --sampling_type generalized

num_atom is the number of atoms of generated fragments. mask is the index of the linker that you would like to replace in the original molecule.

Metrics

Evaluate the batch of generated molecules (You need to turn on the save_results arguments in sample* scripts)

python -u evaluate --path <molecule_path>

If you want to evaluate a single molecule, use evaluate_single.py.

QuickVina2

First, convert all protein PDB files to PDBQT files using adt envrionment.

conda activate adt
prepare_receptor4.py -r {} -o {}
cd evaluation

Then, compute QuickVina scores:

conda deactivate
conda activate mol
python docking_2_single.py --receptor_file <prepapre_receptor4_outdir> --sdf_file <sdf file> --out_dir <qvina_outdir>

!!! You have to replace the path of your own mol and adt environment paths with the path in the scripts already.

Citation

@article {Huang2023.01.28.526011,
	author = {Lei Huang and Tingyang Xu and Yang Yu and Peilin Zhao and Ka-Chun Wong and Hengtong Zhang},
	title = {A dual diffusion model enables 3D binding bioactive molecule generation and lead optimization given target pockets},
	elocation-id = {2023.01.28.526011},
	year = {2023},
	doi = {10.1101/2023.01.28.526011},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2023/01/30/2023.01.28.526011},
	eprint = {https://www.biorxiv.org/content/early/2023/01/30/2023.01.28.526011.full.pdf},
	journal = {bioRxiv}
}

pmdm's People

Contributors

layne-huang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.