Giter Site home page Giter Site logo

chao1224 / chatdrug Goto Github PK

View Code? Open in Web Editor NEW
129.0 3.0 8.0 4.59 MB

LLM for Drug Editing, ICLR 2024

Home Page: https://chao1224.github.io/ChatDrug

Python 93.38% Jupyter Notebook 6.62%
chatgpt conversation drug drug-discovery editing molecule motif peptide protein retrieval

chatdrug's Introduction

Conversational Drug Editing Using Retrieval and Domain Feedback

ICLR 2024

Authors: Shengchao Liu+, Jiongxiao Wang+, Yijin Yang, Chengpeng Wang, Ling Liu, Hongyu Guo*, Chaowei Xiao*

+ Equal contribution
* Equal advising

[Paper] [Project Page] [ArXiv]

ChatDrug is for conversational drug editing, and three types of drugs are considered:

  • Small Molecules
  • Peptides
  • Proteins

Environment

Setup the anaconda (skip this if you already have conda)

wget https://repo.continuum.io/archive/Anaconda3-2019.10-Linux-x86_64.sh
bash Anaconda3-2019.10-Linux-x86_64.sh -b
export PATH=$PWD/anaconda3/bin:$PATH

Then download the required python packages:

conda create -n ChatDrug python=3.8
conda activate ChatDrug
pip install rdkit-pypi==2022.9.4
conda install -y numpy networkx scikit-learn
conda install -y -c conda-forge -c pytorch pytorch=1.9.1

pip install tensorflow
pip install mhcflurry
pip install levenshtein

pip install transformers
pip install lmdb
pip install seqeval
pip install openai
pip install fastchat
pip install psutil
pip install accelerate

pip install -e .

Dataset

We provide the dataset in this link. You can manually download and move to the data folder or using the following python script.

from huggingface_hub import snapshot_download

snapshot_download(repo_id="chao1224/ChatDrug_data", repo_type="dataset", local_dir="data", local_dir_use_symlinks=False, ignore_patterns=["README.md"])

Please give credits to the original papers. For more details of dataset, please check the data folder.

Evaluation

The evaluation metrics for three editing tasks are below:

Drug Type Evaluation
Small Molecule RDKit (conda install -y -c rdkit rdkit)
Peptide MHCFlurry
Protein ProteinDT paper, checkpoints

For evaluation on peptides and proteins, please read the following instructions:

  • For peptides (MHCFlurry), please run the following bash commands:
> pip install mhcflurry
> mhcflurry-downloads fetch models_class1_presentation
> mhcflurry-downloads path models_class1_presentation
$PATH
> mv $PATH data/peptide/models_class1_presentation
  • For proteins (ProteinDT / ProteinCLAP), please run the following python script:
from huggingface_hub import hf_hub_download

hf_hub_download(
  repo_id="chao1224/ProteinCLAP_pretrain_EBM_NCE_downstream_property_prediction",
  repo_type="model",
  filename="pytorch_model_ss3.bin",
  cache_dir="data/protein")

Please give credits to the original papers. For more details of evaluation, please check the data folder.

Prompt for Drug Editing

All the task prompts are defined in ChatDrug/task_and_evaluation. you can also find it on the hugging face link.

Usage

Please provide your OpenAI API Key in ChatDrug/task_and_evaluation/Conversational_LLMs_utils.py

To use ChatDrug, please use the following command:

python main_ChatDrug.py --task task_id --log_file results/ChatDrug.log --record_file results/ChatDrug.json --C 2

Results will be saved in results/.

For protein editing tasks, multiple evaluation times in retrieval process would consume a lot of time. Thus, we provide a fast version of conversation setting. Running the following command to implement accelerate ChatDrug for protein editing tasks:

python main_ChatDrug.py --task task_id --log_file results/ChatDrug_fast_protein.log --record_file results/ChatDrug_fast_protein.json --C 2 --fast_protein

We also provide code for In-Context Learning setting:

python main_InContext.py --task task_id --log_file results/InContext.log --record_file results/InContext.json

Cite Us

Feel free to cite this work if you find it useful to you!

@inproceedings{liu2024chatdrug,
    title={Conversational Drug Editing Using Retrieval and Domain Feedback},
    author={Shengchao Liu, Jiongxiao Wang, Yijin Yang, Chengpeng Wang, Ling Liu, Hongyu Guo, Chaowei Xiao},
    booktitle={The Twelfth International Conference on Learning Representations},
    year={2024},
    url={https://openreview.net/forum?id=yRrPfKyJQ2}
}

chatdrug's People

Contributors

chao1224 avatar jayfeather1024 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

chatdrug's Issues

How can I try ChatDrug?

According to your arxiv preprint, Q&A GUI is found, but I can find only task_and_evaluation dir in this repo. Do you have a plan to make full scripts available for ChatDrug?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.