Giter Site home page Giter Site logo

grade_if's Introduction

GraDe_IF: Graph Denoising Diffusion for Inverse Protein Folding (NeurIPS 2023)

GraDe_IF

Description

Implementation for "Graph Denoising Diffusion for Inverse Protein Folding" arxiv link.

Requirements

To install requirements:

conda env create -f environment.yml

Usage

Like denoising-diffusion-pytorch, there is a brief introduction to show how this discrete diffusion work.

import sys
sys.path.append('diffusion')

import torch
from torch_geometric.data import Batch
from diffusion.gradeif import GraDe_IF,EGNN_NET
from dataset_src.generate_graph import prepare_graph

gnn = EGNN_NET(input_feat_dim=input_graph.x.shape[1]+input_graph.extra_x.shape[1],hidden_channels=10,edge_attr_dim=input_graph.edge_attr.shape[1])

diffusion_model = GraDe_IF(gnn)

graph = torch.load('dataset/process/test/3fkf.A.pt')
input_graph = Batch.from_data_list([prepare_graph(graph)])

loss = diffusion_model(input_graph)
loss.backward()

_,sample_seq = diffusion_model.ddim_sample(input_graph) #using structure information generate sequence

More details can be found in the jupyter notebook

Parameter Chosen in Sampling

Here is an ablation study of two key parameters, step and diverse, in the ddim_sample function used to get improved results presented in the paper. The following results were computed after 50 ensemble runs. One can find how to do ensembles in the jupyter notebook.

BLOSUM Kernel - Diverse Mode

Step Recovery Rate Perplexity Single Sample Recovery Rate
500 0.5341 4.02 0.505
250 0.5370 4.06 0.4679
100 0.5356 4.98 0.4213
50 0.4827 8.02 0.3745

BLOSUM Kernel - Non-Diverse Mode

Step Recovery Rate Perplexity Single Sample Recovery Rate
500 0.5342 4.02 0.505
250 0.5373 4.12 0.4741
100 0.5351 7.43 0.5016
50 0.4999 16.74 0.4736

Uniform Kernel - Diverse Mode

Step Recovery Rate Perplexity Single Sample Recovery Rate
500 0.5286 4.08 0.5022
250 0.5292 4.13 0.4325
100 0.5329 5.28 0.4222
50 0.5341 5.91 0.4212

Uniform Kernel - Non-Diverse Mode

Step Recovery Rate Perplexity Single Sample Recovery Rate
500 0.5286 4.08 0.5022
250 0.5273 4.09 0.4357
100 0.5238 9.49 0.5095
50 0.5285 15.53 0.5113

Comments

  • Our codebase for the EGNN models and discrete diffusion builds on EGNN, DiGress. Thanks for open-sourcing!

Citation

If you consider our codes and datasets useful, please cite:

@inproceedings{
      yi2023graph,
      title={Graph Denoising Diffusion for Inverse Protein Folding},
      author={Kai Yi and Bingxin Zhou and Yiqing Shen and Pietro Lio and Yu Guang Wang},
      booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
      year={2023},
      url={https://openreview.net/forum?id=u4YXKKG5dX}
      }

grade_if's People

Contributors

ykiiiiii avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

grade_if's Issues

Last residue deleted

I noticed on line 565 of cath_imem_2nd.py that you delete the last residue of the PDB file:
graph = self.remove_node(graph, graph.x.shape[0]-1)

Why is this done? Is it safe to comment out this line (code seems to run fine without it)? Is the last residue deleted when computing metrics with the baseline methods as well?

Thanks!

no attribute '__check_input__'

Dear developer,

try to run the example, but it gave an error

in egnn_pytorch_geometric.py
File "diffusion\model\egnn_pytorch\egnn_pytorch_geometric.py", line 250, in forward
hidden_out, coors_out = self.propagate(edge_index, x=feats, edge_attr=edge_attr_feats,
File "diffusion\model\egnn_pytorch\egnn_pytorch_geometric.py", line 273, in propagate
size = self.check_input(edge_index, size)
AttributeError: 'EGNN_Sparse' object has no attribute 'check_input'

after comment out #size = self.check_input(edge_index, size)
AttributeError: 'EGNN_Sparse' object has no attribute 'collect'

problems about inverse_folding.ipynb

Hi, in the fourth block, this code--"prob,sample_graph= diffusion.ema_model.ddim_sample(input_graph)" raise a error
"too many values to unpack (expected 2)".can you please give me some help.

Provide Datasets from Paper

Hi can you clarify which dataset is in the chain_set_splits.json you provided and also provide the json files for all of the datasets used in the paper for reproducibility? Thank you!

’EGNN_Sparse‘ object has no attribute '__check_input__'

Hi, i run the inverse_folding.ipynb , and the fourth cell shows "AttributeError:'EGNN_Sparse' object has no attribute check_input". The torch_geometric version is the same as yours, .2.2.0. and the new 2.4.0 also has the bug. i don't know how to solve this bug. could you please help me?

image

about the paper.

Hi, i want to know if this repository will be update? now it is nothing, thank you!

The recovery in TS50 and t500 Dataset

Hi, could you release the ckpt and config about the TS50 and T500 dataset? because I follow the setting in paper that hidden_size boost to 256, the recoveries of the two datasets are reduced by about 5 percentage points. Maybe the mean_attr.pt is not suitable for these dataset? I am confused.. so hope to receive reply.

thanks!

segmentation fault(core dumped)

Hi, I want to ask for help. when i run download_pdb.py, i got the error that segmentation fault(core dumped). some days before, when i trained the model, in the training process, i also met this error. almostly, i can not successfully run, but i don't know the reason, because the error time is also different, and the error is strange...I have tried some meatures,replace the version, re-install conda, and re-install the system, but the error still exists...
image
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.