DLTKcat

DLTKcat v1.0: Deep learning based prediction of temperature dependent enzyme turnover rates

Dataset curation from SABIO-RK and BRENDA

The dataset curation process is in /code/GetData.ipynb.

How to use DLTKcat ?

Required inputs: substrate name, Uniprot ID of enzyme protein, temperature.
Get SMILES strings and enzyme protein sequences using convert_input(path, enz_col, sub_col ) in /code/feature_functions.py.
The input must be a csv file with columns of 'smiles', 'seq', 'Temp_K_norm', 'Inv_Temp_norm'.
'Temp_K_norm' and 'Inv_Temp_norm' are normalized temperature and inverse temperature values.
Run prediction:

python predict.py --model_path [default = /data/performances/model_latentdim=40_outlayer=4_rmsetest=0.8854_rmsedev=0.908.pth]<br>
--param_dict_pkl [default = /data/hyparams/param_2.pkl] <br>
--input [input.csv] --output [output file name] <br>
--has_label [default = False]

Get attention weights of protein residues:

python get_attention.py --input [input.csv] --output [output file name]

Case studies

Mutants of Pyrococcus furiosus Ornithine Carbamoyltransferase via directed evolution (/data/PFOCT/,/code/CaseStudy_PFOCT.ipynb).
Ref: https://doi.org/10.1128/jb.183.3.1101-1105.2001
Growth and metabolism of Lactococcus lactis and Streptococcus thermophilus at different temperatures(/data/GEMs, /code/GEMs.ipynb).
Ref: https://doi.org/10.1038/srep14199, https://doi.org/10.1111/j.1365-2672.2004.02418.x

Dependencies

Pytorch: https://pytorch.org/
Scikit-learn: https://scikit-learn.org/
RDKit:https://www.rdkit.org/
BRENDApyrser: https://github.com/Robaina/BRENDApyrser
COBRApy: https://github.com/opencobra/cobrapy
Seaborn statistical data visualization:https://seaborn.pydata.org/index.html
Escher: https://github.com/zakandrewking/escher

Citation

DLTKcat: deep learning based prediction of temperature dependent enzyme turnover rates Sizhe Qiu, Simiao Zhao, Aidong Yang bioRxiv 2023.08.10.552798; doi: https://doi.org/10.1101/2023.08.10.552798

Issue

Users might encounter "Index out of range" error at amino_vector = self.embedding_layer_amino(amino).
The potential solution is +1 to n_atom, n_amino in model parameters, and train a new model.

wangjian-ucas / dltkcat Goto Github PK

dltkcat's Introduction

DLTKcat

Dataset curation from SABIO-RK and BRENDA

How to use DLTKcat ?

Case studies

Dependencies

Citation

Issue

dltkcat's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent