Giter Site home page Giter Site logo

af2seq's Introduction

AF2Seq: Alphafold backbone design pipeline

Installation

Create a conda enviroment:
conda env create -f environment_cpu.yml

pyrosetta has to be installed seperately:
conda install -c https://NAME:[email protected] pyrosetta
Please refer to the pyrosetta webpage for detailed instructions

Alphafold weigths can be dowloaded according to the instructions on the official AlphaFold repo.

Go to the repository folder and run:
pip install .

GPU Version:

Install the cpu env first then add:
pip install --upgrade pip
pip install "jax[cuda]>=0.2,<0.3" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

The code was tested using:
gcc 8.4.0-cuda
cuda 11.1.1
cudnn 8.0.5.39-11.1-linux-x64

Starting sequence from secondary structure:

Please install dssp to enable the automatic starting sequence generation.
pip install pydssp

Then call the following function

from af2seq.design.utils import generate_start_sequence

sequence = generate_start_sequence('path/to/pdb/file')

Usage

from af2seq import GradientDesign,MCMCDesign

design = GradientDesign('path/to/weights','output/path')

design.design('path/to/pdb/file',
         iterations=500,
         lr=1e-3,)

mcmc = MCMCDesign('path/to/weights','output/path',random_seed=0,           
         mcmc_muts=1)

mcmc.design('path/to/pdb/file',
         iterations=500)

Plotting

from af2seq import plotting

plotting.plot_pred(design)

For bigger structures, requiring more memory, set the following variabels:
TF_FORCE_UNIFIED_MEMORY=1
XLA_PYTHON_CLIENT_MEM_FRACTION=2.0

Jupyter Notebook

jupyter notebook design.ipynb

Command Line

usage: Af2Seq [-h] [-n NAME] [-m MODEL] [-c CHAINS [CHAINS ...]] [-it ITER] [-s SEED] [--lr LR]
              [-r RECYCLES] [-cl CLAMP] [-am AA_MASK [AA_MASK ...]] [-fp FIX_POS [FIX_POS ...]]
              [-dlp DISABLE_LOSS_POS [DISABLE_LOSS_POS ...]] [-esl ENABLE_SC_LOSS [ENABLE_SC_LOSS ...]]
              [-st STARTSEQ [STARTSEQ ...]] [--msas MSAS [MSAS ...]] [-mm MCMC_MUTS] [-so SURF_OPTIM]
              [-l LOSS [LOSS ...]] [-lw LOSS_WEIGHTS [LOSS_WEIGHTS ...]]
              datadir target mode out

Fixed backbone design using AlphaFold

positional arguments:
  datadir               path to the directory that contains the Alphafold weights
  target                target pdb file that is used as groundtruth
  mode                  Gradient descent (gd) or MCMC (mcmc)
  out                   path to output directory

 optional arguments:
  -h, --help            show this help message and exit
  -n NAME, --name NAME  Name of the experiment
  -m MODEL, --model MODEL
                        Select a specifiy model. ptm or multimer
  -c CHAINS [CHAINS ...], --chains CHAINS [CHAINS ...]
                        chains that are targeted for design.
  -it ITER, --iter ITER
                        How many design steps should be performed
  -s SEED, --seed SEED  seed for mcmc
  --lr LR, --learning_rate LR
                        learning rate
  -l LOSS [LOSS ...], --loss LOSS [LOSS ...]
                        loss function that is used for the optimization process
  -lw LOSS_WEIGHTS [LOSS_WEIGHTS ...], --loss_weights LOSS_WEIGHTS [LOSS_WEIGHTS ...]
                        specifies the impact of each loss term
  -r RECYCLES, --recycles RECYCLES
                        AF recycles
  -cl CLAMP, --clamp CLAMP
                        FAPE loss clamp clips the loss of the distance between two residues is greater
                        than 10A
  -am AA_MASK [AA_MASK ...], --aa_mask AA_MASK [AA_MASK ...]
                        which amino acids to mask
  -fp FIX_POS [FIX_POS ...], --fix_pos FIX_POS [FIX_POS ...]
                        which indexes to mask
  -dlp DISABLE_LOSS_POS [DISABLE_LOSS_POS ...], --disable_loss_pos DISABLE_LOSS_POS [DISABLE_LOSS_POS ...]
                        disable backbone FAPE for these positions
  -esl ENABLE_SC_LOSS [ENABLE_SC_LOSS ...], --enable_sc_loss ENABLE_SC_LOSS [ENABLE_SC_LOSS ...]
                        which positions we want use sidechain FAPE in the loss
  -st STARTSEQ [STARTSEQ ...], --startseq STARTSEQ [STARTSEQ ...]
                        startseq. A for helix,V for b-sheet G for unordered
  --msas MSAS [MSAS ...]
                        MSA input path, None for no MSA
  -mm MCMC_MUTS, --mcmc_muts MCMC_MUTS
                        number of mutations introduced each MCMC round
  -so SURF_OPTIM, --surf_optim SURF_OPTIM
                        dont allow hydrophobic mutations on the surface

af2seq's People

Contributors

bene837 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.