Giter Site home page Giter Site logo

making-it-rain's Introduction

Making-it-rain

Cloud-based molecular simulations for everyone

+ 28Apr2023: We have updated all the notebooks to Python 3.10, everything is working fine
+ 01May2023: We have fixed the Amber, AlphaFol2+MD and Protei-Ligand notebooks.
+ 15Dec2023: We have corrected all the notebooks, and everything should be functioning properly now.

alt text

Hello there!

This is a repository where you can find a Jupyter notebook scripts for running Molecular Dynamics (MD) simulations using OpenMM engine and AMBER and CHARMM force fields files on Google Colab. This repository is a supplementary material of the paper "Making it rain: Cloud-based molecular simulations for everyone" and we encourage you to read it before using this pipeline.

The main goal of this work is to demonstrate how to harness the power of cloud-computing to run microsecond-long MD simulations in a cheap and yet feasible fashion.

Important: We've updated the notebooks to CondaColab. Now, all the dependencies will be installed faster than before (less than half of the previous time). You will see a CondaColab cell, just run and wait a few seconds, the session will restart and this is normal and expected. After that, you can continue running the cells like normal. Do not use the Run all option. Run the condacolab cell individually and wait for the kernel to restart. Only then, you can run all cells if you want. Thank you for your support.

  1. AMBER Open In Colab - Using AMBER to generate topology and to build the simulation box
  2. CHARMM Open In Colab - Using inputs from CHARMM-GUI solution builder
  3. AlphaFold2+MD Open In Colab - Using AlphaFold2_mmseqs2 to generate protein model + MD simulation using AMBER to generate topology and to build simulation box

UPDATE (October 2021)

  1. Protein-Ligand simulations Open In Colab - Using AMBER to generate topology and to build the simulation box and for the ligand using GAFF2 force field
  2. Using AMBER Inputs Open In Colab - Using inputs from AMBER suite of biomolecular simulation program
  3. Using GROMACS Inputs Open In Colab - Using inputs from GROMACS biomolecular simulation package (AMBER, CHARMM and OPLS force fields are compatible)

UPDATE (March 2022)

  1. RESP Partial Charges Open In Colab - Using a SMILES as input and outputs a mol2 file with RESP derived partial charges. Options for setting method (HF, B3LYP, ...), basis set (3-21G, 6-31G*) and singlepoint or geometry optimization are available
  2. Small Molecules MD Open In Colab - Using a SMILES as a input, geometry optimization with TorchANI and topology with AMBER (GAFF2 force field)
  3. GLYCAM Open In Colab - Using inputs from GLYCAM server

UPDATE (August 2022)

  1. DRUDE Open In Colab - Using inputs from CHARMM-GUI Drude Prepper

UPDATE (March 2024)

  1. Protein-Membrane simulations Open In Colab - Using OpenFF to generate the topology and build the simulation box for protein-membrane systems with AMBER force fields.
  2. Martini+cg2all Open In Colab - Utilizing Vermouth, the Python library that powers Martinize2, to generate the topology and build the simulation box for protein systems using Martini force fields. Additionally, employing cg2at to predict all-atom trajectories from coarse-grained (CG) representations.
  3. AMBER Mutations Open In Colab - Performing mutations on protein/nucleic acid systems and utilizing AMBER to generate the topology and build the simulation box.

Tired to "just" run molecular simulations and want to try something new? Gabriel Monteiro da Silva (@GMondaSilva) and I are thrilled to share a colab notebook for running the subsampled AlphaFold2 approach for predicting protein conformational ensembles. Check it out: Open In Colab

alt text

Bugs

Acknowledgments

  • We would like to thank the Psi4 team for developing an excellent and open source suite of ab initio quantum chemistry.
  • We would like to thank the Roitberg team for developing the fantastic TorchANI.
  • We would like to thank the OpenMM team for developing an excellent and open source engine.
  • We would like to thank the AlphaFold team for developing an excellent model and open sourcing the software.
  • We would like to thank the ChemosimLab (@ChemosimLab) team for their incredible ProLIF (Protein-Ligand Interaction Fingerprints) tool.
  • Credits to Sergey Ovchinnikov (@sokrypton), Milot Mirdita (@milot_mirdita) and Martin Steinegger (@thesteinegger) for their fantastic ColabFold
  • Making it rain by Pablo R. Arantes (@pablitoarantes), Marcelo D. Polêto (@mdpoleto), Conrado Pedebos (@ConradoPedebos) and Rodrigo Ligabue-Braun (@ligabue_braun).
  • Also, credit to David Koes for his awesome py3Dmol plugin.
  • Finally, we would like to thank Professor Giulia Palermo for her support and thoughtful comments in the development of the present work.

Do you want to cite this work?

Arantes P.R., Depólo Polêto M., Pedebos C., Ligabue-Braun R. Making it rain: cloud-based molecular simulations for everyone. Journal of Chemical Information and Modeling 2021. DOI: 10.1021/acs.jcim.1c00998.

DOI

trackgit-views

making-it-rain's People

Contributors

cpedebos avatar mdpoleto avatar pablo-arantes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

making-it-rain's Issues

Dynamic simultation using Martini forcefiekd

Hello

This summer we ran several dynamics (openMM/charmm) using your notebook on google colab.
As said in a previous mail, Many thanks for that.
Now, we ll have to run coarse-grained dynamic (Martini 3.0.0) with a system.top as that one (below) and (e.g.) such /bin

Is there any possibility a google colab notebook will be available in weeks coming , May be there is little interest for that ? However running a dynamic simultation on google colab is very confortable for us (low CPU, no GPU in the lab)

Thank you for your time and help

Alain

!/bin/csh

Generated by CHARMM-GUI (http://www.charmm-gui.org)

1) Use Gromacs 5.1 or newer to run these simulations

Minimization

setenv GMX_MAXCONSTRWARN -1

step4.0 - soft-core minimization

gmx grompp -f step4.0_minimization.mdp -o step4.0_minimization.tpr -c step3_charmm2gmx.pdb -r step3_charmm2gmx.pdb -p system.top -n index.ndx
gmx mdrun -deffnm step4.0_minimization

step4.1

gmx grompp -f step4.1_minimization.mdp -o step4.1_minimization.tpr -c step4.0_minimization.gro -r step3_charmm2gmx.pdb -p system.top -n index.ndx
gmx mdrun -deffnm step4.1_minimization
unsetenv GMX_MAXCONSTRWARN

Equilibration

gmx grompp -f step4.2_equilibration.mdp -o step4.2_equilibration.tpr -c step4.1_minimization.gro -r step3_charmm2gmx.pdb -p system.top -n index.ndx
gmx mdrun -deffnm step4.2_equilibration

Production

gmx grompp -f step5_production.mdp -o step5_production.tpr -c step4.2_equilibration.gro -p system.top -n index.ndx
gmx mdrun -deffnm step5_production

system.top
#include "toppar/martini_v3.0.0.itp"
#include "toppar/martini_v3.0.0_ions_v1.itp"
#include "toppar/martini_v3.0.0_nucleobases_v1.itp"
#include "toppar/martini_v3.0.0_phospholipids_v1.itp"
#include "toppar/martini_v3.0.0_phospholipids_v1_matthieu.itp"
#include "toppar/martini_v3.0.0_small_molecules_v1.itp"
#include "toppar/martini_v3.0.0_solvents_v1.itp"
#include "toppar/martini_v3.0.0_sugars_v1.itp"
#include "p15c2_proa.itp"
#include "p15c2_prob.itp"
#include "p15c2_proc.itp"

[ system ]
; name
Martini system

[ molecules ]
; name number
PROA 1
PROB 1
PROC 1
W 38366
NA 428
CL 426

image

name 'vol' is not defined

Hi.
Thank you for your great repository. I have used AMBER notebook several times in the past few days, and it worked fine. But now, when I even try the example PDB ID file 1AKI, it returns an error about the vol of the file.
Name 'vol' is not defined.
Please help me to fix this error

Error during parametrization

Hi. The error that I'm reporting here has been seen with many protein-ligand pairs.

In the parametrization step (in the Protein_ligand.ipynb notebook), the following error was seen.

NameError                                 Traceback (most recent call last)
<ipython-input-6-b8ef11d5d9db> in <module>()
    158         vol = float(line.split()[1])
    159 
--> 160 vol_lit  = vol * pow(10, -27)
    161 atom_lit = 9.03 * pow(10, 22)
    162 conc = float(Concentration)

NameError: name 'vol' is not defined

The line in line #158 is being read from a temporary copy of a part of leap.log file that contains 'Volume:'. However, this word 'Volume:' is not present in leap.log file.

This error was also seen in the issue #5

Attached with this is one of the pdb files which gave this error. (5ES1.txt)
I have used the ligand from this pdb file and made a complex with another protein.

Any help would be deeply appreciated.
Thank you

  • Hemant

stride_time_ps is not defined

I am trying to use the AMBER notebook (https://colab.research.google.com/github/pablo-arantes/Making-it-rain/blob/main/Amber.ipynb) to run a simulation on the PDB file 2XQW. I have followed the instructions throughout the notebook, but when I tried to run the Concatenate and Align Trajectory cell, I got the following error:

image

My runtime disconnected after the "Runs a Production MD simulation (NPT ensemble) after equilibration" step, so I am not sure if this has something to do with the error.

Multible GPUs

OpenMM package has support for multiple GPUs is it possible to implement this on making-it-rain, or it's limited by the number of CPUs on colab?
if so what's the minimum number of CPUs to run multiple GPUs?

Thanks in advance :)

Making it rain on local host

Dear concern,

I was wondering if you could let us know the usability of making it rain code on local host instead of Google colab. The intention of my question is to make amber gromacs even more easier for the beginner.

As an expert you must know how time consuming and difficult it is for the beginners to learn gromacs and AMBER protein-ligand MD simulation. If the jupyter script can be used on local host with GPU, i think it would open a big door..

Any undergraduate, graduate, researcher would opt for it. it would be very helpful for the whole community actually.

new issues

Hi,
I get error messages as below just recently;

ModuleNotFoundError Traceback (most recent call last)
in
----> 1 import pybel
2 import rdkit
3 import mdtraj as md
4 from rdkit import Chem
5 from rdkit.Chem import AllChem,Draw

ModuleNotFoundError: No module named 'pybel'


DCD file does not exist

Hi! I am trying to use the Amber.ipynb Colab notebook to run a MD simulation on the structure with PDB ID 5cvw. At the step "Concatenate and align the trajectory", after running everything above it successfully with default settings, I receive the error below:

OSError Traceback (most recent call last)
in ()
38 ref = [template % int(1)]
39
---> 40 u1 = mda.Universe(pdb, flist)
41 u2 = mda.Universe(pdb, ref)
42

5 frames
/usr/local/lib/python3.7/dist-packages/MDAnalysis/coordinates/DCD.py in init(self, filename, convert_units, dt, **kwargs)
133 super(DCDReader, self).init(
134 filename, convert_units=convert_units, **kwargs)
--> 135 self._file = DCDFile(self.filename)
136 self.n_atoms = self._file.header['natoms']
137

MDAnalysis/lib/formats/libdcd.pyx in MDAnalysis.lib.formats.libdcd.DCDFile.cinit()

MDAnalysis/lib/formats/libdcd.pyx in MDAnalysis.lib.formats.libdcd.DCDFile.open()

OSError: DCD file does not exist


I do not know why this is the case, because a file 5cvw_prod_1.dcd has been saved to my Google Drive from the previous cell in the run; can you help me fix this?

name 'vol' is not defines

Hi,
I'm facing a strange kind of issue. When I am using a complex structure (Protein with Zinc ions and a ligand), after successful simulation, I came to know that Zinc ions are removed. The system was freshly prepared through CHARMM-GUI, and tried many times, but receiving the same error upon "Parameter to generate topology". Below, is the error.

NameError Traceback (most recent call last)
in ()
158 vol = float(line.split()[1])
159
--> 160 vol_lit = vol * pow(10, -27)
161 atom_lit = 9.03 * pow(10, 22)
162 conc = float(Concentration)

NameError: name 'vol' is not defined

Ash

Gromacs_inputs

I am new to molecular dynamics and I am trying to do some simulations of mutant GPCRs in membranes. To create a system for your Colab notebook for simulation I used GPCRmodsim to get a short simulation and necessary GROMACS files. Unfortunately, when I run the simulation, I get the following error:

Too few fields in [ dihedraltypes ] line: SI OS 1 0.000 3.766 3

Any help would be appreciated.

I have included the gro and top files and also for good measure the FF files.
WTactive.zip
FFfiles.zip

FileNotFoundError:

Hi, I am trying to run notebook2 with my own files. I didn't find any Path error because all my files are picked up. The toppar.str is readed but i don't know why nothing is readable from the toppar directory.

FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/MyDrive/p1-charm/open read card unit 10 name toppar/top_all36_prot.rtf'

Any suggestions to handle it.

Amber Error

Hi @mdpoleto and @pablo-arantes !
Thank you so much for guidance as I have been able to run 18 ns production file till now. But while viewing the trajectory and PCA file I came across some error. So i request you to please guide on it.

ValueError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/matplotlib/axes/_axes.py in _parse_scatter_color_args(c, edgecolors, kwargs, xsize, get_next_color_func)
4238 try: # Is 'c' acceptable as PathCollection facecolors?
-> 4239 colors = mcolors.to_rgba_array(c)
4240 except ValueError:

9 frames
ValueError: Invalid RGBA argument: 0.0

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/matplotlib/axes/_axes.py in _parse_scatter_color_args(c, edgecolors, kwargs, xsize, get_next_color_func)
4240 except ValueError:
4241 if not valid_shape:
-> 4242 raise invalid_shape_exception(c.size, xsize)
4243 # Both the mapping and the RGBA conversion failed: pretty
4244 # severe failure => one may appreciate a verbose feedback.

ValueError: 'c' argument has 36 elements, which is inconsistent with 'x' and 'y' with size 624.

How to Deal Metallo-proteins with an Organic Molecule as a Ligand?

Hello, I came to know that GAFF failed to parameterize the metal ions, as a result, the organic molecule (ligand) jumped out of the pocket. On closer inspection of the test run, where I tried both GAFF and OpenFF, both have removed the metal ions (Zn in my case). Would please guide, if there is any other way of handling it, using these notebooks?

Simulation accuracy of making it rain pipeline

I was wondering if you could let me know the simulation accuracy of making it rain pipeline. It turns out that a system which i have simulated has very unexpected data.

I have used 5000 cycle energy minimization with 2 fs integration time. After 10ns equilibrium simulation, i have run 100ns MD production. However the result is bit weird. I tried increasing minimization cycles too. It doesn’t produce good results though. The same system i have simulated through DESMOND provides me RMSD around 2.5 angstrom

Force field amber 14
Tip3P water model, temp 298K, pressure 1 bar

Please see the attachment and check where the problem is arising from?
rmsd_ca

ImportError: cannot import name '_openmm' from 'openmm'

Started getting this error yesterday after using the CHARMM-GUI notebook for months. Any ideas what's changed? Thanks in advance!

Here's the entire error message:

ImportError Traceback (most recent call last)
in ()
17 sys.path.append('/usr/local/lib/python3.7/site-packages/')
18 from biopandas.pdb import PandasPdb
---> 19 import openmm as mm
20 from openmm import *
21 from openmm.app import *

/usr/local/lib/python3.7/site-packages/openmm/openmm.py in ()
11 # Import the low-level C/C++ module
12 if package or "." in name:
---> 13 from . import _openmm
14 else:
15 import _openmm

ImportError: cannot import name '_openmm' from 'openmm' (/usr/local/lib/python3.7/site-packages/openmm/init.py)


Protein_ligand.ipynb pybel issue

Hi,

I was running Protein_ligand module.

While I was running to provide input files, there was an error message saying pybel that it cannot import pybel.

Also, when I tried to "Enumerate Stereoisomers to generate ligand topology", I see error message as below;


AttributeError Traceback (most recent call last)
in
8 ##@markdown You can find the smiles for your lingad at: https://pubchem.ncbi.nlm.nih.gov/
9
---> 10 mol= [m for m in pybel.readfile(filename=ligand_pdb2, format='pdb')][0]
11 mol.calccharges
12 mol.addh()

AttributeError: module 'pybel' has no attribute 'readfile'

Could you look into this issue?

Best,

RuntimeError: n_atoms = 0

Hi @pablo-arantes

I got this issue "RuntimeError: n_atoms = 0: make sure to load correct Topology filename or load supported topology (pdb, amber parm, psf, ...)" for the cell "Runs an Equilibration MD simulation (NPT ensemble)".

I used all the file given by Solution Builder.

image

Thanks for your response,

Have a nice day.

Install dependencies - Protein_ligand.ipynb

when i install the dependencies i get this message :

WARNING:openff.toolkit.utils.toolkit_registry:
Image 1
Warning: Unable to load toolkit 'OpenEye Toolkit'. The Open Force Field Toolkit does not require the OpenEye Toolkits, and can use RDKit/AmberTools instead. However, if you have a valid license for the OpenEye Toolkits, consider installing them for faster performance and additional file format support: https://docs.eyesopen.com/toolkits/python/quickstart-python/linuxosx.html OpenEye offers free Toolkit licenses for academics: https://www.eyesopen.com/academic-licensing
WARNING:root:Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.

CHARMM_GUI ERROR

While running Equilibration MD simulation (NPT ensemble), I faced some error. I request you to please guide in resolving the issue. I am hereby copying the error from my notebook.

Simulation details:

Job name = /content/drive/MyDrive/MD/NOD1_equil
Coordinate file = /content/drive/MyDrive/MD/step3_pbcsetup.crd
PDB file = /content/drive/MyDrive/MD/step3_pbcsetup.pdb
Topology file = /content/drive/MyDrive/MD/step3_pbcsetup.psf
Force Field files = ()

Simulation_time = 100.0 ps
Integration timestep = 2 fs
Total number of steps = 50000

Save coordinates each 100 ps
Print in log file each 10 ps

Temperature = 298.0 K
Pressure = 1.0 bar

Setting the system:

- Reading topology and structure file...
- Setting box (using user's information)...
- Creating system and setting parameters...

KeyError Traceback (most recent call last)
/usr/local/lib/python3.7/site-packages/openmm/app/charmmpsffile.py in loadParameters(self, parmset)
618 else:
--> 619 atype = parmset.atom_types_str[atom.attype]
620 except KeyError:

KeyError: 'NH3'

During handling of the above exception, another exception occurred:

MissingParameter Traceback (most recent call last)
2 frames
in ()
164
165 print("\t- Creating system and setting parameters...")
--> 166 system = psf.createSystem(charmm_params, nonbondedMethod=PME, nonbondedCutoff=1.2nanometers, switchDistance=1.0nanometer, ewaldErrorTolerance = 0.0005, constraints=HBonds)
167
168 print("\t- Applying restraints. Force Constant = " + str(Force_constant) + "kJ/mol")

/usr/local/lib/python3.7/site-packages/openmm/app/charmmpsffile.py in createSystem(self, params, nonbondedMethod, nonbondedCutoff, switchDistance, constraints, rigidWater, implicitSolvent, implicitSolventKappa, implicitSolventSaltConc, temperature, soluteDielectric, solventDielectric, removeCMMotion, hydrogenMass, ewaldErrorTolerance, flexibleConstraints, verbose, gbsaModel, drudeMass)
872 """
873 # Load the parameter set
--> 874 self.loadParameters(params)
875 hasbox = self.topology.getUnitCellDimensions() is not None
876 # Check GB input parameters

/usr/local/lib/python3.7/site-packages/openmm/app/charmmpsffile.py in loadParameters(self, parmset)
620 except KeyError:
621 raise MissingParameter('Could not find atom type for %s' %
--> 622 atom.attype)
623 atom.type = atype
624 # Change to string attype to look up the rest of the parameters

MissingParameter: Could not find atom type for NH3

Small problems running the script and information

Hi @pablo-arantes

I have some very noob questions but I thought in adding it here as other users may have the same problem.

So on the production protocol in order for us to be able to continue our simulation, after GPU from google has been withdrawn, it is a good idea to set up a higher number of "Number_of_strides" so as the strides being performed are then saved in, supposed, the default file 1aki_equil.rst . My problem is that if we start this step again (after the 9 h stop cycle forced by google), colab rerun's it from the beginning and I loose what has been achieved before (like 5% of the run). Should we do something in the config file in order for google detect that we actually have some data already calculated?

Another problem that I have is that if I close the web browser and come back to the same notebook after google dcd me, I have to run the environment again (installation of py3Dmol etc...) is this normal?

Finally, I am getting 16 ns/day from colab with GPU (~200K atoms using the CHARMM script) I can see from your manuscript that this value should be a little higher (I understand that this has nothing to do with your software and rather with google freeware version) but do you have any tip about how to improve this?

Thank you very much for this wonderful python script, I was running my simulation with NAMD2 but it takes a lot of time and was almost giving up. Your code really helps a lot.

Regards.

Install dependencies

ImportError: cannot import name '_openmm' from partially initialized module 'openmm' (most likely due to a circular import) (/usr/local/lib/python3.7/site-packages/openmm/init.py)
1

UsageError: Line magic function `%%capture` not found

Debian Bullseye/Jupyterlab

Copied making-it-rain-main files and sub-directories to Miniconda/envs/Jupyterlab/CC. Opened CHARMM_GUI in Notebook. Here are the results:

Open In Colab
Hello there!

This is a Jupyter notebook for running Molecular Dynamics (MD) simulations using OpenMM engine and AMBER force field for Protein and Ligand systems. This notebook is a supplementary material of the paper "Making it rain: Cloud-based molecular simulations for everyone" (link here) and we encourage you to read it before using this pipeline.

The main goal of this notebook is to demonstrate how to harness the power of cloud-computing to run microsecond-long MD simulations in a cheap and yet feasible fashion.

This notebook is NOT a standard protocol for MD simulations! It is just simple MD pipeline illustrating each step of a simulation protocol.

Bugs

If you encounter any bugs, please report the issue to https://github.com/pablo-arantes/making-it-rain/issues

Acknowledgments

We would like to thank the OpenMM team for developing an excellent and open source engine.

We would like to thank the ChemosimLab (@ChemosimLab) team for their incredible ProLIF (Protein-Ligand Interaction Fingerprints) tool.

A Making-it-rain by Pablo R. Arantes (@pablitoarantes), Marcelo D. Polêto (@mdpoleto), Conrado Pedebos (@ConradoPedebos) and Rodrigo Ligabue-Braun (@ligabue_braun).

Also, credit to David Koes for his awesome py3Dmol plugin.

For related notebooks see: Making-it-rain

Introduction

In general, MD simulations rely on 1) a set of atomic coordinates of all atoms on a simulation box and 2) a set of force field parameters that describes the interaction energies between atoms.

In terms of inputs, we wil need:

A .pdb file of the protein and a .pdb file of the ligand containing a set of atomic coordinates.

In this notebook, we will simulate PDB 3HTB. To build our simulation box, we will use LEaP program (https://ambermd.org/tutorials/pengfei/index.php). The LEaP program is a portal between many chemical structure file types (.pdb and .mol2, primarily), and the Amber model parameter file types such as .lib, .prepi, parm.dat, and .frcmod. Each of the parameter files contains pieces of information needed for constructing a simulation, whether for energy minimization or molecular dynamics. LEaP functions within a larger workflow described in Section 1.1 of the Amber Manual.

To build ligand topology we will use general AMBER force field (GAFF - http://ambermd.org/antechamber/gaff.html) and The Open Force Field Toolkit (OpenFF - https://openforcefield.org/). GAFF is compatible with the AMBER force field and it has parameters for almost all the organic molecules made of C, N, O, H, S, P, F, Cl, Br and I. As a complete force field, GAFF is suitable for study of a great number of molecules in an automatic fashion. The Open Force Field Toolkit, built by the Open Force Field Initiative, is a Python toolkit for the development and application of modern molecular mechanics force fields based on direct chemical perception and rigorous statistical parameterization methods.

You can download the input files examples from here;

Setting the environment for MD calculation

Firstly, we need to install all necessary libraries and packages for our simulation. The main packages we will be installing are:

Anaconda (https://docs.conda.io/en/latest/miniconda.html)
OpenMM (https://openmm.org/)
PyTraj (https://amber-md.github.io/pytraj/latest/index.html)
py3Dmol (https://pypi.org/project/py3Dmol/)
ProLIF (https://github.com/chemosim-lab/ProLIF)
Numpy (https://numpy.org/)
Matplotlib (https://matplotlib.org/)
AmberTools (https://ambermd.org/AmberTools.php)

#@title Install dependencies

#@markdown It will take a few minutes, please, drink a coffee and wait. ;-)

%%capture

import sys

!pip -q install py3Dmol 2>&1 1>/dev/null

!pip install --upgrade MDAnalysis 2>&1 1>/dev/null

!pip install git+https://github.com/pablo-arantes/biopandas 2>&1 1>/dev/null

!pip install rdkit-pypi

!pip install Cython

!git clone https://github.com/pablo-arantes/ProLIF.git

prolif1 = "cd /content/ProLIF"

prolif2 = "sed -i 's/mdanalysis.*/mdanalysis==2.0.0/' setup.cfg"

prolif3 = "pip install ."

original_stdout = sys.stdout # Save a reference to the original standard output

with open('prolif.sh', 'w') as f:

sys.stdout = f # Change the standard output to the file we created.

print(prolif1)

print(prolif2)

print(prolif3)

sys.stdout = original_stdout # Reset the standard output to its original value

!chmod 700 prolif.sh 2>&1 1>/dev/null

!bash prolif.sh >/dev/null 2>&1

install conda

!wget -qnc https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

!bash Miniconda3-latest-Linux-x86_64.sh -bfp /usr/local 2>&1 1>/dev/null

!rm -r Miniconda3-latest-Linux-x86_64.sh /content/ProLIF prolif.sh

!conda install -y -q -c conda-forge openmm=7.6 python=3.7 pdbfixer 2>&1 1>/dev/null

!conda install -c conda-forge ambertools --yes 2>&1 1>/dev/null

!conda install -c ambermd pytraj --yes 2>&1 1>/dev/null

!conda install -c conda-forge parmed --yes 2>&1 1>/dev/null

!conda install -c conda-forge openff-toolkit --yes 2>&1 1>/dev/null

!conda install -c bioconda pybel --yes

!conda install -c openbabel openbabel --yes

#load dependencies

sys.path.append('/usr/local/lib/python3.7/site-packages/')

from openmm import app, unit

from openmm.app import HBonds, NoCutoff, PDBFile

from openff.toolkit.topology import Molecule, Topology

from openff.toolkit.typing.engines.smirnoff import ForceField

from openff.toolkit.utils import get_data_file_path

import parmed as pmd

from biopandas.pdb import PandasPdb

import openmm as mm

from openmm import *

from openmm.app import *

from openmm.unit import *

import os

import urllib.request

import numpy as np

import MDAnalysis as mda

import py3Dmol

import pytraj as pt

import platform

import scipy.cluster.hierarchy

from scipy.spatial.distance import squareform

import scipy.stats as stats

import matplotlib.pyplot as plt

import pandas as pd

from scipy.interpolate import griddata

import seaborn as sb

from statistics import mean, stdev

from pytraj import matrix

from matplotlib import colors

from IPython.display import set_matplotlib_formats

!wget https://raw.githubusercontent.com/openforcefield/openff-forcefields/master/openforcefields/offxml/openff_unconstrained-2.0.0.offxml 2>&1 1>/dev/null

UsageError: Line magic function %%capture not found.

Please advise.

Thanks in advance.

Can the simulation box be a rhombic dodecahedron?

Hi

Prior to using OpenMM, I have been using GROMACS and there's absolutely no doubt that OMM is way faster than GROMACS.
However, I was just curious to know if we can optimize the box dimensions to further increase the simulation rates.

error running AF2 prediction

AF2 module gives the following error:


ModuleNotFoundError Traceback (most recent call last)
in ()
94 import sys
95 sys.path.append('/usr/local/lib/python3.7/site-packages/')
---> 96 from biopandas.pdb import PandasPdb
97 import os
98 import urllib.request

ModuleNotFoundError: No module named 'biopandas'


NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.

Problems concatenating trajectory

Hi Make-it-rain guys,

First of all, thanks for this awesome tool! I loved the idea!

I'm running a simulation using AMBER input, which I generated before using leap.

Everything was ok in the colab up to the point where I am trying to concatenate the trajectories. In this step, I have the following error message:


IndexError Traceback (most recent call last)

in ()
20
21 trajlist = pt.load(flist, pdb, stride=stride_traj)
---> 22 traj_image = trajlist.iterframe(autoimage=True, rmsfit=0)
23 traj_write = pt.write_traj(traj_end, traj_image, overwrite=True)
24 traj_load = pt.load(traj_end, pdb)

1 frames

/usr/local/lib/python3.7/site-packages/pytraj/trajectory/trajectory.py in getitem(self, index)
303 """
304 if self.n_frames == 0:
--> 305 raise IndexError("Your Trajectory is empty, how can I index it?")
306
307 if is_int(index):

IndexError: Your Trajectory is empty, how can I index it?


It does not seem to be system dependent, since I tries two different systems already with the same issue. Any ideas of what might be going on here?

Thanks.

Can't Install dependencies

RuntimeError: This event loop is already running

We have previously run this code successfully. However, we have facing some issues now.
Whenever we tried to install the dependencies we have ended up with RuntimeError.

How to resolve this.

Thank you in advance
Issue
.

openmm circular import

Hi,

While it has been working perfect few days ago, I get error messages as below just recently;


ImportError Traceback (most recent call last)
in
17 sys.path.append('/usr/local/lib/python3.7/site-packages/')
18 from biopandas.pdb import PandasPdb
---> 19 import simtk.openmm as mm
20 import simtk.openmm.app as app
21 from simtk.openmm import unit

2 frames
/usr/local/lib/python3.7/site-packages/openmm/openmm.py in
11 # Import the low-level C/C++ module
12 if package or "." in name:
---> 13 from . import _openmm
14 else:
15 import _openmm

ImportError: cannot import name '_openmm' from partially initialized module 'openmm' (most likely due to a circular import) (/usr/local/lib/python3.7/site-packages/openmm/init.py)


UsageError Protein_ligand tool: WARNING:openff.toolkit.utils.toolkit_registry:Warning: Unable to load toolkit 'OpenEye Toolkit'. The Open Force Field Toolkit does not require the OpenEye Toolkits, and can use RDKit/AmberTools instead. However, if you have a valid license for the OpenEye Toolkits, consider installing them for faster performance and additional file format support: https://docs.eyesopen.com/toolkits/python/quickstart-python/linuxosx.html OpenEye offers free Toolkit licenses for academics: https://www.eyesopen.com/academic-licensing WARNING:root:Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.

Cordial greetings I have had the following error in the installation of the tool in google colab which does not allow me to run the calculations, attached capture

#@title Install dependencies
#@markdown It will take a few minutes, please, drink a coffee and wait. ;-)
%%capture
import sys
!pip -q install py3Dmol 2>&1 1>/dev/null
!pip install --upgrade MDAnalysis 2>&1 1>/dev/null
!pip install git+https://github.com/pablo-arantes/biopandas 2>&1 1>/dev/null
!pip install rdkit-pypi
!pip install Cython
!git clone https://github.com/pablo-arantes/ProLIF.git
prolif1 = "cd /content/ProLIF"
prolif2 = "sed -i 's/mdanalysis.*/mdanalysis==2.0.0/' setup.cfg"
prolif3 = "pip install ."

original_stdout = sys.stdout # Save a reference to the original standard output

with open('prolif.sh', 'w') as f:
sys.stdout = f # Change the standard output to the file we created.
print(prolif1)
print(prolif2)
print(prolif3)
sys.stdout = original_stdout # Reset the standard output to its original value

!chmod 700 prolif.sh 2>&1 1>/dev/null
!bash prolif.sh >/dev/null 2>&1

install conda

!wget -qnc https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
!bash Miniconda3-latest-Linux-x86_64.sh -bfp /usr/local 2>&1 1>/dev/null
!rm -r Miniconda3-latest-Linux-x86_64.sh /content/ProLIF prolif.sh
!conda install -y -q -c conda-forge openmm=7.6 python=3.7 pdbfixer 2>&1 1>/dev/null
!conda install -c conda-forge ambertools --yes 2>&1 1>/dev/null
!conda install -c ambermd pytraj --yes 2>&1 1>/dev/null
!conda install -c conda-forge parmed --yes 2>&1 1>/dev/null
!conda install -c conda-forge openff-toolkit --yes 2>&1 1>/dev/null
!conda install -c bioconda pybel --yes
!conda install -c openbabel openbabel --yes

#load dependencies
sys.path.append('/usr/local/lib/python3.7/site-packages/')
from openmm import app, unit
from openmm.app import HBonds, NoCutoff, PDBFile
from openff.toolkit.topology import Molecule, Topology
from openff.toolkit.typing.engines.smirnoff import ForceField
from openff.toolkit.utils import get_data_file_path
import parmed as pmd
from biopandas.pdb import PandasPdb
import openmm as mm
from openmm import *
from openmm.app import *
from openmm.unit import *
import os
import urllib.request
import numpy as np
import MDAnalysis as mda
import py3Dmol
import pytraj as pt
import platform
import scipy.cluster.hierarchy
from scipy.spatial.distance import squareform
import scipy.stats as stats
import matplotlib.pyplot as plt
import pandas as pd
from scipy.interpolate import griddata
import seaborn as sb
from statistics import mean, stdev
from pytraj import matrix
from matplotlib import colors
from IPython.display import set_matplotlib_formats
!wget https://raw.githubusercontent.com/openforcefield/openff-forcefields/master/openforcefields/offxml/openff_unconstrained-2.0.0.offxml 2>&1 1>/dev/null

WARNING:openff.toolkit.utils.toolkit_registry:Warning: Unable to load toolkit 'OpenEye Toolkit'. The Open Force Field Toolkit does not require the OpenEye Toolkits, and can use RDKit/AmberTools instead. However, if you have a valid license for the OpenEye Toolkits, consider installing them for faster performance and additional file format support: https://docs.eyesopen.com/toolkits/python/quickstart-python/linuxosx.html OpenEye offers free Toolkit licenses for academics: https://www.eyesopen.com/academic-licensing
WARNING:root:Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.

Molecular Simulation on Protein Ligand Complex in Gromacs

Hi,

First of all, thank you very much for this awesome work. However, I am only familiar with Gromacs. I see that there is one Gromacs_input file but that uses protein only. I wanted to know how to perform molecular simulation on protein-ligand complex in gromacs by this method.

Thank you

Query on use of forcefeilds.

Respected Sir,
Based on your advice I increased the number Strides. However, while running the cell for making topology of the molecule. I thought to use amber03(ff94/ff99) forcefeild, for its extensive modifications providing better QM solvent continuum model. Thus, I tried to edit the code to source that forcefeild. However, everytime I run it , it can't enerate topology; It says that ligand_gaff.pdb file couldn't be found. Thus, I would like to request you to consider adding the amber03 forcefeild too.
Regards

Error while incorporating metal atoms in protein MD

I found the notebook to be working incredibly well for protein-ligand simulation when the ligand was an organic molecule.

Here is what I had done before I came across an error. I tried to simulate an APO protein with just Zn atoms. My input files were the protein.pdb and a pdb file containing just the Zn atoms (ligand.pdb). I was getting the following error in the parametrization step.

cat: /content/drive/MyDrive/Source/ligand_gaff.pdb: No such file or directory
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-b8ef11d5d9db> in <module>()
    158         vol = float(line.split()[1])
    159 
--> 160 vol_lit  = vol * pow(10, -27)
    161 atom_lit = 9.03 * pow(10, 22)
    162 conc = float(Concentration)

NameError: name 'vol' is not defined

The line in line #158 is being read from a temporary copy of a part of leap.log file that contains 'Volume:'. However, this word 'Volume:' is not present in leap.log file.

Any help would be greatly appreciated.
Hemant

NameError: name 'firststride' is not defined

Hi,
I have completed simulation in 10 strides where each stride is of 1ns. When I try to concatenate, it says
"NameError: name 'first stride is not defined".
Am i making a mistake?

Unable to open output file

Hello.

  1. How do I visualize the output of the simulation?. I am unable to open the output xtc, pdb, dcd or trr.
  2. Is there any way to generate .tpr files?

Thank you

GROMACS Protein topology defined in another .itp file

This error:

/usr/local/lib/python3.7/site-packages/pytraj/io.py in load_topology(filename, option)
525 if top.n_atoms == 0:
526 raise RuntimeError(
--> 527 'n_atoms = 0: make sure to load correct Topology filename '
528 'or load supported topology (pdb, amber parm, psf, ...)')
529 return top

However in the topol.top file, I have put:

#include "topol_Protein.itp"
#include "topol_Ion2.itp"

The protein topology was defined in the "topol_protein.itp". The error will disappear if I put the topol_Protein.itp information into the topol.top.

Thus, if the software can read both lines above and get the information. That would be great.

Equilibration MD simulation Error

Hello,

Firstly, thanks a lot for making-it-rain, amazing!

I want to run a simulation with my protein using CHARMM but I am getting an error at the "Runs an Equilibration MD simulation (NPT ensemble)" step (see copied error attached). Everything before that works fine. As this protein was embedded in a bilayer, I thought that was the issue. But then, when I repeated the steps using 1AKI (the protein you used in the example) but with files I generated using solution builder in CHARMM_GUI, it also stopped at the same step. I guess it has to do with topology or something but can't figure out (I'm no expert in MD or coding). Could you please let me know how to circumvent it?

Thanks a lot in advance.

Best,
Dimitris

Error.txt

RuntimeError: can not find /content/drive/MyDrive/GROMACS_MIR/1aki_prod_1.dcd

RuntimeError                              Traceback (most recent call last)
[<ipython-input-26-a2c3b9fd379d>](https://localhost:8080/#) in <module>()
     19 #print(flist)
     20 
---> 21 trajlist = pt.load(flist, pdb, stride=stride_traj)
     22 traj_image = trajlist.iterframe(autoimage=True, rmsfit=0)
     23 traj_write = pt.write_traj(traj_end, traj_image, overwrite=True)

3 frames
[/usr/local/lib/python3.7/site-packages/pytraj/io.py](https://localhost:8080/#) in load(filename, top, frame_indices, mask, stride)
    122     # load to TrajectoryIterator object first
    123     # do not use frame_indices_ here so we can optimize the slicing speed
--> 124     traj = load_traj(filename, top, stride=stride)
    125 
    126     # do the slicing and other things if needed.

[/usr/local/lib/python3.7/site-packages/pytraj/io.py](https://localhost:8080/#) in load_traj(filename, top, *args, **kwd)
    244 
    245     if 'stride' in kwd:
--> 246         ts._load(filename, stride=kwd['stride'])
    247     elif 'frame_slice' in kwd:
    248         ts._load(filename, frame_slice=kwd['frame_slice'])

[/usr/local/lib/python3.7/site-packages/pytraj/trajectory/trajectory_iterator.py](https://localhost:8080/#) in _load(self, filename, top, frame_slice, stride)
    233                 self._frame_slice_list.append(frame_slice)
    234                 super(TrajectoryIterator, self)._load(
--> 235                     fname, top_, frame_slice=fslice)
    236         else:
    237             raise ValueError("filename must a string or a list of string")

pytraj/trajectory/c_traj/c_trajectory.pyx in pytraj.trajectory.c_traj.c_trajectory.TrajectoryCpptraj._load()

[/usr/local/lib/python3.7/site-packages/pytraj/utils/check_and_assert.py](https://localhost:8080/#) in ensure_exist(filename)
     78     if not os.path.exists(filename):
     79         txt = "can not find %s" % filename
---> 80         raise RuntimeError(txt)
     81 
     82 

RuntimeError: can not find /content/drive/MyDrive/GROMACS_MIR/1aki_prod_1.dcd

**I am getting this error even though I executed the cells above "Concatenate and align the trajectory" after I reconnected.

Thank you**

SyntaxError: invalid syntax

File "/usr/local/lib/python3.7/dist-packages/biopandas/pdb/pandas_pdb.py", line 369
if s := header.split():
^
SyntaxError: invalid syntax

Gettin this error since yesterday. Any solution?
Thank you

Append from unfinished simulation

Hi,

Thank you for the software. Currently, I have a question regarding the way of simulation. As I understood that the "make it rain" try to go around the limits of 12h/24h by having the stride time and number of strides, however, I was wondering why not use the restart function within gromacs i.e. "gmx mdrun -cpi state.cpt" during the simulation process? Currently, I keep getting the problem of the simulation stopped at 1.9 ns (and the stride time is 2 ns), then the next restart starts from the beginning of that simulation again. Thank you very much if you can clarify that for me.

Best,

Ben

Having issue with Ramachandran plot

Sorry I am new to colab, I was not sure how to troubleshoot this issue. Can you help me on this? The colab i used is AlphaFold2+MD
The error i get is NameError: name 'inp' is not defined. I use the amino acid sequence as input and left everything on default.

image

AttributeError: module 'jaxlib.pocketfft' has no attribute 'pocketfft'

Encountered when executing the Run Prediction cell in the AlphaFold2+MD colab.

AttributeError Traceback (most recent call last)
[<ipython-input in ()
14 del os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]
15
---> 16 from colabfold.colabfold import plot_protein
17 from pathlib import Path
18 import matplotlib.pyplot as plt

7 frames
[/usr/local/lib/python3.7/dist-packages/jax/_src/lax/fft.py] in ()
143 batching.primitive_batchers[fft_p] = fft_batching_rule
144 if pocketfft:
--> 145 xla.backend_specific_translations['cpu'][fft_p] = pocketfft.pocketfft

Possiblity to append/continue simuations;

Respected Sir,
I came across your Google Colab Notebook (& whitepaper) while searching for cloud based computation for protein ligand simulations. i was kind of impressed by the way you have assembled the entire toolset that would be require for peforming simulations. However , while trying it for my personal research ,it stated that a 100ns simulation for the protein ligand simulation would take about 2 days. this would be quite difficult as maximus google colab offers to me is 6-12 hours of runtime. thus, i wanted to break the simulation in parts and append it time to time. But couldn't find any feature associated which could allow me to continue the simulation from the checkpoint. Thus, could you please help if you know how to continue simulation from where it is halted

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.