pAPRika tutorial 7 - APR/NAMD with Colvars restraints

In this tutorial, we will perform APR calculations for the butane (BUT)–cucurbit[6]uril (CB6) host-guest system with the Generalized-Born implicit solvent. This tutorial will use NAMD for the simulation and Colvars for the restraints. Although it is possible to run NAMD with Plumed we will not use Plumed in this tutorial. Users should refer to the previous tutorial to see how to use Plumed restraints.

🔵 Since we have a prepared the host-guest-dummy setup from the first tutorial, we will skip the initial tleap steps and go right into initializing the restraints. If you haven’t completed tutorial 1 please go back and run the notebook to generate the starting structure.

Initialize

[1]:
import os

Define names

We will store the files created in this tutorial in a folder called namd so we don’t mix files with the previous tutorial.

[2]:
base_name = "cb6-but-dum"
work_dir = "namd"
complex_dir = "complex"
data = "../../../paprika/data/cb6-but"

Configure APR Restraints

Define anchor atoms

See tutorial 1 for the choice of selection

[3]:
# Guest atoms
G1 = ":BUT@C"
G2 = ":BUT@C3"

# Host atoms
H1 = ":CB6@C"
H2 = ":CB6@C31"
H3 = ":CB6@C18"

# Dummy atoms
D1 = ":DM1"
D2 = ":DM2"
D3 = ":DM3"

Determine the number of windows

Before we add the restraints, it is helpful to set the \(\lambda\) fractions that control the strength of the force constants during attach and release, and to define the distances for the pulling phase.

The attach fractions go from 0 to 1 and we place more points at the bottom of the range to sample the curvature of \(dU/d \lambda\). Next, we generally apply a distance restraint until the guest is ~18 Angstroms away from the host, in increments of 0.4 Angstroms. This distance should be at least twice the Lennard-Jones cutoff in the system. These values have worked well for us, but this is one aspect that should be carefully checked for new systems.

[4]:
import numpy as np
[5]:
attach_string = "0.00 0.40 0.80 1.60 2.40 4.00 5.50 8.65 11.80 18.10 24.40 37.00 49.60 74.80 100.00"
attach_fractions = [float(i) / 100 for i in attach_string.split()]

initial_distance = 6.0
pull_distances = np.arange(0.0 + initial_distance, 18.0 + initial_distance, 1.0)

release_fractions = []

windows = [len(attach_fractions), len(pull_distances), len(release_fractions)]
print(f"There are {windows} windows in this attach-pull-release calculation.")
There are [15, 18, 0] windows in this attach-pull-release calculation.

Load structure

[6]:
import parmed as pmd
  • Load complex structure

[7]:
structure = pmd.load_file(
    os.path.join(complex_dir, f"{base_name}.prmtop"),
    os.path.join(complex_dir, f"{base_name}.rst7"),
    structure = True,
)

Host Static Restraints

See tutorial 1 for an explanation of the static restraints.

[8]:
import paprika.restraints as restraints
[9]:
static_restraints = []
[10]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D1, H1],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 5.0,
                                    amber_index=True)

static_restraints.append(r)
[11]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D2, D1, H1],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=True)

static_restraints.append(r)
[12]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D3, D2, D1, H1],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=True)

static_restraints.append(r)
[13]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D1, H1, H2],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=True)

static_restraints.append(r)
[14]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D2, D1, H1, H2],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=True)

static_restraints.append(r)
[15]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D1, H1, H2, H3],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=True)

static_restraints.append(r)

Guest translational and rotational restraints

See tutorial 1 for an explanation of the guest restraints

[16]:
guest_restraints = []
[17]:
r = restraints.DAT_restraint()
r.mask1 = D1
r.mask2 = G1
r.topology = structure
r.auto_apr = True
r.continuous_apr = True
r.amber_index = True

r.attach["target"] = pull_distances[0]          # Angstroms
r.attach["fraction_list"] = attach_fractions
r.attach["fc_final"] = 5.0                      # kcal/mol/Angstroms**2

r.pull["target_final"] = 24.0                   # Angstroms
r.pull["num_windows"] = windows[1]

r.initialize()
guest_restraints.append(r)
[18]:
r = restraints.DAT_restraint()
r.mask1 = D2
r.mask2 = D1
r.mask3 = G1
r.topology = structure
r.auto_apr = True
r.continuous_apr = True
r.amber_index = True

r.attach["target"] = 180.0                      # Degrees
r.attach["fraction_list"] = attach_fractions
r.attach["fc_final"] = 100.0                    # kcal/mol/radian**2

r.pull["target_final"] = 180.0                  # Degrees
r.pull["num_windows"] = windows[1]

r.initialize()
guest_restraints.append(r)
[19]:
r = restraints.DAT_restraint()
r.mask1 = D1
r.mask2 = G1
r.mask3 = G2
r.topology = structure
r.auto_apr = True
r.continuous_apr = True
r.amber_index = True

r.attach["target"] = 180.0                      # Degrees
r.attach["fraction_list"] = attach_fractions
r.attach["fc_final"] = 100.0                    # kcal/mol/radian**2

r.pull["target_final"] = 180.0                  # Degrees
r.pull["num_windows"] = windows[1]

r.initialize()
guest_restraints.append(r)

Create APR windows

We use the guest restraints to create a list of windows with the appropriate names and then create the directories.

[20]:
from paprika.restraints.utils import create_window_list
[21]:
window_list = create_window_list(guest_restraints)
[22]:
if not os.path.isdir(work_dir):
    os.makedirs(work_dir)

for window in window_list:
    folder = os.path.join(work_dir, window)
    if not os.path.isdir(folder):
        os.makedirs(os.path.join(work_dir, window))

Write APR restraints to a Colvars module file

In this section we create an instance of Colvars() from paprika.restraints.colvars, which is a class to generate the Colvars restraint files. This class is analogous to the Plumed class. We need to specify the list of restraints used throughout the APR calculations and the corresponding windows list. In this tutorial we will print the host static restraints and the guest restraints. The Colvars class includes a method to add restraints to dummy atoms (add_dummy_atoms_to_file). We will add the positional restraints on the dummy atoms after generating the structures for each window.

Note: be careful when specifying the force constants in DAT_restraints. We follow the AMBER (and CHARMM) convention where the force constant is already multiplied by a factor of 1/2 but Colvars (and Plumed) requires the user to specify the force constant without this factor, i.e.

\[\begin{split}\begin{align} U_{amber} &= K_{amber} (r-r_{0})^2 \\ U_{colvars} &= \frac{1}{2} k_{colvars} (r - r_{0})^2 \end{align}\end{split}\]

thus \(k_{colvars} = 2 \times K_{amber}\). If AMBER force constants was used in generating the DAT_restraints (the case in this tutorial) we need to set the variable uses_legacy_k to True (this is on by default).

[23]:
from paprika.restraints.colvars import Colvars
[24]:
restraints_list = (static_restraints + guest_restraints)

colvars = Colvars()
colvars.file_name = "colvars.tcl"
colvars.path = work_dir
colvars.window_list = window_list
colvars.restraint_list = restraints_list
colvars_uses_legacy_k = True

colvars.dump_to_file()

Prepare host-guest system

Translate guest molecule

For the attach windows, we will use the initial, bound coordinates for the host-guest complex. Only the force constants change during this phase, so a single set of coordinates is sufficient. For the pull windows, we will translate the guest to the target value, and for the release windows, we will use the coordinates from the final pull window.

[25]:
import shutil
[26]:
for window in window_list:
    if window[0] == "a":
        shutil.copy(os.path.join(complex_dir, f"{base_name}.prmtop"),
                    os.path.join(work_dir, window, f"{base_name}.prmtop"))
        shutil.copy(os.path.join(complex_dir, f"{base_name}.rst7"),
                    os.path.join(work_dir, window, f"{base_name}.rst7"))

    elif window[0] == "p":
        structure = pmd.load_file(
            os.path.join(complex_dir, f"{base_name}.prmtop"),
            os.path.join(complex_dir, f"{base_name}.rst7"),
            structure = True
        )
        target_difference = guest_restraints[0].phase['pull']['targets'][int(window[1:])] -\
                            guest_restraints[0].pull['target_initial']
        print(f"In window {window} we will translate the guest {target_difference.magnitude:0.1f}.")

        for atom in structure.atoms:
            if atom.residue.name == "BUT":
                atom.xz += target_difference.magnitude

        structure.save(os.path.join(work_dir, window, f"{base_name}.prmtop"), overwrite=True)
        structure.save(os.path.join(work_dir, window, f"{base_name}.rst7"), overwrite=True)
In window p000 we will translate the guest 0.0 Angstroms.
In window p001 we will translate the guest 1.1 Angstroms.
In window p002 we will translate the guest 2.1 Angstroms.
In window p003 we will translate the guest 3.2 Angstroms.
In window p004 we will translate the guest 4.2 Angstroms.
In window p005 we will translate the guest 5.3 Angstroms.
In window p006 we will translate the guest 6.4 Angstroms.
In window p007 we will translate the guest 7.4 Angstroms.
In window p008 we will translate the guest 8.5 Angstroms.
In window p009 we will translate the guest 9.5 Angstroms.
In window p010 we will translate the guest 10.6 Angstroms.
In window p011 we will translate the guest 11.6 Angstroms.
In window p012 we will translate the guest 12.7 Angstroms.
In window p013 we will translate the guest 13.8 Angstroms.
In window p014 we will translate the guest 14.8 Angstroms.
In window p015 we will translate the guest 15.9 Angstroms.
In window p016 we will translate the guest 16.9 Angstroms.
In window p017 we will translate the guest 18.0 Angstroms.

Add dummy atom restraints

Here we will add positional restraints on the dummy atoms. The reference positions are extracted from the structure files we just created in each window.

[27]:
for window in window_list:
    folder = os.path.join(work_dir, window)

    structure = pmd.load_file(
        os.path.join(folder, f"{base_name}.prmtop"),
        os.path.join(folder, f"{base_name}.rst7"),
        structure = True
    )

    colvars.add_dummy_atom_restraints(structure, window)

Simulation

Now that we have prepared the structure and colvars.tcl file for each window, we have everything ready to go. pAPRika comes with a python wrapper for NAMD that can help set up default parameters for the simulation. There are some high level options that we set directly, like simulation.path, and then we call the function config_gb_min() to setup reasonable default simulation parameters for a minimization and production run. The temperature of the system will be controlled with the Langevin thermostat, which is the default. For running simulations with a different thermostat parse the thermostat of choice using the Thermostat Enum class to config_gb_md(). For example, if we want to run our simulations with the Lowe-Anderson thermostat:

simulation = NAMD()
...
simulation.config_gb_md(NAMD.Thermostat.LoweAnderson)
simulation.run()

See the pAPRika documentation for more details on the different temperature control algorithm implemented.

Note: NAMD version 2 is similar to GROMACS in that more than one thread for every GPU is preferred to deliver better speed. We set the number of processors with simulation.n_threads = 8 and the GPU device with simulation.gpu_devices = 0. Starting with NAMD version 3 (currently in alpha phase) a smaller number of n_threads is required per GPU.

[28]:
from paprika.simulate import NAMD

Initialize logger

[29]:
import logging
from importlib import reload
reload(logging)

logger = logging.getLogger()
logging.basicConfig(
    format='%(asctime)s %(message)s',
    datefmt='%Y-%m-%d %I:%M:%S %p',
    level=logging.INFO
)

Energy Minimization

[30]:
for window in window_list:
    simulation = NAMD()
    simulation.executable = "namd2"
    simulation.n_threads = 4
#     simulation.gpu_devices = 0
    simulation.path = os.path.join(work_dir, window)

    simulation.topology = f"{base_name}.prmtop"
    simulation.coordinates = f"{base_name}.rst7"
    simulation.prefix = "minimize"
    simulation.colvars_file = "colvars.tcl"

    simulation.config_gb_min()

    logging.info(f"Running minimization in window {window}...")
    simulation.run(overwrite=True)
2020-10-09 11:31:02 AM Running minimization in window a000...
2020-10-09 11:31:07 AM Running minimization in window a001...
2020-10-09 11:31:13 AM Running minimization in window a002...
2020-10-09 11:31:20 AM Running minimization in window a003...
2020-10-09 11:31:25 AM Running minimization in window a004...
2020-10-09 11:31:30 AM Running minimization in window a005...
2020-10-09 11:31:34 AM Running minimization in window a006...
2020-10-09 11:31:39 AM Running minimization in window a007...
2020-10-09 11:31:43 AM Running minimization in window a008...
2020-10-09 11:31:48 AM Running minimization in window a009...
2020-10-09 11:31:53 AM Running minimization in window a010...
2020-10-09 11:31:59 AM Running minimization in window a011...
2020-10-09 11:32:03 AM Running minimization in window a012...
2020-10-09 11:32:08 AM Running minimization in window a013...
2020-10-09 11:32:12 AM Running minimization in window p000...
2020-10-09 11:32:16 AM Running minimization in window p001...
2020-10-09 11:32:21 AM Running minimization in window p002...
2020-10-09 11:32:25 AM Running minimization in window p003...
2020-10-09 11:32:30 AM Running minimization in window p004...
2020-10-09 11:32:34 AM Running minimization in window p005...
2020-10-09 11:32:39 AM Running minimization in window p006...
2020-10-09 11:32:43 AM Running minimization in window p007...
2020-10-09 11:32:48 AM Running minimization in window p008...
2020-10-09 11:32:56 AM Running minimization in window p009...
2020-10-09 11:33:01 AM Running minimization in window p010...
2020-10-09 11:33:05 AM Running minimization in window p011...
2020-10-09 11:33:10 AM Running minimization in window p012...
2020-10-09 11:33:14 AM Running minimization in window p013...
2020-10-09 11:33:19 AM Running minimization in window p014...
2020-10-09 11:33:23 AM Running minimization in window p015...
2020-10-09 11:33:28 AM Running minimization in window p016...
2020-10-09 11:33:32 AM Running minimization in window p017...

Thermalization

Here, we will do a slow heating process to demonstrate the custom_run_command feature of the NAMD wrapper. We will heat the system from 50 K to 300 K in steps of 50 K and will do a total of 500 MD steps for each temperature.

[31]:
for window in window_list:
    simulation = NAMD()
    simulation.executable = "namd2"
    simulation.n_threads = 4
#     simulation.gpu_devices = 0
    simulation.path = os.path.join(work_dir, window)

    simulation.topology = f"{base_name}.prmtop"
    simulation.coordinates = f"{base_name}.rst7"
    simulation.prefix = "thermalization"
    simulation.checkpoint = "minimize"
    simulation.colvars_file = "colvars.tcl"

    simulation.config_gb_md()

    # Slow heating
    simulation.custom_run_commands = [
        "for {set temp 50} {$temp <= 300} {incr temp 50} {",
        "langevinTemp $temp",
        "reinitvels $temp",
        "run 500",
        "}",
    ]

    logging.info(f"Running thermalization in window {window}...")
    simulation.run(overwrite=True)
2020-10-09 11:33:37 AM Running thermalization in window a000...
2020-10-09 11:33:39 AM Running thermalization in window a001...
2020-10-09 11:33:40 AM Running thermalization in window a002...
2020-10-09 11:33:42 AM Running thermalization in window a003...
2020-10-09 11:33:44 AM Running thermalization in window a004...
2020-10-09 11:33:46 AM Running thermalization in window a005...
2020-10-09 11:33:47 AM Running thermalization in window a006...
2020-10-09 11:33:49 AM Running thermalization in window a007...
2020-10-09 11:33:51 AM Running thermalization in window a008...
2020-10-09 11:33:53 AM Running thermalization in window a009...
2020-10-09 11:33:54 AM Running thermalization in window a010...
2020-10-09 11:33:56 AM Running thermalization in window a011...
2020-10-09 11:33:58 AM Running thermalization in window a012...
2020-10-09 11:34:00 AM Running thermalization in window a013...
2020-10-09 11:34:02 AM Running thermalization in window p000...
2020-10-09 11:34:06 AM Running thermalization in window p001...
2020-10-09 11:34:09 AM Running thermalization in window p002...
2020-10-09 11:34:13 AM Running thermalization in window p003...
2020-10-09 11:34:15 AM Running thermalization in window p004...
2020-10-09 11:34:17 AM Running thermalization in window p005...
2020-10-09 11:34:18 AM Running thermalization in window p006...
2020-10-09 11:34:20 AM Running thermalization in window p007...
2020-10-09 11:34:22 AM Running thermalization in window p008...
2020-10-09 11:34:24 AM Running thermalization in window p009...
2020-10-09 11:34:25 AM Running thermalization in window p010...
2020-10-09 11:34:27 AM Running thermalization in window p011...
2020-10-09 11:34:29 AM Running thermalization in window p012...
2020-10-09 11:34:30 AM Running thermalization in window p013...
2020-10-09 11:34:32 AM Running thermalization in window p014...
2020-10-09 11:34:34 AM Running thermalization in window p015...
2020-10-09 11:34:36 AM Running thermalization in window p016...
2020-10-09 11:34:37 AM Running thermalization in window p017...

We will skip the equilibration step and go straight to production

Production Run

We will run the production phase for 5,000 steps.

Note: a proper production run would require much longer simulation.

[32]:
for window in window_list:
    simulation = NAMD()
    simulation.executable = "namd2"
    simulation.n_threads = 4
#     simulation.gpu_devices = 0
    simulation.path = os.path.join(work_dir, window)

    simulation.topology = f"{base_name}.prmtop"
    simulation.coordinates = f"{base_name}.rst7"
    simulation.prefix = "production"
    simulation.checkpoint = "thermalization"
    simulation.colvars_file = "colvars.tcl"

    simulation.config_gb_md()
    simulation.control["run"] = 5000

    logging.info(f"Running production in window {window}...")
    simulation.run(overwrite=True)
2020-10-09 11:34:39 AM Running production in window a000...
2020-10-09 11:34:41 AM Running production in window a001...
2020-10-09 11:34:44 AM Running production in window a002...
2020-10-09 11:34:46 AM Running production in window a003...
2020-10-09 11:34:49 AM Running production in window a004...
2020-10-09 11:34:52 AM Running production in window a005...
2020-10-09 11:34:54 AM Running production in window a006...
2020-10-09 11:34:57 AM Running production in window a007...
2020-10-09 11:35:00 AM Running production in window a008...
2020-10-09 11:35:02 AM Running production in window a009...
2020-10-09 11:35:05 AM Running production in window a010...
2020-10-09 11:35:10 AM Running production in window a011...
2020-10-09 11:35:15 AM Running production in window a012...
2020-10-09 11:35:17 AM Running production in window a013...
2020-10-09 11:35:21 AM Running production in window p000...
2020-10-09 11:35:25 AM Running production in window p001...
2020-10-09 11:35:29 AM Running production in window p002...
2020-10-09 11:35:33 AM Running production in window p003...
2020-10-09 11:35:36 AM Running production in window p004...
2020-10-09 11:35:40 AM Running production in window p005...
2020-10-09 11:35:42 AM Running production in window p006...
2020-10-09 11:35:46 AM Running production in window p007...
2020-10-09 11:35:51 AM Running production in window p008...
2020-10-09 11:35:54 AM Running production in window p009...
2020-10-09 11:35:58 AM Running production in window p010...
2020-10-09 11:36:06 AM Running production in window p011...
2020-10-09 11:36:15 AM Running production in window p012...
2020-10-09 11:36:21 AM Running production in window p013...
2020-10-09 11:36:27 AM Running production in window p014...
2020-10-09 11:36:32 AM Running production in window p015...
2020-10-09 11:36:37 AM Running production in window p016...
2020-10-09 11:36:41 AM Running production in window p017...

Analysis

Once the simulation is completed, we can using the Analysis module to determine the binding free energy. We supply the location of the parameter information, a string or list for the file names (wildcards supported), the location of the windows, and the restraints on the guest.

In this example, we use the method ti-block which determines the free energy using thermodynamic iintegration and then estimates the standard error of the mean at each data point using blocking analysis. Bootstrapping it used to determine the uncertainty of the full thermodynamic integral for each phase.

After running compute_free_energy(), a dictionary called results will be populated, that contains the free energy and SEM for each phase of the simulation.

[33]:
import paprika.analysis as analysis
[35]:
free_energy = analysis.fe_calc()
free_energy.topology = "cb6-but-dum.prmtop"
free_energy.trajectory = 'production*.dcd'
free_energy.path = work_dir
free_energy.restraint_list = guest_restraints
free_energy.collect_data()
free_energy.methods = ['ti-block']
free_energy.ti_matrix = "full"
free_energy.boot_cycles = 1000
free_energy.compute_free_energy()

We also need to calculate the free-energy cost of releasing the restraints on the guest molecule.

[36]:
free_energy.compute_ref_state_work([
    guest_restraints[0], guest_restraints[1], None,
    None, guest_restraints[2], None
])

Then we add the free-energies together and combine the uncertainties to get the binding-free energy

[37]:
binding_affinity = -1 * (
    free_energy.results["attach"]["ti-block"]["fe"] + \
    free_energy.results["pull"]["ti-block"]["fe"] + \
    free_energy.results["ref_state_work"]
)

sem = np.sqrt(
    free_energy.results["attach"]["ti-block"]["sem"]**2 + \
    free_energy.results["pull"]["ti-block"]["sem"]**2
)
[38]:
print(f"The binding affinity of butane to cucurbit[6]uril = {binding_affinity.magnitude:0.2f} +/- {sem.magnitude:0.2f} kcal/mol")
The binding affinity of butane to cucurbit[6]uril = -6.16 +/- 4.40 kcal/mol