pAPRika tutorial 4 - APR/OpenMM

In this tutorial, we will perform APR calculations for butane (BUT)–cucurbit[6]uril (CB6). This is a repeat of tutorial 1 using OpenMM instead of AMBER as the simulation engine. We will go through the process of converting APR restraints constructed with pAPRika and AMBER structure files to an OpenMM system.

🔵 Since we have a prepared the host-guest-dummy setup from the first tutorial, we will skip the initial tleap steps and go right into initializing the restraints.

Initialize

[1]:
import os
import json
import shutil
import numpy as np
import parmed as pmd

Initialize logger

[2]:
import logging
from importlib import reload
reload(logging)

logger = logging.getLogger()
logging.basicConfig(
    format='%(asctime)s %(message)s',
    datefmt='%Y-%m-%d %I:%M:%S %p',
    level=logging.INFO
)

Define names

We will store the files created in this tutorial in a folder called openmm so we don’t mix files with the previous tutorial.

[3]:
base_name = "cb6-but-dum"
work_dir = "openmm"
complex_dir = "complex"

Generate APR Restraints

NOTE: The only difference here is to set amber_index to False since OpenMM atom numbering starts from 0.

Define anchor atoms

See tutorial 1 for the choice of selection

[4]:
# Guest atoms
G1 = ":BUT@C"
G2 = ":BUT@C3"

# Host atoms
H1 = ":CB6@C"
H2 = ":CB6@C31"
H3 = ":CB6@C18"

# Dummy atoms
D1 = ":DM1"
D2 = ":DM2"
D3 = ":DM3"

Determine the number of windows

Before we add the restraints, it is helpful to set the \(\lambda\) fractions that control the strength of the force constants during attach and release, and to define the distances for the pulling phase.

The attach fractions go from 0 to 1 and we place more points at the bottom of the range to sample the curvature of \(dU/d \lambda\). Next, we generally apply a distance restraint until the guest is ~18 Angstroms away from the host, in increments of 0.4 Angstroms. This distance should be at least twice the Lennard-Jones cutoff in the system. These values have worked well for us, but this is one aspect that should be carefully checked for new systems.

[5]:
attach_string = "0.00 0.40 0.80 1.60 2.40 4.00 5.50 8.65 11.80 18.10 24.40 37.00 49.60 74.80 100.00"
attach_fractions = [float(i) / 100 for i in attach_string.split()]

initial_distance = 6.0
pull_distances = np.arange(0.0 + initial_distance, 18.0 + initial_distance, 1.0)

release_fractions = []

windows = [len(attach_fractions), len(pull_distances), len(release_fractions)]
logging.info(f"There are {windows} windows in this attach-pull-release calculation.")
2020-08-19 10:30:19 PM There are [15, 18, 0] windows in this attach-pull-release calculation.

Load complex structure

[7]:
structure = pmd.load_file(
    os.path.join(complex_dir, f"{base_name}.prmtop"),
    os.path.join(complex_dir, f"{base_name}.rst7"),
    structure = True,
)

Host Static Restraints

See tutorial 1 for an explanation of the static restraints

[ ]:
import paprika.restraints as restraints
[8]:
static_restraints = []
[9]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D1, H1],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 5.0,
                                    amber_index=False)

static_restraints.append(r)
[10]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D2, D1, H1],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=False)

static_restraints.append(r)
[11]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D3, D2, D1, H1],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=False)

static_restraints.append(r)
[12]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D1, H1, H2],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=False)

static_restraints.append(r)
[13]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D2, D1, H1, H2],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=False)

static_restraints.append(r)
[14]:
r = restraints.static_DAT_restraint(restraint_mask_list = [D1, H1, H2, H3],
                                    num_window_list = windows,
                                    ref_structure = structure,
                                    force_constant = 100.0,
                                    amber_index=False)

static_restraints.append(r)

Guest translational and rotational restraints

See tutorial 1 for an explanation of the guest restraints

[15]:
guest_restraints = []
[16]:
r = restraints.DAT_restraint()
r.mask1 = D1
r.mask2 = G1
r.topology = structure
r.auto_apr = True
r.continuous_apr = True
r.amber_index = False

r.attach["target"] = pull_distances[0]          # Angstroms
r.attach["fraction_list"] = attach_fractions
r.attach["fc_final"] = 5.0                      # kcal/mol/Angstroms**2

r.pull["target_final"] = 24.0                   # Angstroms
r.pull["num_windows"] = windows[1]

r.initialize()
guest_restraints.append(r)
[17]:
r = restraints.DAT_restraint()
r.mask1 = D2
r.mask2 = D1
r.mask3 = G1
r.topology = structure
r.auto_apr = True
r.continuous_apr = True
r.amber_index = False

r.attach["target"] = 180.0                      # Degrees
r.attach["fraction_list"] = attach_fractions
r.attach["fc_final"] = 100.0                    # kcal/mol/radian**2

r.pull["target_final"] = 180.0                  # Degrees
r.pull["num_windows"] = windows[1]

r.initialize()
guest_restraints.append(r)
[18]:
r = restraints.DAT_restraint()
r.mask1 = D1
r.mask2 = G1
r.mask3 = G2
r.topology = structure
r.auto_apr = True
r.continuous_apr = True
r.amber_index = False

r.attach["target"] = 180.0                      # Degrees
r.attach["fraction_list"] = attach_fractions
r.attach["fc_final"] = 100.0                    # kcal/mol/radian**2

r.pull["target_final"] = 180.0                  # Degrees
r.pull["num_windows"] = windows[1]

r.initialize()
guest_restraints.append(r)

Create APR windows

We use the guest restraints to create a list of windows with the appropriate names and then create the directories.

[19]:
from paprika.restraints.restraints import create_window_list
[20]:
window_list = create_window_list(guest_restraints)
2020-08-19 10:30:19 PM Restraints appear to be consistent
[21]:
for window in window_list:
    folder = os.path.join(work_dir, window)
    if not os.path.isdir(folder):
        os.makedirs(os.path.join(work_dir, window))

Prepare host-guest system

Translate guest molecule

For the attach windows, we will use the initial, bound coordinates for the host-guest complex. Only the force constants change during this phase, so a single set of coordinates is sufficient. For the pull windows, we will translate the guest to the target value of the restraint before solvation, and for the release windows, we will use the coordinates from the final pull window.

[22]:
for window in window_list:
    if window[0] == "a":
        shutil.copy(os.path.join(complex_dir, f"{base_name}.prmtop"),
                    os.path.join(work_dir, window, f"{base_name}.prmtop"))
        shutil.copy(os.path.join(complex_dir, f"{base_name}.rst7"),
                    os.path.join(work_dir, window, f"{base_name}.rst7"))

    elif window[0] == "p":
        structure = pmd.load_file(
            os.path.join(complex_dir, f"{base_name}.prmtop"),
            os.path.join(complex_dir, f"{base_name}.rst7"),
            structure = True
        )
        target_difference = guest_restraints[0].phase['pull']['targets'][int(window[1:])] -\
                            guest_restraints[0].pull['target_initial']
        logging.info(f"In window {window} we will translate the guest {target_difference.magnitude:0.1f}.")

        for atom in structure.atoms:
            if atom.residue.name == "BUT":
                atom.xz += target_difference.magnitude

        structure.save(os.path.join(work_dir, window, f"{base_name}.prmtop"), overwrite=True)
        structure.save(os.path.join(work_dir, window, f"{base_name}.rst7"), overwrite=True)
2020-08-19 10:30:19 PM In window p000 we will translate the guest 0.0 Angstroms.
2020-08-19 10:30:19 PM In window p001 we will translate the guest 1.1 Angstroms.
2020-08-19 10:30:19 PM In window p002 we will translate the guest 2.1 Angstroms.
2020-08-19 10:30:19 PM In window p003 we will translate the guest 3.2 Angstroms.
2020-08-19 10:30:19 PM In window p004 we will translate the guest 4.2 Angstroms.
2020-08-19 10:30:19 PM In window p005 we will translate the guest 5.3 Angstroms.
2020-08-19 10:30:19 PM In window p006 we will translate the guest 6.4 Angstroms.
2020-08-19 10:30:19 PM In window p007 we will translate the guest 7.4 Angstroms.
2020-08-19 10:30:19 PM In window p008 we will translate the guest 8.5 Angstroms.
2020-08-19 10:30:19 PM In window p009 we will translate the guest 9.5 Angstroms.
2020-08-19 10:30:19 PM In window p010 we will translate the guest 10.6 Angstroms.
2020-08-19 10:30:19 PM In window p011 we will translate the guest 11.6 Angstroms.
2020-08-19 10:30:19 PM In window p012 we will translate the guest 12.7 Angstroms.
2020-08-19 10:30:19 PM In window p013 we will translate the guest 13.8 Angstroms.
2020-08-19 10:30:19 PM In window p014 we will translate the guest 14.8 Angstroms.
2020-08-19 10:30:19 PM In window p015 we will translate the guest 15.9 Angstroms.
2020-08-19 10:30:19 PM In window p016 we will translate the guest 16.9 Angstroms.
2020-08-19 10:30:19 PM In window p017 we will translate the guest 18.0 Angstroms.

Create OpenMM system and apply restraints

Here, we will convert the AMBER *.prmtop & *.rst7 to an OpenMM system object for each window and convert it to a XML file. The Generalized Born Implicit Solvent model we will use is HCT, which is equivalent to igb=1 in AMBER. Afterwords, we will apply restraints on the dummy atoms using apply_positional_restraints and the static & guest restraints with apply_dat_restraint.

[23]:
import openmm.unit as unit
import openmm.app as app
import openmm as openmm

from paprika.restraints.utils import parse_window
from paprika.restraints.openmm import apply_positional_restraints, apply_dat_restraint
[24]:
for window in window_list:
    # Current window
    folder = os.path.join(work_dir, window)
    window_number, phase = parse_window(window)
    logging.info(f"Creating XML for in window {window}")

    # Load Amber
    prmtop = app.AmberPrmtopFile(os.path.join(folder, f'{base_name}.prmtop'))
    inpcrd = app.AmberInpcrdFile(os.path.join(folder, f'{base_name}.rst7'))

    # Create PDB file
    with open(os.path.join(folder, 'system.pdb'), 'w') as file:
        app.PDBFile.writeFile(prmtop.topology, inpcrd.positions, file, keepIds=True)

    # Create an OpenMM system from the Amber topology
    system = prmtop.createSystem(
        nonbondedMethod=app.NoCutoff,
        constraints=app.HBonds,
        implicitSolvent=app.HCT,
    )

    # Apply positional restraints on the dummy atoms
    apply_positional_restraints(os.path.join(folder, 'system.pdb'), system, force_group=15)

    # Apply host static restraints
    for restraint in static_restraints:
        apply_dat_restraint(system, restraint, phase, window_number, force_group=10)

    # Apply guest restraints
    for restraint in guest_restraints:
        apply_dat_restraint(system, restraint, phase, window_number, force_group=11)

    # Save OpenMM system to XML file
    system_xml = openmm.XmlSerializer.serialize(system)
    with open(os.path.join(folder, 'system.xml'), 'w') as file:
        file.write(system_xml)
2020-08-19 10:30:19 PM Creating XML for in window a000
2020-08-19 10:30:19 PM Creating XML for in window a001
2020-08-19 10:30:19 PM Creating XML for in window a002
2020-08-19 10:30:19 PM Creating XML for in window a003
2020-08-19 10:30:19 PM Creating XML for in window a004
2020-08-19 10:30:19 PM Creating XML for in window a005
2020-08-19 10:30:19 PM Creating XML for in window a006
2020-08-19 10:30:20 PM Creating XML for in window a007
2020-08-19 10:30:20 PM Creating XML for in window a008
2020-08-19 10:30:20 PM Creating XML for in window a009
2020-08-19 10:30:20 PM Creating XML for in window a010
2020-08-19 10:30:20 PM Creating XML for in window a011
2020-08-19 10:30:20 PM Creating XML for in window a012
2020-08-19 10:30:20 PM Creating XML for in window a013
2020-08-19 10:30:20 PM Creating XML for in window p000
2020-08-19 10:30:20 PM Creating XML for in window p001
2020-08-19 10:30:20 PM Creating XML for in window p002
2020-08-19 10:30:20 PM Creating XML for in window p003
2020-08-19 10:30:20 PM Creating XML for in window p004
2020-08-19 10:30:20 PM Creating XML for in window p005
2020-08-19 10:30:20 PM Creating XML for in window p006
2020-08-19 10:30:20 PM Creating XML for in window p007
2020-08-19 10:30:20 PM Creating XML for in window p008
2020-08-19 10:30:20 PM Creating XML for in window p009
2020-08-19 10:30:20 PM Creating XML for in window p010
2020-08-19 10:30:20 PM Creating XML for in window p011
2020-08-19 10:30:20 PM Creating XML for in window p012
2020-08-19 10:30:20 PM Creating XML for in window p013
2020-08-19 10:30:20 PM Creating XML for in window p014
2020-08-19 10:30:20 PM Creating XML for in window p015
2020-08-19 10:30:20 PM Creating XML for in window p016
2020-08-19 10:30:20 PM Creating XML for in window p017

Simulation

For this part, you need to have OpenMM installed and by default the simulations will run on the CPU. See the OpenMM documentation if you want to run the simulation on the GPU. We will set the integrator time step to 1 fs with a total of 50,000 steps for production run and the temperature is set to 300 K.

Energy Minimization

[25]:
for window in window_list:
    folder = os.path.join(work_dir, window)
    logging.info(f"Running minimization in window {window}...")

    # Load XML and input coordinates
    with open(os.path.join(folder, 'system.xml'), 'r') as file:
        system = openmm.XmlSerializer.deserialize(file.read())
    coords = app.PDBFile(os.path.join(folder, 'system.pdb'))

    # Integrator
    integrator = openmm.LangevinIntegrator(300 * unit.kelvin, 1.0 / unit.picoseconds, 1.0 * unit.femtoseconds)

    # Simulation Object
    simulation = app.Simulation(coords.topology, system, integrator)
    simulation.context.setPositions(coords.positions)

    # Minimize Energy
    simulation.minimizeEnergy(tolerance=1.0*unit.kilojoules_per_mole, maxIterations=5000)

    # Save final coordinates
    positions = simulation.context.getState(getPositions=True).getPositions()
    with open(os.path.join(folder, 'minimized.pdb'), 'w') as file:
        app.PDBFile.writeFile(simulation.topology, positions, file, keepIds=True)
2020-08-19 10:30:20 PM Running minimization in window a000...
2020-08-19 10:30:21 PM Running minimization in window a001...
2020-08-19 10:30:21 PM Running minimization in window a002...
2020-08-19 10:30:21 PM Running minimization in window a003...
2020-08-19 10:30:21 PM Running minimization in window a004...
2020-08-19 10:30:21 PM Running minimization in window a005...
2020-08-19 10:30:21 PM Running minimization in window a006...
2020-08-19 10:30:21 PM Running minimization in window a007...
2020-08-19 10:30:22 PM Running minimization in window a008...
2020-08-19 10:30:22 PM Running minimization in window a009...
2020-08-19 10:30:22 PM Running minimization in window a010...
2020-08-19 10:30:22 PM Running minimization in window a011...
2020-08-19 10:30:22 PM Running minimization in window a012...
2020-08-19 10:30:22 PM Running minimization in window a013...
2020-08-19 10:30:22 PM Running minimization in window p000...
2020-08-19 10:30:23 PM Running minimization in window p001...
2020-08-19 10:30:23 PM Running minimization in window p002...
2020-08-19 10:30:23 PM Running minimization in window p003...
2020-08-19 10:30:23 PM Running minimization in window p004...
2020-08-19 10:30:23 PM Running minimization in window p005...
2020-08-19 10:30:23 PM Running minimization in window p006...
2020-08-19 10:30:24 PM Running minimization in window p007...
2020-08-19 10:30:24 PM Running minimization in window p008...
2020-08-19 10:30:24 PM Running minimization in window p009...
2020-08-19 10:30:24 PM Running minimization in window p010...
2020-08-19 10:30:24 PM Running minimization in window p011...
2020-08-19 10:30:24 PM Running minimization in window p012...
2020-08-19 10:30:24 PM Running minimization in window p013...
2020-08-19 10:30:25 PM Running minimization in window p014...
2020-08-19 10:30:25 PM Running minimization in window p015...
2020-08-19 10:30:25 PM Running minimization in window p016...
2020-08-19 10:30:25 PM Running minimization in window p017...
  • Here we will skip the equilibration step and go straight to production!

Production Run

[26]:
for window in window_list:
    folder = os.path.join(work_dir, window)
    logging.info(f"Running production in window {window}...")

    # Load XML and input coordinates
    with open(os.path.join(folder, 'system.xml'), 'r') as file:
        system = openmm.XmlSerializer.deserialize(file.read())
    coords = app.PDBFile(os.path.join(folder, 'minimized.pdb'))

    # Integrator
    integrator = openmm.LangevinIntegrator(300 * unit.kelvin, 1.0 / unit.picoseconds, 1.0 * unit.femtoseconds)

    # Reporters
    dcd_reporter = app.DCDReporter(os.path.join(folder, 'production.dcd'), 500)
    state_reporter = app.StateDataReporter(
        os.path.join(folder, 'production.log'),
        500,
        step=True,
        kineticEnergy=True,
        potentialEnergy=True,
        totalEnergy=True,
        temperature=True,
    )

    # Simulation Object
    simulation = app.Simulation(coords.topology, system, integrator)
    simulation.context.setPositions(coords.positions)
    simulation.reporters.append(dcd_reporter)
    simulation.reporters.append(state_reporter)

    # MD steps
    simulation.step(50000)

    # Save final coordinates
    positions = simulation.context.getState(getPositions=True).getPositions()
    with open(os.path.join(folder, 'production.pdb'), 'w') as file:
        app.PDBFile.writeFile(simulation.topology, positions, file, keepIds=True)
2020-08-19 10:30:25 PM Running production in window a000...
2020-08-19 10:30:28 PM Running production in window a001...
2020-08-19 10:30:31 PM Running production in window a002...
2020-08-19 10:30:33 PM Running production in window a003...
2020-08-19 10:30:36 PM Running production in window a004...
2020-08-19 10:30:39 PM Running production in window a005...
2020-08-19 10:30:42 PM Running production in window a006...
2020-08-19 10:30:45 PM Running production in window a007...
2020-08-19 10:30:47 PM Running production in window a008...
2020-08-19 10:30:50 PM Running production in window a009...
2020-08-19 10:30:53 PM Running production in window a010...
2020-08-19 10:30:56 PM Running production in window a011...
2020-08-19 10:30:59 PM Running production in window a012...
2020-08-19 10:31:01 PM Running production in window a013...
2020-08-19 10:31:04 PM Running production in window p000...
2020-08-19 10:31:07 PM Running production in window p001...
2020-08-19 10:31:10 PM Running production in window p002...
2020-08-19 10:31:13 PM Running production in window p003...
2020-08-19 10:31:15 PM Running production in window p004...
2020-08-19 10:31:18 PM Running production in window p005...
2020-08-19 10:31:21 PM Running production in window p006...
2020-08-19 10:31:24 PM Running production in window p007...
2020-08-19 10:31:26 PM Running production in window p008...
2020-08-19 10:31:29 PM Running production in window p009...
2020-08-19 10:31:32 PM Running production in window p010...
2020-08-19 10:31:35 PM Running production in window p011...
2020-08-19 10:31:38 PM Running production in window p012...
2020-08-19 10:31:40 PM Running production in window p013...
2020-08-19 10:31:43 PM Running production in window p014...
2020-08-19 10:31:46 PM Running production in window p015...
2020-08-19 10:31:49 PM Running production in window p016...
2020-08-19 10:31:52 PM Running production in window p017...

Analysis

Once the simulation is completed, we can using the analysis module to determine the binding free energy. We supply the location of the parameter information, a string or list for the file names (wildcards supported), the location of the windows, and the restraints on the guest.

In this example, we use the method ti-block which determines the free energy using thermodynamic iintegration and then estimates the standard error of the mean at each data point using blocking analysis. Bootstrapping it used to determine the uncertainty of the full thermodynamic integral for each phase.

After running compute_free_energy(), a dictionary called results will be populated, that contains the free energy and SEM for each phase of the simulation.

[27]:
import paprika.analysis as analysis
[28]:
free_energy = analysis.fe_calc()
free_energy.topology = "system.pdb"
free_energy.trajectory = "production.dcd"
free_energy.path = work_dir
free_energy.restraint_list = guest_restraints
free_energy.collect_data()
free_energy.methods = ['ti-block']
free_energy.ti_matrix = "diagonal"
free_energy.boot_cycles = 1000
free_energy.compute_free_energy()
[29]:
free_energy.compute_ref_state_work([
    guest_restraints[0], guest_restraints[1], None, None,
    guest_restraints[2], None
])

binding_affinity = -1 * (
    free_energy.results["attach"]["ti-block"]["fe"] + \
    free_energy.results["pull"]["ti-block"]["fe"] + \
    free_energy.results["ref_state_work"]
)

sem = np.sqrt(
    free_energy.results["attach"]["ti-block"]["sem"]**2 + \
    free_energy.results["pull"]["ti-block"]["sem"]**2
)
[30]:
print(f"The binding affinity of butane to cucurbit[6]uril = {binding_affinity.magnitude:0.2f} +/- {sem.magnitude:0.2f} kcal/mol")
The binding affinity of butane to cucurbit[6]uril = -8.96 +/- 1.14 kcal/mol

The value above is very close to the value obtained from running the APR calculations with Amber.