This chapter will introduce some basic usage of DMFF. All scripts can be found in examples/
directory in which Jupyter notebook-based demos are provided.
DMFF uses OpenMM to parse input files, including coordinates file, topology specification file and force field parameter file. Then, the core class Hamiltonian
inherited from openmm.ForceField
will be initialized and the method createPotential
will be called to create differentiable potential energy functions for different energy terms. Take parametrzing an organic moleclue with GAFF2 force field as an example:
import jax
import jax.numpy as jnp
import openmm.app as app
import openmm.unit as unit
from dmff import Hamiltonian, NeighborList
app.Topology.loadBondDefinitions("lig-top.xml")
pdb = app.PDBFile("lig.pdb")
ff = Hamiltonian("gaff-2.11.xml", "lig-prm.xml")
potentials = ff.createPotential(pdb.topology)
for k in potentials.dmff_potentials.keys():
pot = potentials.dmff_potentials[k]
print(pot)
In this example, lig.pdb
is the PDB file containing atomic coordinates, and lig-top.xml
specifying bond connections within a molecule and this information is required by openmm.app
to generate molecular topology. Note that this file is not always required, if bond conncections are defined in .pdb file by CONNECT
keyword. gaff-2.11.xml
contains GAFF2 force field parameters (bonds, angles, torsion and vdW), and lig-prm.xml
contains atomic partial charges (GAFF2 requests a user-defined charge assignment process). This xml format is compatitable with OpenMM definitions, and a detailed description can be found in OpenMM user guide or XML-format force fields section.
If you run this script in examples/classical
, you will get the following output.
<function HarmonicBondJaxGenerator.createForce.<locals>.potential_fn at 0x112504af0>
<function HarmonicAngleJaxGenerator.createForce.<locals>.potential_fn at 0x1124cd820>
<function PeriodicTorsionJaxGenerator.createForce.<locals>.potential_fn at 0x18509b790>
<function NonbondJaxGenerator.createForce.<locals>.potential_fn at 0x18509baf0>
The force field parameters are stored as a Python dict in the Hamiltonian.
params = ff.getParameters()
nbparam = params['NonbondedForce']
nbparam
{
'sigma': DeviceArray([0.33152124, ...], dtype=float32),
'epsilon': DeviceArray([0.4133792, ...], dtype=float32),
'epsfix': DeviceArray([], dtype=float32),
'sigfix': DeviceArray([], dtype=float32),
'charge': DeviceArray([-0.75401515, ...], dtype=float32),
'coulomb14scale': DeviceArray([0.8333333], dtype=float32),
'lj14scale': DeviceArray([0.5], dtype=float32)
}
Each generated function will read coordinates, box, pairs and force field parameters as inputs.
positions = jnp.array(pdb.getPositions(asNumpy=True).value_in_unit(unit.nanometer))
box = jnp.array([
[10.0, 0.0, 0.0],
[ 0.0, 10.0, 0.0],
[ 0.0, 0.0, 10.0]
])
nbList = NeighborList(box, rc=4)
nbList.allocate(positions)
pairs = nbList.pairs
Note that in order to take advantages of the auto-differentiable implementation in JAX, the input arrays have to be jax.numpy.ndarray
, otherwise DMFF will raise an error. pairs
is a rcut
. As shown above, this can be calculated with dmff.NeighborList
class which is supported by jax_md
.
The potential energy function will give energy (a scalar, in kJ/mol) as output:
nbfunc = potentials.dmff_potentials['NonbondedForce']
nbene = nbfunc(positions, box, pairs, params)
print(nbene)
If everything works fine, you will get -425.41412
as a result. In addition, you can also use getPotentialFunc()
and getParameters()
to obtain the whole potential energy function and force field parameter set, instead of seperated functions for different energy terms.
efunc = potentials.getPotentialFunc()
params = ff.getParameters()
totene = efunc(positions, box, pairs, params)
Different from conventional programming frameworks, explicit definition of atomic force calculation functions are no longer needed. Instead, the forces can be evaluated in an automatic manner with jax.grad
.
pos_grad_func = jax.grad(efunc, argnums=0)
force = -pos_grad_func(positions, box, pairs, params)
Similarly, the derivative of energy with regard to force field parameters can also be computed easily.
param_grad_func = jax.grad(nbfunc, argnums=-1)
pgrad = param_grad_func(positions, box, pairs, params)
print(pgrad["NonbondedForce"]["charge"])
[ 652.7753 55.108738 729.36115 -171.4929 502.70837
-44.917206 129.63994 -142.31796 -149.62088 453.21503
46.372574 140.15303 575.488 461.46902 294.4358
335.25153 27.828705 671.3637 390.8903 519.6835
220.51129 238.7695 229.97302 210.58838 231.8734
196.40994 237.08563 35.663574 457.76416 77.4798
256.54382 402.2121 611.9573 440.8465 -52.09662
421.86688 592.46265 237.98883 110.286194 150.65375
218.61087 240.20477 -211.85376 150.7331 310.89404
208.65228 -139.23026 -168.8883 114.3645 3.7261353
399.6282 298.28455 422.06445 526.18463 521.27563
575.85767 606.74744 394.40845 549.84033 556.4724
485.1427 512.1267 558.55896 560.4667 562.812
333.74194 ]
Besides atom-typing based methods, DMFF also supports assigning force field parameters with SMIRKS. SMIRKS is an extenstion of SMARTS language which allows users not only to specify chemical substructures with certain patterns, but also to numerically tag the matched atoms for assigning parameters. This approach avoid the duplicate atom-typing definition process, which enables new parameters to be easily introduced to existing force field parameters sets. OpenFF[1-2] series are examples of SMIRKS-based force fields for organic molecules.
Fig. 1 Illustration of matching chemical substructures with SMIRKS language. Adapted from [[1]](#sminorff).The SMIRKS pattern matching is supported by RDKit package, which can be install with conda:
conda install rdkit -c conda-forge
To begin with, we need a molecule encoded in rdkit.Chem.Mol
object. As an example, we will load a N-methylacetamide molecule defined in examples/smirks/C3H7NO.mol
:
from rdkit import Chem
from dmff import Hamiltonian
mol = Chem.MolFromMolFile("C3H7NO.mol", removeHs=False) # hydrogens must be preserved
Then load force field parameters in xml format. Instuctions about how to write a SMIRKS-based force field XML file can be found in the Chapter 4.
h_smk = Hamiltonian("C3H7NO.xml", noOmmSys=True)
Note that the argument noOmmSys
is set to False
so that DMFF will not create an openmm system, as openmm
does not support SMIRKS-based force field definitions.
Build an openmm topology and parametrize the molecule to create differentiable potential energy functions:
top = h_smk.buildTopologyFromMol(mol)
potObj = h_smk.createPotential(top, rdmol=mol)
So far, we can utilize this dmff.Potential
object to calculate energy and forces as we did in the previous sections.
Bond charge correction[3-4] is an approach to obtain high-accuracy atomic partial charges (e.g. HF/6-31G* ESP-fit charges) by adopting corrections to low-accuracy atomic charges (e.g. AMI Mulliken charges). In order to ensure a zero of total correction values within a molecule, these correction parameters are usually defined based on bond types, which suggests that they can also be defined by SMIRKS patterns.
Virtual sites are additional off-centered charged sites which are introduced to improve the desciption of electrostatic effects caused by sigma hole (halogen bond) or lone pairs. The positions of virtual sites are calculated directly by its parent atoms, not by integrating the equations of motion. This approach is well known for its application in TIP4P[5] and TIP5P[6] water models, and it also proves to be useful in drug-like moelcular force fields like OPLS series[7-8]. Basically, the parameters to define a virtual site includes : where to add virtual sites, how the virtual sites' positions are determined and the charges.
Not surprisingly, all these parameters can all be defined in SMIRKS pattern and as well as can be parsed with DMFF by adding terms in <NonbondedForce>
, such as:
<NonbondedForce>
<BondChargeCorrection smirks="[#0:1]~[#17:2]" bcc="0.000000"/>
<BondChargeCorrection smirks="[#0:1]~[#35:2]" bcc="0.000000"/>
<BondChargeCorrection smirks="[#0:1]~[#53:2]" bcc="0.000000"/>
<BondChargeCorrection smirks="[#0:1]~[#7:2]" bcc="0.000000"/>
<VirtualSite smirks="[#17,#35:1]-[#6X3;a:2]" vtype="1" distance="0.1600" />
<VirtualSite smirks="[#53:1]-[#6X3;a:2]" vtype="1" distance="0.1800" />
<VirtualSite smirks="[#7X2;a:1](:[*;a:2]):[*;a:3]" vtype="2" distance="0.0400" />
</NonbondedForce>
In Chapter 4, we will explain the meaning of these XML-format parameters in detail. Here, we will give a simple example to parametrize 2-chloropyridine with BCC parameters and virtual sites.
As introduced above, we first load molecule and force field parameters.
import jax.numpy as jnp
from rdkit import Chem
from dmff import Hamiltonian
mol = Chem.MolFromMolFile("clpy.mol", removeHs=False)
h_vsite = Hamiltonian("clpy_vsite.xml", noOmmSys=True)
Next, we build the dmff potential. We can see the BCC and virtual site parameters are successfully parsed.
top = h_vsite.buildTopologyFromMol(mol)
potObj = h_vsite.createPotential(top, rdmol=mol)
Then we can add virtual sites to the molecule and obtain a new rdkit.Chem.Mol
object.
mol_vsite = h_vsite.addVirtualSiteToMol(rdmol, h_vsite.getParameters())
Chem.MolToMolFile(mol_vsite, "clpy_vsite.mol")
By dumping this molecule to mol file and visualize it, we can see that as expected, two virtual sites are added along the bond between aromatic carbon (arC) and chloroine and also along the bisector of the arC-N-arC angle.
Alternatively, we can also add coordinates of virtual sites by taking atomic positions matrix as an input.
pos = jnp.array(mol.GetConformer().GetPositions()) / 10 # convert angstrom to nm
pos_vsite = h_vsite.addVirtualSiteCoords(pos, h_vsite.getParameters())