Adam KrautBlogGitHub

Thirty Years of (Bio)Molecular Simulation: How Far Have We Come?

19 March, 2009 - 5 min read

This was originally intended to be micro-blogged talk. Probably on friendfeed. But when I walked into the old Chevron building on the Pitt campus to listen to Professor Wilfred van Gunsteren the wireless was spotty, so I saved my notes for a triumphant return to normal blogging. The talk is part of a lecture series presented by the CMMS at the University of Pittsburgh. Since it was probably the intended purpose when I started Bleeding Edge Biotech; this is my notepad of the distinguished lecturer’s slides and talking points.

Computation based on molecular models is playing an increasingly important role in biology, biological chemistry, and biophysics. Since only a very limited number of properties of biomolecular systems is actually accessible to measurement by experimental means, computer simulation can complement experiment by providing not only averages, but also distributions and time series of any definable – observable or non-observable – quantity, for example conformational distributions or interactions between parts of molecular systems. Present day biomolecular modelling is limited in its application by four main problems: 1) the force-field problem, 2) the search (sampling) problem, 3) the ensemble (sampling) problem, and 4) the experimental problem. These four problems will be discussed and illustrated by practical examples. Progress over the past thirty years will be briefly reviewed. Perspectives will be outlined for pushing forward the limitations of molecular modelling.

Why Thirty Years?

…first simulations were performed in 1976..

Molecular modeling choices to make:

Degrees of Freedom: atoms are elementary Forces (interactions between atoms) Boundary conditions Methods to generate configuration of atoms: Newton’s equation Simulations can:

explain experiment provoke experiment replace experiment aid in establishing intellectual property The four problems

Force field problem The search (sampling) problem The ensemble sampling problem The experimental problem The Force Field problem

small free energy differences account for entropic effects variety of atoms and molecules (keep it simple; transferable parameters) …using only the PDB for force field development just doesn’t work out.

Most dominant fold is not difficult; equilibra between folds is more important. Should be able to get melting temperatures from simulations. Solvent viscosity drives the kinetics of folding. Todo: Polarizable force- fields. —–

The searching (sampling) problem

convergence B. alleviated C. aggrevated Methods to compute free energy

counting configurations thermodynamic integration (many simulations) perturbation formula (one simulation) One-step perturbation (few simulations)

use “soft-core” atoms for each site where the inhibitors will interact. Original Viagra and Levitra could have benefitted from this method (IP, patents)

The ensemble (sampling) problem

Entropy Averaging Non-linear averaging Coiled-coil stability has a strong entropic component. For monomers the solute-solvent interaction decreases. For trimers the solute-solute interaction decreases. Entropy increases with temperature. In trimers atomic fluctuations do not increase with temperature but solute entropy increases with temperature.

The experimental problem

Averaging Insufficient data Insufficient accuracy “Averages are dangerous”

Conclusions:

Experimental data cannot determine the average structure Experimental data cannot determine the biomolecular structure Artifacts of XPLOR NMR refinement disagree with simulations guided by NOE- restraints – Two ensembles with no ensemble overlap and given same experimental data

“Experimental data is not sufficient”

Don’t rely on structural data (It’s derived; strive for primary data)

History

1957 First molecule 1964 atomic liguid (argon) 1971 molecular liquid (water)

Future

2001 — 2029 Biomolecules in water 2034 E-coli 2056 Mamallian cell (10^-9 sec) 2080 Biomolecules in water (fast as nature) 106 2172 Human body (1027 atoms) 1 sec

So what if you could simulate every atom in your body for 1 second?

— There’s much better things simulation can answer; ask better questions.

Polarizable Force Field

improves transferability between different environments – working on these force fields – solvation drives protein processes Coarse-graining

Need to switch FG/CG, back and forth – Run simulations in parallel – Easy to clamp 5 atoms to 1 but not easy to map 1 to 5 – FG/CG replica-exchange simulation enhances sampling – Much faster to cross barriers in CG mode if you can switch – Both force-fields must be thermodynamically calibrated We need simulations to explain experiment; so we can see the numbers. For molecular modelers, there’s still enough work to do at least until 2172! Questions from the audience

Q: What’s the state of NMR determination A: It depends, narrow bundles should have more motion. Stable proteins are easy. Averaging problem is present even in Crystallography. Can’t get R-values. Many many structures are not that good (XPLOR FF is simple, no solvent). Found 20% of side-chain J-values cannot be right. Simulation is getting to the point to correct experiment.

Q: Could you comment on CG model ‘clamping atoms’ and potential problems related to entropy A: Take 5 atoms, make a ball, you lose entropy. You should compensate that in the energy level? You must balance it.

Q: Is Path integral still useful? A: No, we’d like to remove it next version of Gromos.

Professor van Gunsteren is a big believer in using all the data you can get your hands on.