Yong Wang:
Hybrid Methods and Atomistic Models to Explore Free Energies, Rates and Pathways of Protein Shape Changes

Date: 15-10-2016    Supervisor: Kresten Lindorff-Larsen

When I just joined the Lindorff-Larsen group as a fresh PhD student, the Nobel Prize in Chemistry that year was awarded for the \development of multiscale models for complex chemical systems" to prize the pioneering works of Martin Karplus, Michael Levitt and Arieh Warshel. As a computational biologist, I was proud and excited for the breaking news as this prize is not only to them, but also to the whole community of computational biology. There has been progress in the modeling of protein dynamics in recent years and it has also started to be clear that computer simulations play an irreplaceable role rather than supporting role of wet-lab experiments, to obtain a complete understanding of complex biomolecules. Some of the progress in the field has been introduced in the first Chapter of this thesis. Despite its enormous success, this field has not yet been fully developed. In some respects, for example, accurately quantifying the free energy differences and transition times of protein conformational exchanges and their dependence on sequence modifications, we are still at the early stages.

In this dissertation, I present a number of new methodological improvements and applications for protein folding, conformational exchange and binding with ligands at long time scales. In Chapter 2, we benchmarked how well the current force fields and molecular dynamics (MD) simulations could model changes in structure, dynamics, free energy and kinetics for an extensively studied protein called T4 lysozyme (T4L), whose conformational dynamics however is still not fully understood. We found modern simulation methods and force fields are able to capture key aspects of how this protein changes its shape, paving the way for future studies for systems that are difficult to study experimentally. In Chapter 3, we revisited the problem of accurately quantifying the thermodynamics and kinetics, by following a novel route. In this route both of the forward and backward rates are calculated directly from MD simulations using a recently developed enhanced sampling method, called \infrequent metadynamics", and subsequently used to estimate the free energy differences based on a twostate assumption. To show its practical utility, we applied this approach by taking T4L-benzene system as the model system in which binding free energies from kinetics, free energy perturbation and experiments are all in good agreement. Indeed, this route has also been applied to calculate the kinetics and thermodynamics of the conformational exchange of T4L (as shown in Chapter 2). In Chapter 4, we designed a novel method, called \pace-adaptive metadynamics", in which the frequency of bias deposition is adjusted at the course of simulations. By testing in a simple model system and applying in a case of T4L binding/unbinding with two di erent ligands, we showed that the pace adaptive scheme can improve the reliability and accuracy of kinetics estimation, importantly without the need of extra computational resources. So this strategy allows us to utilize the limited computational resources in a more reasonable way. In Chapter 5, we further illustrated the possibility to combine the free energy ooding potential obtained from the variational method with infrequent metadynamics to calculate the long timescale rate. This hybrid method was tested again in the calculation of the unbinding time of T4L-benzene. The results suggest this hybrid method can obtain similar results as infrequent metadynamics but with less computational resources. Thus it is promising to apply this hybrid method to calculate kinetics of escaping from a deep free energy well, e.g. the drug residence time. In Chapter 6, we developed an atomistic hybrid model by integration of physics-based and structure-based potentials in the context of Monte Carlo software packages. We showed the ability of our models to distinguish the folding mechanisms of four topologically similar proteins.