Proteins are the building blocks of life, participating in almost all cellular processes in our body. They transport smaller molecules, catalyze reactions, regulate the metabolism or support our immune system. All these functions require intensive interaction with other molecules, which is only possible if the shape of the interacting partners fit together. To allow such a fit, a protein continuously has to change and adjust its three-dimensional shape (see figure). Understanding the motion of a protein, therefore, provides insights into the function it can perform. Most importantly, it can enable therapeutical intervention in order to regulate this function, resulting in drugs targeting severe diseases such as Alzheimer's, AIDS or cancer.
The size of proteins, and the time scales in which their interactions are performed, prevent current experimental techniques from observing their motion directly. In the Robotics and Biology lab, we are working towards a “computational microscope”, an efficient and biologically accurate simulator for protein motion.
Due to the complex nature of proteins, we can not simulate all occurring types and scales of motions caused by different physical and biochemical phenomena. However, we believe that we do not have to simulate everything, as the function of a protein has to be robust against small disturbances. We hypothesize that some motions in some parts of the protein contribute more to the functionality than others, and therefore a simplified model suffices to explain the protein's motion, and hence its function. We believe that this model has to be more detailed and accurate in relevant areas and less so elsewhere. Our goal is to concentrate our computational power to areas of the protein that exhibit function-relevant motions.
How may this simplified model look like? We investigate two different strategies of coarse-graining, one is based on the well-established Elastic Network Models, the other uses kinematic models and Matrix Structural Analysis.
Elastic network models (ENMs), emerged as powerful methods to study protein motion at a coarse-grained scale. Using Normal Mode Analysis (NMA) they determine the most dominant deformation modes of a protein. Despite their success ENMs have an important limitation questioning their practical relevance. To accurately predict protein motions they require the underlying contact topology to remain unchanged when the protein moves. This prevents ENMs from explaining localized functional protein motions that often involve substantial changes in the contact topology. We propose a novel elastic network model, lmcENM, that overcomes this limitation by leveraging information about the dynamic behavior of contacts (see Putz et al. 2017). By learning, which contacts are maintained throughout the movement, lmcENM is able to accurately predict protein motions even for localized and uncorrelated functional transitions with changing contact topology.
We show that simply the absence of observed breaking contacts enables ENMs to accurately capture localized functional transitions. We developed a machine-learning based classifier (SVM) to differentiate breaking from maintained contacts based on a graph-based encoding of their local, physicochemical environment. The characteristics of these environments capture how tightly different parts of the protein are bound to each other, how this affects their movements, and ultimately their contact topology. Based on the outcome of the classifier we then build our novel elastic network of learned maintained contacts, lmcENM.
We can model the peptide chain of a protein as a kinematic structure, where each peptide plane is a link and each peptide bond is a joint (Fig. 2). Methods from robotics give us efficient ways to study the motions of such a kinematic chain (see Jagodzinkski et al. 2007).
This model, however, only considers kinematic constraints. It ignores forces generated by biochemical and physical phenomena. These additional constraints, rather than making the problem more complex, provide us a way to further simplify the model. By incorporating these constraints into our model, we can partition a protein into smaller subparts. The kinematic structure representing each subpart in the original model can now be replaced with a simpler representation thus reducing the complexity of our model. Using Matrix Structural Analysis (MSA) we are able to determine the most dominant deformation modes of a protein. We expect that this will result in biologically accurate, yet computationally tractable models.