Understanding the Role of Individual Residues in Proteins: a Study of Missense Variants and their Impact on Function

Research output: Book/ReportPh.D. thesisResearch

Proteins play a central role in virtually all known biological processes. They are involved in a wide variety of biological functions and perform a wide range of tasks necessary for life. They are catalysts, transporters, motors and structural components, and their use in various fields, including biotechnology and pharmaceuticals, is increasing. It’s therefore vital to understand how proteins work and how their function can be altered. Furthermore, changes in their molecular mechanisms can lead to dysfunction and disease, which is why proteins are well-established targets for pharmaceutical drugs.

Proteins exert their functions through groups of residues, usually located in functional sites. Functional sites are usually regulated by the organism’s internal mechanism to adjust the level of activity as needed. However, the normal activity of a functional site can be altered by mutations in the amino acid sequence of the protein. Missense variants, single amino acid changes in which the wild-type residue is replaced by another, account for between one-third and one-half of all pathogenic variants currently catalogued in clinical databases and have been used over the past decade to study disease onset and the effect on the molecular mechanism of a protein.

High-throughput experiments have been widely used to study the effect of a missense mutation on the molecular mechanism of a protein. However, it is not easy to separate mutations that directly affect function (e.g. altering catalysis, binding and signalling) from those that affect function through an indirect mechanism, such as destabilising folding or accelerating protein degradation. To model the effect on the mechanism, it is then necessary to find a way to dissociate the two signals. While some strategies have been used to exploit the experimental information, another possible solution is to integrate the experimental data with computational predictions, which provide a cheaper and scalable alternative to extend the analysis.

This thesis addresses the challenge of developing a robust methodology to study the mechanism behind loss–of–function (LOF) and to identify functional sites in proteins using missense variants as a source of information. First, the focus is on how to use the available experimental data to investigate the effect on the molecular mechanism. This is achieved by developing a strategy to combine different high-throughput measurements to understand how missense variants affect mechanisms and to identify the role of residues in a protein. The thesis then shows how these findings have been scaled up using computational predictors. This is achieved by integrating statistical models of protein sequences with biophysical models of stability to build a computational predictor that identifies functional sites in a target protein. Finally, the thesis presents a complete analysis of the human proteome using the functional site predictions of the developed model, providing insight into the mechanism behind LOF for most human variants, identifying the role of residues in human proteins and defining mechanisms behind clinically annotated variants, providing new information for the development of precision medicine solutions.
Original languageEnglish
PublisherDepartment of Biology, Faculty of Science, University of Copenhagen
Number of pages155
Publication statusPublished - 2023

ID: 380302443