Algorithms for Protein Structure Prediction

Research output: Book/ReportPh.D. thesisResearch

Standard

Algorithms for Protein Structure Prediction. / Paluszewski, Martin.

København : Department of Computer Science, University of Copenhagen, 2008. 196 p.

Research output: Book/ReportPh.D. thesisResearch

Harvard

Paluszewski, M 2008, Algorithms for Protein Structure Prediction. Department of Computer Science, University of Copenhagen, København.

APA

Paluszewski, M. (2008). Algorithms for Protein Structure Prediction. Department of Computer Science, University of Copenhagen.

Vancouver

Paluszewski M. Algorithms for Protein Structure Prediction. København: Department of Computer Science, University of Copenhagen, 2008. 196 p.

Author

Paluszewski, Martin. / Algorithms for Protein Structure Prediction. København : Department of Computer Science, University of Copenhagen, 2008. 196 p.

Bibtex

@phdthesis{199ade00ac0411debc73000ea68e967b,
title = "Algorithms for Protein Structure Prediction",
abstract = "The problem of predicting the three-dimensional structure of a protein given itsamino acid sequence is one of the most important open problems in bioinformatics.One of the carbon atoms in amino acids is the C-atom and the overallstructure of a protein is often represented by a so-called C-trace.Here we present three different approaches for reconstruction of C-tracesfrom predictable measures. In our first approach [63, 62], the C-trace is positionedon a lattice and a tabu-search algorithm is applied to find minimumenergy structures. The energy function is based on half-sphere-exposure (HSE)and contact number (CN) measures only. We show that the HSE measure ismuch more information-rich than CN, nevertheless, HSE does not appear to provideenough information to reconstruct the C-traces of real-sized proteins. Ourexperiments also show that using tabu search (with our novel tabu definition)is more robust than standard Monte Carlo search.In the second approach for reconstruction of C-traces, an exact branch andbound algorithm has been developed [67, 65]. The model is discrete and makesuse of secondary structure predictions, HSE, CN and radius of gyration. Weshow how to compute good lower bounds for partial structures very fast. Usingthese lower bounds, we are able to find global minimum structures in a hugeconformational space in reasonable time. We show that many of these globalminimum structures are of good quality compared to the native structure. Ourbranch and bound algorithm is competitive in quality and speed with otherstate-of-the-art decoy generation algorithms.Our third C-trace reconstruction approach is based on bee-colony optimization[24]. We demonstrate why this algorithm has some important propertiesthat makes it suitable for protein structure prediction.Our approach for model quality assessment (MQA) [64] makes use of distanceconstraints extracted from alignments to templates. We show how to use CNprobabilities in an optimization algorithm for selecting good distance constraintsand we introduce the concept of non-contacts. When comparing our algorithmwith state-of-the-art MQA algorithms on the CASP7 benchmark, our algorithmis among the top-ranked algorithms. We are currently participating in CASP8MQA with this algorithm.",
author = "Martin Paluszewski",
year = "2008",
language = "English",
publisher = "Department of Computer Science, University of Copenhagen",

}

RIS

TY - BOOK

T1 - Algorithms for Protein Structure Prediction

AU - Paluszewski, Martin

PY - 2008

Y1 - 2008

N2 - The problem of predicting the three-dimensional structure of a protein given itsamino acid sequence is one of the most important open problems in bioinformatics.One of the carbon atoms in amino acids is the C-atom and the overallstructure of a protein is often represented by a so-called C-trace.Here we present three different approaches for reconstruction of C-tracesfrom predictable measures. In our first approach [63, 62], the C-trace is positionedon a lattice and a tabu-search algorithm is applied to find minimumenergy structures. The energy function is based on half-sphere-exposure (HSE)and contact number (CN) measures only. We show that the HSE measure ismuch more information-rich than CN, nevertheless, HSE does not appear to provideenough information to reconstruct the C-traces of real-sized proteins. Ourexperiments also show that using tabu search (with our novel tabu definition)is more robust than standard Monte Carlo search.In the second approach for reconstruction of C-traces, an exact branch andbound algorithm has been developed [67, 65]. The model is discrete and makesuse of secondary structure predictions, HSE, CN and radius of gyration. Weshow how to compute good lower bounds for partial structures very fast. Usingthese lower bounds, we are able to find global minimum structures in a hugeconformational space in reasonable time. We show that many of these globalminimum structures are of good quality compared to the native structure. Ourbranch and bound algorithm is competitive in quality and speed with otherstate-of-the-art decoy generation algorithms.Our third C-trace reconstruction approach is based on bee-colony optimization[24]. We demonstrate why this algorithm has some important propertiesthat makes it suitable for protein structure prediction.Our approach for model quality assessment (MQA) [64] makes use of distanceconstraints extracted from alignments to templates. We show how to use CNprobabilities in an optimization algorithm for selecting good distance constraintsand we introduce the concept of non-contacts. When comparing our algorithmwith state-of-the-art MQA algorithms on the CASP7 benchmark, our algorithmis among the top-ranked algorithms. We are currently participating in CASP8MQA with this algorithm.

AB - The problem of predicting the three-dimensional structure of a protein given itsamino acid sequence is one of the most important open problems in bioinformatics.One of the carbon atoms in amino acids is the C-atom and the overallstructure of a protein is often represented by a so-called C-trace.Here we present three different approaches for reconstruction of C-tracesfrom predictable measures. In our first approach [63, 62], the C-trace is positionedon a lattice and a tabu-search algorithm is applied to find minimumenergy structures. The energy function is based on half-sphere-exposure (HSE)and contact number (CN) measures only. We show that the HSE measure ismuch more information-rich than CN, nevertheless, HSE does not appear to provideenough information to reconstruct the C-traces of real-sized proteins. Ourexperiments also show that using tabu search (with our novel tabu definition)is more robust than standard Monte Carlo search.In the second approach for reconstruction of C-traces, an exact branch andbound algorithm has been developed [67, 65]. The model is discrete and makesuse of secondary structure predictions, HSE, CN and radius of gyration. Weshow how to compute good lower bounds for partial structures very fast. Usingthese lower bounds, we are able to find global minimum structures in a hugeconformational space in reasonable time. We show that many of these globalminimum structures are of good quality compared to the native structure. Ourbranch and bound algorithm is competitive in quality and speed with otherstate-of-the-art decoy generation algorithms.Our third C-trace reconstruction approach is based on bee-colony optimization[24]. We demonstrate why this algorithm has some important propertiesthat makes it suitable for protein structure prediction.Our approach for model quality assessment (MQA) [64] makes use of distanceconstraints extracted from alignments to templates. We show how to use CNprobabilities in an optimization algorithm for selecting good distance constraintsand we introduce the concept of non-contacts. When comparing our algorithmwith state-of-the-art MQA algorithms on the CASP7 benchmark, our algorithmis among the top-ranked algorithms. We are currently participating in CASP8MQA with this algorithm.

M3 - Ph.D. thesis

BT - Algorithms for Protein Structure Prediction

PB - Department of Computer Science, University of Copenhagen

CY - København

ER -

ID: 14772122