NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

NetSurfP-2.0 : improved prediction of protein structural features by integrated deep learning. / Klausen, Michael Schantz; Jespersen, Martin Closter; Nielsen, Henrik; Jensen, Kamilla Kjærgaard; Jurtz, Vanessa Isabell; Sønderby, Casper Kaae; Sommer, Morten Otto Alexander; Winther, Ole; Nielsen, Morten; Petersen, Bent; Marcatili, Paolo.

In: Proteins: Structure, Function and Bioinformatics, Vol. 87, No. 6, 2019, p. 520-527.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Klausen, MS, Jespersen, MC, Nielsen, H, Jensen, KK, Jurtz, VI, Sønderby, CK, Sommer, MOA, Winther, O, Nielsen, M, Petersen, B & Marcatili, P 2019, 'NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning', Proteins: Structure, Function and Bioinformatics, vol. 87, no. 6, pp. 520-527. https://doi.org/10.1002/prot.25674

APA

Klausen, M. S., Jespersen, M. C., Nielsen, H., Jensen, K. K., Jurtz, V. I., Sønderby, C. K., Sommer, M. O. A., Winther, O., Nielsen, M., Petersen, B., & Marcatili, P. (2019). NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning. Proteins: Structure, Function and Bioinformatics, 87(6), 520-527. https://doi.org/10.1002/prot.25674

Vancouver

Klausen MS, Jespersen MC, Nielsen H, Jensen KK, Jurtz VI, Sønderby CK et al. NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning. Proteins: Structure, Function and Bioinformatics. 2019;87(6):520-527. https://doi.org/10.1002/prot.25674

Author

Klausen, Michael Schantz ; Jespersen, Martin Closter ; Nielsen, Henrik ; Jensen, Kamilla Kjærgaard ; Jurtz, Vanessa Isabell ; Sønderby, Casper Kaae ; Sommer, Morten Otto Alexander ; Winther, Ole ; Nielsen, Morten ; Petersen, Bent ; Marcatili, Paolo. / NetSurfP-2.0 : improved prediction of protein structural features by integrated deep learning. In: Proteins: Structure, Function and Bioinformatics. 2019 ; Vol. 87, No. 6. pp. 520-527.

Bibtex

@article{f5447764f7404f56af0882e9f591cc10,
title = "NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning",
abstract = "The ability to predict local structural features of a protein from the primary sequence is of paramount importance for unraveling its function in absence of experimental structural information. Two main factors affect the utility of potential prediction tools: their accuracy must enable extraction of reliable structural information on the proteins of interest, and their runtime must be low to keep pace with sequencing data being generated at a constantly increasing speed. Here, we present NetSurfP-2.0, a novel tool that can predict the most important local structural features with unprecedented accuracy and runtime. NetSurfP-2.0 is sequence-based and uses an architecture composed of convolutional and long short-term memory neural networks trained on solved protein structures. Using a single integrated model, NetSurfP-2.0 predicts solvent accessibility, secondary structure, structural disorder, and backbone dihedral angles for each residue of the input sequences. We assessed the accuracy of NetSurfP-2.0 on several independent test datasets and found it to consistently produce state-of-the-art predictions for each of its output features. We observe a correlation of 80% between predictions and experimental data for solvent accessibility, and a precision of 85% on secondary structure 3-class predictions. In addition to improved accuracy, the processing time has been optimized to allow predicting more than 1000 proteins in less than 2 hours, and complete proteomes in less than 1 day.",
keywords = "deep learning, disorder, local structure prediction, secondary structure, solvent accessibility",
author = "Klausen, {Michael Schantz} and Jespersen, {Martin Closter} and Henrik Nielsen and Jensen, {Kamilla Kj{\ae}rgaard} and Jurtz, {Vanessa Isabell} and S{\o}nderby, {Casper Kaae} and Sommer, {Morten Otto Alexander} and Ole Winther and Morten Nielsen and Bent Petersen and Paolo Marcatili",
year = "2019",
doi = "10.1002/prot.25674",
language = "English",
volume = "87",
pages = "520--527",
journal = "Proteins: Structure, Function, and Bioinformatics",
issn = "0887-3585",
publisher = "JohnWiley & Sons, Inc.",
number = "6",

}

RIS

TY - JOUR

T1 - NetSurfP-2.0

T2 - improved prediction of protein structural features by integrated deep learning

AU - Klausen, Michael Schantz

AU - Jespersen, Martin Closter

AU - Nielsen, Henrik

AU - Jensen, Kamilla Kjærgaard

AU - Jurtz, Vanessa Isabell

AU - Sønderby, Casper Kaae

AU - Sommer, Morten Otto Alexander

AU - Winther, Ole

AU - Nielsen, Morten

AU - Petersen, Bent

AU - Marcatili, Paolo

PY - 2019

Y1 - 2019

N2 - The ability to predict local structural features of a protein from the primary sequence is of paramount importance for unraveling its function in absence of experimental structural information. Two main factors affect the utility of potential prediction tools: their accuracy must enable extraction of reliable structural information on the proteins of interest, and their runtime must be low to keep pace with sequencing data being generated at a constantly increasing speed. Here, we present NetSurfP-2.0, a novel tool that can predict the most important local structural features with unprecedented accuracy and runtime. NetSurfP-2.0 is sequence-based and uses an architecture composed of convolutional and long short-term memory neural networks trained on solved protein structures. Using a single integrated model, NetSurfP-2.0 predicts solvent accessibility, secondary structure, structural disorder, and backbone dihedral angles for each residue of the input sequences. We assessed the accuracy of NetSurfP-2.0 on several independent test datasets and found it to consistently produce state-of-the-art predictions for each of its output features. We observe a correlation of 80% between predictions and experimental data for solvent accessibility, and a precision of 85% on secondary structure 3-class predictions. In addition to improved accuracy, the processing time has been optimized to allow predicting more than 1000 proteins in less than 2 hours, and complete proteomes in less than 1 day.

AB - The ability to predict local structural features of a protein from the primary sequence is of paramount importance for unraveling its function in absence of experimental structural information. Two main factors affect the utility of potential prediction tools: their accuracy must enable extraction of reliable structural information on the proteins of interest, and their runtime must be low to keep pace with sequencing data being generated at a constantly increasing speed. Here, we present NetSurfP-2.0, a novel tool that can predict the most important local structural features with unprecedented accuracy and runtime. NetSurfP-2.0 is sequence-based and uses an architecture composed of convolutional and long short-term memory neural networks trained on solved protein structures. Using a single integrated model, NetSurfP-2.0 predicts solvent accessibility, secondary structure, structural disorder, and backbone dihedral angles for each residue of the input sequences. We assessed the accuracy of NetSurfP-2.0 on several independent test datasets and found it to consistently produce state-of-the-art predictions for each of its output features. We observe a correlation of 80% between predictions and experimental data for solvent accessibility, and a precision of 85% on secondary structure 3-class predictions. In addition to improved accuracy, the processing time has been optimized to allow predicting more than 1000 proteins in less than 2 hours, and complete proteomes in less than 1 day.

KW - deep learning

KW - disorder

KW - local structure prediction

KW - secondary structure

KW - solvent accessibility

U2 - 10.1002/prot.25674

DO - 10.1002/prot.25674

M3 - Journal article

C2 - 30785653

AN - SCOPUS:85054928375

VL - 87

SP - 520

EP - 527

JO - Proteins: Structure, Function, and Bioinformatics

JF - Proteins: Structure, Function, and Bioinformatics

SN - 0887-3585

IS - 6

ER -

ID: 215974831