WASCO: A Wasserstein-based Statistical Tool to Compare Conformational Ensembles of Intrinsically Disordered Proteins

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

WASCO : A Wasserstein-based Statistical Tool to Compare Conformational Ensembles of Intrinsically Disordered Proteins. / González-Delgado, Javier; Sagar, Amin; Zanon, Christophe; Lindorff-Larsen, Kresten; Bernadó, Pau; Neuvial, Pierre; Cortés, Juan.

In: Journal of Molecular Biology, Vol. 435, No. 14, 168053, 2023.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

González-Delgado, J, Sagar, A, Zanon, C, Lindorff-Larsen, K, Bernadó, P, Neuvial, P & Cortés, J 2023, 'WASCO: A Wasserstein-based Statistical Tool to Compare Conformational Ensembles of Intrinsically Disordered Proteins', Journal of Molecular Biology, vol. 435, no. 14, 168053. https://doi.org/10.1016/j.jmb.2023.168053

APA

González-Delgado, J., Sagar, A., Zanon, C., Lindorff-Larsen, K., Bernadó, P., Neuvial, P., & Cortés, J. (2023). WASCO: A Wasserstein-based Statistical Tool to Compare Conformational Ensembles of Intrinsically Disordered Proteins. Journal of Molecular Biology, 435(14), [168053]. https://doi.org/10.1016/j.jmb.2023.168053

Vancouver

González-Delgado J, Sagar A, Zanon C, Lindorff-Larsen K, Bernadó P, Neuvial P et al. WASCO: A Wasserstein-based Statistical Tool to Compare Conformational Ensembles of Intrinsically Disordered Proteins. Journal of Molecular Biology. 2023;435(14). 168053. https://doi.org/10.1016/j.jmb.2023.168053

Author

González-Delgado, Javier ; Sagar, Amin ; Zanon, Christophe ; Lindorff-Larsen, Kresten ; Bernadó, Pau ; Neuvial, Pierre ; Cortés, Juan. / WASCO : A Wasserstein-based Statistical Tool to Compare Conformational Ensembles of Intrinsically Disordered Proteins. In: Journal of Molecular Biology. 2023 ; Vol. 435, No. 14.

Bibtex

@article{59a4c7d6f6034c50b9ea6abe8542e41f,
title = "WASCO: A Wasserstein-based Statistical Tool to Compare Conformational Ensembles of Intrinsically Disordered Proteins",
abstract = "The structural investigation of intrinsically disordered proteins (IDPs) requires ensemble models describing the diversity of the conformational states of the molecule. Due to their probabilistic nature, there is a need for new paradigms that understand and treat IDPs from a purely statistical point of view, considering their conformational ensembles as well-defined probability distributions. In this work, we define a conformational ensemble as an ordered set of probability distributions and provide a suitable metric to detect differences between two given ensembles at the residue level, both locally and globally. The underlying geometry of the conformational space is properly integrated, one ensemble being characterized by a set of probability distributions supported on the three-dimensional Euclidean space (for global-scale comparisons) and on the two-dimensional flat torus (for local-scale comparisons). The inherent uncertainty of the data is also taken into account to provide finer estimations of the differences between ensembles. Additionally, an overall distance between ensembles is defined from the differences at the residue level. We illustrate the potential of the approach with several examples of applications for the comparison of conformational ensembles: (i) produced from molecular dynamics (MD) simulations using different force fields, and (ii) before and after refinement with experimental data. We also show the usefulness of the method to assess the convergence of MD simulations, and discuss other potential applications such as in machine-learning-based approaches. The numerical tool has been implemented in Python through easy-to-use Jupyter Notebooks available at https://gitlab.laas.fr/moma/WASCO.",
keywords = "conformational ensembles, intrinsically disordered proteins, molecular dynamics simulations, SAXS/NMR ensemble refinement, Wasserstein distance matrices",
author = "Javier Gonz{\'a}lez-Delgado and Amin Sagar and Christophe Zanon and Kresten Lindorff-Larsen and Pau Bernad{\'o} and Pierre Neuvial and Juan Cort{\'e}s",
note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Ltd",
year = "2023",
doi = "10.1016/j.jmb.2023.168053",
language = "English",
volume = "435",
journal = "Journal of Molecular Biology",
issn = "0022-2836",
publisher = "Academic Press",
number = "14",

}

RIS

TY - JOUR

T1 - WASCO

T2 - A Wasserstein-based Statistical Tool to Compare Conformational Ensembles of Intrinsically Disordered Proteins

AU - González-Delgado, Javier

AU - Sagar, Amin

AU - Zanon, Christophe

AU - Lindorff-Larsen, Kresten

AU - Bernadó, Pau

AU - Neuvial, Pierre

AU - Cortés, Juan

N1 - Publisher Copyright: © 2023 Elsevier Ltd

PY - 2023

Y1 - 2023

N2 - The structural investigation of intrinsically disordered proteins (IDPs) requires ensemble models describing the diversity of the conformational states of the molecule. Due to their probabilistic nature, there is a need for new paradigms that understand and treat IDPs from a purely statistical point of view, considering their conformational ensembles as well-defined probability distributions. In this work, we define a conformational ensemble as an ordered set of probability distributions and provide a suitable metric to detect differences between two given ensembles at the residue level, both locally and globally. The underlying geometry of the conformational space is properly integrated, one ensemble being characterized by a set of probability distributions supported on the three-dimensional Euclidean space (for global-scale comparisons) and on the two-dimensional flat torus (for local-scale comparisons). The inherent uncertainty of the data is also taken into account to provide finer estimations of the differences between ensembles. Additionally, an overall distance between ensembles is defined from the differences at the residue level. We illustrate the potential of the approach with several examples of applications for the comparison of conformational ensembles: (i) produced from molecular dynamics (MD) simulations using different force fields, and (ii) before and after refinement with experimental data. We also show the usefulness of the method to assess the convergence of MD simulations, and discuss other potential applications such as in machine-learning-based approaches. The numerical tool has been implemented in Python through easy-to-use Jupyter Notebooks available at https://gitlab.laas.fr/moma/WASCO.

AB - The structural investigation of intrinsically disordered proteins (IDPs) requires ensemble models describing the diversity of the conformational states of the molecule. Due to their probabilistic nature, there is a need for new paradigms that understand and treat IDPs from a purely statistical point of view, considering their conformational ensembles as well-defined probability distributions. In this work, we define a conformational ensemble as an ordered set of probability distributions and provide a suitable metric to detect differences between two given ensembles at the residue level, both locally and globally. The underlying geometry of the conformational space is properly integrated, one ensemble being characterized by a set of probability distributions supported on the three-dimensional Euclidean space (for global-scale comparisons) and on the two-dimensional flat torus (for local-scale comparisons). The inherent uncertainty of the data is also taken into account to provide finer estimations of the differences between ensembles. Additionally, an overall distance between ensembles is defined from the differences at the residue level. We illustrate the potential of the approach with several examples of applications for the comparison of conformational ensembles: (i) produced from molecular dynamics (MD) simulations using different force fields, and (ii) before and after refinement with experimental data. We also show the usefulness of the method to assess the convergence of MD simulations, and discuss other potential applications such as in machine-learning-based approaches. The numerical tool has been implemented in Python through easy-to-use Jupyter Notebooks available at https://gitlab.laas.fr/moma/WASCO.

KW - conformational ensembles

KW - intrinsically disordered proteins

KW - molecular dynamics simulations

KW - SAXS/NMR ensemble refinement

KW - Wasserstein distance matrices

U2 - 10.1016/j.jmb.2023.168053

DO - 10.1016/j.jmb.2023.168053

M3 - Journal article

C2 - 36934808

AN - SCOPUS:85151248153

VL - 435

JO - Journal of Molecular Biology

JF - Journal of Molecular Biology

SN - 0022-2836

IS - 14

M1 - 168053

ER -

ID: 341914065