Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks

Research output: Contribution to journalJournal articleResearchpeer-review

Documents

  • Fulltext

    Final published version, 4.15 MB, PDF document

  • Julia Koehler Leman
  • Sergey Lyskov
  • Steven M. Lewis
  • Jared Adolf-Bryfogle
  • Rebecca F. Alford
  • Kyle Barlow
  • Ziv Ben-Aharon
  • Daniel Farrell
  • Jason Fell
  • William A. Hansen
  • Ameya Harmalkar
  • Jeliazko Jeliazkov
  • Georg Kuenze
  • Justyna D. Krys
  • Ajasja Ljubetič
  • Amanda L. Loshbaugh
  • Jack Maguire
  • Rocco Moretti
  • Vikram Khipple Mulligan
  • Morgan L. Nance
  • Phuong T. Nguyen
  • Shane Ó Conchúir
  • Shourya S. Roy Burman
  • Rituparna Samanta
  • Shannon T. Smith
  • Frank Teets
  • Andrew Watkins
  • Hope Woods
  • Brahm J. Yachnin
  • Christopher D. Bahl
  • Chris Bailey-Kellogg
  • David Baker
  • Rhiju Das
  • Frank DiMaio
  • Sagar D. Khare
  • Tanja Kortemme
  • Jason W. Labonte
  • Jens Meiler
  • William Schief
  • Ora Schueler-Furman
  • Justin B. Siegel
  • Vladimir Yarov-Yarovoy
  • Brian Kuhlman
  • Andrew Leaver-Fay
  • Dominik Gront
  • Jeffrey J. Gray
  • Richard Bonneau

Each year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.

Original languageEnglish
Article number6947
JournalNature Communications
Volume12
Number of pages15
ISSN2041-1723
DOIs
Publication statusPublished - 2021

Bibliographical note

Publisher Copyright:
© 2021, The Author(s).

ID: 288711473