Predicting water quality from geospatial lake, catchment, and buffer zone characteristics in temperate lowland lakes

Research output: Contribution to journalJournal articleResearchpeer-review

Documents

  • Fulltext

    Final published version, 2.18 MB, PDF document

Lakes provide essential ecosystem services and strongly influence landscape nutrient and carbon cycling. Therefore, monitoring water quality is essential for the management of element transport, biodiversity, and public goods in lakes. We investigated the ability of machine learning models to predict eight important water quality variables (alkalinity, pH, total phosphorus, total nitrogen, chlorophyll a, Secchi depth, color, and pCO2) using monitoring data from 924 to 1054 lakes. The geospatial predictor variables comprise a wide range of potential drivers at the lake, buffer zone, and catchment level. We compared the performance of nine predictive models of varying complexity for each of the eight water quality variables. The best models (Random Forest and Support Vector Machine in six and two cases, respectively) generally performed well on the test set (R2 = 0.28–0.60). Models were then used to predict water quality for all 180,377 mapped Danish lakes. Additionally, we trained models to predict each water quality variable by using the predictions we had generated for the remaining seven variables. This improved model performance (R2 = 0.45–0.78). Overall, the uncovered relationships were in line with the findings of previous studies, e.g., total nitrogen was positively related to catchment agriculture and chlorophyll a, Secchi depth, and alkalinity were influenced by soil type and landscape history. Remarkably, buffer zone geomorphology (curvature, ruggedness, and elevation) had a strong influence on nutrients, chlorophyll a, and Secchi depth, e.g., curvature was positively related to nutrients and chlorophyll a and negatively to Secchi depth. Lake area was a strong predictor of multiple variables, especially its relationship with pH (positive), pCO2 (negative), and color (negative). Our analysis shows that the combination of machine learning methods and geospatial data can be used to predict lake water quality and improve national upscaling of predictions related to nutrient and carbon cycling.

Original languageEnglish
Article number158090
JournalScience of the Total Environment
Volume851
Number of pages12
ISSN0048-9697
DOIs
Publication statusPublished - 2022

Bibliographical note

Publisher Copyright:
© 2022 The Authors

    Research areas

  • Carbon dioxide, Geomorphology, Machine learning, Nutrients, Predictive modeling, Watershed

ID: 322801993