Modeling gross primary productivity across different European ecosystem types
Evaluating the versatility of SARIMAX, XGBoost, and LSTM using ICOS FLUXNET and Sentinel-2 data
Anna Spinosa (TU Delft - Electrical Engineering, Mathematics and Computer Science, Deltares)
Karisma Karisma (University of Twente, Deltares)
Marieke A. Eleveld (Deltares, TU Delft - Civil Engineering & Geosciences)
Mario Alberto Fuentes-Monjaraz (Deltares)
Valeria Mobilia (Deltares)
Ulf Mallast (UZF - Helmholtz Centre for Environmental Research)
Johannes Peterseil (Environmental Agency of Austria)
Ghada El Serafy (TU Delft - Electrical Engineering, Mathematics and Computer Science, Deltares)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Predicting Gross Primary Productivity (GPP) is key for understanding ecosystem health and quantifying the global carbon cycle. While data-driven models have shown strong performance in capturing GPP dynamics at specific sites, their ability to generalize across ecosystems without site-specific recalibration remains largely untested. This study addresses this gap by evaluating the applicability of XGBoost and LSTM models in estimating GPP across different European ecosystems. We developed a unified (cross-site) modeling framework that integrates in-situ eddy covariance observations and Sentinel-2–derived vegetation indices using incremental learning. Models’ performance was assessed via: (i) site-specific models, developed to capture individual site characteristics, and (ii) cross-site generalization, including evaluation on an independent dataset of unseen ecosystems. SARIMAX is included as a site-specific statistical benchmark for comparison. Our findings indicate that XGBoost consistently outperformed the other models, achieving site-specific R2 values above 0.90 in forest and grassland ecosystems and an average R2 of 0.72 across unseen sites (range 0.66–0.78). LSTM exhibited better accuracy in predicting GPP peaks at site-specific level, particularly in cropland and forest ecosystems. At site-level, SARIMAX showed comparable performance to XGBoost but struggled in capturing the rapid temporal variation of GPP. These findings demonstrate the feasibility of a data-driven framework for cross-site GPP monitoring within European flux-tower networks, making a first step toward transferable GPP prediction without site-specific recalibration.