Use of multiway Partial Least Squares Regression (N-PLS) as model emulator to quantify climate change induced uncertainty in future marine chlorophyll-a concentrations

More Info


Traditionally, quantifying climate change induced uncertainty in ecological indicators requires stochastic simulation with a chain of physically-based models describing various processes such as hydrodynamics, waves, sediment transport and ecology. Such Monte Carlo based simulation on the entire model chain, especially with large sample size, is however computationally expensive and often unfeasible. In this paper, it was investigated how regression models can potentially replace physically-based models and predict chlorophyll-a concentration directly from meteorological variables. Since several correlated meteorological variables are used to estimate one ecological response variable, and thus a multi-collinearity problem is present, Partial Least Squares (PLS) regression is considered to be a favourable supervised technique. On the other hand, the climate change projection dataset at hand is multidimensional. This is due to the fact that it contains several variables which are not only varying over time but also over space (spatially distributed). Consequently, a multiway regression model should be applied which can account for the spatial dimension. The multiway PLS regression (N-PLS) algorithm is a promising candidate for this purpose. The N-PLS is an extension of the ordinary two-way PLS regression algorithm to multi-way data, where essentially the bilinear model of predictors is replaced with a multilinear model. In order to test its efficiency, the N-PLS algorithm was compared with other unsupervised and supervised, two-way and multi-way techniques using both synthetic and real datasets. The latter dataset consists of meteorological variables from KNMI (Royal Netherlands Meteorological Institute) and chlorophyll-a concentrations obtained from the Delft3D WAQ ecological model. Firstly, it was confirmed that supervised techniques should be favoured over unsupervised ones, due to their ability to include correlation to the response variable which reduces prediction error. Moreover, the results suggest that by applying multi-way methods improvements can be achieved in the prediction accuracy. The magnitude of these improvements is, however, case dependent. In conclusion, it was found that N-PLS, as a supervised multi-way method, is a promising regression model for the above mentioned purpose. Finally, due to the fast simulation time of the algorithm, it could be suitable for stochastic simulation with large sample size for the assessment of climate change induced uncertainty in coastal ecosystem indicators. Future work will focus on applying the fitted N-PLS model to EURO-CORDEX climate change projections and quantify related uncertainties in the Wadden Sea ecosystem.