A Non-Parametric Bayesian Network Hydrologic Model: A Case Study of a Lowland Catchment

More Info
expand_more

Abstract

The past decades, the increasing availability of data has paved the way for a new, data-driven generation of models. This research proposes a non-parametric Bayesian network (NPBN) to model hydrologic processes. The Bayesian network (BN) is a directed, acyclic graph in which the variables are represented by the nodes, and the conditional probability distribution between variable pairs is represented by the arcs. NPBNs are computationally less expensive than many conceptual hydrologic models and are sufficiently flexible to be able to handle different continuous data sources. The goal of this thesis is to make an NPBN for a lowland catchment and test its performance. The case study concerns the catchment of the Vledder, Wapserveense and Steenwijker Aa. This catchment makes this research the first one in which an NPBN is comprehensively implemented for (1.) a single catchment in which the catchment processes are modelled, (2.) a Dutch catchment, and (3.) a lowland, partially managed, catchment. For the BN model, seven hydro-meteorological variables have been selected for the model, complemented by the target variable, which is the monthly maximum daily average discharge (MMDAD). The aim of the BN is to be able to accurately predict the MMDAD, and the Kling-Gupta efficiency (KGE) acts as a performance indicator by which to optimize the BN’s parameters. For this thesis, the Gaussian copula was selected to be implemented for all variable combinations in the BN, because this type allows for the use of the multivariate normal distribution to calculate a conditioned network. The fit of the Gaussian copula to the data is tested in this thesis. This method is far more convenient than the alternative called the vine-copula method and most likely gives a better fit than the other alternatives. Three distributions are compared to model the marginal distributions, of which the Gaussian mixture model has been selected. This function extrapolated too little, so a novel alteration function has been proposed to shift the predictions. Several other parameters in the BN have been analysed as well. A sensitivity analysis has been performed to understand what influences of artificial errors would be. In general, random errors have a low influence on the prediction of the model, whereas new systematic errors have a larger influence. Criteria for a practical, well-performing BN have been presented and a strategy to create such a model that satisfies these criteria has been assembled. This strategy left the selection of some connection implementations up for interpretation. The chosen implementation has been decided based on which implementation produced the best predictions of the relevant variable within the network. The final model gave a median, k-fold tested KGE of 0.73 when predicting the MMDAD. It is also analysed how well the MMDAD is predicted if not all other variables are fixed. Another novelty is that a BN model is benchmarked against a SOBEK model, a neural network, and a multiple linear regression model. Compared to these models, the BN performs well. Moreover, all these other models lack some advantages that the unsaturated BN has.