Forecasting Extreme Precipitation Using k-nearest Forest Neighbors

Bachelor Thesis (2018)
Author(s)

Y. Dai (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J. Cai – Mentor

J.J. Velthoen – Mentor

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2018 Yinghao Dai
More Info
expand_more
Publication Year
2018
Language
English
Copyright
© 2018 Yinghao Dai
Graduation Date
23-07-2018
Awarding Institution
Delft University of Technology
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Precipitation has high spatial and temporal uncertainty, which makes it challenging to predict. We focus specifically on extreme amounts of precipitation. The Royal Dutch Meteorological Institute (KNMI) uses a numerical model, approximating the solutions to partial differential equations, to forecast precipitation and other metrics about the weather. These forecasts have systematic errors, due to the model’s high sensitivity to input parameters. These errors can be corrected with statistical methods, by looking at the relation between the predicted and actual precipitation. We use a non-parametric regression set-up to estimate the conditional expectation of the weather given the forecasts of the numerical weather prediction model of the KNMI. Specifically, we focus on predicting the maximum precipitation in a three by three kilometers area in the Netherlands. There are several existing methods for solving non-parametric regression problems; in this thesis we will focus on k-nearest neighbors and random forests. A simulation study shows, however, that both these methods are not capable of dealing with more complex regression problems, such as forecasting extreme precipitation. Therefore, we are proposing a newly developed method, called k-nearest forest neighbors, which is a generalization of the random forests approach. This new method performs significantly better on the simulated data, compared to k-nearest neighbors and random forests. When applying the methods on a precipitation data set obtained from the KNMI, it also turns out that the method we developed has more predictive power than the numerical weather model and the existing non-parametric regression approaches.

Files

License info not available