Circular Image

G. Jongbloed

21 records found

We consider the problem of online nonparametric regression for signals of length n with total variation at most Cn whose observations are contaminated by σ-subgaussian noise. While there exist many algorithms which achieve optimal performance under the assumption of independent n ...
This thesis examines rainfall event characteristics in the Netherlands over a 26-year period (1998–2023) using radar-derived precipitation data. Extreme precipitation is a major contributor to flooding, which impacts human life, infrastructure, and ecosystems. A life-cycle-based ...

Advancing Gaussian Process Bandit Optimization for Time-Varying Functions

Online Learning in the Continuous Time-Varying Setting

This thesis investigates the problem of time-varying function optimization. In particular, we study techniques to minimize the cumulative regret when optimizing a time-varying function in the Gaussian process setting. First, we introduce the problem and present a literature revie ...

A Novel Optimal Execution Strategy

Using data-driven methods and stochastic modeling, with application in the FX Spot Market

This thesis presents a novel approach to optimize execution strategies in the Foreign Exchange Spot Market, focusing on the application of data-driven methodologies and stochastic modeling.
It begins by proposing a new measure to evaluate the limit order book volume imbalanc ...
This thesis aims to enhance existing models that infer parameters describing the spread of a virus by analyzing the distribution of empirical cluster sizes of identical genetic sequences. An approach that has gained recent popularity assumes that each individual cluster can be mo ...
The growing demand for renewable energy and the increased installation of wind turbines have brought challenges related to operational efficiency and predictive accuracy. In this thesis, we extend the method 'Narrowest Significance Pursuit' (NSP) to non-linear frameworks and expl ...
With the ever-increasing need to reduce the use of fossil fuels, Tesla is accelerating the world's transition to sustainable energy. This means replacing all internal combustion vehicles with electric ones over time. The growing number of Tesla vehicles on the road poses interest ...
The Bayesian approach is a very important approach for tackling problems in statistics. It involves choosing a distribution that reflects the prior knowledge and  thus takes all knowledge into account in contrast to the frequentist approach. It also assumes that the parameters (t ...

Evaluating Constant Failure Rates in Storm Surge Barriers

A Statistical Framework Applied to Censored Component Lifetimes of the Oosterscheldekering

This study examines the validity of constant failure rates in the reliability assessment of storm surge barriers, with a focus on the Stormvloedkering Oosterschelde (SVKO). Analysing a dataset of 1,501 malfunctions, including 87 critical incidents over six years, we employ Expone ...
De 'grootte' van een verzameling staat in de wiskunde bekend onder de term 'kardinaliteit'. Omdat de kardinaliteit van de natuurlijke getallen niet met een eindig natuurlijk getal n kan worden aangetoond, is de kardinaliteit van deze verzameling uniek gedefinieerd als alef ...

Improving data quality is of the utmost importance for any data-driven company, as data quality is unmistakably tied to business analytics and processes. One method to improve upon data quality is to restore missing and wrong data entries. 

Improving data quality is of the utmost importance for any data-driven company, as data quality is unmistakably tied to business analytics and processes. One method to improve upon data quality is to restore missing and wrong data entries. 

The goal of this research is construct an algorithm such that it is possible to restore missing and wrong data entries, while making use of a human adaptive framework. This algorithm has been constructed in a modular fashion and consists of three main modules: Data Transformation, Data Structure Analysis and Model Selection. Data Transformation has concerned itself with conversion of raw data to data types and forms the other modules can use.

Data Structure Analysis has been designed to deal with correctly missing data and dichotomy in the target feature by making use of three clustering algorithms: DBSCAN, K-Means and Diffusion Maps. DBSCAN is used to determine the necessity of clustering as well as the initialisation of the K-Means algorithm. K-Means and Diffusion Maps have been used as clustering methods in the one-dimensional target feature and the two-dimensional input-target feature pairs, respectively. Data Structure Analysis has further been designed to perform feature selection through three filter methods: CorrCoef, FCBF and Treelet.

Model Selection has proposed a novel approach to selection of the best model of a candidate set through the optimisation of a conditional model ranking strategy based on the prior construction of theoretical testing. Our candidate set consisted of Expectation Maximisation, K-Means, Multi-Layer Perceptron, Nearest Neighbor, Random Forest, Linear Regression, Polynomial Regression, ElasticNet Regression.

In terms of restorability, it was shown that the optimal configuration of the Cleansing Algorithm for the restoration of missing data, was provided by opting not to use clustering, using a custom alteration to the Treelet algorithm for feature selection and making use of the model selection strategy. This not only lead to the greatest restorability of 56.90% on Aegon data sets, which was an improvement of 44.83% when compared to not using the Cleansing Algorithm, but also to the reduction of computation time by over 400%. A more realistic restorability due to the presence of correctly missing data, was given by the same configuration making use of one-dimensional output clustering. This resulted in a restorability on Aegon data sets of 43.10%. As such it was deemed possible to restore missing data on Aegon data sets.

With respect to the human adaptive framework, it was determined that the construction of the algorithm be modular in the sense that any alternate feature selection or clustering approach can be implemented with ease. Furthermore, the model selection module allows us to customize the theoretical testing and choice of regression or classification models for the restoration of missing data. In doing so, the algorithm has laid the foundations for human adaptivity of the Cleansing Algorithm.

This thesis examines statistical methods to find the right timing of intrauterine insemination treatment relative to the start of the follow-up of the couples. Intrauterine insemination is a fertility treatment conducted by injecting refined sperm into a woman's uterus. Lots of r ...

Classification in football

Activity classification using sensor data in football

The goal of this report is to present and describe the effort surrounding the completion of the Master Thesis Project of classification in football. Classification is a procedure which belongs in the field of Statistics. The objective is to capture, detect and distinguish certain act ...

Forensic speaker recognition

Based on text analysis of transcribed speech fragments

Currently, speaker recognition research is mainly based on phonetics and speech signal processing. This research addresses speaker recognition from a new perspective, analysing the transcription of a fragment of speech with text analysis methods. Since text analysis is based on t ...

Web-Based Economic Activity Classification

Comparing semi-supervised text classification methods to deal with noisy labels

In order to provide accurate statistics for industries, the classification of enterprises by economic activity is an important task for national statistical institutes. The economic activity codes in the Dutch business register are less accurate for small enterprises since small ...
Sinds 2015/2016 krijgen de leerlingen op de havo met wiskunde A op hun eindexamen een blad met enkele vuistregels. In deze thesis heb ik de achtergrond van deze vuistregels onderzocht en de vuistregels zelf tegen het licht gehouden.
What is the actual value of a house and which factors contribute the most to it? In this thesis we do a thorough research and try to come up with an answer. We will set up models that approximate the current market values for all houses in the Netherlands. To do this, we use a lo ...
This thesis is dedicated to the application of data science to sports data. The research for this thesis is part of a bigger project on injury prevention and sport performance called Citius Altius Sanius (CAS). Two data sets from two different projects within CAS are analysed, wi ...
In this report, inhomogeneous Lévy processes are studied in a discrete observational model based on derivatives of the process. First, homogeneous Lévy models are defined and an already known nonparametric method, using Fourier techniques and call and put option prices, for estim ...
When the re brigade arrives at a burning building, it is of vital importance that people who are still inside can quickly be found. In this thesis we contribute to an ultrasonic sound sensor for human presence detection in smoke-lled spaces. This type of sensor could assist the r ...