T.J. Viering | TU Delft Repository

Transformers can do Bayesian Clustering

Master thesis (2025) - P. Bhaskaran , T.J. Viering , O.K. Shirekar , M.J.T. Reinders , J.W. Böhmer

Motivation: Clustering is an unsupervised learning task with broad applications. Traditional clustering methods often rely on point estimates of model parameters, which can limit their ability to capture uncertainty. Bayesian clustering addresses this by incorporating unce ...

Extrapolating Learning Curves: When Do Neural Networks Outperform Parametric Models?

Bachelor thesis (2025) - A. Cazacu , T.J. Viering , C. Yan , S. Mukherjee , M.T.J. Spaan

Learning curve extrapolation helps practitioners predict model performance at larger data scales, enabling better planning for data collection and computational resource allocation. This paper investigates when neural networks outperform parametric models for this task. We conduc ...

Sharpness-Aware Optimization for Stability Gap Reduction

Bachelor thesis (2025) - K. Sycheva , G.M. van de Ven , T.J. Viering , A. Hanjalic

One of the problems in continual learning, where models are trained sequentially on tasks, is a sudden drop in performance after switching to a new task, called stability gap. The presence of stability gap likely indicates that training is not done optimally. In this work we aim ...

The Impact of Imbalanced Training Data on Learning Curve Prior-Fitted Networks

Bachelor thesis (2025) - B. Kostov , Cheng Yan , Sayak Mukherjee , Tom Viering , M.T.J. Spaan

Learning curves represent the relationship between the amount of training data and the error rate in machine learning. An important use case for learning curves is extrapolating them in order to predict how much data is needed to achieve a certain performance. One way to do such ...

How Noisy Is Too Noisy?

Robust Extrapolation of Learning Curves with LC-PFN

Bachelor thesis (2025) - R.M. Gherasa , C. Yan , S. Mukherjee , T.J. Viering , M.T.J. Spaan

Accurately predicting a machine learning model’s final performance based on only partial training data can save substantial computational resources and guide early stopping, model selection, and automated machine learning (AutoML) workflows. Learning Curve Prior-Fitted Networks ( ...

Reaching for Resilience: Understanding How Optimizers Affect the Stability Gap in Continual Learning

Bachelor thesis (2025) - C. Obis , G.M. van de Ven , T.J. Viering , A. Hanjalic

In the context of continual learning, recent work has identified a significant and recurring perfor- mance drop, followed by a gradual recovery, upon the introduction of a new task. This phenomenon is referred to as the stability gap. Investigating it and the potential solutions ...

Effectiveness of Machine Learning Models in Classifying Learners Based on Learning Curves

Improving Our Understanding of Learning Curves Through the Process of Classification

Bachelor thesis (2025) - S. Basaran , C. Yan , S. Mukherjee , T.J. Viering , M.T.J. Spaan

In machine learning, learning curves are a metric that plots performance versus training set size. They inform decisions about data acquisition, model selection, and hyperparameter tuning. Despite their importance, recent research suggests that our understanding of learning curve ...

The Effect of Domain Shift on Learning Curve Extrapolation

Bachelor thesis (2025) - M. Soeters , T.J. Viering , C. Yan , S. Mukherjee , M.T.J. Spaan

Domain shift is when the distribution of data differs between the training of a model and its testing. This can happen when the conditions of training are slightly different from the conditions that will happen when a model is tested or used. This is a problem for generalizabilit ...

Mind the Gap: Layerwise Proximal Replay for Stable Continual Learning

Bachelor thesis (2025) - O.S.E. Hage , T.J. Viering , G.M. van de Ven , A. Hanjalic

Continual learning aims to train models that can incrementally acquire new knowledge over a sequence of tasks while retaining previously learned information, even in the absence of access to past data. A key challenge in this setting is maintaining stability at task transitions, ...

I Fought the Low

Decreasing Stability Gap with Neuronal Decay

Bachelor thesis (2025) - K. Zhankov , G.M. van de Ven , T.J. Viering , A. Hanjalic

Task-based continual learning setups suffer from temporary dips in performance shortly after switching to new tasks, a phenomenon referred to as stability gap. State-of-the-art methods that considerably mitigate catastrophic forgetting do not necessarily decrease the stability ga ...

Stability Gap in Continual Learning: The Role of Learning Rate

Bachelor thesis (2025) - P.K. Sobocińska , T.J. Viering , G.M. van de Ven , A. Hanjalic

Continual learning aims to enable neural networks to acquire new knowledge sequentially without forgetting what they have already learned. While many strategies have been developed to address catastrophic forgetting, a subtler challenge known as the stability gap—a temporary drop ...

Revisiting SVM Training

Optimizing SVM Hyperparameter tuning using early stopping in the SMO algorithm

Master thesis (2025) - I. Dekker , M.J.T. Reinders , T.J. Viering , O.T. Turan , I. Barcelos Carneiro M Da R

Support Vector Machines (SVMs) are widely used in various domains, with their performance heavily dependent on hyperparameter selection. However, hyperparameter tuning is computationally demanding due to the SVM training complexity, which is at best $O(n^2)$, where $n$ represents ...

How does scaling a learning curve influence the curve fitting process?

Bachelor thesis (2025) - C. van den Oudenhoven , O.T. Turan , C. Yan , T.J. Viering , A. van Deursen

Learning curves show the learning rate of a clas- sifier by plotting the dataset size used to train the classifier versus the error rate. By extrapolating these curves it is possible to predict how well the classifier will perform when trained on dataset sizes that are currently ...

What is the effect of Gaussian filtering applied before curve fitting?

Bachelor thesis (2025) - Ionut-Liviu Moanta , T.J. Viering , O.T. Turan , C. Yan , A. van Deursen

Learning curves are graphical representations of the relationship between dataset size and error rate in machine learning. Curve fitting is the process of estimating a learning curve using a mathematical formula. This paper analyzes two ways of performing curve fitting: interpola ...

Starting Right: Exploring the impact of random distribution sampling on initial Parameter selection for curve fitting

Bachelor thesis (2025) - D. Darie , O.T. Turan , T.J. Viering , C. Yan , A. van Deursen

Learning curves are used to evaluate the perfor- mance of a machine learning (ML) model with respect to the amount of data used when train- ing. Curve fitting finds the unknown optimal co- efficients by minimizing the error prediction for a learning curve. This research analyzed ...

Malware Evolution

Unraveling Malware Genomics: Synergistic Approach using Deep Learning and Phylogenetic Analysis for Evolutionary Insights

Master thesis (2024) - A. Amalan , G. Smaragdakis , T.J. Viering , H.J. Griffioen

The rapid advancement of artificial intelligence technologies has significantly increased the complexity of polymorphic and metamorphic malware, presenting new challenges to cybersecurity defenses. Our study introduces a novel bioinformatics-inspired approach, leveraging deep lea ...

The rapid advancement of artificial intelligence technologies has significantly increased the complexity of polymorphic and metamorphic malware, presenting new challenges to cybersecurity defenses. Our study introduces a novel bioinformatics-inspired approach, leveraging deep learning and phylogenetic analysis to understand the evolutionary dynamics of such malware. By analyzing a dataset of 103,883 malware samples, we transformed extracted features using pseudo-static, dynamic, and image analyses into embeddings with deep learning techniques, combining them into what we refer to as the "genome" of malware. These combined embeddings were used to construct phylogenetic trees employing the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) and the Neighbor-Joining (NJ) method.We were the first to utilize OpenAI's state-of-the-art embeddings for converting pseudo-static and dynamic features into embeddings. In addition, we discovered that transfer learning with ResNet-50 is highly effective compared to traditional CNNs, producing better image embeddings that outperform others in terms of classification accuracy.

We also introduced new validation techniques for phylogenetic trees, making use of VirusTotal timestamps and embedding drift analysis. These methods confirmed that the NJ method was more accurate. Furthermore, we developed techniques to simplify the analysis of these extensive phylogenetic trees, enabling efficient derivation of relationships within and between malware families. The insights from our NJ-built phylogenetic trees closely align with public data and lay a foundation for generating evolutionary-informed signatures that enhance tailored detection strategies. Our method has significantly expedited the process of identifying connections among 538 malware families by dramatically reducing the timeframe from months or years to just weeks much faster than traditional reverse engineering approaches for tracing malware evolution.

Deciphering Learning Curve Characteristics via K-Means Clustering of Curve Model Parameters

Bachelor thesis (2024) - E.A. Ozgur , O.T. Turan , T.J. Viering , H.S. Hung

Learning curves illustrate the relationship between the performance of learning algorithms and the increasing volume of training data [1, 2, 3]. While the concept of learning curves is well-established, clustering these curves based on fitting parameters remains an underexplored ...

Prevalence of non-monotonicity in learning curves

Bachelor thesis (2024) - D. Gafton , T.J. Viering , O.T. Turan , H.S. Hung

Learning curves are useful to determine the amount of data needed for a certain performance. The conventional belief is that increasing the amount of data improves performance. However, recent work challenges this assumption, and shows nonmonotonic behaviors of certain learners o ...

Learning Curve Extrapolation using Machine Learning

Benefits and Limitations of using LCPFN for Learning Curve Extrapolation

Bachelor thesis (2024) - P. Johari , T.J. Viering , O.T. Turan , H.S. Hung

This study explores the extrapolation of learning curves, a crucial aspect in evaluating learner performance with varying dataset sample sizes. We use the Learning Curve Prior Fitted Network (LC-PFN), a transformer pre-trained on synthetic data with proficiency in approximate Bay ...

Learning Curves

How do Data Imbalances affect the Learning Curves using Nearest Mean Model?

Bachelor thesis (2024) - J.J. Feng , T.J. Viering , O.T. Turan

This research investigates the impact of data imbalances on the learning curve using the nearest mean model. Learning curves are useful to represent the performance of the model as the training size increases. Imbalanced datasets are often encountered in real-life scenarios and p ...