A. van Deursen | TU Delft Repository

Comparing the hint quality of a Small Language Model and a Large Language Model in automatic hint generation

Replacing the LLM inside the JetBrains Academy AI hint generation system with a RAG-augmented SLM

Master thesis (2025) - C.R. Dekeling (author) , Gosia Migut (mentor) , A. Deursen (graduation committee member) , M.M. Specht (graduation committee member) , Anastasiia Birillo (mentor)

The rapid advancement of Large Language Models (LLMs) in recent years is not without concerns, such as a lack of privacy, environmental impact, and financial concerns. It might therefore be beneficial to use Small Language Models (SLMs) instead, which are more accessible to be ru ...

Model Checking Under JAM21

Master thesis (2025) - M. Rączkiewicz (author) , S.S. Chakraborty (mentor) , Arie Van Deursen (graduation committee member)

This thesis presents the first known implementation of a model checker for the Java memory model JAM21 within the GenMC framework - a tool for stateless model checking using custom memory models. In addition to the baseline GenMC implementation, we introduce a more efficient mode ...

SMURF: a Methodology for Energy Profiling Software Systems

Simulate and Measure to Understand Resource Footprints

Master thesis (2025) - O.K.N. Kaaij (author) , Luís Cruz (mentor) , J. Sallou (mentor) , A. van Deursen (graduation committee member) , A. Lukina (graduation committee member) , J. Silva (mentor)

Understanding the energy profile of a complex, multi-faceted software system is difficult. In this thesis, we present a novel methodology, called SMURF, a five-step methodology that gives insights into the energy consumption of a complex system. The methodology is broadly applica ...

How does sample weighting improve learning curve fitting?

Bachelor thesis (2025) - G.F.M. den Hollander (author) , T.J. Viering (mentor) , C. Yan (mentor) , O.T. Turan (mentor) , A. Van Van Deursen (graduation committee member)

Learning curves plot the performance of a machine learning model against the size of the dataset used for training. Curve fitting is a process that attempts to optimize algorithm parameters by minimizing the error in its loss function, thereby achieving the best possible fit to t ...

How does scaling a learning curve influence the curve fitting process?

Bachelor thesis (2025) - C. van den Oudenhoven (author) , O. Taylan Turan (mentor) , C. Yan (mentor) , Tom Viering (mentor) , A. van Deursen (graduation committee member)

Learning curves show the learning rate of a clas- sifier by plotting the dataset size used to train the classifier versus the error rate. By extrapolating these curves it is possible to predict how well the classifier will perform when trained on dataset sizes that are currently ...

Optimizing Compiler for Neuromorphic Hardware

Master thesis (2025) - S. Yu (author) , Rajendra bishnoi (mentor) , S.S. Chakraborty (mentor) , A. Van Deursen (graduation committee member) , Sumit Diware (graduation committee member)

The rapid advancement of neural network applications, including multilayer perceptrons (MLP) and deep convolutional neural networks (CNN), has revolutionized domains such as image recognition, speech processing, and classification. However, the increasing depth and complexity of ...

Starting Right: Exploring the impact of random distribution sampling on initial Parameter selection for curve fitting

Bachelor thesis (2025) - D. Darie (author) , O. Taylan Turan (mentor) , Tom Viering (mentor) , C. Yan (mentor) , A. van Deursen (graduation committee member)

Learning curves are used to evaluate the perfor- mance of a machine learning (ML) model with respect to the amount of data used when train- ing. Curve fitting finds the unknown optimal co- efficients by minimizing the error prediction for a learning curve. This research analyzed ...

What is the effect of Gaussian filtering applied before curve fitting?

Bachelor thesis (2025) - I. Moanta (author) , Tom Viering (mentor) , O. Taylan Turan (mentor) , C. Yan (mentor) , A. van Deursen (graduation committee member)

Learning curves are graphical representations of the relationship between dataset size and error rate in machine learning. Curve fitting is the process of estimating a learning curve using a mathematical formula. This paper analyzes two ways of performing curve fitting: interpola ...

Fuzzing for concurrent programs under C/C++ weak memory model

Master thesis (2024) - L. Li (author) , Burcu Özkan (mentor) , A. Deursen (graduation committee member) , J.E.A.P. Decouchant (graduation committee member) , Ori Lahav (mentor) , Michalis Kokologiannakis (mentor)

Fuzzing has been a popular approach in the domain of software testing due to its efficiency and capability to uncover unexpected bugs. Fuzz testing was originally developed in the days of sequential programs. With the rise of multi-core devices and increasing demand for computat ...

Conflict-Free Replicated Probabilistic Filter

Master thesis (2024) - Junbo Xiong (author) , Burcu Külahçıoğlu Ozkan (mentor) , Arie van van Deursen (graduation committee member) , J.E.A.P. Decouchant (graduation committee member) , E.B. Gülcan (graduation committee member)

Conflict-free replicated data types (CRDTs) offer high-availability low-latency updates to data without the need for central coordination. Despite the current vast collection of CRDTs, few works have been done on maintaining probabilistic membership information using CRDTs. In th ...

Towards increasing the reliability of Maven's dependency resolution

Master thesis (2024) - C.R. Paulsen (author) , S. Proksch (mentor) , A. Deursen (graduation committee member) , C. Lofi (graduation committee member)

A reliable dependency resolution process should minimize dependency-related issues. We identify transparency, stability, and flexibility as the three core properties that define a reliable resolution process and discuss how different dependency declaration strategies affect them. ...

PGFuzz: Coverage Guided Testing of Graph Processing Applications

Master thesis (2024) - M.W.M. Oudemans (author) , Burcu Ozkan (graduation committee member) , Arie van Deursen (graduation committee member) , Jesper Cockx (graduation committee member) , Stafania Dumbrava (graduation committee member)

The rise of graph processing has led to an increase in the usage of graph databases and the availability of various frameworks. Graph databases have become more accessible and, in specific instances, can compete with relational databases. Testing an application with a relational ...

Beyond Acceptance Rates: The Impact of JetBrains AI Assistant and FLCC

Analysis of the behavior of users assisted by LLMs in 13 JetBrains IDEs

Master thesis (2024) - R. Schrijver (author) , A van Deursen (graduation committee member) , Maliheh Izadi (graduation committee member) , Pouria Derakhshanfar (graduation committee member) , A. Panichella (graduation committee member) , S. Dumančić (graduation committee member)

LLM (Large Language Model) powered AI (Artificial Intelligence) assistants are a popular tool used by programmers, but what impact do they have? In this thesis we investigate two such tools designed by JetBrains: AI Assistant and FLCC (Full Line Code Completion).
We collecte ...

A Framework for Identifying Evolution Patterns of Open-Source Software Projects

Master thesis (2024) - M. Bonfanti (author) , Sebastian Proksch (mentor) , Arie van van Deursen (graduation committee member) , J.G.H. Cockx (graduation committee member)

Research on open-source software evolution gained popularity in the last decade focusing on the theoretical determining factors. Additional works studied growth patterns modeling using time series techniques on small projects and metrics samples or non-openly available larger dat ...

Modeling and Analysis of System-on-Chip Address Maps

Master thesis (2024) - N.L.C. Mook (author) , Soham Chakraborty (mentor) , Arie van Deursen (graduation committee member) , Erwin de Kock (mentor) , Bas Arts (mentor)

This thesis presents a methodology for the formal verification of memory organizations in System-on-Chip (SoC) designs described in IP-XACT. The approach involves modeling the address map structures of the design's IP-XACT description and its spreadsheet-based global address map ...

Exploring the Generation and Detection of Weaknesses in LLM Generated Code

LLMs can not be trusted to produce secure code, but they can detect it

Bachelor thesis (2024) - I. Vasiliauskas (author) , A. Al-Kaswan (mentor) , Arie van Deursen (graduation committee member) , Maliheh Izadi (graduation committee member)

Large Language Models (LLMs) have gained a lot of popularity for code generation in recent years. Developers might use LLM-generated code in projects where the security of software matters. A relevant question is therefore: what is the prevalence of code weaknesses in LLM-generat ...

Exploring Speed/Quality Trade-offs in Dimensionality of Attention Mechanism

Optimization with Grouped Query Attention and Diverse Key-Query-Value Dimensionalities

Bachelor thesis (2024) - K. Gulamov (author) , Aral de Moor (mentor) , Maliheh Izadi (graduation committee member) , A. Deursen (graduation committee member) , T.E.P.M.F. Abeel (graduation committee member)

The advent of transformer architectures revolutionized natural language processing, particularly with the popularity of decoder-only transformers for text generation tasks like GPT models. However, the autoregressive nature of these models challenges their inference speed, crucia ...

Sustainability of Edge AI at Scale

An empirical study on the sustainability of Edge AI in terms of energy consumption

Master thesis (2024) - S.R. van der Noort (author) , Luís Cruz (mentor) , Silverio Martínez-Fernández (mentor) , Arie van Deursen (graduation committee member)

Edge AI is an architectural deployment tactic that brings AI models closer to the user and data, relieving internet bandwidth usage and providing low latency and privacy. It remains unclear how this tactic performs at scale, since the distribution overhead could impact the total ...

Measuring up to Stability

Guidelines towards accurate energy consumption measurement results of Rust benchmarks

Master thesis (2024) - R. Hijdra (author) , Luís Cruz (mentor) , Arie van Deursen (graduation committee member) , Christoph Laaber (coach)

In Sustainable Software Engineering there is a need for tooling and guidelines for developers. In this research we aim to provide such guidelines. We find that for our experimental setup and set of benchmarks 500 samples gives results that are likely stable at a 1% threshold in t ...

Minimize experimentation overhead through dataset selection, ensemble feature attention, and feature selection with reduced subset sizes

Master thesis (2024) - M. Anton (author) , Luis Cruz (mentor) , A. Shome (mentor) , Arie van van Deursen (graduation committee member) , Jan C. van Gemert (graduation committee member) , Vincent Cohen-Addad (mentor) , Sammy Jerome (mentor)

In large-scale ML, data size becomes a critical variable, especially in the context of large companies, where models already exist and are hard to change and fine-tune. Time to market and model quality are essential metrics, thus looking for ways to select, prune and augment the ...