Thomas Abeel | TU Delft Repository

Assessing the Impact of Ligated Chimeric Artefacts on Viral Diversity Estimation

Master thesis (2025) - W.J. Sung (author) , Jasmijn A. Baaijens (mentor) , Michiel Weber (mentor) , TEPMF Abeel (graduation committee member) , M. Khosla (graduation committee member)

Reliable estimation of intra-host viral diversity is essential for understanding viral evolution,
treatment resistance, and outbreak dynamics. However, technical artefacts introduced during
sample preparation and sequencing can distort variant frequencies and lead to inco ...

Efficient Utilization of Local Optimization Methods and Strategies in Local Search Genetic Algorithms for Lennard Jones Clusters

Bachelor thesis (2025) - K. Yanchev (author) , Peter A.N. Bosman (mentor) , Anton Bouter (mentor) , Vanessa Volz (mentor) , T.E.P.M.F. Abeel (graduation committee member)

This paper investigates how the local optimization method and strategy affect the efficiency of genetic algorithms (GAs) for Lennard-Jones (LJ) clusters. Several ASE-implemented optimizers were considered; however, only BFGS, FIRE, and Conjugate Gradient (CG) proved viable for in ...

Analyzing The Impact of Mutations on Genetic Algorithms for Finding the Lowest Energy Structure of Atomic Clusters

A Benchmark Study of Mutation Operations on Lennard-Jones Clusters

Bachelor thesis (2025) - S. Bud (author) , Peter A.N. Bosman (mentor) , Anton Bouter (mentor) , Vanessa Volz (mentor) , T.E.P.M.F. Abeel (graduation committee member)

Finding the lowest-energy structure of a cluster of atoms is an NP-Hard problem with applications in materials science. Genetic Algorithms (GAs) have shown promise in solving this problem due to their ability to explore complex energy landscapes. A critical component of GAs are t ...

Dynamic Mutation Rate Control for the Genetic Algorithm for Global Geometry Optimization

Bachelor thesis (2025) - J. Kulik (author) , Peter A.N. Bosman (mentor) , Anton Bouter (mentor) , Vanessa Volz (mentor) , T.E.P.M.F. Abeel (graduation committee member)

Global Geometry (or Cluster) optimization is the process of finding the most stable formations of a cluster of some atoms. A genetic algorithm was developed to find the global minimum of a cluster using the Lennard-Jones atom interaction model efficiently. Determining the optimal ...

Genetic Algorithms for Solving the Global Geometry Optimization Problem

Evaluating Initialization and Crossover Strategies for Lennard-Jones Cluster Optimization

Bachelor thesis (2025) - E. Dzintars (author) , Peter A.N. Bosman (mentor) , Anton Bouter (mentor) , Vanessa Volz (mentor) , T.E.P.M.F. Abeel (graduation committee member)

Discovery of new materials is essential in a lot if different fields, such as, space exploration, maritime industry and others. To stop new materials undergoing spontaneous reactions or reacting with the environment, they have to stable or at least metastable. That is where Globa ...

Improving research data reusability through data conversations

Bridging gaps in metadata supply and demand

Master thesis (2025) - S.M. Op den Orth (author) , C. Lofi (mentor) , Thomas Abeel (graduation committee member)

Efficient and inclusive data reuse across research disciplines is based on high quality metadata that bridges the gap between data producers and consumers. This gap, referred to as the metadata gap, arises when the metadata provided by producers do not meet the needs of consumers ...

Decision Diagram Focused Learning

Master thesis (2025) - J. Schaap (author) , M.M. de Weerdt (mentor) , J.G.M. van der Linden (mentor) , Konstantin Sidorov (mentor) , Thomas Abeel (graduation committee member)

Decision-Focused Learning (DFL) focuses on a setting where a system gets as input some features and needs to predict coefficients to a downstream optimization problem. Classically, one would apply a two-stage solution, which trains the predictor as a regression task and only uses ...

Sparse Transformers are (in)Efficient Learners

Comparing Sparse Feedforward Layers in Small Transformers

Bachelor thesis (2024) - Y. Wu (author) , Arie van Deursen (mentor) , Maliheh Izadi (mentor) , Aral de Moor (mentor) , T.E.P.M.F. Abeel (graduation committee member)

Although transformers are state-of-the-art models for natural language tasks, obtaining reasonable performance still often requires large transformers which are expensive to train and deploy. Fortunately, there are techniques to increase the size of transformers without extra com ...

Tokenization Matters: Training your Tokenizer Right

Testing the Impact of Tokenization on Language Modelling with (Small) Transfomers

Bachelor thesis (2024) - R. Braga Medeiros Mota Borges (author) , Maliheh Izadi (mentor) , Aral de Moor (mentor) , A. Deursen (mentor) , T.E.P.M.F. Abeel (graduation committee member)

Large language models (LLMs) are rapidly increasing in parameter count, but this growth is not matched by an availability of high-quality data. This discrepancy raises concerns about the sustain- ability of current approaches to language model improvement, especially as forecasts ...

Pushing the Limits of the Compressive Memory Introduced in Infini-Attention

Architectural Decisions for Language Modelling with (Small) Transformers

Bachelor thesis (2024) - L. Kesküll (author) , Aral de Moor (mentor) , T.E.P.M.F. Abeel (graduation committee member)

Transformers are a type of neural network archi- tecture used in natural language processing. They excel in tasks such as translation, text generation, and language modeling by capturing long-range de- pendencies. Increasing input sequence length en- hances performance but at a h ...

GoViral: A local viral haplotype reconstruction pipeline featuring a transformer-based classification model

Master thesis (2024) - I. Nika (author) , J.A. Baaijens (mentor) , Thomas Abeel (graduation committee member)

RNA viruses, characterized by high replication rates and the absence of proofreading mechanisms,
are susceptible to errors during replication. This characteristic allows them to form diverse
communities of genome mutants known as "viral quasispecies". Each individual geno ...

RNA viruses, characterized by high replication rates and the absence of proofreading mechanisms,
are susceptible to errors during replication. This characteristic allows them to form diverse
communities of genome mutants known as "viral quasispecies". Each individual genome
mutant is referred to as a haplotype. Analyzing viral populations involves reconstructing individual
haplotypes, using sequencing reads. This process is challenging due to the unknown
number of haplotypes, their high similarity, and varied abundances. Common sequencing
methods add to this complexity. Next-generation sequencing provides short reads lacking
sufficient information for haplotype reconstruction, while third-generation sequencing (TGS)
offers longer but error-prone reads. Recent advancements in TGS, such as PacBio HiFi reads,
deliver long, accurate reads, providing new opportunities for haplotype assembly. However,
the majority of TGS viral haplotype assembly tools rely on reference sequences and utilize
alignment-based methods. Moreover, during outbreaks suitable reference genomes might be
unavailable. In the last two decades, machine learning has enabled us to uncover patterns
within and between biological sequences, and to discover important biological attributes. In
this study, we introduce GoViral, a pipeline designed to reconstruct haplotypes specific to
genomic regions and estimate their abundances. Our pipeline features a transformer-based
classification model fine-tuned on a self-constructed dataset to classify read pairs, identifying
those originating from the same haplotype. Predictions are made irrespective of coverage and
abundance, removing reliance on alignment-based methods. In addition, our pipeline employs
a community detection algorithm to cluster and reconstruct region-specific haplotypes, estimating
their abundances. GoViral achieves high performance across real data and diverse
RNA viruses, including SARS-CoV-2, HIV-1, and HCV-1b, surpassing existing tools.

Evaluating Adaptive Activation Functions in Language Models

Does choice of activation function matter in smaller Langaunge Models?

Bachelor thesis (2024) - F. Ignijic (author) , Maliheh Izadi (mentor) , Arie van Deursen (mentor) , Aral de Moor (mentor) , T.E.P.M.F. Abeel (graduation committee member)

The rapid expansion of large language models (LLMs) driven by the transformer architecture has raised concerns about the lack of high-quality train ing data. This study investigates the role of acti vation functions in smaller-scale language models, specifically those with app ...

Evaluating the Explainability of Graph Neural Networks for Disease Subnetwork Detection

Bachelor thesis (2024) - S. Rajesh (author) , M. Khosla (mentor) , Jana M. Weber (mentor) , Thomas Abeel (graduation committee member)

Graph neural networks (GNNs), while effective at various tasks on complex graph-structured data, lack interpretability. Post-hoc explainability techniques developed for these GNNs in order to overcome their inherent uninterpretability have been applied to the additional task of d ...

Evaluating GNN Explainer Faithfulness in Molecular Property Prediction Using Comprehensiveness and Sufficiency

Bachelor thesis (2024) - H.V.M. Pajari (author) , M. Khosla (mentor) , Jana M. Weber (mentor) , Thomas Abeel (graduation committee member)

Predicting properties, such as toxicity or water solubility of unknown molecules with Graph Neural Networks has applications in drug research. Because of the ethical concerns associated with using artificial intelligence techniques in the medical field, explainable artificial int ...

Influence of graph neural network architecture on explainability of protein-protein interaction networks

Bachelor thesis (2024) - H.T. Janczak (author) , M. Khosla (mentor) , Jana M. Weber (mentor) , Thomas Abeel (graduation committee member)

AI explainers are tools capable of approximating how a neural network arrived at a given predic- tion by providing parts of the input data most rel- evant for the model’s choice. These tools have become a major point of research due to a need for human-verifiable predictions in m ...

Exploring Speed/Quality Trade-offs in Dimensionality of Attention Mechanism

Optimization with Grouped Query Attention and Diverse Key-Query-Value Dimensionalities

Bachelor thesis (2024) - K. Gulamov (author) , Aral de Moor (mentor) , Maliheh Izadi (graduation committee member) , A. Deursen (graduation committee member) , T.E.P.M.F. Abeel (graduation committee member)

The advent of transformer architectures revolutionized natural language processing, particularly with the popularity of decoder-only transformers for text generation tasks like GPT models. However, the autoregressive nature of these models challenges their inference speed, crucia ...

A counterfactual-based evaluation framework for machine learning models that use gene expression data

Master thesis (2024) - M.E. Radder (author) , Cynthia C. S. Liem (mentor) , P. Altmeyer (mentor) , T.E.P.M.F. Abeel (graduation committee member)

The evaluation metrics commonly used for machine learning models often fail to adequately reveal the inner workings of the models, which is particularly necessarily in critical fields like healthcare. Explainable AI techniques, such as counterfactual explanations, offer a way to ...

Interactive semantic segmentation of 3D medical images

Comparative analysis of discrete and gradient descent based batch query retrieval methods in active learning

Bachelor thesis (2023) - A.H. Vilakathara (author) , Klaus Hildebrandt (mentor) , N.F. Chaves de Plaza (mentor) , Thomas Abeel (graduation committee member)

Accurate segmentation of anatomical structures and abnormalities in medical images is crucial, but manual segmentation is time-consuming and automated approaches lack clinical accuracy. In recent years, active learning approaches that aim to combine automatic segmentation with ma ...

Optimization Algorithms for Plane Selection in Interactive 3D Image Segmentation

Bachelor thesis (2023) - J. van Marrewijk (author) , N.F. Chaves de Plaza (mentor) , Klaus Hildebrandt (mentor) , Thomas Abeel (graduation committee member)

Segmentation of 3D medical images is useful for various medical tasks. However, fully automated segmentation lacks the accuracy required for medical purposes while manual segmentation is too time-consuming. Therefore, an active learning method can be used to generate an accurate ...

Effect of different uncertainties in medical image segment error estimation

Interactive segmentation of 3D medical images

Bachelor thesis (2023) - S. Kim (author) , Klaus Hildebrandt (mentor) , N.F. Chaves de Plaza (mentor) , Thomas Abeel (graduation committee member)

Although automated segmentation of 3D medical images produce near-ideal results, they encounter limitations and occasional errors, necessitating manual intervention for error correction. Recent studies introduce an active learning pipeline as an efficient solution for this, requi ...