MS

M.T.J. Spaan

56 records found

Efficient exploration is a major issue in reinforcement learning, particularly in environments with sparse rewards. In these environments, traditional methods like e-greedy fail to efficiently reach an optimal policy. A new method proposed by Fortunato, et al. Fortunato, et al. s ...
We present a large-scale empirical study of Bootstrapped DQN (BDQN) and Randomized-Prior BDQN (RP-BDQN) in the DeepSea environment, aimed at characterizing their scaling properties. Our primary contribution is a unified scaling law that accurately models the probability of reward ...
Learning curve extrapolation helps practitioners predict model performance at larger data scales, enabling better planning for data collection and computational resource allocation. This paper investigates when neural networks outperform parametric models for this task. We conduc ...
Deep Reinforcement Learning has achieved superhuman performance in many tasks, such as robotic control or autonomous driving. Algorithms in Deep Reinforcement Learning still suffer from a sample efficiency problem, where, in many cases, millions of samples are needed to achieve g ...

How Noisy Is Too Noisy?

Robust Extrapolation of Learning Curves with LC-PFN

Accurately predicting a machine learning model’s final performance based on only partial training data can save substantial computational resources and guide early stopping, model selection, and automated machine learning (AutoML) workflows. Learning Curve Prior-Fitted Networks ( ...
Learning curves represent the relationship between the amount of training data and the error rate in machine learning. An important use case for learning curves is extrapolating them in order to predict how much data is needed to achieve a certain performance. One way to do such ...

Effectiveness of Machine Learning Models in Classifying Learners Based on Learning Curves

Improving Our Understanding of Learning Curves Through the Process of Classification

In machine learning, learning curves are a metric that plots performance versus training set size. They inform decisions about data acquisition, model selection, and hyperparameter tuning. Despite their importance, recent research suggests that our understanding of learning curve ...
Domain shift is when the distribution of data differs between the training of a model and its testing. This can happen when the conditions of training are slightly different from the conditions that will happen when a model is tested or used. This is a problem for generalizabilit ...
With the introduction of autonomous vehicles on public roads, their performance in emergency situations has become a strong focus. Collision Imminent Control (CIC) concerns the planning and control of aggressive evasive maneuvers for collision avoidance of automated vehicles. CIC ...
Reusable tools for engineering software languages can bridge the gap between formal specification and implementation, lowering the bar for engineers to design and implement programming languages. Among such tools belong NaBL2 and its successor Statix, which are meta-languages for ...
Algebraic effects and handlers have become a popular abstraction for effectful computation, with implementations even in mainstream programming languages, such as OCaml. The operations of an algebraic effect define the syntax of the effect, while handlers define the semantics. Th ...
Intrusion detection systems (IDSs) are essential for protecting computer systems and networks from malicious attacks. However, IDSs face challenges in dealing with dynamic and imbalanced data, as well as limited label availability. In this thesis, we propose a novel elastic gradi ...

VoBERT: Unstable Log Sequence Anomaly Detection

Introducing Vocabulary-Free BERT

With the ever-increasing digitalisation of society and the explosion of internet-enabled devices with the Internet of Things (IoT), keeping services and devices secure is becoming more important. Logs play a critical role in sustaining system reliability. Manual analysis of logs ...
How convenient would it be to have an AI that relieves us programmers from the burden of coding? Program synthesis is a technique that achieves exactly that: it automatically generates simple programs that meet a given set of examples or adhere to a provided specification. This i ...
Phylogenetic networks are a specific type of directed acyclic graph (DAG), used to depict evolutionary relationships among, for example, species or other groups of organisms. To solve computationally hard problems, treewidth has been used to parametrize algorithms in phylogenetic ...
The Statix meta-language has been developed in order to simplify the definition of static semantics in programming languages. A high-level static semantics definition of a language in Statix can be used to generate a type-checker, hence abstracting over the shared implementation ...
Sequential decision-making problems are problems where the goal is to find a sequence of actions that complete a task in an environment. A particularly difficult type of sequential decision-making problem to solve is one in which the environment has sparse rewards, a large state ...
Reinforcement learning (RL) has grown tremendously over one and a half decades and is increasingly emerging in many real-life applications. However, the application of RL is still limited due to its low training efficiencies and surplus training cost. The sampling and computation ...
Dependently typed languages such as Agda can provide users certain guarantees about the correct- ness of the code that they write, however, this comes at the cost of excess code that is not used at run time. Agda code is currently compiled to another language before it is run, th ...
Agda is a functional programming language with built-in support for dependent types. A dependent type depends on a value. This allows the developer to specify strict constraints for the types used in an application. Writing code with dependent types results in fewer type-related ...