R. Babuska | TU Delft Repository

SmartAlert

Machine learning-based patient-ventilator asynchrony detection system in intensive care units

Journal article (2025) - Jaroslav Pažout, Milan Němý, Jakub Mikeš, Jan Jirman, Jan Kubr, Eliška Niebauerová, Miroslav Macík, Robert Babuška, František Duška, More authors...

Background and Objective: Patient-ventilator asynchronies (PVA) are associated with ventilator-induced lung injury and increased mortality. Current detection methods rely on static thresholds, extensive preprocessing, or proprietary ventilator data. This study aimed to develop and validate a fully online, real-time system that detects and classifies PVAs directly from ventilator screen data while alerting clinicians based on severity. Methods: The SmartAlert system was developed using ventilator screen recordings from ICU patients. It extracts pressure and flow waveforms from video recordings, converts them into time-series data, and employs deep neural networks to classify asynchronies and assign alarm levels from no urgency to most urgent. A dataset of 381,280 double-breath units was independently annotated by two expert intensivists. Two deep learning models were trained: one for alarm prediction and another for asynchrony classification (ineffective triggering, double cycling, high inspiratory effort, no asynchrony). Performance was evaluated using accuracy, sensitivity, specificity, and AUC-ROC, compared to expert consensus. Results: SmartAlert demonstrated strong performance for alarm level prediction (overall accuracy: 83.8 %, weighted AUC-ROC: 0.943 [95 % CI: 0.941–0.945]) and PVA classification (weighted accuracy: 89.3 %, weighted AUC-ROC: 0.951 [95 % CI: 0.950–0.953]). It showed high specificity for urgent alarms (99.9 % for level 3) and PVA types (98.5 % for ineffective triggering, 96.9 % for double cycling, 94.8 % for high inspiratory effort). Conclusions: We developed and internally validated SmartAlert, an automated system that detects PVAs, classifies severity, and alerts clinicians in real time. Its potential to reduce alarm fatigue, optimize ventilator settings, and improve patient outcomes remains to be tested in clinical trials. ...

Neuro-Evolutionary Approach to Physics-Aware Symbolic Regression

Conference paper (2025) - Jiri Kubalik, Robert Babuska

Symbolic regression is a technique that can automatically derive analytic models from data. Traditionally, symbolic regression has been implemented primarily through genetic programming that evolves populations of candidate solutions sampled by genetic operators, crossover and mutation. More recently, neural networks have been employed to learn the entire analytical model, i.e., its structure and coefficients, using regularized gradient-based optimization. Although this approach tunes the model's coefficients better, it is prone to premature convergence to suboptimal model structures. Here, we propose a neuro-evolutionary symbolic regression method that combines the strengths of evolutionary-based search for optimal neural network (NN) topologies with gradient-based tuning of the network's parameters. Due to the inherent high computational demand of evolutionary algorithms, it is not feasible to learn the parameters of every candidate NN topology to the full convergence. Thus, our method employs a memory-based strategy and population perturbations to enhance exploitation and reduce the risk of being trapped in suboptimal NNs. In this way, each NN topology can be trained using only a short sequence of back-propagation iterations. The proposed method was experimentally evaluated on three real-world test problems and has been shown to outperform other NN-based approaches regarding the quality of the models obtained. ...

REX

GPU-Accelerated Sim2Real Framework with Delay and Dynamics Estimation

Journal article (2025) - Bas van der Heijden, Jens Kober, Robert Babuska, Laura Ferranti

Sim2real, the transfer of control policies from simulation to the real world, is crucial for efficiently solving robotic tasks without the risks associated with real-world learning. How-ever, discrepancies between simulated and real environments, especially due to unmodeled dynamics and latencies, significantly impact the performance of these transferred policies. In this paper, we address the challenges of sim2real transfer caused by latency and asynchronous dynamics in real-world robotic systems. Our approach involves developing a novel framework, REX (Robotic Environments with jaX), that uses a graph-based simulation model to incorporate latency effects while optimizing for parallelization on accelerator hard-ware. Our framework simulates the asynchronous, hierarchical nature of real-world systems, while simultaneously estimating system dynamics and delays from real-world data and implementing delay compensation strategies to minimize the sim2real gap. We validate our approach on two real-world systems, demonstrating its effectiveness in improving sim2real performance by accurately modeling both system dynamics and delays. Our results show that the proposed framework supports both accelerated simulation and real-time processing, making it valuable for robot learning. ...

Scalable Task Planning via Large Language Models and Structured World Representations

Journal article (2025) - Rodrigo Pérez-Dattari, Z. Li, R. Babuska, J. Kober, C. Della Santina

Planning methods often struggle with computational intractability when solving task-level problems in large-scale environments. This work explores how the commonsense knowledge encoded in Large Language Models (LLMs) can be leveraged to enhance planning techniques for such complex scenarios. Specifically, we propose an approach that uses LLMs to efficiently prune irrelevant components from the planning problem's state space, thereby substantially reducing its complexity. We demonstrate the efficacy of our system through extensive experiments in a household simulation environment as well as real-world validation on a 7-DoF manipulator (video: https://youtu.be/6ro2UOtOQS4). ...

Embedded Hierarchical MPC for Autonomous Navigation

Journal article (2025) - Dennis Benders, Johannes Kohler, Thijs Niesten, Robert Babuska, Javier Alonso-Mora, Laura Ferranti

To efficiently deploy robotic systems in society, mobile robots must move autonomously and safely through complex environments. Nonlinear model predictive control (MPC) methods provide a natural way to find a dynamically feasible trajectory through the environment without colliding with nearby obstacles. However, the limited computation power available on typical embedded robotic systems, such as quadrotors, poses a challenge to running MPC in real time, including its most expensive tasks: constraints generation and optimization. To address this problem, we propose a novel hierarchical MPC scheme that consists of a planning and a tracking layer. The planner constructs a trajectory with a long prediction horizon at a slow rate, while the tracker ensures trajectory tracking at a relatively fast rate. We prove that the proposed framework avoids collisions and is recursively feasible. Furthermore, we demonstrate its effectiveness in simulations and lab experiments with a quadrotor that needs to reach a goal position in a complex static environment. The code is efficiently implemented on the quadrotor's embedded computer to ensure real-time feasibility. Compared to a state-of-the-art single-layer MPC formulation, this allows us to increase the planning horizon by a factor of 5, which results in significantly better performance. ...

ILeSiA

Interactive Learning of Robot Situational Awareness From Camera Input

Journal article (2025) - Petr Vanc, Giovanni Franzese, Jan Kristof Behrens, Cosimo Della Santina, Karla Stepanova, Jens Kober, Robert Babuska

Learning from demonstration is a promising approach for teaching robots new skills. However, a central challenge in the execution of acquired skills is the ability to recognize faults and prevent failures. This is essential because demonstrations typically cover only a limited set of scenarios and often only the successful ones. During task execution, unforeseen situations may arise, such as changes in the robot's environment or interaction with human operators. To recognize such situations, this paper focuses on teaching the robot situational awareness by using a camera input and labeling frames as safe or risky. We train a Gaussian Process (GP) regression model fed by a low-dimensional latent space representation of the input images. The model outputs a continuous risk score ranging from zero to one, quantifying the degree of risk at each timestep. This allows for pausing task execution in unsafe situations and directly adding new training data, labeled by the human user. Our experiments on a robotic manipulator show that the proposed method can reliably detect both known and novel faults using only a single example for each new fault. In contrast, a standard multi-layer perceptron (MLP) performs well only on faults it has encountered during training. Our method enables the next generation of cobots to be rapidly deployed with easy-to-set-up, vision-based risk assessment, proactively safeguarding humans and detecting misaligned parts or missing objects before failures occur. ...

Engine Agnostic Graph Environments for Robotics (EAGERx)

A Graph-Based Framework for Sim2real Robot Learning

Journal article (2025) - Bas van der Heijden, Jelle Luijkx, Laura Ferranti, Jens Kober, Robert Babuska

Sim2real, that is, the transfer of learned control policies from simulation to the real world, is an area of growing interest in robotics because of its potential to efficiently handle complex tasks. The sim2real approach faces challenges because of mismatches between simulation and reality. These discrepancies arise from inaccuracies in modeling physical phenomena and asynchronous control, among other factors. To this end, we introduce Engine Agnostic Graph Environments for Robotics (EAGERx), a framework with a unified software pipeline for both real and simulated robot learning. It can support various simulators and aids in integrating state, action, and time scale abstractions to facilitate learning. EAGERx’s integrated delay simulation, domain randomization features, and proposed synchronization algorithm contribute to narrowing the sim2real gap. We demonstrate (in the context of robot learning and beyond) the efficacy of EAGERx in accommodating diverse robotic systems and maintaining consistent simulation behavior. EAGERx is open source, and its code is available at https://eagerx.readthedocs.io ...

A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models

Review (2024) - Erik Derner, Kristina Batistic, Jan Zahalka, Robert Babuska

As large language models (LLMs) permeate more and more applications, an assessment of their associated security risks becomes increasingly necessary. The potential for exploitation by malicious actors, ranging from disinformation to data breaches and reputation damage, is substantial. This paper addresses a gap in current research by specifically focusing on security risks posed by LLMs within the prompt-based interaction scheme, which extends beyond the widely covered ethical and societal implications. Our work proposes a taxonomy of security risks along the user-model communication pipeline and categorizes the attacks by target and attack type alongside the commonly used confidentiality, integrity, and availability (CIA) triad. The taxonomy is reinforced with specific attack examples to showcase the real-world impact of these risks. Through this taxonomy, we aim to inform the development of robust and secure LLM applications, enhancing their safety and trustworthiness. ...

An Empirical Investigation on Variational Autoencoder-Based Dynamic Modeling of Deformable Objects from RGB Data

Conference paper (2024) - Tomas Coleman, Robert Babuska, Jens Kober, Cosimo Della Santina

Formulating the dynamics of continuously deformable objects and other mechanical systems analytically from first principles is an exceedingly challenging task, often impractical in real-world scenarios. What makes this challenge even harder to solve is that, usually, the object has not been observed previously, and the only information that we can get from it is a stream of RGB camera data. In this study, we explore the use of deep learning techniques to solve this nonlinear identification problem. We specifically focus on extracting dynamic models of simple deformable objects from the high-dimensional sensor input coming from an RGB camera. We investigate a two-stage approach to achieve this goal. First, we train a variational autoencoder to extract an extremely low-dimensional representation of the object configuration. Then, we learn a dynamic model that predicts the evolution of these latent space variables. The proposed architecture can accurately predict the object's state up to one second into the future. ...

Robotic Grasping of Harvested Tomato Trusses Using Vision and Online Learning

Conference paper (2024) - Luuk Van Den Bent, Tomás Coleman, Robert Babuška

Currently, truss tomato weighing and packaging require significant manual work. The main obstacle to automation lies in the difficulty of developing a reliable robotic grasping system for already harvested trusses. We propose a method to grasp trusses that are stacked in a crate with considerable clutter, which is how they are commonly stored and transported after harvest. The method consists of a deep learning-based vision system to first identify the individual trusses in the crate and then determine a suitable grasping location on the stem. To this end, we have introduced a grasp pose ranking algorithm with online learning capabilities. After selecting the most promising grasp pose, the robot executes a pinch grasp without needing touch sensors or geometric models. Lab experiments with a robotic manipulator equipped with an eye-in-hand RGB-D camera showed a 100% clearance rate when tasked to pick all trusses from a pile. 93% of the trusses were successfully grasped on the first try, while the remaining 7% required more attempts. ...

SymFormer

End-to-End Symbolic Regression Using Transformer-Based Architecture

Journal article (2024) - Martin Vastl, Jonas Kulhanek, Jiri Kubalik, Erik Derner, Robert Babuska

Many real-world systems can be naturally described by mathematical formulas. The task of automatically constructing formulas to fit observed data is called symbolic regression. Evolutionary methods such as genetic programming have been commonly used to solve symbolic regression tasks, but they have significant drawbacks, such as high computational complexity. Recently, neural networks have been applied to symbolic regression, among which the transformer-based methods seem to be most promising. After training a transformer on a large number of formulas, the actual inference, i.e., finding a formula for new, unseen data, is very fast (in the order of seconds). This is considerably faster than state-of-the-art evolutionary methods. The main drawback of transformers is that they generate formulas without numerical constants, which have to be optimized separately, yielding suboptimal results. We propose a transformer-based approach called SymFormer, which predicts the formula by outputting the symbols and the constants simultaneously. This helps to generate formulas that fit the data more accurately. In addition, the constants provided by SymFormer serve as a good starting point for subsequent tuning via gradient descent to further improve the model accuracy. We show on several benchmarks that SymFormer outperforms state-of-the-art methods while having faster inference. ...

Imitrob

Imitation Learning Dataset for Training and Evaluating 6D Object Pose Estimators

Journal article (2023) - Jiri Sedlar, Karla Stepanova, Radoslav Skoviera, Jan K. Behrens, Matus Tuna, Gabriela Sejnova, Josef Sivic, Robert Babuska

This letter introduces a dataset for training and evaluating methods for 6D pose estimation of hand-held tools in task demonstrations captured by a standard RGB camera. Despite the significant progress of 6D pose estimation methods, their performance is usually limited for heavily occluded objects, which is a common case in imitation learning, where the object is typically partially occluded by the manipulating hand. Currently, there is a lack of datasets that would enable the development of robust 6D pose estimation methods for these conditions. To overcome this problem, we collect a new dataset (Imitrob) aimed at 6D pose estimation in imitation learning and other applications where a human holds a tool and performs a task. The dataset contains image sequences of nine different tools and twelve manipulation tasks with two camera viewpoints, four human subjects, and left/right hand. Each image is accompanied by an accurate ground truth measurement of the 6D object pose obtained by the HTC Vive motion tracking device. The use of the dataset is demonstrated by training and evaluating a recent 6D object pose estimation method (DOPE) in various setups. ...

Toward Physically Plausible Data-Driven Models

A Novel Neural Network Approach to Symbolic Regression

Journal article (2023) - Jiri Kubalik, Erik Derner, Robert Babuska

Many real-world systems can be described by mathematical models that are human-comprehensible, easy to analyze and help explain the system's behavior. Symbolic regression is a method that can automatically generate such models from data. Historically, symbolic regression has been predominantly realized by genetic programming, a method that evolves populations of candidate solutions that are subsequently modified by genetic operators crossover and mutation. However, this approach suffers from several deficiencies: it does not scale well with the number of variables and samples in the training data-models tend to grow in size and complexity without an adequate accuracy gain, and it is hard to fine-tune the model coefficients using just genetic operators. Recently, neural networks have been applied to learn the whole analytic model, i.e., its structure and the coefficients, using gradient-based optimization algorithms. This paper proposes a novel neural network-based symbolic regression method that constructs physically plausible models based on even very small training data sets and prior knowledge about the system. The method employs an adaptive weighting scheme to effectively deal with multiple loss function terms and an epoch-wise learning process to reduce the chance of getting stuck in poor local optima. Furthermore, we propose a parameter-free method for choosing the model with the best interpolation and extrapolation performance out of all the models generated throughout the whole learning process. We experimentally evaluate the approach on four test systems: the TurtleBot 2 mobile robot, the magnetic manipulation system, the equivalent resistance of two resistors in parallel, and the longitudinal force of the anti-lock braking system. The results clearly show the potential of the method to find parsimonious models that comply with the prior knowledge provided. ...

Where to Look Next: Learning Viewpoint Recommendations for Informative Trajectory Planning

Conference paper (2022) - M. Lodel, B.F. Ferreira de Brito, A. Serra Gomez, L. Ferranti, R. Babuska, J. Alonso-Mora

Search missions require motion planning and navigation methods for information gathering that continuously replan based on new observations of the robot's surroundings. Current methods for information gathering, such as Monte Carlo Tree Search, are capable of reasoning over long horizons, but they are computationally expensive. An alternative for fast online execution is to train, offline, an information gathering policy, which indirectly reasons about the information value of new observations. However, these policies lack safety guarantees and do not account for the robot dynamics. To overcome these limitations we train an information-aware policy via deep reinforcement learning, that guides a receding-horizon trajectory optimization planner. In particular, the policy continuously recommends a reference viewpoint to the local planner, such that the resulting dynamically feasible and collision-free trajectories lead to observations that maximize the information gain and reduce the uncertainty about the environment. In simulation tests in previously unseen environments, our method consistently outperforms greedy next-best-view policies and achieves competitive performance compared to Monte Carlo Tree Search, in terms of information gains and coverage time, with a reduction in execution time by three orders of magnitude. ...

Learning 3D Shape Proprioception for Continuum Soft Robots with Multiple Magnetic Sensors

Journal article (2022) - T.A. Baaij, Marn Klein Holkenborg, Maximilian Stölzle, Daan van der Tuin, Jonatan Naaktgeboren, Robert Babuska, Cosimo Della Santina

Sensing the shape of continuum soft robots without obstructing their movements and modifying their natural softness requires innovative solutions. This letter proposes to use magnetic sensors fully integrated into the robot to achieve proprioception. Magnetic sensors are compact, sensitive, and easy to integrate into a soft robot. We also propose a neural architecture to make sense of the highly nonlinear relationship between the perceived intensity of the magnetic field and the shape of the robot. By injecting a priori knowledge from the kinematic model, we obtain an effective yet data-efficient learning strategy. We first demonstrate in simulation the value of this kinematic prior by investigating the proprioception behavior when varying the sensor configuration, which does not require us to re-train the neural network. We validate our approach in experiments involving one soft segment containing a cylindrical magnet and three magnetoresistive sensors. During the experiments, we achieve mean relative errors of 4.5%. ...

OpenDR: An Open Toolkit for Enabling High Performance, Low Footprint Deep Learning for Robotics

Conference paper (2022) - N. Passalis, S. Pedrazzi, More Authors..., R. Babuska, W. Burgard, F. Ferro, M. Gabbouj, E. Kayacan, J. Kober, R. Pieters, A. Valada

Existing Deep Learning (DL) frameworks typically do not provide ready-to-use solutions for robotics, where very specific learning, reasoning, and embodiment problems exist. Their relatively steep learning curve and the different methodologies employed by DL compared to traditional approaches, along with the high complexity of DL models, which often leads to the need of employing specialized hardware accelerators, further increase the effort and cost needed to employ DL models in robotics. Also, most of the existing DL methods follow a static inference paradigm, as inherited by the traditional computer vision pipelines, ignoring active perception, which can be employed to actively interact with the environment in order to increase perception accuracy. In this paper, we present the Open Deep Learning Toolkit for Robotics (OpenDR). OpenDR aims at developing an open, non-proprietary, efficient, and modular toolkit that can be easily used by robotics companies and research institutions to efficiently develop and deploy AI and cognition technologies to robotics applications, providing a solid step towards addressing the aforementioned challenges. We also detail the design choices, along with an abstract interface that was created to overcome these challenges. This interface can describe various robotic tasks, spanning beyond traditional DL cognition and inference, as known by existing frameworks, incorporating openness, homogeneity and robotics-oriented perception e.g., through active perception, as its core design principles. ...

ViewFormer

NeRF-Free Neural Rendering from Few Images Using Transformers

Conference paper (2022) - Jonáš Kulhánek, Erik Derner, Torsten Sattler, Robert Babuška

Novel view synthesis is a long-standing problem. In this work, we consider a variant of the problem where we are given only a few context views sparsely covering a scene or an object. The goal is to predict novel viewpoints in the scene, which requires learning priors. The current state of the art is based on Neural Radiance Field (NeRF), and while achieving impressive results, the methods suffer from long training times as they require evaluating millions of 3D point samples via a neural network for each image. We propose a 2D-only method that maps multiple context views and a query pose to a new image in a single pass of a neural network. Our model uses a two-stage architecture consisting of a codebook and a transformer model. The codebook is used to embed individual images into a smaller latent space, and the transformer solves the view synthesis task in this more compact space. To train our model efficiently, we introduce a novel branching attention mechanism that allows us to use the same model not only for neural rendering but also for camera pose estimation. Experimental results on real-world scenes show that our approach is competitive compared to NeRF-based methods while not reasoning explicitly in 3D, and it is faster to train. ...

Foreword - Proceedings 6th IFAC Conference on Intelligent Control and Automation Sciences, ICONS 2022

Journal article (2022) - Jus Kocijan, Robert Babuska, Kevin Guelton, Zsófia Lendek, Lucían Busoniu

DeepKoCo

Efficient latent planning with a task-relevant Koopman representation

Conference paper (2021) - Bas van der Heijden, Laura Ferranti, Jens Kober, Robert Babuska

This paper presents DeepKoCo, a novel modelbased agent that learns a latent Koopman representation from images. This representation allows DeepKoCo to plan efficiently using linear control methods, such as linear model predictive control. Compared to traditional agents, DeepKoCo learns taskrelevant dynamics, thanks to the use of a tailored lossy autoencoder network that allows DeepKoCo to learn latent dynamics that reconstruct and predict only observed costs, rather than all observed dynamics. As our results show, DeepKoCo achieves a similar final performance as traditional model-free methods on complex control tasks, while being considerably more robust to distractor dynamics, making the proposed agent more amenable for real-life applications. ...

Multi-objective symbolic regression for physics-aware dynamic modeling

Journal article (2021) - Jiří Kubalík, Erik Derner, Robert Babuška

Virtually all dynamic system control methods benefit from the availability of an accurate mathematical model of the system. This includes also methods like reinforcement learning, which can be vastly sped up and made safer by using a dynamic system model. However, obtaining a sufficient amount of informative data for constructing dynamic models can be difficult. Consequently, standard data-driven model learning techniques using small data sets that do not cover all important properties of the system yield models that are partly incorrect, for instance, in terms of their steady-state characteristics or local behavior. However, often some knowledge about the desired physical properties of the model is available. Recently, several symbolic regression approaches making use of such knowledge to compensate for data insufficiency were proposed. Therefore, this knowledge should be incorporated into the model learning process to compensate for data insufficiency. In this paper, we consider a multi-objective symbolic regression method that optimizes models with respect to their training error and the measure of how well they comply with the desired physical properties. We propose an extension to the existing algorithm that helps generate a diverse set of high-quality models. Further, we propose a method for selecting a single final model out of the pool of candidate output models. We experimentally demonstrate the approach on three real systems: the TurtleBot 2 mobile robot, the Parrot Bebop 2 drone and the magnetic manipulation system. The results show that the proposed model-learning algorithm yields accurate models that are physically justified. The improvement in terms of the model's compliance with prior knowledge over the models obtained when no prior knowledge was involved in the learning process is of several orders of magnitude. ...