J.H.G. Dauwels | TU Delft Repository

Motion representations for privacy-aware cross-domain action recognition

Journal article (2026) - P. Benschop, J.C. van Gemert, J.P. Mense, J.H.G. Dauwels

Video captured for action recognition often contains sensitive appearance cues such as faces, skin color, and clothing. Models trained on such data may exploit these cues rather than the underlying motion, raising privacy concerns in real-world deployment. In this work, we study action recognition under a motion-focused constraint: the model receives only motion representations that capture pixel displacement over time, while reducing appearance cues that expose identity or scene context. We focus on motion-history images and optical flow as learning-free representations that reduce identifiable appearance information while retaining action recognition accuracy. Our motion I3D model achieves approximately 31% and 52% zero-shot top-1 accuracy on HMDB-51 and UCF-101, respectively, outperforming non-CLIP direct-transfer baselines trained on Kinetics-400 despite operating without any appearance input. In 16-shot adaptation, the same model reaches 52% and 83% top-1 accuracy. In the domain adaptation setting on TP-HMDB↔TP-UCF, our motion-focused models achieve higher action recognition accuracy than prior privacy-preserving methods. Sensitive attribute predictability is reduced relative to RGB by a comparable margin, without requiring a learned privacy filter. On PA-HMDB51, optical flow is the strongest motion representation for privacy preservation, approaching chance level for skin-color prediction and remaining below RGB on most privacy attributes, indicating that motion representations retain useful action information while exposing less personal information. ...

Cross-Border e-Commerce Customs Risk Management

Exploring the Potential of Linking Digital Product Passport Data, X-Ray Scanned Images, and AI

Conference paper (2026) - B.D. Rukanova, J.H.G. Dauwels, Ger C. M. Koomen, Y. Tan, Susana Wong Chan, Frank Janssens, Toni Männistö

Cross-border e-commerce is continuously growing with rapid speed which poses issues for authorities to monitor and control the large volumes of goods entering the EU via postal and express services. Digital Product Passports (DPP) are seen as a digital tool that can enable the e-commerce monitoring, however what roles DPPs can play is not yet fully understood. In this research, based on real-life piloting with scanned x-ray images of 10 packages with textiles and toys and based on product data, we gained insights and defined further research directions for exploring further the potential of DPPs and AI and scanned images for customs risk management in the context of cross-border e-commerce. ...

Testing individual and group markers of collaboration in a team-based learning classroom

Journal article (2025) - Y. H.Victoria Chua, Justin Dauwels, Preman Rajalingam, Chew Lee Teo, Suzy J. Styles

Background: Intra-group discussions during actual TBL sessions play a huge role in knowledge consolidation and learning but are often understudied. Aims: Using a pre-registered study framework, we examined if participation equity (H1), reciprocal interaction (H2), information density (H3), mutual understanding (H4), and emotional rapport (H5) affected how much students learn from their intra-group team-based learning discussions and how they rated their team's discussions. Sample: Participants were 165 undergraduate students assigned to 28 teams. Methods: Using linguistic, conversational, and socio-affective features extracted from recordings of Year 1 and 2 medical students engaging in team-based learning, each construct was conceptualised at the level of the group and the individual. We used linear mixed-effects models and competing models approach to establish which of our metrics best account for the observed variance in individual learning gains and perceived collaboration quality. The analysis plan was preregistered, including correction for multiple comparisons. Results: None of our individual-level or group-level metrics significantly predicted individual learning gains. One of the group-level metrics significantly predicted perceived collaboration quality: reciprocal interaction. Our exploratory analysis found that individual baseline score of the best performer in the team positively predicted individual learning gains for others in their team, regardless of other interaction metrics. Conclusion: While students perceived the highest collaboration quality when turn-taking in their team was evenly distributed, the strongest predicter of learning gains for a student was the knowledge level of their top-scoring team-mate. This finding has implications for classroom equity, group formation and activity planning. ...

Uncertainty-Aware Gate-Lifetime Prediction of p-GaN Gate HEMTs Using Gaussian Processes

Conference paper (2025) - S. Zhao, R. T. Rajan, A. N. Tallarico, M. Millesimo, V. Volosov, A. Imbruglia, J. Dauwels

The accurate prediction of Gallium Nitride High-Electron Mobility Transistors (GaN HEMTs) lifetime is essential for ensuring the reliability of power electronics. However, the complex and often competing degradation mechanisms within a single GaN-based transistor make lifetime extrapolation particularly challenging, especially under limited-data scenarios. In this work, we explore two machine learning approaches, i.e., XGBoost Regression and Gaussian Process Regression (GPR), for static gate lifetime prediction based on early measurements of current and ON-state resistance. In particular, we use features derived from empirical models to improve accuracy and model-specific methods to estimate uncertainty. We compare bootstrapped XGBoost ensembles, which yield empirical confidence intervals, with GPR, which provides analytical uncertainty estimates. Experiments on a time-dependent gate breakdown (TDGB) dataset spanning 16 voltage–temperature combinations show that GPR achieves an SMAPE of 8.8% and ECE of 0.028, outperforming XGBoost in Leave-One-Condition-Out Cross-Validation. These results highlight the feasibility of our proposed uncertainty-aware gate-lifetime prediction for Schottky p-GaN gate HEMTs in small-sample settings, and provide a basis for extending the framework towards time-dependent degradation modeling. ...

Enhancing Autonomous Vehicle Navigation Through Computer Vision

Techniques for Lane Marker Detection and Rain Removal

Book chapter (2025) - Sarat Chandra Nagavarapu, Anuj Abraham, Sihao Li, Justin Dauwels

Autonomous Vehicles (AVs) equipped with camera systems have emerged as a pivotal solution for smart urban mobility. The escalating demand for AVs emphasizes the need to prioritize driving safety, especially in challenging weather conditions like heavy rain. In this context, the accurate perception of environmental features, notably lane markers, becomes imperative for effective autonomous navigation. Severe weather can lead to camera image degradation, including blur and loss of details, impacting the accuracy of subsequent image processing. Despite the prevalence of camera-based methods, sensitivity to environmental noise, such as rain streaks, poses a challenge, necessitating preprocessing mechanisms like rain removal to enhance lane detection accuracy. This chapter focuses on the development of a vision-based algorithm dedicated to detecting and tracking lane markers, coupled with an efficient rain streak removal algorithm. A progressive approach to lane detection on city roads is presented, incorporating sliding windows and Kalman filter methodologies into a model-based method. Integration of the Kalman filter has yielded a notable improvement in video processing speeds, from 1.67 to 2.72 frames/s, enhancing overall operational efficiency. Furthermore, a novel neural network structure, amalgamating convolutional neural networks (CNNs) and long short-term memory (LSTM), is introduced for rain streak removal before performing lane marker detection. Comparative analysis against existing methods demonstrates an average 2.3% improvement in peak signal-to-noise ratio (PSNR) for rain removal and an 8% enhancement in Google Vision test results. ...

Surgical Workflow Analysis

An Explainable Approach

Conference paper (2025) - Christos Spiliadis, Yiheng Chang, Justin Dauwels, Chavdar Bachvarov, John J. Van Den Dobbelsteen, Benno H.W. Hendriks, Maarten Van Der Elst, Markku Eskola

Surgical workflow analysis optimizes efficiency, resource use, and patient safety in catheterization labs. Traditional manual methods are labour-intensive and inconsistent, driving the need for automated solutions that utilize machine learning and computer vision. This thesis introduces an explainable two-stage model for workflow analysis using ceiling-mounted cameras. The approach combines a YOLOv8 object detection model with a Gaussian Mixture Model - Hidden Markov Model (GMM-HMM). The first stage detects key objects for input into the second stage, where the GMM-HMM infers workflow phases by modelling spatial and temporal dynamics for real-time classification. Validation on two hospital datasets achieves 95.2% accuracy for the RdGG dataset and 95.4% for HH Tampere, demonstrating generalizability across environments. Experimental results show high accuracy in detecting workflow phases, highlighting explainability and robustness. The combined efficiencies of YOLOv8 and GMM-HMM allow for precise phase transition identification. The model's real-time application and adaptability across hospitals suggest its clinical implementation potential. This research furthers automated workflow analysis by enhancing interpretability and adaptability. Future work aims to improve robustness against occlusions, integrate audio data, and explore applications in other surgical settings. ...

Physics-Informed Intelligent Motor Fault Detection

Conference paper (2025) - S. Li, R. T. Rajan, E. Marth, P. Zorn, W. Gruber, J. Dauwels

Intelligent Fault Detection (IFD) has garnered significant attention, with recent advances in AI-empowered predictive maintenance. A key challenge in applying IFD models lies in the interpretability of the methods, since the mechanisms are typically complex and difficult to integrate with data-driven approaches. In addition, the integration of edge devices is an emerging trend, which ensures fault detection and subsequent decision making on the edge, and thus offering an instant response as compared to a conventional centralized server-based architecture. However, to realize Edge-based IFD the primary constraints are low storage capacity and limited computational resources. In this paper, we address various critical challenges in automatic Edge-based IFD for motors in industrial settings, focusing on three key constraints, i.e., (a) limited availability of training data, (b) the lack of method interpretability, and (c) the computational and storage limitations of edge devices. To overcome these challenges, we propose a suite of light weight Physics-Informed (PI) AI algorithms to achieve Edge-based IFD - without compromising detection performance. We validate our proposed methods on experimental data for motor fault detection, and additionally present results from the implementation of these methods on an edge device. We discuss the benefits of our proposed solutions, and give directions for future work. ...

Prediction of Postinduction Hypotension by Machine Learning

Conference paper (2024) - Shuoyan Zhao, Alan Hamo, Niki Ottenhof, Jan Wiebe H. Korstanje, Justin Dauwels

Post-induction hypotension (PIH) occurs shortly after anesthesia induction and is related to several post-operative complications. Medications delivered during induction and maintenance of anesthesia are significantly related to PIH occurrence, which remains common due to the intricate nature of clinical factors. To enhance decision-making on anesthestic dosing, machine learning (ML) is proposed to predict the risk of PIH associated with specific anesthetic dosages. This study focuses on the development of a prediction model for PIH to support anesthesia decision-making. Trained on 320 cases from the VitalDB database, the model incorporates demographic data, vital signs, and medication dosing information. By including the dosage of propofol administered during the induction period as an input variable, the algorithm predicts PIH risk before induction, providing valuable insights into the safety of propofol dosage plans. The results were validated using nested cross-validation, achieving high performance (precision of 0.83 and recall of 0.84). Moreover, an advisory model demonstrates the potential for personalizing a safe propofol anesthetics range for an individual patient. ...

Letter to the Editor

Announcement of a Call for Proposals for biomedical waveform coding

Journal article (2024) - J. J. Halford, G. Campobello, B. H. Brinkmann, M. Stead, S. Rampp, J. Rémi, K. B. Nilsen, J. Dauwels, M. Galanti, More authors...

MoReSo

A DNN Framework Expediting Content-based Video Image Retrieval (CBVIR)

Conference paper (2024) - Sinian Li, Doruk Barokas Profeta, Justin Dauwels

With the exponential growth of video data, individuals, particularly scholars in the fields of history and sociology, are increasingly reliant on video materials. However, the task of locating specific frames within videos remains a laborious and time-consuming endeavor. Advanced machine learning-assisted video processing techniques have emerged, including text-based video searches, video summarization, real-time object detection, and person re-identification. However, distinct from these, the main challenge of retrieving video frames based on given visual content is how to efficiently and accurately pinpoint the instance occurrences. To expedite the process while maintaining retrieval performance, we propose a two-stage approach, combining KeyFrame Extraction (KFE) and Content-based Image Retrieval (CBIR), underpinned a DNN-empowered framework called MoReSo. Our innovations include 1) the integration of improved statistical features with dynamic clustering in the KFE stage and 2) the development of the MoReSo framework, which consists of MobileNet and ResNet backbones with SOA layer to jointly represent video frames, achieving 2.67x increase in efficiency compared to existing solutions. Our framework is evaluated on two datasets: the annotated EHM Historical Database provided by digital history researchers and the widely-used image retrieval benchmark datasets, the Oxford and Paris datasets. The experimental results showcase that the proposed framework and scheme excel among other models in the CBVIR task. We make our code available for further exploration through our GitHub repository. This repository contains the implementation of our model and CBVIR system with a GUI prototype. ...

Towards Robust Object Detection in Unseen Catheterization Laboratories

Conference paper (2024) - Zipeng Wang, Rick Butler, John van den Dobbelsteen, Benno Hendriks, Maarten van der Elst, Justin Dauwels

Deep learning-based object detectors, while offering exceptional performance, are data-dependent and can suffer from generalization issues. In this work, we investigated deep neural networks for detecting people and medical instruments for the vision-based workflow analysis system inside Catheterization Laboratories (Cath Labs). The central problem explored in this paper is the fact that the performance of the detector can degrade drastically if it is trained and tested on data from different Cath Labs. Our research aimed to investigate the underlying causes of this specific performance degradation and find solutions to mitigate this issue. We employed the YOLOv8 object detector and created datasets from clinical procedures recorded at Reinier de Graaf Hospital (RdGG) and Philips Best Campus, supplemented with publicly accessible images. Through a series of experiments complemented by data visualization, we discovered that the performance degradation primarily stems from data distribution shifts in the feature space. Notably, the object detector trained on non-sensitive online images can generalize to unseen Cath Labs, outperforming the model trained on a procedure recording from a different Cath Lab. The detector trained on the online images achieved an mAP@0.5 of 0.517 on the RdGG dataset. Furthermore, by switching to the most suitable camera for each object in the Cath Lab, the multi-camera system can further improve the detection performance significantly. An aggregated L-camera mAP@0.5 of 0.679 is achieved for single-object classes on the RdGG dataset. ...

Precipitation Nowcasting Using Physics Informed Discriminator Generative Models

Conference paper (2024) - Junzhe Yin, Cristian Meo, Ankush Roy, Zeineh Bou Cher, Mircea Lică, Yanbo Wang, Ruben Imhoff, Remko Uijlenhoet, Justin Dauwels

Nowcasting leverages real-time atmospheric conditions to forecast weather over short periods. State-of-the-art models, including PySTEPS, encounter difficulties in accurately forecasting extreme weather events because of their unpredictable distribution patterns. In this study, we design a physics-informed neural network to perform precipitation nowcasting using the precipitation and meteorological data from the Royal Netherlands Meteorological Institute (KNMI). This model draws inspiration from the novel Physics-Informed Discriminator GAN (PID-GAN) formulation, directly integrating physics-based supervision within the adversarial learning framework. The proposed model adopts a GAN structure, featuring a Vector Quantization Generative Adversarial Network (VQ-GAN) and a Transformer as the generator, with a temporal discriminator serving as the discriminator. Our findings demonstrate that the PID-GAN model outperforms numerical and SOTA deep generative models in terms of precipitation nowcasting downstream metrics. ...

NeuroDots

From Single-Target to Brain-Network Modulation: Why and What Is Needed?

Review (2024) - Dirk De Ridder, Muhammad Ali Siddiqi, Justin Dauwels, Wouter A. Serdijn, Christos Strydis

Objectives: Current techniques in brain stimulation are still largely based on a phrenologic approach that a single brain target can treat a brain disorder. Nevertheless, meta-analyses of brain implants indicate an overall success rate of 50% improvement in 50% of patients, irrespective of the brain-related disorder. Thus, there is still a large margin for improvement. The goal of this manuscript is to 1) develop a general theoretical framework of brain functioning that is amenable to surgical neuromodulation, and 2) describe the engineering requirements of the next generation of implantable brain stimulators that follow from this theoretic model. Materials and Methods: A neuroscience and engineering literature review was performed to develop a universal theoretical model of brain functioning and dysfunctioning amenable to surgical neuromodulation. Results: Even though a single target can modulate an entire network, research in network science reveals that many brain disorders are the consequence of maladaptive interactions among multiple networks rather than a single network. Consequently, targeting the main connector hubs of those multiple interacting networks involved in a brain disorder is theoretically more beneficial. We, thus, envision next-generation network implants that will rely on distributed, multisite neuromodulation targeting correlated and anticorrelated interacting brain networks, juxtaposing alternative implant configurations, and finally providing solid recommendations for the realization of such implants. In doing so, this study pinpoints the potential shortcomings of other similar efforts in the field, which somehow fall short of the requirements. Conclusion: The concept of network stimulation holds great promise as a universal approach for treating neurologic and psychiatric disorders. ...

Machine Learning Algorithm to Estimate Cardiac Output Based On Less-Invasive Arterial Blood Pressure Measurements

Conference paper (2024) - Alan Hamo, Niki Ottenhof, Jan Wiebe H. Korstanje, Justin Dauwels

Cardiac output (CO) is a vital hemodynamic parameter that reflects the blood volume pumped by the heart per minute. A less-invasive way to estimate CO is by analyzing arterial blood pressure (ABP) waveforms. However, the relationship between CO and blood pressure is unknown. This study uses machine learning and feature engineering techniques to discover the relationship between CO and ABP. We apply the sparse identification non-linear dynamics (SINDy) algorithm to discover features. Additionally, we investigate the optimum number of cardiac cycles required for feature extraction to achieve the best performance. The proposed approach achieves clinically acceptable performance regarding radial limits of agreement (RLOA) and bias (RBias). Further, the proposed approach is validated on an external dataset. Finally, similarities to the Navier-Stokes equations are presented. ...

LGM³A 2024 Chairs’ Welcome

Conference paper (2024) - Shihao Xu, Yiyang Luo, Justin Dauwels, Andy Khong, Zheng Wang, Qianqian Chen, Chen Cai, Wei Shi, Tat Seng Chua

Fusion of Probabilistic Projections of Sea-Level Rise

Journal article (2024) - Benjamin S. Grandey, Justin Dauwels, Zhi Yang Koh, Benjamin P. Horton, Lock Yue Chew

A probabilistic projection of sea-level rise uses a probability distribution to represent scientific uncertainty. However, alternative probabilistic projections of sea-level rise differ markedly, revealing ambiguity, which poses a challenge to scientific assessment and decision-making. To address the challenge of ambiguity, we propose a new approach to quantify a best estimate of the scientific uncertainty associated with sea-level rise. Our proposed fusion combines the complementary strengths of the ice sheet models and expert elicitations that were used in the Sixth Assessment Report (AR6) of the Intergovernmental Panel on Climate Change (IPCC). Under a low-emissions scenario, the fusion's very likely range (5th–95th percentiles) of global mean sea-level rise is 0.3–1.0 m by 2100. Under a high-emissions scenario, the very likely range is 0.5–1.9 m. The 95th percentile projection of 1.9 m can inform a high-end storyline, supporting decision-making for activities with low uncertainty tolerance. By quantifying a best estimate of scientific uncertainty, the fusion caters to diverse users. ...

Tide–surge interaction observed at Singapore and the east coast of Peninsular Malaysia using a semi-empirical model

Journal article (2024) - Zhi Yang Koh, Benjamin S. Grandey, Dhrubajyoti Samanta, Adam D. Switzer, Benjamin P. Horton, J.H.G. Dauwels, Lock Yue Chew

Tide–surge interaction plays a substantial role in determining the characteristics of coastal water levels over shallow regions. We study the tide–surge interaction observed at seven tide gauges along Singapore and the east coast of Peninsular Malaysia, focusing on the timing of extreme non-tidal residuals relative to tidal high water. We propose a modified statistical framework using a no-tide–surge interaction (no-TSI) null distribution that accounts for asymmetry and variation in the duration of tidal cycles. We find that our modified framework can mitigate false-positive signals of tide–surge interaction in this region. We find evidence of tide–surge interaction at all seven locations, with characteristics varying smoothly along the coastline: the highest non-tidal residuals are found to occur most frequently before tidal high water in the south, both before and after tidal high water in the central region, and after tidal high water in the north. We also propose a semi-empirical model to investigate the effects of tidal-phase alteration, which is one mechanism of tide–surge interaction. Results of our semi-empirical model reveal that tidal-phase alteration caused by storm surges is substantial enough to generate significant change in the timing of extreme non-tidal residuals. To mitigate the effect of tidal-phase alteration on return level estimation, skew surge can be used. We conclude that (1) tide–surge interaction influences coastal water levels in this region, (2) our semi-empirical model provides insight into the mechanism of tidal-phase alteration, and (3) our no-TSI distribution should be used for similar studies globally. ...

Unveiling Hidden Anomalies

A Hybrid Approach for Surface Mounted Electronics

Conference paper (2024) - Amir Ghorbani Ghezeljehmeidan, Willem Dirk van Driel, Justin Dauwels

Industrial assembly lines are the heartbeat of modern manufacturing, where precision and efficiency are paramount. This paper introduces a novel hybrid Explainable artificial intelligence (XAI) approach to enhance monitoring and analysis in industrial assembly. By fusing the power of vision anomaly detection models with the clarity of the gradient tree boosting algorithm, this framework not only boosts defect detection accuracy but also provides transparent, actionable insights. This synergy transforms how operators and engineers interact with AI, fostering trust and enhancing operational excellence. ...

LGM3A 2024

The 2nd Workshop on Large Generative Models Meet Multimodal Applications

Conference paper (2024) - Shihao Xu, Yiyang Luo, Justin Dauwels, Andy Khong, Zheng Wang, Qianqian Chen, Chen Cai, Wei Shi, Tat Seng Chua

This workshop aims to explore the potential of large generative models to revolutionize how we interact with multimodal information. A Large Language Model (LLM) represents a sophisticated form of artificial intelligence engineered to comprehend and produce natural language text, exemplified by technologies such as GPT, LLaMA, Flan-T5, ChatGLM, Qwen, etc. These models undergo training on extensive text datasets, exhibiting commendable attributes including robust language generation, zero-shot transfer capabilities, and In-Context Learning (ICL). With the surge in multimodal content-encompassing images, videos, audio, and 3D models-over the recent period, Large MultiModal Models (LMMs) have seen significant enhancements. These improvements enable the augmentation of conventional LLMs to accommodate multimodal inputs or outputs, as seen in BLIP, Flamingo, KOSMOS, LLaVA, Gemini, GPT-4, etc. Concurrently, certain research initiatives have developed specific modalities, with Kosmos2 and MiniGPT-5 focusing on image generation, and SpeechGPT on speech production. There are also endeavors to integrate LLMs with external tools to achieve a near "any-to-any" multimodal comprehension and generation capacity, illustrated by projects like Visual-ChatGPT, ViperGPT, MMREACT, HuggingGPT, and AudioGPT. Collectively,these models, spanning not only text and image generation but also other modalities, are referred to as large generative models. This workshop will allow researchers, practitioners, and industry professionals to explore the latest trends and best practices in the multimodal applications of large generative models. ...

Evaluation of different classification methods using electronic nose data to diagnose sarcoidosis

Journal article (2023) - Iris G. van der Sar, Nynke van Jaarsveld, Imme A. Spiekerman, Floor J. Toxopeus, Quint L. Langens, Marlies S. Wijsenbeek, Justin Dauwels, Catharina C. Moor

Electronic nose (eNose) technology is an emerging diagnostic application, using artificial intelligence to classify human breath patterns. These patterns can be used to diagnose medical conditions. Sarcoidosis is an often difficult to diagnose disease, as no standard procedure or conclusive test exists. An accurate diagnostic model based on eNose data could therefore be helpful in clinical decision-making. The aim of this paper is to evaluate the performance of various dimensionality reduction methods and classifiers in order to design an accurate diagnostic model for sarcoidosis. Various methods of dimensionality reduction and multiple hyperparameter optimised classifiers were tested and cross-validated on a dataset of patients with pulmonary sarcoidosis (n= 224) and other interstitial lung disease (n= 317). Best performing methods were selected to create a model to diagnose patients with sarcoidosis. Nested cross-validation was applied to calculate the overall diagnostic performance. A classification model with feature selection and random forest (RF) classifier showed the highest accuracy. The overall diagnostic performance resulted in an accuracy of 87.1% and area-under-the-curve of 91.2%. After comparing different dimensionality reduction methods and classifiers, a highly accurate model to diagnose a patient with sarcoidosis using eNose data was created. The RF classifier and feature selection showed the best performance. The presented systematic approach could also be applied to other eNose datasets to compare methods and select the optimal diagnostic model. ...