Spatio-Temporal Transformer for Load Estimation using EMG and IMU in Assistive Robotics

None, None

Spatio-Temporal Transformer for Load Estimation using EMG and IMU in Assistive Robotics

Master Thesis (2026)

Author(s)

B.C. Wingen (TU Delft - Mechanical Engineering)

Contributor(s)

A.H.A. Stienen – Mentor (TU Delft - Mechanical Engineering)

X. Zhang – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Mechanical Engineering

Deep Learning Transformer IMU Robotics Regression EMG Exoskeleton Sensor fusion Load Estimation Assistive

To reference this document use

https://resolver.tudelft.nl/uuid:eca68a74-0969-417c-aba5-f7b63633ec18

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

23-06-2026

Awarding Institution

Delft University of Technology

Faculty

Mechanical Engineering

Downloads counter

4

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Intuitive control of assistive robotic devices, such as exoskeletons and arm supports, requires inferring the user’s interaction with objects in the environment. Surface electromyography (EMG) and inertial measurement units (IMU) provide complementary information about muscle activation and limb kinematics, but interpreting these sensory modalities for real-time control remains challenging. Deep learning is effective for modeling human motion intention, but has seen limited use in estimating the handheld load during object manipulation. This paper proposes a sensor-fused spatio-temporal transformer (ST-Transformer) that regresses the handheld load from synchronized EMG and IMU signals, together with a real-time acquisition and processing pipeline for an arm support device. Data were used from 17 participants performing a weight-movement task spanning six weight classes (0−6kg). EMG and IMU normalization, dataset-balancing augmentation, dropout, and weight decay were applied to improve cross-participant generalization. Trained and tested on the same participants, the sensor-fused model estimated load accurately (all metrics participant-class-balanced; R2 =0.935, MAE = 0.316kg, RMSE = 0.441kg) and significantly outperformed an EMG-only model (R2 = 0.913, MAE = 0.380kg, RMSE = 0.520kg). Under Leave-One-Participant-Out (LOPO) cross-validation, however, the fused model (R2 = 0.853, MAE = 0.536kg, RMSE = 0.680kg) retained only a slight, statistically non-significant edge over EMG alone (R2 =0.839, MAE =0.546kg, RMSE =0.703kg), while the IMU-only model degraded sharply. This indicates that the transferable load information is carried primarily by muscle activation, while the complementary IMU contribution is largely entangled with participant-specific characteristics. An attribution analysis localizes the load-relevant signal to the forearm muscles, indicating that a compact forearm-worn sensor set captures most of the usable signal, and the model (approximately 1.03 106 parameters) is feasible for real-time on-device inference on current microcontrollers.

Files

Master_Thesis_Bas_Wingen_final... (pdf)

(pdf | 11.6 Mb)

License info not available