Unsupervised Representation Learning for Monitoring Rail Infrastructures With High-Frequency Moving Vibration Sensors

Journal Article (2025)
Authors

W. Phusakulkajorn (TU Delft - Railway Engineering)

Y. Zeng (TU Delft - Railway Engineering)

Z Li (TU Delft - Railway Engineering)

Alfredo Núñez (TU Delft - Railway Engineering)

Research Group
Railway Engineering
To reference this document use:
https://doi.org/10.1109/TITS.2025.3557712
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Railway Engineering
DOI:
https://doi.org/10.1109/TITS.2025.3557712
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Nowadays, rolling stock can be equipped with high-frequency vibration sensors to continuously monitor rail infrastructures and detect defects. These moving sensors measure at high speeds and sampling frequencies, generating a massive amount of data that covers each track position with very short signal durations. These data contain a variety of dynamic and transient responses that vary significantly along the track and are affected by noise. This leads to a large amount of unlabeled and noisy data, complicating the extraction of dynamic responses for effective anomaly detection. To address these challenges, this paper proposes an unsupervised representation learning methodology to automatically capture and extract characteristic features of dynamic responses that reflect the conditions of rail infrastructures. The unsupervised nature allows exploratory analysis of high-frequency vibration signals when prior knowledge or reference information about infrastructure conditions is unavailable or very limited. A collaborative optimization process that synchronizes empirical mode decomposition (EMD) with a convolutional autoencoder (CAE) is presented. The EMD level is tuned to remove noise while preserving effective vibration responses. The CAE is trained using demodulated signals that are considered normal to generate representations that ensure reconstruction quality and differentiate between normal and abnormal conditions. Furthermore, a Gaussian mixture model is used to showcase the effectiveness of the learned representations for rail infrastructures. Applied to validated axle box acceleration data for rail defect detection and train-borne laser Doppler vibrometer data for rail fastener monitoring, our method outperforms other variants of autoencoder-based models and the wavelet-based CAE in accurately identifying the conditions. It achieves an average improvement of 16% with the axle box acceleration data and 21% with the laser Doppler vibrometer data.