Motor Fault Detection Using Transformer-Based Models
J. Zhou (TU Delft - Electrical Engineering, Mathematics and Computer Science)
J. Dauwels – Mentor (TU Delft - Signal Processing Systems)
Qing Wang – Graduation committee member (TU Delft - Embedded Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
With the rapid development of industrial systems, the demand for stability, reliability, and robustness has become increasingly critical. Fault detection has emerged as a key research area, aiming to prevent unexpected failures and performance degradation. Recent advances in feature extraction techniques and machine learning have enabled the development of intelligent, autonomous fault detection systems.
This thesis proposes two Transformer-based models for motor fault detection. The first is a supervised classification model that incorporates discrete wavelet transform (DWT) to decompose time-series signals into multi-scale components, which are then processed by Transformer-based architectures to extract features for classification. Two structural variants are explored: one using masked attention over concatenated coefficients, and another employing upsampling and linear attention for efficient fusion. The second approach is an unsupervised forecasting-based model, where only normal samples are used for training. At inference time, samples are classified based on whether their forecasting error exceeds a threshold determined via ROC curve analysis on a validation set.
Experiments conducted on the JKU and CWRU datasets demonstrate the effectiveness of both approaches. The classification-based method achieves high accuracy in distinguishing between fault types, while the forecasting-based method shows strong robustness to previously unseen fault categories without retraining. The findings indicate that the models are capable of capturing informative temporal patterns for fault detection, showing promise for further exploration in real-world settings.