Anatomy-Aware Masked Autoencoders for Hip Osteoarthritis Classification in X-ray Images
J.C. van Beusekom (TU Delft - Electrical Engineering, Mathematics and Computer Science)
G. van Tulder – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
JH Krijthe – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Michael Weinmann – Graduation committee member (TU Delft - Computer Graphics and Visualisation)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Self Supervised Learning (SSL) has been shown to effectively utilise unlabelled data for pre-training models used in down-stream medical tasks. This property of SSL enables it to use much larger datasets when compared to supervised models, which require manually labelled data. Medical classification tasks often require the identification of patterns inside a small Region Of Interest (ROI) known to be relevant for radiographic diagnosis. This contrasts standard image classification tasks, which generally rely on broader patterns. To guide a model in learning such anatomically relevant features, we investigated the hip osteoarthritis classification performance of a ROI-guided Masked Autoencoder (MAE) with a Convolutional Neural Network (CNN)-based architecture. Unlike conventional MAEs, which learn latent features by reconstructing randomly masked images, our alternative uses generated anatomical landmarks to exclusively mask the ROI or background. Contradicting similar research on Vision Transformer (ViT)-based MAEs, random masking outperformed our ROI-guided alternatives, revealing a fundamental difference in what drives performance for the two architectures, and guiding future research on more sophisticated ROI-guided masking strategies. The code is available on GitHub: https://github.com/Jasperdetweede/AnatAMAE/