Anatomy-aware data augmentation techniques in contrastive self-supervised learning for diagnosing hip osteoarthritis in X-ray images
Z. Yancheva (TU Delft - Electrical Engineering, Mathematics and Computer Science)
JH Krijthe – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
G. van Tulder – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
Michael Weinmann – Graduation committee member (TU Delft - Computer Graphics and Visualisation)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Supervised learning approaches have proven to be useful in diagnosing Osteoarthritis from X-ray images, aiding professionals in an otherwise time-consuming and subjective process. However, in the medical field, labeled data is scarce. For this reason, we investigate a contrastive self-supervised approach, SimCLR, capable of learning useful representations from unlabeled data. Specifically, we explore a core component of this method – the data augmentation techniques. While these augmentations are highly effective in introducing variability in conventional image datasets, they are too aggressive for medical images, often altering their semantic meaning. In this paper, we implement custom anatomy-aware augmentation techniques, which aim to preserve the main region of interest needed for a diagnosis. We evaluate these anatomy-aware augmentations including Gaussian blur, Contrast enhancement, Random resized crop, and Random erasing, against their classical counterparts by training multiple encoders based on different combinations of those augmentations. The findings of our study have shown that utilizing this anatomy-aware approach for all data augmentations a model uses does not lead to a significant improvement in its performance. However, selective use of anatomy-awareness on geometric-based approaches seems to show promising initial results.