Self-supervised learning (SSL) is a promising approach for medical imaging tasks by reducing the need for labeled data, but most existing SSL methods treat each scan as an isolated sample and overlook the fact that patients often have multiple radiographs taken over time. These l
...
Self-supervised learning (SSL) is a promising approach for medical imaging tasks by reducing the need for labeled data, but most existing SSL methods treat each scan as an isolated sample and overlook the fact that patients often have multiple radiographs taken over time. These longitudinal sequences—multiple scans of the same hip acquired at different visits—encode the natural progression of osteoarthritis (OA) and thus could enrich representation learning. In this study, we evaluate whether incorporating temporal information from these longitudinal radiographic sequences into SSL pretraining yields more transferable representations and leads to improved downstream classification of hip OA severity. We focus on a temporal contrastive task (Contrastive Predictive Coding, CPC), which learns to predict future scan representations from earlier ones, and compare it to a SimCLR-based pretraining that treats each radiograph independently. We also investigate a multitask framework that combines both objectives — either by sequentially pretraining with CPC then SimCLR, or by interleaving the two tasks. Experiments on the Osteoarthritis Initiative (OAI) dataset for binary classification of KL-grade severity show that CPC alone does not surpass SimCLR-based pretraining. However, both the sequential and interleaved multitask approaches significantly improve classification accuracy over either single-task method. These findings demonstrate that even though temporal prediction by itself isn’t sufficient — combining temporal and within-scan contrastive learning can yield stronger models for hip OA severity assessment.