Annotation-Efficient Osteophyte Severity Estimation in Hip X-rays

None, None

Annotation-Efficient Osteophyte Severity Estimation in Hip X-rays

Combining Binary Presence Labels with Limited OARSI Grade Supervision

Bachelor Thesis (2026)

Author(s)

D. Gogoana (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

G. van Tulder – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

J.H. Krijthe – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

I.M. Olkhovskaia – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Osteophytes Weak-supervision

To reference this document use

https://resolver.tudelft.nl/uuid:62f184d6-b2f8-4175-8135-795a87c272f1

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

26-06-2026

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project, Detecting osteophytes in hip X-ray images with weakly supervised learning

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

23

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Detailed OARSI grading of osteophytes, an important radiographic indicator of hip
osteoarthritis, is expensive because it requires expert annotation, whereas coarser binary presence labels are far easier to obtain. This study investigates how effectively
these binary labels can be combined with a limited number of graded labels to estimate ordinal osteophyte severity in hip X-ray crops, and whether the choice of which samples to grade matters. We formulate the task as cumulative ordinal regression over four anatomical locations per hip, in which binary labels supervise the presence threshold and graded labels supervise the higher severity thresholds, while thresholds with no available grade are left unsupervised. A binary-only baseline detected osteophyte presence well and produced confidence scores that rose with true grade, but could not resolve the higher grades. A few graded labels enabled ordinal expected-severity estimates and reduced macro-averaged mean absolute error, with the largest gains at the smallest budgets and diminishing returns beyond. Comparing score-stratified sampling against random selection of the graded subset, the score-based strategy was competitive but not consistently better, indicating that most of the benefit comes from adding graded supervision rather than from how the samples are chosen. All results are reported on a held-out test set, averaged over three seeds. Combining many binary labels with relatively few graded labels is a promising way to reduce expert annotation burden while still producing useful ordinal severity estimates.

Files

Annotation_Efficient_Osteophyt... (pdf)

(pdf | 0.432 Mb)

License info not available