Analyzing and Mitigating Bias for Vulnerable Road Users by Addressing Class Imbalance in Datasets

Journal Article (2025)
Authors

D. Katare (TU Delft - Information and Communication Technology)

David Solans Noguero (Telefónica Research)

Souneil Park (Telefónica Research)

Nicolas Kourtellis (Telefónica Research)

Marijn Marijn (TU Delft - Engineering, Systems and Services)

Aaron Yi Ding (TU Delft - Information and Communication Technology)

Research Group
Information and Communication Technology
To reference this document use:
https://doi.org/10.1109/OJITS.2025.3564558
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Information and Communication Technology
Volume number
6
Pages (from-to)
590-604
DOI:
https://doi.org/10.1109/OJITS.2025.3564558
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Vulnerable road users (VRUs), including pedestrians, cyclists, and motorcyclists, account for approximately 50% of road traffic fatalities globally, as per the World Health Organization. In these scenarios, the accuracy and fairness of perception applications used in autonomous driving become critical to reduce such risks. For machine learning models, performing object classification and detection tasks, the focus has been on improving accuracy and enhancing model performance metrics; however, issues such as biases inherited in models, statistical imbalances and disparities within the datasets are often overlooked. Our research addresses these issues by exploring class imbalances among vulnerable road users by focusing on class distribution analysis, evaluating model performance, and bias impact assessment. Using popular CNN models and Vision Transformers (ViTs) with the nuScenes dataset, our performance evaluation shows detection disparities for underrepresented classes. Compared to related work, we focus on metric-specific and cost-sensitive learning for model optimization and bias mitigation, which includes data augmentation and resampling. Using the proposed mitigation approaches, we see improvement in IoU(%) and NDS(%) metrics from 71.3 to 75.6 and 80.6 to 83.7 for the CNN model. Similarly, for ViT, we observe improvement in IoU and NDS metrics from 74.9 to 79.2 and 83.8 to 87.1. This research contributes to developing reliable models while addressing inclusiveness for minority classes in datasets. Code can be accessed at: BiasDet.