Social Impact Regularization in IQ-Learn
Steering Social Intent in Heterogeneous Driving Demonstrations
P. Koev (TU Delft - Electrical Engineering, Mathematics and Computer Science)
L. Cavalcante Siebert – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
A. Mone – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
C.A. Raman – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Autonomous driving relies heavily on Reinforcement Learning (RL) to train agents in sequential decision-making settings. However, RL's success is deeply bottlenecked by the need to manually specify a reward function, a notoriously difficult task when attempting to balance safety, efficiency, and nuanced social etiquette in highly interactive domains. Inverse Reinforcement Learning (IRL) circumvents this challenge by extracting latent objectives directly from expert data. Yet, standard IRL operates under a critical assumption: that all demonstrations stem from a single, homogeneous behavioural profile. In reality, traffic is fundamentally heterogeneous, composed of a mixture of distinct driving styles ranging from calm and cooperative to aggressive and assertive. When standard IRL is applied to such mixed datasets, it inherently struggles to fit a single reward function to the conflicting behaviours. Consequently, the recovered reward typically collapses into an arbitrary average, completely misrepresenting varied driving profiles and failing to account for the essential social context of driving. To resolve this ambiguity, this thesis introduces the Social Impact Regularized IQ-Learn framework. This approach decomposes the driving reward into two distinct components: an individual reward capturing the ego vehicle's own progress, and an ego-centric social impact signal measuring how the vehicle's actions directly affect its neighbours. By combining these into a social scoring function, the framework integrates a normative prior as an additive regularizer within the IQ-Learn objective. This formulation exploits a vital separation: the core IQ-Learn objective absorbs universal physical driving dynamics from the entire mixed dataset, while the regularizer selectively steers the social interpretation of those dynamics towards a specific, designer chosen behavioural target. Evaluations spanning a tabular gridworld proof-of-concept, a multi-agent stochastic environment, and a continuous observation intersection simulator confirm that the regularizer effectively resolves behavioural ambiguity. The framework can successfully steer the recovered policy towards a targeted social alignment. Ultimately, by making the social orientation of the learned policy an explicit and inspectable parameter, this methodology provides a concrete, auditable mechanism for designers and regulators to verify that an autonomous vehicle's social behaviour actively matches its intended design.