Learning collision risk proactively from naturalistic driving data at scale
Yiru Jiao (TU Delft - Traffic Systems Engineering)
Simeon C. Calvert (TU Delft - Traffic Systems Engineering)
Sander van Cranenburgh (TU Delft - Transport and Logistics)
Hans van Lint (TU Delft - Traffic Systems Engineering)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Accurately and proactively alerting drivers or automated systems to emerging collisions is crucial for road safety, particularly in highly interactive and complex urban environments. Existing methods require labour-intensive annotation of sparse risk, struggle to consider varying contextual factors or are tailored to limited scenarios. Here we present the generalized surrogate safety measure (GSSM), a data-driven approach that learns collision risk from naturalistic driving without the need for crash or risk labels. Trained on diverse datasets and evaluated on 2,591 real-world crashes and near-crashes, a basic GSSM using only instantaneous motion kinematics achieves an area under the precision–recall curve of 0.9 and secures a median time advance of 2.6 s to prevent potential collisions. Incorporating more interaction patterns and contextual factors provides further performance gains. Across interaction scenarios, such as rear end, merging and turning, GSSM consistently outperforms existing baselines in terms of accuracy and timeliness. These results establish GSSM as a scalable, context-aware and generalizable foundation for identifying risky interactions before they become unavoidable and support proactive safety in autonomous driving systems and traffic incident management.