Safety assessment of automated vehicles

how to determine whether we have collected enough field data?

Journal Article (2019)
Author(s)

Erwin De Gelder (TU Delft - Team Bart De Schutter, TNO)

Jan Pieter Paardekooper (Radboud Universiteit Nijmegen, TNO)

Olaf op den Op den Camp (TNO)

Bart Schutter (TU Delft - Delft Center for Systems and Control, TU Delft - Team Bart De Schutter)

Research Group
Team Bart De Schutter
Copyright
© 2019 E. de Gelder, Jan Pieter Paardekooper, Olaf Op den Camp, B.H.K. De Schutter
DOI related publication
https://doi.org/10.1080/15389588.2019.1602727
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 E. de Gelder, Jan Pieter Paardekooper, Olaf Op den Camp, B.H.K. De Schutter
Research Group
Team Bart De Schutter
Issue number
sup1
Volume number
20
Pages (from-to)
S162-S170
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Objective: The amount of collected field data from naturalistic driving studies is quickly increasing. The data are used for, among others, developing automated driving technologies (such as crash avoidance systems), studying driver interaction with such technologies, and gaining insights into the variety of scenarios in real-world traffic. Because data collection is time consuming and requires high investments and resources, questions like “Do we have enough data?,” “How much more information can we gain when obtaining more data?,” and “How far are we from obtaining completeness?” are highly relevant. In fact, deducing safety claims based on collected data—for example, through testing scenarios based on collected data—requires knowledge about the degree of completeness of the data used. We propose a method for quantifying the completeness of the so-called activities in a data set. This enables us to partly answer the aforementioned questions. Method: In this article, the (traffic) data are interpreted as a sequence of different so-called scenarios that can be grouped into a finite set of scenario classes. The building blocks of scenarios are the activities. For every activity, there exists a parameterization that encodes all information in the data of each recorded activity. For each type of activity, we estimate a probability density function (pdf) of the associated parameters. Our proposed method quantifies the degree of completeness of a data set using the estimated pdfs. Results: To illustrate the proposed method, 2 different case studies are presented. First, a case study with an artificial data set, of which the underlying pdfs are known, is carried out to illustrate that the proposed method correctly quantifies the completeness of the activities. Next, a case study with real-world data is performed to quantify the degree of completeness of the acquired data for which the true pdfs are unknown. Conclusion: The presented case studies illustrate that the proposed method is able to quantify the degree of completeness of a small set of field data and can be used to deduce whether sufficient data have been collected for the purpose of the field study. Future work will focus on applying the proposed method to larger data sets. The proposed method will be used to evaluate the level of completeness of the data collection on Singaporean roads, aimed at defining relevant test cases for the autonomous vehicle road approval procedure that is being developed in Singapore.