T. Driessen | TU Delft Repository

Assessing Human Drivers

From Raw Data to Context-Aware Interpretations

Doctoral thesis (2025) - T. Driessen, J.C.F. de Winter, D. Dodou, Dick de Waard

Road traffic accidents remain a major public health concern worldwide. Technological advances in vehicle sensing, automation, and artificial intelligence present novel opportunities to assess and improve human driving. This dissertation explores these opportunities by developing and evaluating algorithms to assess the behavior of car and truck drivers.

Initial research establishes the perspectives of driving examiners and professional truck drivers on the acceptance of data-driven tools to assess driver behavior. The work then demonstrates that practical methods using readily available GPS and accelerometer data can successfully identify driving styles and predict negative outcomes like fines and damage incidents at a population level. However, these simple metrics prove insufficient for fair individual assessment due to the lack of situational context embedded in such data.

To address this limitation, the thesis explores modern AI-based approaches. It demonstrates how AI systems from automated driving can provide continuous behavioral references to evaluate human performance, and concludes by showing that vision-language models can establish a more holistic, "context-aware" risk assessment using images of typical traffic situations. ...

Putting ChatGPT vision (GPT-4V) to the test: risk perception in traffic images

Journal article (2024) - Tom Driessen, Dimitra Dodou, Pavlo Bazilinskyy, Joost De Winter

Vision-language models are of interest in various domains, including automated driving, where computer vision techniques can accurately detect road users, but where the vehicle sometimes fails to understand context. This study examined the effectiveness of GPT-4V in predicting the level of 'risk' in traffic images as assessed by humans. We used 210 static images taken from a moving vehicle, each previously rated by approximately 650 people. Based on psychometric construct theory and using insights from the self-consistency prompting method, we formulated three hypotheses: (i) repeating the prompt under effectively identical conditions increases validity, (ii) varying the prompt text and extracting a total score increases validity compared to using a single prompt, and (iii) in a multiple regression analysis, the incorporation of object detection features, alongside the GPT-4V-based risk rating, significantly contributes to improving the model's validity. Validity was quantified by the correlation coefficient with human risk scores, across the 210 images. The results confirmed the three hypotheses. The eventual validity coefficient was r = 0.83, indicating that population-level human risk can be predicted using AI with a high degree of accuracy. The findings suggest that GPT-4V must be prompted in a way equivalent to how humans fill out a multi-item questionnaire. ...

How AI from Automated Driving Systems Can Contribute to the Assessment of Human Driving Behavior

Journal article (2024) - T. Driessen, O. Siebinga, T.A.B. de Boer, D. Dodou, Dick de Waard, J.C.F. de Winter

This paper proposes a novel approach to measuring human driving performance by using the AI capabilities of automated driving systems, illustrated through three example scenarios. Traditionally, the assessment of human driving has followed a bottom-up methodology, where raw data are compared to fixed thresholds, yielding indicators such as the number of hard braking events. However, acceleration threshold exceedances are often heavily influenced by the driving context. We propose a top-down context-aware approach to driving assessments, in which recordings of human-driven vehicles are analyzed by an automated driving system. By comparing the human driver’s speed to the AI’s recommended speed, we derive a level of disagreement that can be used to distinguish between hard braking caused by aggressive driving and emergency braking in response to a critical event. The proposed method may serve as an alternative to the metrics currently used by some insurance companies and may serve as a template for future AI-based driver assessment. ...

The use of ChatGPT for personality research

Administering questionnaires using generated personas

Journal article (2024) - Joost C.F. de Winter, T. Driessen, Dimitra Dodou

Personality research has traditionally relied on questionnaires, which bring with them inherent limitations, such as response style bias. With the emergence of large language models such as ChatGPT, the question arises as to what extent these models can be used in personality research. In this study, ChatGPT (GPT-4) generated 2000 text-based personas. Next, for each persona, ChatGPT completed a short form of the Big Five Inventory (BFI-10), the Brief Sensation Seeking Scale (BSSS), and a Short Dark Triad (SD3). The mean scores on the BFI-10 items were found to correlate strongly with means from previously published research, and principal component analysis revealed a clear five-component structure. Certain relationships between traits, such as a negative correlation between the age of the persona and the BSSS score, were clearly interpretable, while some other correlations diverged from the literature. An additional analysis using four new sets of 2000 personas each, including a set of ‘realistic’ personas and a set of cinematic personas, showed that the correlation matrix among personality constructs was affected by the persona set. It is concluded that evaluating questionnaires and research hypotheses prior to engaging with real individuals holds promise. ...

Exploring the challenges faced by Dutch truck drivers in the era of technological advancement

Journal article (2024) - J.C.F. de Winter, T. Driessen, D. Dodou, Aschwin Cannoo

Introduction: Despite their important role in the economy, truck drivers face several challenges, including adapting to advancing technology. The current study investigated the occupational experiences of Dutch truck drivers to detect common patterns. Methods: A questionnaire was distributed to professional drivers in order to collect data on public image, traffic safety, work pressure, transport crime, driver shortage, and sector improvements. Results: The findings based on 3,708 respondents revealed a general dissatisfaction with the image of the industry and reluctance to recommend the profession. A factor analysis of the questionnaire items identified two primary factors: ‘Work Pressure’, more common among national drivers, and ‘Safety & Security Concerns’, more common among international drivers. A ChatGPT-assisted analysis of textbox comments indicated that vehicle technology received mixed feedback, with praise for safety and fuel-efficiency improvements, but concerns about reliability and intrusiveness. Discussion: In conclusion, Dutch professional truck drivers indicate a need for industry improvements. While the work pressure for truck drivers in general may not be high relative to certain other occupational groups, truck drivers appear to face a deficit of support and respect. ...

Using mobile devices for driving test assessment

A study of acceleration and GPS data

Journal article (2024) - Tom Driessen, David Stefan, Daniël Heikoop, Dimitra Dodou, Joost de Winter

There is a need to improve the validity of the driving test as a measure of an individual’s ability to drive safely. This paper explores the use of algorithms to analyze acceleration and GPS data from a smartphone and a GoPro to distinguish between different driving styles, as performed by experienced examiners portraying stereotypical driving test candidates. Measures from nine driving tests were analyzed, including (harsh) accelerations, jerk, mean speed, and speeding. Results showed that the type of car, instructed driving style, and driving route impacted the dependent measures. It is concluded that GPS and accelerometer data can effectively distinguish between cautious, normal, and aggressive driving. However, it is important to consider additional sensors, such as cameras, to allow for more context-aware assessments of driving behavior. Furthermore, we demonstrate methods to quantify variations in road conditions and provide suggestions for presenting the data to driving examiners. ...

Predicting Damage Incidents, Fines, and Fuel Consumption from Truck Driver Data

A Study from the Netherlands

Journal article (2023) - Tom Driessen, Dimitra Dodou, Dick de Waard, Joost de Winter

Trucks are disproportionately involved in fatal traffic accidents and contribute significantly to CO₂ emissions. Gathering data from trucks presents a unique opportunity for estimating driver-specific costs associated with truck operation. Although research has been published on the predictive validity of driver data, such as in the contexts of pay-how-you-drive insurance and naturalistic driving studies, the investigation into how telematics data relate to the negative consequences of truck driving remains limited. In the present study, driving data from 180 truck drivers, collected over a 2-year period, were examined to predict damage incidents, traffic fines, and fuel consumption. Correlation analysis revealed that the number of fines and damage incidents could be predicted based on the number of harsh braking events per hour of driving, whereas fuel consumption was predicted by engine torque exceedances. Our analysis also sheds light on the impact of covariates, including the engine capacity of the truck operated and time of day, among others. We conclude that the damage incidents and fines incurred by truck drivers can be predicted not only from their number of harsh decelerations but also through driving demands that extend beyond the driver’s immediate control. It is recommended that transportation companies adopt a systemic approach to mitigating truck-driving-related expenses. ...

Driving examiners’ views on data-driven assessment of test candidates

An interview study

Journal article (2021) - Tom Driessen, Angèle Picco, Dimitra Dodou, Dick de Waard, Joost de Winter

Vehicles are increasingly equipped with sensors that capture the state of the driver, the vehicle, and the environment. These developments are relevant to formal driver testing, but little is known about the extent to which driving examiners would support the use of sensor data in their job. This semi-structured interview study examined the opinions of 37 driving examiners about data-driven assessment of test candidates. The results showed that the examiners were supportive of using data to explain their pass/fail verdict to the candidate. According to the examiners, data in an easily accessible form such as graphs of eye movements, headway, speed, or braking behavior, and color-coded scores, supplemented with camera images, would allow them to eliminate doubt or help them convince disagreeing test-takers. The examiners were skeptical about higher levels of decision support, noting that forming an overall picture of the candidate's abilities requires integrating multiple context-dependent sources of information. The interviews yielded other possible applications of data collection and sharing, such as selecting optimal routes, improving standardization, and training and pre-selecting candidates before they are allowed to take the driving test. Finally, the interviews focused on an increasingly viable form of data collection: simulator-based driver testing. This yielded a divided picture, with about half of the examiners being positive and half negative about using simulators in driver testing. In conclusion, this study has provided important insights regarding the use of data as an explanation aid for examiners. Future research should consider the views of test candidates and experimentally evaluate different forms of data-driven support in the driving test. ...

Feeling uncertain-effects of a vibrotactile belt that communicates vehicle sensor uncertainty

Journal article (2020) - Matti Krüger, Tom Driessen, Christiane B. Wiebel-Herboth, Joost C.F. de Winter, Heiko Wersing

With the rise of partially automated cars, drivers are more and more required to judge the degree of responsibility that can be delegated to vehicle assistant systems. This can be supported by utilizing interfaces that intuitively convey real-time reliabilities of system functions such as environment sensing. We designed a vibrotactile interface that communicates spatiotemporal information about surrounding vehicles and encodes a representation of spatial uncertainty in a novel way. We evaluated this interface in a driving simulator experiment with high and low levels of human and machine confidence respectively caused by simulated degraded vehicle sensor precision and limited human visibility range. Thereby we were interested in whether drivers (i) could perceive and understand the vibrotactile encoding of spatial uncertainty, (ii) would subjectively benefit from the encoded information, (iii) would be disturbed in cases of information redundancy, and (iv) would gain objective safety benefits from the encoded information. To measure subjective understanding and benefit, a custom questionnaire, Van der Laan acceptance ratings and NASA TLX scores were used. To measure the objective benefit, we computed the minimum time-to-contact as a measure of safety and gaze distributions as an indicator for attention guidance. Results indicate that participants were able to understand the encoded uncertainty and spatiotemporal information and purposefully utilized it when needed. The tactile interface provided meaningful support despite sensory restrictions. By encoding spatial uncertainties, it successfully extended the operating range of the assistance system. ...