Safe, Efficient, and Socially Compliant Automated Driving in Mixed Traffic
Sensing, Anomaly Detection, Planning and Control
More Info
expand_more
Abstract
Background
The steady development of automated vehicles (AVs) promises significant benefits in terms of traffic safety and efficiency. However, the transition to fully AVs and their deployment on the road will be gradual, leading to a phase of mixed-traffic conditions where AVs at various levels coexist with human-driven vehicles (HDVs). This transition poses unprecedented hurdles, requiring a deeper understanding of the emerging challenges for AVs in sensing and perceiving road environments, as well as in the novel interactions between AVs and HDVs. Furthermore, the social compliance of AVs and the optimization of their deployment strategies need to be considered as well.
Contents of this Thesis
This thesis addresses the multifaceted challenges associated with AVs’ development and deployment in mixed-traffic environments. The main objective of this thesis is to enhance the capabilities of AVs enabling them with a wider Operational Design Domain (ODD) and thus facilitate the implementation of safe, efficient, and socially compliant automated driving in mixed traffic. Referring to the modular design of AV systems, three key perspectives, i.e., sensing and perception, anomaly detection, as well as planning and control, are tackled in this thesis. To be specific:
Chapters 2-4 focus on enhancing sensing and perception capabilities through the development of hybrid spatial-temporal deep learning models and self-supervised pretraining methods. Lane detection is chosen as the focus of these chapters since it is vital for current vehicle localization and positioning, and it is also the foundation of various automated driving features. The main findings of these chapters are summarized as follows.
Chapter 2 presents a pioneering hybrid spatial-temporal sequence-to-one deep learning architecture tailored for vision-based lane detection tasks. By integrating the spatial convolutional neural network (SCNN) with spatial-temporal Recurrent Neural Network (RNN) modules, this architecture effectively captures correlations and dependencies among continuous image frames. Through extensive experimentation on various driving scenes, including challenging scenarios, the proposed model variants exhibit superior performance over existing state-of-the-art models. Notably, even the lighter model variants demonstrate remarkable accuracy, outperforming their counterparts while maintaining lower computational complexity.
Building upon the foundation laid in Chapter 2, Chapter 3 focuses on refining vision-based sensing and perception through the development of customized spatial-temporal attention mechanisms. These mechanisms, including temporal attention, spatial-temporal attention, and spatial-temporal attention with fully connected layers, are meticulously designed to optimize the utilization of spatial-temporal correlations across different regions of interest within the consecutive image frames. Leveraging linear Long Short Term Memory (LSTM) neural networks in conjunction with the proposed attention blocks, this chapter demonstrates the feasibility of lightweight and computationally efficient solutions for sequential deep neural networks (DNNs). Through rigorous experimentation, ablation studies, and comparative analysis across diverse datasets, the effectiveness of the proposed attention mechanisms in enhancing lane detection performance is convincingly established.
In Chapter 4, the exploration of enhancing vision-based sensing and perception capabilities continues with the introduction of a self-supervised pretraining method employing masked sequential autoencoders (MSAE). This innovative approach leverages both labelled and unlabelled data to improve detection accuracy and expedite the training process of DNN models dedicated to lane detection tasks. Additionally, a customized Focal Loss based PolyLoss is introduced to further enhance the detection accuracy. Through comprehensive experimentation and comparative analysis, the efficacy of the proposed pretraining method and loss function is demonstrated, showcasing substantial improvements in lane detection performance across diverse driving scenarios. Specifically, the utilization of MSAE-based pretraining and the adoption of the customized PolyLoss result in superior performance metrics, underscoring the pivotal role of self-supervised learning techniques and tailored loss functions in fortifying the robustness and efficiency of vision-based sensing and perception systems in AVs.
These chapters address the challenges of vision-based lane detection, crucial for AV navigation and safety.
Chapters 5-6 delve into anomaly detection, investigating techniques for identifying abnormal lane rendering in digital map applications and detecting anomalies in driving behaviour.
Chapter 5 introduces an innovative approach to anomaly detection in lane rendering images of digital map applications, utilizing Transformer-based models with self-supervised pretraining and customized fine-tuning. By transforming anomaly detection into a classification problem, the chapter proposes a four-phase pipeline that includes data pre-processing, self-supervised pre-training with masked image modelling (MiM), customized fine-tuning using cross-entropy-based loss, and post-processing. Experimental results demonstrate the pipeline’s effectiveness, with significant improvements in detection accuracy and reduced training time achieved through self-supervised pre-training. Ablation studies regarding tackling the problem with different numbers of classes further validate the pipeline’s performance enhancements, particularly in addressing data imbalance. This approach not only enhances anomaly detection accuracy but also contributes to reducing labour costs associated with manual labelling and anomaly detection efforts, offering significant societal benefits.
Additionally, Chapter 6 explores the critical task of detecting abnormal driving behaviour, addressing the need for more feasible and efficient approaches by leveraging semi-supervised ML methods. Utilizing large-scale real-world driving data, the study develops a semi-supervised ML model based on Hierarchical Extreme Learning Machines (HELM). This approach utilizes partly labelled data and introduces Surrogate Safety Measures (SSMs) (specifically the event-baed safety indicators of Two-Dimensional Time-To-Collision (2D-TTC)) as the pivotal input features to enhance performance. Results demonstrate the effectiveness of the proposed semi-supervised ML model, showcasing superior performance compared to baseline methods. The integration of SSMs significantly improves detection accuracy, highlighting their significant role in enhancing model performance. By leveraging unlabelled data for training and only a small sample of labelled data for fine-tuning, the proposed semi-supervised approach achieves competitive performance while reducing dependency on fully labelled datasets, making it suitable for real-world applications.
To sum up, the exploration of semi-supervised and self-supervised ML methods presents promising avenues in anomaly detection. The pioneering research presented in this thesis represents a significant stride towards leveraging data-driven ML-based anomaly detection methodologies to enhance the safety of driving.
Chapters 7-9 shift the focus to planning and control strategies for AVs, presenting a comprehensive examination of decision-making frameworks and control algorithms. These chapters introduce a conceptual framework aimed at fostering socially compliant driving behaviour and propose a range of model-based and learning-based approaches.
Chapter 7 lays the groundwork by introducing a conceptual framework that emphasizes socially compliant automated driving. This framework encompasses various social components such as cultural nuances, norms, and driving styles. A key innovation is the introduction of bidirectional behavioural adaptation, highlighting the dynamic interactions between AVs and human drivers. Furthermore, the framework advocates for the incorporation of a spatial-temporal memory module to enable continuous refinement of driving strategies, thereby promoting adaptability and safety in diverse traffic scenarios. Validation through an online expert survey lends credence to the framework’s efficacy. This conceptual framework lays a solid foundation for learning-based and model-based approaches for implementing planning and control algorithms for automated driving.
In the learning-based approach explored in Chapter 8, Deep Reinforcement Learning (DRL) takes centre stage, with a focus on integrating safety, efficiency, comfort level, and energy consumption considerations into the learning framework. Multiple DRL algorithms are evaluated across diverse driving manoeuvres, particularly roundabout driving, highlighting the importance of real-world requirements in reward function design and simulation-based training. Among the compared DRL algorithms, Trust Region Policy Optimization (TRPO) emerges as leading in safety and efficiency, while Proximal Policy Optimization (PPO) excels in comfort during roundabout driving. Moreover, the extension of the training environment to encompass various driving scenarios showcases the adaptability of DRL models to train a uniform driving model for real traffic environments, signalling promising avenues for future research.
Regarding the model-based approach, Chapter 9 introduces the DRF-SVO-MPCC algorithm, aimed at enhancing AVs’ understandability and predictability to human drivers, particularly during interactions with HDVs when driving through the roundabouts, as this challenging manoeuvre involves large curvature and tackles both longitudinal and lateral control. This algorithm integrates the perceived Driving Risk Field (DRF), Social Value Orientation (SVO), and Model Predictive Contouring Control (MPCC), enabling AVs to navigate social scenarios with sensitivity to the welfare of surrounding HDVs. Simulation experiments, conducted on various roundabout scenarios, underscore the algorithm’s superiority in trajectory tracking and adaptability to different driving styles, ensuring safety and social compliance. The findings illuminate the potential of the DRF-SVO-MPCC algorithm in fostering harmonious interactions between AVs and HDVs, setting a precedent for socially aware automated driving systems.
Overall, this thesis represents a solid endeavour to advance the planning and control capabilities of AVs in mixed-traffic environments. Through the development of novel conceptual frameworks and innovative model-based and learning-based algorithmic solutions, it lays the groundwork for the realization of safe, efficient, socially compliant, and adaptable automated driving, contributing to safer and more harmonious transportation systems.
Conclusion and perspectives
In summary, this thesis contributes to advancing the knowledge of how to improve automated driving systems in the realms of sensing and perception, anomaly detection, as well as planning and control. By integrating theoretical frameworks, methodological innovations, and data-driven empirical evaluations, notable progress has been achieved in fostering the development of safe, efficient, and socially compliant automated driving within mixed-traffic environments.
Despite the considerable progress made, several directions for future research have been identified. These include the imperative for more expansive high-quality datasets, exploration of domain adaptation techniques for both sensing and anomaly detection tasks, as well as the seamless integration of model-based and learning-based methodologies for planning and control. Additionally, transitioning towards a unified driving model and effectively addressing the complexities of multi-agent interactions in intricate urban settings remain pivotal areas for further exploration. Furthermore, interdisciplinary collaboration will be instrumental in harnessing the full potential of automated vehicles to revolutionize transportation systems.