Title
Reinforcement Learning from Simulation to Real World Autonomous Driving using Digital Twin
Author
Voogd, Kevin (TU Delft Mechanical, Maritime and Materials Engineering)
Contributor
Alonso Mora, J. (mentor) 
Tong, Son (mentor)
Allamaa, Jean Pierre (mentor)
Shyrokau, B. (graduation committee) 
Mazo, M. (graduation committee) 
Degree granting institution
Delft University of Technology
Programme
Mechanical Engineering | Vehicle Engineering | Cognitive Robotics
Project
FOCETA
Date
2022-11-22
Abstract
Autonomous driving is attracting growing attention because of the potential advantages it poses on safety, leisure, energy efficiency, reduced emissions, and traffic reduction. Current research is focusing on areas related to artificial intelligence to solve complex planning and decision-making tasks, object detection, or simultaneous localization and mapping.
However, training and testing these methods in the real world is rather unsafe and can be harmful to other road users. For that reason, the focus is set on developing such methods first in a simulated and controlled environment and then transferring the control strategy to the target domain. The issue with this approach is that the performance drops when the domain changes, a problem known as sim2real. In addition to this, certain methods cannot be deployed to real-time applications as the computational complexity of the techniques is too large to produce outcomes in a constrained time frame.
To date, research on reducing the sim2real for autonomous driving gap has focused on domain adaptation or domain randomization techniques, however, none has combined them with a high-fidelity vehicle dynamics model. This work aims to validate a safe transfer learning approach that reduces the reality gap in a zero-shot manner by combining the advantages of using simulated and real-world data, a digital twin of the vehicle, and a traffic scenario simulator.
To test this framework reinforcement learning agents are trained to track different paths. Further evaluation includes the influence of using a high-fidelity model and real-world data. Furthermore, the effect of the reward function on the control strategy of the learning agents is also examined. The training process of this transfer learning framework starts with virtually generated scenarios, which are simpler than their real-life counterparts and noise-free. The motion of the learning agent is simulated with the digital twin and in every episode, its parameters are randomized. Moreover, noise is added to the control action. These randomizations are useful for the controller to operate under uncertainty and avoid overfitting to model inaccuracies. Once the performance saturates, real-world logged data is included so that the learning agent adapts to the target distribution, i.e. on noise levels and driving style. After the performance stops improving, the results are tested in Model-in-the-Loop and Vehicle-in-the-Loop with the SimRod, an all-electric drive-by-wire vehicle.
The results of this study indicate that using this zero-shot transfer learning framework yields much better tracking accuracy (about 30% on average) in the physical world with respect to controllers trained only with lower-fidelity models or synthetically generated data. This study stresses that a combination of the three sim2real methods and synthetically generated and real-world data are necessary to reduce the reality gap. Additional results include the influence of the reward function design and using different reinforcement learning algorithms (proximal policy optimization and soft actor-critic).
Subject
Learning and adaptation in autonomous vehicles
Trajectory Tracking and Path Following
Sim2Real
Reinforcement Learning Control
To reference this document use:
http://resolver.tudelft.nl/uuid:31573fc6-8138-4f64-9cfa-0edec7de1510
Embargo date
2025-11-22
Part of collection
Student theses
Document type
master thesis
Rights
© 2022 Kevin Voogd