A Data Augmentation Pipeline: Leveraging Controllable Diffusion Models and Automotive Simulation Software

None, None

A Data Augmentation Pipeline: Leveraging Controllable Diffusion Models and Automotive Simulation Software

Master Thesis (2024)

Author(s)

J.S. van Leuven (TU Delft - Mechanical Engineering)

Contributor(s)

Son Tong – Mentor

J. Kober – Mentor (TU Delft - Learning & Autonomous Control)

Holger Caesar – Mentor (TU Delft - Intelligent Vehicles)

Faculty

Mechanical Engineering

Diffusion models Domain gap Simulation in the loop

To reference this document use:

https://resolver.tudelft.nl/uuid:2bfdfa15-d62e-4ef0-88d2-3c3f3f3d0828

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

26-07-2024

Awarding Institution

Delft University of Technology

Programme

['Mechanical Engineering | Vehicle Engineering | Cognitive Robotics']

Faculty

Mechanical Engineering

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Training models for autonomous vehicles (AVs) necessitates substantial volumes of high-quality data due to the strong correlation between dataset size and model performance. However, acquiring such datasets is labor-intensive and expensive, requiring significant resources for collection and labeling. To optimize the utility of available data, augmenting the dataset or generating synthetic data presents a cost-effective and efficient solution. Traditional methods that operate within the RGB domain frequently overlook crucial information, such as object frequency, scene composition, and agent trajectories. To address these limitations, a pipeline employing controllable diffusion models and vehicle simulation software is proposed. This approach involves loading collected data into a physics-based simulator, which allows for augmentation beyond the pixel space into the structural space. The augmented simulated data is subsequently transformed back into the photorealistic domain using generative artificial intelligence. This process generates high-fidelity synthetic data, thereby enabling models to train effectively on an expanded and varied dataset, enhancing robustness through the increased variation. The proposed method is evaluated in both image and video domains to assess its effectiveness.

Files

Thesis_v20.pdf

(pdf | 17.8 Mb)

License info not available