An Empirical Study on Auxiliary Task Joint-Training for Diffusion Policy

None, None

An Empirical Study on Auxiliary Task Joint-Training for Diffusion Policy

Master Thesis (2026)

Author(s)

Q. Luo (TU Delft - Mechanical Engineering)

Contributor(s)

C. Della Santina – Mentor (TU Delft - Mechanical Engineering)

Z. Li – Mentor (TU Delft - Mechanical Engineering)

J. Kober – Graduation committee member (TU Delft - Mechanical Engineering)

Faculty

Mechanical Engineering

Joint training Auxiliary task Diffusion policy

To reference this document use

https://resolver.tudelft.nl/uuid:4cd94466-cdd0-4011-9b58-3feb38bfedf3

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

30-06-2026

Awarding Institution

Delft University of Technology

Faculty

Mechanical Engineering

Downloads counter

15

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

While Diffusion Policy has emerged as a powerful framework for robotic manipulation due to its expressiveness in modeling complex action distributions, its deployment is heavily constrained by high demonstration collection costs. This study presents a systematic empirical investigation into whether joint-training with visual auxiliary tasks can enhance the sample efficiency of diffusion policies under single-task spatial generalization (i.e., variations in object orientations and initial locations). Restricting observation inputs to raw 2D images and low-dimensional robot proprioception, we incorporate four candidate auxiliary tasks: image reconstruction, active object mask extraction, keypoint prediction, and optical flow estimation. We evaluate them with a joint-training framework across two simulated manipulation tasks and one real-world robotic task, using varying amounts of demonstration data. Our empirical findings demonstrate that joint-training with auxiliary tasks indeed provides sample efficiency benefits, particularly in intermediate data regimes. However, we observe that in certain cases, optimization conflicts and gradient interference between auxiliary and primary tasks diminish these benefits, especially in data-starved or data-rich regimes under simulated settings.

Files

An_Empirical_Study_on_Auxiliar... (pdf)

(pdf | 9.04 Mb)

License info not available