Towards Federated Diffusion for Robot Navigation

Distributed Training of Generative Control Policies

Master Thesis (2026)
Author(s)

T.C.W.M. Cramer (TU Delft - Mechanical Engineering)

Contributor(s)

L. Ferranti – Mentor (TU Delft - Mechanical Engineering)

R. Babuska – Graduation committee member (TU Delft - Mechanical Engineering)

Faculty
Mechanical Engineering
More Info
expand_more
Publication Year
2026
Language
English
Graduation Date
23-03-2026
Awarding Institution
Delft University of Technology
Programme
Mechanical Engineering, Vehicle Engineering, Cognitive Robotics
Faculty
Mechanical Engineering
Downloads counter
66
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Diffusion Policies enable imitation learning by generating action sequences through iterative denoising. In robotics, policies are commonly trained centrally by pooling demonstrations into a single dataset, but demonstrations are often distributed across robots or sites and cannot be easily shared due to privacy, ownership, bandwidth, or operational constraints. This thesis studies whether diffusion based navigation policies can be trained with federated learning while maintaining closed loop performance under client heterogeneity and limited local data. We introduce FedDiff, a pipeline combining Diffusion Policy training with Federated Averaging (FedAvg), and evaluate it on a controlled 2D navigation benchmark with low dimensional beam observations.We compare three training regimes: (i) individual (no sharing), where each client trains only on its own data; (ii) centralized training on pooled data; and (iii) federated training via FedAvg. Experiments cover three scenario families: near-IID (clients have similar data distributions) with sufficient data, near-IID with scarce data, and non-IID compositional generalization across distinct navigation primitives. Across two sensing configurations (4-beam and 8-beam), we train 36 models and report success rates on held out start--goal configurations and, where applicable, on environments not used during training. Federated training is most beneficial under data scarcity and, in the 8-beam Near-IID setting, can match or slightly exceed centralized training on average, while single client training remains strongest on its own training room in the 4-beam data sufficient case. In the Non-IID setting, federated and centralized models perform strongly on primitive navigation skills but remain limited on composed environments that require combining those primitives. We also discuss training dynamics and practical sensitivities relevant to real world deployment.

Files

License info not available