Towards Federated Diffusion for Robot Navigation

None, None

Towards Federated Diffusion for Robot Navigation

Distributed Training of Generative Control Policies

Master Thesis (2026)

Author(s)

T.C.W.M. Cramer (TU Delft - Mechanical Engineering)

Contributor(s)

L. Ferranti – Mentor (TU Delft - Mechanical Engineering)

R. Babuska – Graduation committee member (TU Delft - Mechanical Engineering)

Faculty

Mechanical Engineering

Federated Learning Diffusion Policy Distributed Robot Navigation

To reference this document use

https://resolver.tudelft.nl/uuid:066a23af-54b8-4102-94d6-45f729006631

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

23-03-2026

Awarding Institution

Delft University of Technology

Programme

Mechanical Engineering, Vehicle Engineering, Cognitive Robotics

Faculty

Mechanical Engineering

Downloads counter

67

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Diffusion Policies enable imitation learning by generating action sequences through iterative denoising. In robotics, policies are commonly trained centrally by pooling demonstrations into a single dataset, but demonstrations are often distributed across robots or sites and cannot be easily shared due to privacy, ownership, bandwidth, or operational constraints. This thesis studies whether diffusion based navigation policies can be trained with federated learning while maintaining closed loop performance under client heterogeneity and limited local data. We introduce FedDiff, a pipeline combining Diffusion Policy training with Federated Averaging (FedAvg), and evaluate it on a controlled 2D navigation benchmark with low dimensional beam observations.We compare three training regimes: (i) individual (no sharing), where each client trains only on its own data; (ii) centralized training on pooled data; and (iii) federated training via FedAvg. Experiments cover three scenario families: near-IID (clients have similar data distributions) with sufficient data, near-IID with scarce data, and non-IID compositional generalization across distinct navigation primitives. Across two sensing configurations (4-beam and 8-beam), we train 36 models and report success rates on held out start--goal configurations and, where applicable, on environments not used during training. Federated training is most beneficial under data scarcity and, in the 8-beam Near-IID setting, can match or slightly exceed centralized training on average, while single client training remains strongest on its own training room in the 4-beam data sufficient case. In the Non-IID setting, federated and centralized models perform strongly on primitive navigation skills but remain limited on composed environments that require combining those primitives. We also discuss training dynamics and practical sensitivities relevant to real world deployment.

Files

Master_Thesis-Thomas_Cramer.pd... (pdf)

(pdf | 9.99 Mb)

License info not available