Decentralized Real-Time Planning for Multi-UAV Cooperative Manipulation via Imitation Learning

None, None; None, None; None, None

Decentralized Real-Time Planning for Multi-UAV Cooperative Manipulation via Imitation Learning

Conference Paper (2026)

Author(s)

S. Agarwal (Student TU Delft)

Javier Alonso-Mora (TU Delft - Mechanical Engineering)

Sihao Sun (TU Delft - Mechanical Engineering)

Research Group

Learning & Autonomous Control

DOI related publication

https://doi.org/10.1109/MRS66243.2025.11357262 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:fa0cbfeb-4eae-44fe-94a3-2fadd5db5742

More Info

expand_more

Publication Year

2026

Language

English

Research Group

Learning & Autonomous Control

Publisher

IEEE

ISBN (electronic)

979-8-3315-9359-9

Event

2025 IEEE International Symposium on Multi-Robot and Multi-Agent Systems, MRS 2025 (2025-12-04 - 2025-12-05), Singapore, Singapore

Downloads counter

26

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Existing approaches for transporting and manipulating cable-suspended loads using multiple UAVs along reference trajectories typically rely on either centralized control architectures or reliable inter-agent communication. In this work, we propose a novel machine learning-based method for decentralized kinodynamic planning that operates effectively under partial observability and without inter-agent communication. Our method leverages imitation learning to train a decentralized student policy for each UAV by imitating a centralized kinodynamic motion planner with access to privileged global observations. The student policy generates smooth trajectories using physics-informed neural networks that respect the derivative relationships in motion. During training, the student policies utilize the full trajectory generated by the teacher policy, leading to improved sample efficiency. Moreover, each student policy can be trained in under two hours on a standard laptop. We validate our method in both simulation and real-world environments to follow an agile reference trajectory, demonstrating performance comparable to that of centralized approaches.

Files

Decentralized_Real-Time_Planni... (pdf)

(pdf | 1.24 Mb)

Taverne

File under embargo until 28-07-2026