SADCHER: Scheduling using Attention-based Dynamic Coalitions of Heterogeneous Robots in Real-Time

Master Thesis (2025)
Author(s)

J.D. Bichler (TU Delft - Mechanical Engineering)

Contributor(s)

J. Alonso-Mora – Mentor (TU Delft - Learning & Autonomous Control)

A. Matoses Gimenez – Mentor (TU Delft - Learning & Autonomous Control)

B. Atasoy – Graduation committee member (TU Delft - Transport Engineering and Logistics)

Faculty
Mechanical Engineering
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
09-07-2025
Awarding Institution
Delft University of Technology
Programme
['Mechanical Engineering | Vehicle Engineering | Cognitive Robotics']
Faculty
Mechanical Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

We present Sadcher, a real-time task assignment framework for heterogeneous multi-robot teams that incorporates dynamic coalition formation and task precedence constraints. Sadcher is trained through Imitation Learning and combines graph attention and transformers to predict assignment rewards between robots and tasks. Based on the predicted rewards, a relaxed bipartite matching step generates high-quality schedules with feasibility guarantees. We explicitly model robot and task positions, task durations, and robots’ remaining processing times, enabling advanced temporal and spatial reasoning and generalization to environments with different spatiotemporal distributions compared to training. Trained on optimally solved small-scale instances, our method can scale to larger task sets and team sizes. Sadcher outperforms other learning-based and heuristic baselines on randomized, unseen prob-
lems for small and medium-sized teams with computation times suitable for real-time operation. Wealso explore sampling-based variants and evaluate scalability across robot and task counts. To address performance limitations identified for large-scale problem instances, we experiment with using Reinforcement Learning (RL) to fine-tune the imitation-learned model. Both discrete and continuous RL formulations are explored, leveraging Proximal Policy Optimization (PPO). This allows for training on larger problem instances, for which optimal solutions for IL are infeasible to obtain. Our RL experiments provide insights into the comparative advantages and trade-offs of IL and RL methods. In addition, we release our dataset of 250,000 optimal schedules to facilitate future research. We include a detailed description of the instance generation and the Mixed-Integer Linear Programming (MILP) formulation used to solve them optimally. Furthermore, this thesis contributes a lightweight
simulation environment with visualization tools for benchmarking different task assignment algorithms.

Files

Bichler_Thesis_Report.pdf
(pdf | 7.44 Mb)
License info not available