Automatic Design of Verifiable Robot Swarms
M. Coppola (TU Delft - Control & Simulation, TU Delft - Space Systems Egineering)
G. C. H. E. de Croon – Promotor (TU Delft - Control & Simulation)
E.K.A. Gill – Promotor (TU Delft - Space Engineering)
J Guo – Promotor (TU Delft - Space Systems Egineering)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
The paradigm of swarm robotics aims to enable several independent robots to collaborate together toward collective goals. The distributed nature of a swarm, whereby each robot acts independently in accordance with its perceived environment, is expected to provide the system with a high degree of flexibility, robustness, and scalability. However, this comes at the cost of increased system complexity. This thesis explores how to automatically design a collective behavior in a way that is transparent and verifiable. The thesis begins by taking a step back and analyzing the design choices that need to be made when designing a swarm of robots. Through an in-depth literature study, focusing on swarms of small drones as a case study, we found how sensor and actuator choices can create constraints for the swarm behavior that can be achieved, and how desired swarm behaviors can create requirements for the hardware design and local-level controllers. Coincidentally, we found a prominent example of this in our own research on relative localization sensors for swarms of tiny drones (performed in addition to the research in this thesis), whereby we developed a communication-based relative localization approach that enabled teams of tiny drones to fly together in tight areas, the advantages being: omni-directional sensing, independence from lighting conditions and/or visual clutter, low mass, and low computational costs. However, this solution also comes with the restriction of ensuring that robots never move parallel to each other, as this will present an unobservable situation. Based on such lessons, the remainder of the thesis aims for a framework that is agnostic with respect to the robot and the swarm's collective task. The framework proposed in this thesis is centered around the following notion: a collective goal can be broken down into a set of locally observable objectives which the robots can sense, referred to as ``desired'' objectives. The robots then take actions in order to reach these desired objectives. When all robots achieve the desired objectives, then the global goal and/or collective behavior emerges. This framework was first developed for the specific case study of pattern formation by cognitively limited robots, which could only sense the relative location of close-by neighbors. It was later generalized, and its use was demonstrated on other collective tasks, namely: aggregation, consensus, and foraging. Through a local model of agent transitions, it was possible to: 1) identify potential obstructions to achieving the collective goal, and 2) optimize the behavior of the robots so as to maximize the likelihood of achieving the desired objectives. The optimization is performed by an evolutionary algorithm that leverages the local model, whereby the fitness function maximizes the probability of being in a desired local state. Using this approach, the policy evaluation only scales with the size of the local state space, and demands much less computation than swarm simulations would. In the final stage of this research, a complete framework was further developed to alleviate the need to manually define the desired objectives as well as the local models required for potential verification and/or optimization. The framework uses a data-driven approach to automatically extract two models: 1) a deep neural network that estimates the global performance of the swarm from the distribution of local sensor data, and 2) a probabilistic state transition model that explicitly models the local state transitions (i.e., transitions in observations from the perspective of a single robot in a swarm) given a policy. The framework can efficiently lead to effective controllers, as demonstrated via multiple case studies. It can also be used in combination with an evolutionary optimization process, leading to higher efficiency, or for heterogeneous online learning. Overall, the methods and insights developed in this thesis propose a new way to approach the development of verifiable and understandable behaviors for swarms of robots, using models in order to perform analysis, verification, and optimization.