Learning-Based Multi-UAV Flocking Control With Limited Visual Field and Instinctive Repulsion

None, None; None, None; None, None; None, None; None, None

Learning-Based Multi-UAV Flocking Control With Limited Visual Field and Instinctive Repulsion

Journal Article (2024)

Author(s)

Chengchao Bai (Harbin Institute of Technology)

Peng Yan (Harbin Institute of Technology)

Haiyin Piao (SADRI Institute, Northwestern Polytechnical University)

W. Pan (TU Delft - Robot Dynamics, The University of Manchester)

Jifeng Guo (Harbin Institute of Technology)

Research Group

Robot Dynamics

Copyright

DOI related publication

https://doi.org/10.1109/TCYB.2023.3246985

Optimization Sensors UAVs Collision avoidance Reinforcement learning Visualization Training Autonomous aerial vehicles Deep reinforcement learning (DRL) Flocking control Inter-unmanned aerial vehicle (UAV) collision avoidance Limited visual field

To reference this document use:

https://resolver.tudelft.nl/uuid:9c9e7446-41fa-4ba4-84c3-03575873e5b0

More Info

expand_more

Publication Year

2024

Language

English

Copyright

Research Group

Robot Dynamics

Issue number

1

Volume number

54

Pages (from-to)

462-475

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This article explores deep reinforcement learning (DRL) for the flocking control of unmanned aerial vehicle (UAV) swarms. The flocking control policy is trained using a centralized-learning-decentralized-execution (CTDE) paradigm, where a centralized critic network augmented with additional information about the entire UAV swarm is utilized to improve learning efficiency. Instead of learning inter-UAV collision avoidance capabilities, a repulsion function is encoded as an inner-UAV 'instinct.' In addition, the UAVs can obtain the states of other UAVs through onboard sensors in communication-denied environments, and the impact of varying visual fields on flocking control is analyzed. Through extensive simulations, it is shown that the proposed policy with the repulsion function and limited visual field has a success rate of 93.8% in training environments, 85.6% in environments with a high number of UAVs, 91.2% in environments with a high number of obstacles, and 82.2% in environments with dynamic obstacles. Furthermore, the results indicate that the proposed learning-based methods are more suitable than traditional methods in cluttered environments.

Files

Learning_Based_Multi_UAV_Flock... (pdf)

(pdf | 7.02 Mb)

- Embargo expired in 08-09-2023

License info not available