Reinforcement Learning-Based Guidance and Control for Aerial-to-Aerial Pest Interception

Master Thesis (2025)
Author(s)

M.S. Broekers (TU Delft - Aerospace Engineering)

Contributor(s)

G.C.H.E. de Croon – Mentor (TU Delft - Control & Simulation)

R.W. Vos – Mentor (TU Delft - Control & Simulation)

M. Yedutenko – Mentor (TU Delft - Control & Simulation)

C. de Wagter – Graduation committee member (TU Delft - Control & Simulation)

A. Bombelli – Graduation committee member (TU Delft - Operations & Environment)

Faculty
Aerospace Engineering
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
11-12-2025
Awarding Institution
Delft University of Technology
Programme
['Aerospace Engineering']
Faculty
Aerospace Engineering
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

PATS-X is a greenhouse pest suppression system that uses a depth camera with an autonomous micro air vehicle (MAV) to detect, track, and physically intercept flying insects. This study targets guidance and control for reliable aerial-to-aerial interception. Reinforcement learning (RL) is used to learn policies from insect flight recordings. We evaluate control policies at increasing levels of abstraction: direct motor commands, collective thrust and body rates (CTBR), and acceleration. In simulation, lower abstraction levels yield better interception performance; moving from acceleration to motor command reduces the median time to first interception by about 41%. A systematic variation of the observation space reveals that the most effective observations are body frame relative position and velocity, and short temporal histories add no benefit beyond noise filtering. Compared with a state-of-the-art classical benchmark, Fast Response Proportional Navigation (FRPN), the best motor level RL policy in simulation achieves a median first interception time of 0.85 [0.76--1.07]s with 99.1% interception rate, compared with FRPN at 1.90 [1.04--2.80]s and 95.6%. To address the reality gap, we compare how well the different control abstractions transfer to hardware. CTBR policies deploy on hardware with the least performance loss relative to simulation. Motor-level policies also transfer when trained with modest domain randomization (DR) plus an action-difference penalty that limits command jitter and thermal load. Acceleration-level policies did not transfer. In a PATS-X proof of concept, an RL controller deployed on the actual system reached a 95.6% interception rate of virtual moths versus 80.0% for the existing controller. Moreover, the RL controller shortened time-to-first-interception by 0.70s, indicating the potential of RL-based guidance for the PATS-X system.

Files

License info not available