Autonomous UAV Landing on Stochastic Maritime Targets

None, None

Autonomous UAV Landing on Stochastic Maritime Targets

A reinforcement learning approach for maritime UAV applications

Master Thesis (2025)

Author(s)

H.S. Hennecken (TU Delft - Aerospace Engineering)

Contributor(s)

M.J. Ribeiro – Mentor (TU Delft - Aerospace Engineering)

O. Pfeifle – Mentor (Royal Netherlands Aerospace Centre)

E. van Kampen – Graduation committee member (TU Delft - Aerospace Engineering)

J.S. Sun – Mentor (TU Delft - Technology, Policy and Management)

Faculty

Aerospace Engineering

To reference this document use

https://resolver.tudelft.nl/uuid:c22debc0-b16f-464a-bc19-7ec3ca8f9d78

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

11-11-2025

Awarding Institution

Delft University of Technology

Programme

Aerospace Engineering

Faculty

Aerospace Engineering

Downloads counter

112

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Reliable autonomous recovery of Unmanned Aerial Vehicles (UAVs) on moving maritime platforms remains a critical challenge, primarily due to complex, stochastic deck motion, particularly vertical heave, and unpredictable environmental disturbances. Existing Reinforcement Learning (RL) approaches often simplify this environment, limiting their real-world applicability. This thesis investigates the robustness trade-offs of RL-based guidance controllers under realistic, high-dynamicity maritime conditions. We benchmarked a classical Proportional-IntegralDerivative (PID) controller against two RL architectures trained using Soft Actor-Critic (SAC) in a high-fidelity PyBullet simulation: a Full RL 3D controller and a novel Hybrid RL 1D controller, which strategically applies RL only to the critical, stochastic vertical (heave) axis. The results demonstrate that the Hybrid RL 1D architecture (86.6% success rate) achieved superior overall robustness and efficiency. Notably, the RL controllers dramatically reduced average landing time (RL_1D: 3.31 s vs. Baseline: 11.51 s), though the classical PID baseline maintained higher horizontal precision (Err𝑋𝑌 of 0.17 ± 0.17 m ). The Hybrid RL 1D maintained a superior success rate up to 89% in high sea states (SS7) and exhibited greater resilience to sensor noise. However, a critical limitation was identified: both RL-based policies experienced a pronounced performance collapse under strong, untrained wind disturbances, a regime where the non-adaptive classical PID baseline proved unexpectedly stable. These findings confirm the benefits of hybrid control for maximizing robustness and highlight that the system’s ability to handle wind disturbance rejection remains a significant, unresolved shortcoming for current RL guidance systems.

Files

MSc_Thesis_Hennecken.pdf

(pdf | 1.87 Mb)

License info not available