Reinforcement Learning for a Six Degree of Freedom Martian Landing
A. El Ghalbzouri (TU Delft - Aerospace Engineering)
E. van Kampen – Mentor (TU Delft - Aerospace Engineering)
E.J.J. Smeur – Graduation committee member (TU Delft - Aerospace Engineering)
M.C. Naeije – Graduation committee member (TU Delft - Aerospace Engineering)
I.Z. El-Hajj – Graduation committee member (TU Delft - Aerospace Engineering)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
This thesis presents a simulation of a Martian lander using Reinforcement Learning. The objective is to train an agent to land on Mars using the Proximal Policy Optimization (PPO) method. The spacecraft is controlled by control allocation, where the translations and rotations are controlled independently. Also a PD controller is made to control the same lander.
The PD controller is found to be more accurate in comparison with the Reinforcement learning controller. A reinforcement learning thruster model is also made. This spacecraft is controlled by five thrusters and the motions coupled. This lander needs more control effort, and is less accurate in tracking a reference velocity. It is also less accurate with landing on the designated landing spot.