Controlling MXER Tether Dynamics for Extended Payload Rendezvous
Z.H. du Toit (TU Delft - Aerospace Engineering)
M.C. Naeije – Mentor (TU Delft - Astrodynamics & Space Missions)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
This thesis addresses the critical challenge of extending the brief payload rendezvous window for Momentum Exchange with Electrodynamic Reboost (MXER) tether systems, a transformative technology for propellantless space transportation. The research systematically conducted a comparative analysis of three distinct actuator configurations using a two-dimensional rigid-body dynamical model, and evaluated a conventional optimal control method against a modern, model-free Reinforcement Learning (RL) algorithm.
The investigation definitively identifies the reeler actuator configuration as the most effective for extending the rendezvous window in an unconstrained dynamic environment. This configuration, which incorporates an intermediate reeling mass, achieved a threefold improvement, extending the uncontrolled rendezvous window of 0.6 seconds to 1.8 seconds. This duration, achieved within specified trajectory tracking tolerances of 10 m for position and 10 m/s for velocity relative to the payload, significantly outperformed both the baseline tip-reeling (0.8 s) and climber (1.0 s) configurations. This superior performance is primarily attributed to the reeler's enhanced control authority over the tether tip's velocity profile, enabling more effective counteraction of the characteristic V-shaped relative velocity curve inherent to rendezvous.
In the unconstrained scenario, both the conventional iterative Linear Quadratic Regulator (iLQR) and the model-free Soft Actor-Critic (SAC) RL agent successfully developed control policies, matching the 1.8-second rendezvous window extension. However, the SAC agent's policy exhibited less smooth, sporadic actuator usage, a trait undesirable in practical applications due to potential structural loads, component wear, and the excitation of unmodelled high-frequency wave dynamics.
The study of constrained control revealed the inherent difficulty of the problem. When realistic operational limits on tether tension, g-loads, and actuator usage were imposed, neither the Augmented-Lagrangian iLQR (AL-iLQR) nor the SAC-based controller could achieve a sustained rendezvous window. The AL-iLQR proved overly conservative, satisfying constraints but failing to exploit the system's full dynamic potential. Conversely, the SAC agent, guided by a simple penalty-based reward function, did not robustly enforce critical constraints, notably violating tension requirements, which would lead to system failure.
Verification and validation studies confirmed the fidelity of the rigid-body model. A variance-based sensitivity analysis highlighted tether length uncertainty as the dominant factor affecting rendezvous accuracy. Additionally, a comprehensive hyperparameter optimisation study for the SAC RL agent identified the learning rate and batch size as highly influential parameters for performance. A brief generalisation test also showed that the RL agent, trained on the reeler configuration, did not successfully generalise to the climber configuration, though its velocity control performance indicated potential for improvement.
Ultimately, this thesis successfully addressed its primary research questions, demonstrating how actuator configuration influences rendezvous window controllability and affirming RL's potential, albeit with current limitations concerning constraint satisfaction and control smoothness. All project goals, from model derivation and iLQR implementation to the deployment and evaluation of the SAC RL algorithm, were addressed, laying foundational groundwork for future advancements in MXER tether control.