Conventional and Reinforcement Learning Control of MXER Tether Dynamics for Extended Payload Rendezvous
More Info
expand_more
Abstract
Momentum Exchange with Electrodynamic Reboost (MXER) tethers transfer captured payloads to higher orbits using a long, rotating tether. This transfer occurs through a momentum exchange from the tether to the payload, after which the tether's orbital energy is restored via electrodynamic thrusting. MXER tethers offer a sustainable, reusable, and near-propellantless alternative to rockets for orbital and interplanetary transfer of payloads. However, the short rendezvous window for tether payload capture, typically lasting mere seconds, presents a significant challenge to the use of these tether systems. This research investigates the control of MXER tether dynamics, aiming to improve payload capture success by extending the rendezvous window. This work compares three actuator configurations (a baseline tip-reeling system, a climbing actuator mass, and a reeling actuator mass) previously studied for librating tethers, adapting them for a rotating MXER system based on the Cislunar Tether Transport System design. A 2D rigid-body model is used to simulate the system dynamics. Initially, a conventional iterative Linear Quadratic Regulator (iLQR) establishes a baseline for control performance. Subsequently, the model-free Soft Actor-Critic (SAC) Deep Reinforcement Learning (RL) algorithm is implemented and trained. Both control methods were tested with and without dynamic system constraints. The performance of each configuration is evaluated based on rendezvous window extension and constraint satisfaction. In the unconstrained case, the reeler configuration is shown to be the most effective, extending the rendezvous window to 1.8 seconds from the 0.6 seconds for the uncontrolled case. The SAC RL algorithm matches the performance of the tuned iLQR controller, but produces a less smooth control policy with sporadic actuator use. The constrained control proved more challenging, with neither the augmented-Lagrangian iLQR nor the SAC-based controller managing to extend the rendezvous window; the former was overly conservative, while the latter failed to satisfy operational constraints.