This work investigates adaptive switching mechanisms for reinforcement learning (RL) controllers in high-speed autonomous drone racing. While domain randomization (DR) improves generalization, single-policy controllers remain constrained by their training distributions and may fa
...
This work investigates adaptive switching mechanisms for reinforcement learning (RL) controllers in high-speed autonomous drone racing. While domain randomization (DR) improves generalization, single-policy controllers remain constrained by their training distributions and may fail under unmodeled conditions such as wind, hardware wear, or sensor degradation. To address this, we introduce two complementary online switching strategies: the look-back switch, which retrospectively identifies the dynamics model that best matches the current flight state, and the look-ahead switch, which prospectively selects the most suitable policy by simulating candidate controllers over short horizons. Both mechanisms leverage pre-trained policies without requiring retraining, enabling safer and faster flight. We evaluate the methods in simulation on two platforms (5-inch racing quadcopter and the high-fidelity A2RL competition drone) and across two tracks (Figure-8 and A2RL Grand Challenge). Results show that adaptive switching improves robustness and reduces lap times compared to fixed-policy baselines, demonstrating its potential for real-world deployment.