Hypersonic glide vehicle trajectory optimization requires generating complete reachable footprints for mission planning under strict path constraints. Current methods face limitations including equilibrium glide assumptions, fixed angle of attack profiles and incomplete footprint
...
Hypersonic glide vehicle trajectory optimization requires generating complete reachable footprints for mission planning under strict path constraints. Current methods face limitations including equilibrium glide assumptions, fixed angle of attack profiles and incomplete footprint coverage requiring re-optimization for each target direction. This work presents the first reinforcement learning approach for complete global footprint generation with dual control authority (bank angle and angle of attack) on a rotating spherical Earth model including direct target point guidance and no-fly zone avoidance. The trained policy generates footprints for any location on Earth while incorporating full three-degree-of-freedom dynamics including Coriolis effects and control rate limitations. All trajectories satisfy operational constraints including dynamic pressure, g-load and temperature limits. The policy learns purely from the objective to maximize range in every direction without pre-designed control profiles. Based on the Soft Actor-Critic algorithm, optimal control strategies are learned directly from environment interaction. Large-scale Monte Carlo validation with 100 million randomly sampled trajectories confirms the learned policy discovered the boundaries of the reachable domain, with the reinforcement learning footprint exceeding the Monte Carlo footprint by 5.1% in footprint area. The framework provides precision target guidance to arbitrary global coordinates and allows for no-fly zone avoidance.