Exploring Reinforcement Learning for Constrained Wing Shape Optimization

More Info


In this paper, the Proximal Policy Optimization (PPO) algorithm is used to perform a constrained wing shape optimization. The PPO algorithm is a Machine Learning (ML) algorithm that improves itself by repeatedly performing the same optimization and learning from its results. The complete adaptation of the PPO framework to the design problem is detailed and evaluated. Not only was the PPO framework able to consistently optimize the wing 4% further than the Particle Swarm Optimization (PSO) algorithm, it was able to do so 35 times faster once the model is fully trained. The PPO framework was able to find more efficient wing shapes than the PSO framework. The trained PPO model was able to optimize the wing of other similar aircraft, even without direct retraining. These results illustrate that PPO could be a promising technique for automated aerospace design problems. Due to the significant training time of the ML approach, the PPO algorithm is not an effective replacement of traditional optimization algorithms for design problems where only a single optimization is required.