RG

R.K. Georgiev

info

Please Note

1 records found

Bachelor thesis (2026) - R.K. Georgiev, F. Yu, F.A. Oliehoek, N. Yorke-Smith
Pairs trading has grown increasingly popular over the past several decades, and its application has extended into the domain of portfolio optimization. Reinforcement learning (RL) strategies, particularly Proximal Policy Optimization (PPO), have been used to address this problem. However, while substantial research exists for the single-pair case, a systematic investigation of RL models for portfolio optimization across multiple pairs simultaneously has been lacking. To address this gap, we develop and compare two PPO models that trade on several cointegrated pairs identified within the energy sector of the S&P 500. The two models differ in their information set: one is given explicit knowledge of the asset pairs it trades, while the other operates without this information, learning to allocate capital from price and portfolio data alone. We find that the pair-aware model achieves an annual return of 20.1% and a Sharpe ratio of 0.877, and maintains consistent performance across varying numbers of traded pairs, though no clear relationship emerges between the number of pairs traded and performance. These results suggest that the multi-pair approach to portfolio optimization is promising and highlight the need for further investigation. ...