The use of Reinforcement Learning in Algorithmic Trading
What are the impacts of different possible reward functions on the ability of the RL model to learn, and the performance of the RL Model?
J. Bertasius (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Antonis Papapantoleon – Mentor (TU Delft - Applied Probability)
M.A. Sharifi Kolarijani – Mentor (TU Delft - Team Amin Sharifi Kolarijani)
N. Yorke-Smith – Mentor (TU Delft - Algorithmics)
Julia Olkhovskaya – Graduation committee member (TU Delft - Sequential Decision Making)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Algorithmic trading already dominates modern financial markets, yet most live systems still rely on fixed heuristics that falter when conditions change. Deep reinforcement learning agents promise adaptive decision making, but their behaviour is driven entirely by the reward function - a design choice that remains poorly researched. Given the critical importance of reward function design in RL problems, this paper investigates the impact of different choices of reward functions while trading in the Forex market and focusing on the EUR/USD pair. We explore different rewarding methods such as profit‑only, risk‑adjusted, multi‑objective, imitation‑learning based, and self‑rewarding mechanisms. Overall, we demonstrate the impact of careful reward engineering and whether it can boost performance and efficiency, highlighting reward design as a critical and previously under‑examined part of RL for deploying reliable trading models.