The use of Reinforcement Learning in Algorithmic Trading

What are the impacts of different possible reward functions on the ability of the RL model to learn, and the performance of the RL Model?

Bachelor Thesis (2025)
Author(s)

J. Bertasius (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Antonis Papapantoleon – Mentor (TU Delft - Applied Probability)

M.A. Sharifi Kolarijani – Mentor (TU Delft - Team Amin Sharifi Kolarijani)

N. Yorke-Smith – Mentor (TU Delft - Algorithmics)

Julia Olkhovskaya – Graduation committee member (TU Delft - Sequential Decision Making)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
24-06-2025
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Algorithmic trading already dominates modern financial markets, yet most live systems still rely on fixed heuristics that falter when conditions change. Deep reinforcement learning agents promise adaptive decision making, but their behaviour is driven entirely by the reward function - a design choice that remains poorly researched. Given the critical importance of reward function design in RL problems, this paper investigates the impact of different choices of reward functions while trading in the Forex market and focusing on the EUR/USD pair. We explore different rewarding methods such as profit‑only, risk‑adjusted, multi‑objective, imitation‑learning based, and self‑rewarding mechanisms. Overall, we demonstrate the impact of careful reward engineering and whether it can boost performance and efficiency, highlighting reward design as a critical and previously under‑examined part of RL for deploying reliable trading models.

Files

Research_Paper_5_.pdf
(pdf | 4.92 Mb)
License info not available