The use of Reinforcement Learning in Algorithmic Trading

None, None

The use of Reinforcement Learning in Algorithmic Trading

What are the impacts of different possible reward functions on the ability of the RL model to learn, and the performance of the RL Model?

Bachelor Thesis (2025)

Author(s)

J. Bertasius (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Antonis Papapantoleon – Mentor (TU Delft - Applied Probability)

M.A. Sharifi Kolarijani – Mentor (TU Delft - Team Amin Sharifi Kolarijani)

N. Yorke-Smith – Mentor (TU Delft - Algorithmics)

Julia Olkhovskaya – Graduation committee member (TU Delft - Sequential Decision Making)

Faculty

Electrical Engineering, Mathematics and Computer Science

To reference this document use:

https://resolver.tudelft.nl/uuid:df2b5d6d-80dd-4a66-b187-557dc52f5d1f

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

24-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Algorithmic trading already dominates modern financial markets, yet most live systems still rely on fixed heuristics that falter when conditions change. Deep reinforcement learning agents promise adaptive decision making, but their behaviour is driven entirely by the reward function - a design choice that remains poorly researched. Given the critical importance of reward function design in RL problems, this paper investigates the impact of different choices of reward functions while trading in the Forex market and focusing on the EUR/USD pair. We explore different rewarding methods such as profit‑only, risk‑adjusted, multi‑objective, imitation‑learning based, and self‑rewarding mechanisms. Overall, we demonstrate the impact of careful reward engineering and whether it can boost performance and efficiency, highlighting reward design as a critical and previously under‑examined part of RL for deploying reliable trading models.

Files

Research_Paper_5_.pdf

(pdf | 4.92 Mb)

License info not available