Imitation learning from neural networks with continuous action spaces using regression trees

None, None

Imitation learning from neural networks with continuous action spaces using regression trees

Bachelor Thesis (2025)

Author(s)

T.S. Cichocki (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

D.A. Vos – Mentor (TU Delft - Algorithmics)

A. Lukina – Graduation committee member (TU Delft - Algorithmics)

Faculty

Electrical Engineering, Mathematics and Computer Science

Reinforcement Learning Interpretable machine learning Dagger

To reference this document use:

https://resolver.tudelft.nl/uuid:e62ae645-ad4e-4186-bcdc-f4c119b8b503

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

24-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Reinforcement learning models are being utilised in a wide range of industries where even minor mistakes can have severe consequences. For safety reasons, it is important that a human expert can verify the decision-making process of a model. This is where interpretable reinforcement learning proves its importance. This research is focused on training decision tree policies with a limited size and evaluating them on continuous action space environments. For that, a DAGGER algorithm is used with appropriate modifications to account for the continuous setting. The results demonstrate that small decision trees can replicate the high-performing neural network policies (e.g., TD3), achieving close to benchmark scores. Therefore, it is possible to explain the complex model's behaviour with much more understandable structures.

Files

Research_paper_6_.pdf

(pdf | 3.19 Mb)

License info not available