Discretising Continuous Action Spaces for Optimal Decision Trees

None, None

Discretising Continuous Action Spaces for Optimal Decision Trees

Verifiable Policies for Continuous Environments in Reinforcement Learning

Bachelor Thesis (2025)

Author(s)

M.M.J. van der Kuil (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

D.A. Vos – Mentor (TU Delft - Algorithmics)

A. Lukina – Graduation committee member (TU Delft - Algorithmics)

L. Cavalcante Siebert – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty

Electrical Engineering, Mathematics and Computer Science

Reinforcement Learning Explainable AI Decision Trees Discretisation Techniques Search Algorithms

To reference this document use:

https://resolver.tudelft.nl/uuid:d2514471-e98c-4e0a-9186-39af68eae826

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

24-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Complex reinforcement learning (RL) models that receive high rewards in their environments are often hard to understand. To this end, more interpretable models can be used, such as decision trees. To be able to deploy these models in safety-critical environments, they need to be high-performing and verifiable. Optimal decision trees can fulfill both these goal. While some methods already exist to find optimal decision trees, none have applied them to RL environments with continuous state
and action spaces [4; 13]. Broccoli is a state-of-the-art approach to synthesizing decision tree policies for blackbox environments [3]. Given a discretisation of the continuous state space, it can find the optimal tree in discrete action environments.
This research will focus on discretising action spaces of continuous environments. In doing so, discrete action spaces are created for which Broccoli can find optimal decision trees. Additionally, a goal is to show that it is possible to discretise continuous action spaces and compute corresponding optimal decision trees in feasible time. Furthermore, it proposes informed discretisation techniques that result in better-performing decision tree policies.

Files

Research_Project_Main_paper.pd... (pdf)

(pdf | 0.513 Mb)

License info not available