Can we use LLMs for abstraction in MDPs?

A deep dive into the potential and limitations of LLMs

Master Thesis (2025)
Author(s)

D. Lentschig (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

F.A. Oliehoek – Mentor (TU Delft - Sequential Decision Making)

J. He – Mentor (TU Delft - Sequential Decision Making)

P.K. Murukannaiah – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2025
Language
English
Graduation Date
22-10-2025
Awarding Institution
Delft University of Technology
Programme
['Electrical Engineering | Embedded Systems']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This thesis explores whether Large Language Models (LLMs) can generate abstractions in Markovian Decision Processes (MDPs) to reduce complexity in planning with Monte Carlo Tree Search (MCTS). A complete pipeline was developed to extract and validate cluster-based abstractions from LLMs. The pipeline combines modular prompt engineering, post-processing, and evaluation through both structural similarity and performance metrics. Experiments in gridworld environments show that Deepseek-R1 models consistently outperform LLaMA models, with architecture and training proving more important than parameter size. Structured prompts, especially those using JSON representation and rationale-driven responses, significantly improved abstraction quality. While LLMs can approximate, and sometimes even find the ideal abstractions in simple environments, performance deteriorates in larger or less regular domains. These findings highlight both the potential and current limitations of LLM-based abstraction, and suggest directions for future research, including more complex environments, richer abstraction types, and advanced prompting strategies.

Files

License info not available