Program Synthesis from Rewards using Probe and FrAngel

Impact of Exploration-Exploitation Configurations on Probe and FrAngel in Minecraft

Bachelor Thesis (2024)
Author(s)

N. Filat (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Sebastijan Dumančić – Mentor (TU Delft - Algorithmics)

T.R. Hinnerichs – Mentor (TU Delft - Algorithmics)

Wendelin Böhmer – Graduation committee member (TU Delft - Sequential Decision Making)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
21-06-2024
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Program synthesis involves finding a program that meets the user intent, typically provided as input/output examples or formal mathematical specifications. This paper explores a novel specification in program synthesis - learning from rewards.
We explore existing synthesizers, Probe and FrAngel, to solve navigation tasks inside the popular Minecraft game. The problem formulation is inspired by reinforcement learning but was adapted to program synthesis. Similar to reinforcement learning, balancing exploration and exploitation is essential for solving the task efficiently. Excessive exploration can prevent finding the correct program because the feedback from the environment is not used. On the other hand, excessive exploitation is not ideal, as seemingly promising programs might not lead to the actual solution. This work compares different trade-offs between exploration and exploitation of Probe and FrAngel when applied to Minecraft environments.

Files

Final_Paper.pdf
(pdf | 0.746 Mb)
License info not available