What are the implications of Curriculum Learning strategy on IRL methods?

None, None

What are the implications of Curriculum Learning strategy on IRL methods?

Investigating Inverse Reinforcement Learning from Human Behavior

Bachelor Thesis (2023)

Author(s)

M. Vlasenko (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

L. Cavalcante Siebert – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

A. Caregnato Neto – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

J.M. Weber – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Reinforcement Learning Inverse Reinforcement Learning Adversarial Machine Learning Curriculum Learning Proximal Policy Optimization

URL related publication

https://github.com/mikhail-vlasenko/curriculum-learning-IRL

To reference this document use

https://resolver.tudelft.nl/uuid:e8f9b5b8-df68-4947-8965-792b99cba283

More Info

expand_more

Publication Year

2023

Language

English

Graduation Date

29-06-2023

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Abstract

Inverse Reinforcement Learning (IRL) is a subfield of Reinforcement Learning (RL) that focuses on recovering the reward function using expert demonstrations. In the field of IRL, Adversarial IRL (AIRL) is a promising algorithm that is postulated to recover non-linear rewards in environments with unknown dynamics. This study investigates the potential benefits of applying the Curriculum Learning (CL) strategy to the AIRL algorithm. For our experiments, we use a randomized partially observable Markov decision process in the form of a grid-world-like environment. Using only expert demonstrations obtained with an RL algorithm under the true reward function, we train AIRL in a variety of configurations and identify an effective curriculum. Our results show, that a well-constructed curriculum can enhance the performance of AIRL twofold in both key aspects: the speed of convergence and the efficiency of using expert demonstrations. We thus conclude that CL can be a useful addition to an AIRL-based solution. Full code is available online in the supplementary material https://github.com/mikhail-vlasenko/curriculum-learning-IRL.

Files

CSE3000_Final_Paper_for_submis... (pdf)

(pdf | 3.38 Mb)

License info not available