Inverse Reinforcement Learning (IRL) in Presence of Risk and Uncertainty Related Cognitive Biases

None, None

Inverse Reinforcement Learning (IRL) in Presence of Risk and Uncertainty Related Cognitive Biases

To what extent can IRL learn rewards from expert demonstrations with loss and risk aversion?

Bachelor Thesis (2023)

Author(s)

M. Ikiz (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Caregnato Neto – Mentor (TU Delft - Interactive Intelligence)

Luciano C. Siebert – Mentor (TU Delft - Interactive Intelligence)

J.M. Weber – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

Reinforcement Learning Inverse Reinforcement Learning Cognitive biases

To reference this document use:

https://resolver.tudelft.nl/uuid:1571ca1f-3f46-4cf1-982e-b012316644f8

More Info

expand_more

Publication Year

2023

Language

English

Copyright

Graduation Date

29-06-2023

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A key issue in Reinforcement Learning (RL) research is the difficulty of defining rewards. Inverse Reinforcement Learning (IRL) is a technique that addresses this challenge by learning the rewards from expert demonstrations. In a realistic setting, expert demonstrations are collected from humans, and it is important to acknowledge that these demonstrations can deviate from rationality due to systematic biases known as cognitive biases. One group of cognitive biases, known as risk-sensitive cognitive biases, pertains to individuals' attitudes and behaviors towards risk and uncertainty. This paper investigates the extent to which IRL can learn from demonstrations that contain risk-sensitive cognitive biases such as loss aversion and risk aversion. Modelling biases using concepts from Prospect Theory and System 1 and 2 model and using Maximum Entropy IRL algorithm, this paper concludes that IRL can recreate similar solutions to experts but inferring the underlying motivations and the interactions between them is an intricate problem that requires novel approaches.

Files

BachelorThesis.pdf

(pdf | 1.74 Mb)

License info not available