One-Shot Generalization in Offline Reinforcement Learning with WSAC-N

None, None

One-Shot Generalization in Offline Reinforcement Learning with WSAC-N

Bachelor Thesis (2024)

Author(s)

M.D.I. Museur (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M.R. Weltevrede – Mentor (TU Delft - Sequential Decision Making)

MTJ Spaan – Mentor (TU Delft - Sequential Decision Making)

Elena Congeduti – Graduation committee member (TU Delft - Computer Science & Engineering-Teaching Team)

Faculty

Electrical Engineering, Mathematics and Computer Science

To reference this document use:

https://resolver.tudelft.nl/uuid:af3a0321-2f1a-4839-901f-eb906b63df34

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

27-06-2024

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Recent work has shown that offline reinforcement learning (RL) does not generalize well to new environments compared to behavioral cloning (BC). We propose WSAC-N, an ensemble model of soft actor-critics with weights to de-emphasize actions with high variance. We compare the zero-shot generalization abilities of WSAC-N with the baseline BC in a four-room maze-like environment, testing on unseen tasks. Our findings indicate that WSAC-N has worse zero-shot generalization compared to BC, aligning with previous work. Additionally, we investigate the impact of dataset characteristics on generalization, finding that dataset size has a negligible impact, while the quality of trajectories generally has a positive effect. These results are consistent with prior research.

Files

Research_Project_Final.pdf

(pdf | 1.31 Mb)

License info not available