- document
-
Peschl, M. (author), Zgonnikov, A. (author), Oliehoek, F.A. (author), Cavalcante Siebert, L. (author)Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions. However, state-of-the art methods typically focus on learning a single reward model, thus rendering it difficult to trade off different reward functions from multiple experts. We...conference paper 2022
- document
-
Peschl, M. (author)We propose a deep reinforcement learning algorithm that employs an adversarial training strategy for adhering to implicit human norms alongside optimizing for a narrow goal objective. Previous methods which incorporate human values into reinforcement learning algorithms either scale poorly or assume hand-crafted state features. Our algorithm...conference paper 2021