Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework

None, None; None, None; None, None; None, None

Facial Feedback for Reinforcement Learning: A Case Study and Offline Analysis Using the TAMER Framework

Conference Paper (2021)

Author(s)

Guangliang Li (Ocean University of China)

Shimon Whiteson (University of Oxford)

Hamdi Dibeklioğlu (Bilkent University)

HS Hung (TU Delft - Pattern Recognition and Bioinformatics)

Research Group

Pattern Recognition and Bioinformatics

Copyright

Reinforcement Learning Interactive Facial Feedback Implicit Feedback

To reference this document use:

https://resolver.tudelft.nl/uuid:b8cdb87b-4dd2-4b6b-8fad-0b70aca8ce02

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Research Group

Pattern Recognition and Bioinformatics

Pages (from-to)

1735-1737

ISBN (electronic)

9781450383073

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Interactive reinforcement learning provides a way for agents to learn to solve tasks from evaluative feedback provided by a human user. Previous research showed that humans give copious feedback early in training but very sparsely thereafter. In this paper, we investigate the potential of agent learning from trainers’ facial expressions via interpreting them as evaluative feedback. To do so, we implemented TAMER which is a popular interactive reinforcement learning method in a reinforcement-learning benchmark problem — Infinite Mario, and conducted the first large-scale study of TAMER involving 561 participants. With designed CNN-RNN model, our analysis shows that telling trainers to use facial expressions and competition can improve the accuracies for estimating positive and negative feedback using facial expressions. In addition, our results with a simulation experiment show that learning solely from predicted feedback based on facial expressions is possible and using strong/effective prediction models or a regression method, facial responses would significantly improve the performance of agents. Furthermore, our experiment supports previous studies demonstrating the importance of bi-directional feedback and competitive elements in the training interface.

Files

P1735.pdf

(pdf | 0.919 Mb)

License info not available