Natural Language Processing and Reinforcement Learning to Generate Morally

Title

Natural Language Processing and Reinforcement Learning to Generate Morally: What is the optimal weight w to win the games while playing morally?

Author

Boudier, Kenzo (TU Delft Electrical Engineering, Mathematics and Computer Science)

Contributor

Liscio, E. (mentor)
Mambelli, D. (mentor)
Murukannaiah, P.K. (mentor)
Yang, J. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Science and Engineering

Project

CSE3000 Research Project

Date

2023-07-03

Abstract

In our everyday life, people interact more and more with agents. However these agents often lack a moral sense and prioritize the accomplishment of the given task. In consequence, agents may unknowingly act immorally. Little research or progress has been done to endow agents with human morality and an internal sense of right and wrong. As of today, agents have a primitive representation of morality often represented as 1 value. In contrast, humans have multiple reasons to judge an action as moral. In hope of creating agents that are imbued with a more complex and human moral, we build upon the Jiminy Cricket environment. This preexisting environment has multiple games with diverse scenarios and the objective is to do the most moral action to maximize the reward

Subject

To reference this document use:

http://resolver.tudelft.nl/uuid:9b72f10d-577e-4433-8b10-15cf964112d5

Part of collection

Student theses

Document type

bachelor thesis

Rights

© 2023 Kenzo Boudier

Files

PDF

CSE3000_Final_Report_2_.pdf

488.7 KB

Close viewer