Print Email Facebook Twitter Natural Language Processing and Reinforcement Learning to Generate Morally Aligned Text Title Natural Language Processing and Reinforcement Learning to Generate Morally Aligned Text: Comparing a moral agent to an optimally playing agent Author Lubbers, Rob (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Liscio, E. (mentor) Mambelli, D. (mentor) Murukannaiah, P.K. (mentor) Yang, J. (mentor) Degree granting institution Delft University of Technology Corporate name Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2023-07-03 Abstract Nowadays Large Language Models are becoming more and more prevalent in today's society. These models act without a sense of morality however. They only prioritize accomplishing their goal. Currently, little research has been done evaluating these models. The current state of the art Reinforcement Learning models represent morality by a singular scalar value determining the morality of a statement. This way of representing morality is inaccurate as there are multiple features determining how moral a statement is. We leverage knowledge from the Moral Foundations Theory to represent morality in a more accurate way, by using a 5-dimensional vector representing morality features. We implement several different agents in an environment where decisions with possible moral implications need to be made. These agents all use alternative approaches in deciding which action to take. The policies are: always pick the most moral action and always pick the most immoral action. Two other agents have the same aforementioned policy but still give some weight towards game progression. Lastly, we look at an amoral agent which does not look at morality at all. We compare these agents by percent completion of the Infocom game suspect. We find that the agent which does not take morality into account achieves the highest completion rate. Agents which give morality a huge weight almost instantly get stuck in an infinite loop without progression. Subject Natural Language ProcessingReinforcement LearningMoralityethics To reference this document use: http://resolver.tudelft.nl/uuid:38b1be58-89dc-4d97-b003-6185b111a06d Part of collection Student theses Document type bachelor thesis Rights © 2023 Rob Lubbers Files PDF CSE3000_Final_Paper.pdf 281.77 KB Close viewer /islandora/object/uuid:38b1be58-89dc-4d97-b003-6185b111a06d/datastream/OBJ/view