Balancing multidimensional morality and progression

Evaluating the tradeoff for artificial agents playing text-based games

More Info
expand_more

Abstract

Morality is a fundamental concept that guides humans in the decision-making process. Given the rise of large language models in society, it is necessary to ensure that they adhere to human principles, among which morality is of substantial importance. While research has been done regarding artificial agents behaving morally, current state of the art implementations consider morality to be linear, thus failing to capture its complexity and nuances. To account for this, a multidimensional representation of morality is proposed, each dimension corresponding to a different moral foundation. Then, the performance of three types of artificial agents tasked with choosing actions while playing text-based games is compared and analysed. One type of agent is implemented to only choose the most moral action, without aiming to win the games, another one prioritizes moral actions over game progression, and another strives to win the games while also playing morally. The latter outperforms the others in terms of game progression, while also taking few immoral actions. However, the agent prioritizing morality over progression performs only slightly worse while taking no immoral actions, proving that artificial agents can perform well while also behaving morally.

Files