Evaluating interpretability of state-of-the-art NLP models for predicting moral values

Constantinescu, Ionut

Evaluating interpretability of state-of-the-art NLP models for predicting moral values

Title

Evaluating interpretability of state-of-the-art NLP models for predicting moral values

Author

Constantinescu, Ionut (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Intelligent Systems)

Contributor

Liscio, E. (mentor)
Murukannaiah, P.K. (mentor)
Marroquim, Ricardo (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Science and Engineering

Project

CSE3000 Research Project

Date

2021-07-02

Abstract

Understanding personal values is a crucial aspect that can facilitate the collaboration between AI and humans. Nonetheless, the implementation of collaborative agents in real life greatly depends on the amount of trust that is built in their relationship with people. In order to bridge this gap, more extensive analysis of the explainability of these systems needs to be conducted. We implement LSTM, BERT and FastText, three deep learning models for text classification and compare their interpretability on the task of predicting moral values from opinionated text. The results highlight the different degrees to which the behaviour of the three models can be explained in the context of moral value prediction. Our experiments showed that BERT, current state-of-the-art in natural language processing tasks, achieves the best performance while also providing more interpretable predictions than the other two models.

Subject

Moral foundations
Moral values
Natural Language Processing
Explainable AI

To reference this document use:

http://resolver.tudelft.nl/uuid:f8560b2b-8831-4c79-923a-9de785aa3c85

Embargo date

2022-12-31

Part of collection

Student theses

Document type

bachelor thesis

Rights

Files

PDF

Research_Paper_Ionut_Cons ... inescu.pdf

1.11 MB

Close viewer