Moral Embeddings: A closer look at their Performance, Generalizability and Transferability

More Info
expand_more

Abstract

Moral values are abstract ideas that ground our judgements towards what is right or wrong. However, with the rapid unfold of moral rhetoric on social media, it becomes increasingly important to place these ideas in a moral frame, contain their harmful effects, and recognise their positive ones. So far, estimating values from opinionated text has posed a challenge due to values' abstract and subjective nature. However, with the latest developments in Natural Language Processing (NLP), we foresee an opportunity to align the study of morality in text with state-of-the-art NLP architectures. Recently published, the Moral Foundations Tweeter Corpus is a milestone in moral classification tasks by offering a dataset that allows for a closer look into how people express moral narratives in social media. In the downstream process of a text classifier, embeddings convert words and sentences into meaningful vectors. Pre-trained on large corpora, they can be fine-tuned, and domain adapted. This study proposes a refinement model, starting from the available dataset, that learns to capture moral information in Sentence-BERT embeddings by applying a state-of-the-art supervised method (triplet loss). We further demonstrate how the refined embeddings improve the accuracy of moral classifiers. Finally, with an improvement of 5% F1-score over models that use pre-trained embeddings, we pave the way towards a generalisable and transferable set of moral embeddings.

Files

Research_Project_Final.pdf
(.pdf | 0.374 Mb)
- Embargo expired in 31-12-2022