Decoding Sentiment with Large Language Models

None, None

Decoding Sentiment with Large Language Models

Comparing Prompting Strategies Across Hard, Soft, and Subjective Label Scenarios

Bachelor Thesis (2024)

Author(s)

T. Oberhuber (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Luciano Cavalcante Siebert – Mentor (TU Delft - Interactive Intelligence)

Amir Homayounirad – Mentor (TU Delft - Interactive Intelligence)

Enrico Liscio – Mentor (TU Delft - Interactive Intelligence)

J. Yang – Graduation committee member (TU Delft - Web Information Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Sentiment Analysis Large Language Models (LLMs) Public Deliberation Subjective Sentiment Zero-Shot Learning Few-Shot Learning Chain-of-Thought Learning Hard Labels Soft Labels Subjective Labels Annotation Perspective Capture Frisian Dataset

To reference this document use:

https://resolver.tudelft.nl/uuid:ee47b8c2-1bb3-44a8-883b-3bca0ee13097

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

27-06-2024

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

This study evaluates the performance of different sentiment analysis methods in the context of public deliberation, focusing on hard-, soft-, and subjective-label scenarios to answer the research question: ``can a Large Language Model detect subjective sentiment of statements within the context of public deliberation?''. If the answer to this question is affirmative, that is a strong indicator that, with the help of longitudinal studies, sentiment analysis with large language models (LLMs) may be implemented to scale public deliberations by providing support for moderators in such discussions. To answer this question, four prompting methods were tested: zero-shot, few-shot, chain-of-thought (CoT) zero-shot, and CoT few-shot using a Frisian dataset of 50 statements annotated by 5 annotators. The findings indicate that the CoT few-shot method significantly outperforms other methods in all scenarios, that soft-labels outperform their hard equivalent, that the underlying data must be balanced for high performing models, and that capturing the perspective of a specific annotator requires further research. Our study suggests that LLMs may perform best under the supervision, or with the collaboration of a human, due to the multi-faced nature of sentiment.

Files

Timur_Oberhuber_-_Final_Paper.... (pdf)

(pdf | 0.287 Mb)

License info not available