Contrastive Self-Explanation Method (CoSEM): Generating Large Language Model Contrastive Self-Explanations

None, None

Contrastive Self-Explanation Method (CoSEM): Generating Large Language Model Contrastive Self-Explanations

Master Thesis (2024)

Author(s)

R. Kargul (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

J Yang – Mentor (TU Delft - Web Information Systems)

S.E. Carter – Mentor (TU Delft - Web Information Systems)

S.N.R. Buijsman – Mentor (TU Delft - Ethics & Philosophy of Technology)

Maria S. Pera – Graduation committee member (TU Delft - Web Information Systems)

M.L. Tielman – Graduation committee member (TU Delft - Interactive Intelligence)

Faculty

Electrical Engineering, Mathematics and Computer Science

Explainable AI Contrastive explanation Large Language Model

To reference this document use:

https://resolver.tudelft.nl/uuid:8cfb8cf1-0273-4da7-8612-df1b2b05803d

More Info

expand_more

Publication Year

2024

Language

English

Graduation Date

30-09-2024

Awarding Institution

Delft University of Technology

Programme

['Computer Science']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Large language models (LLMs) are widely used tools that assist us by answering various questions. Humans implicitly use contrast as a natural way to think about and seek explanations (i.e., "Why A and not B?"). Explainability is a challenging aspect of LLMs, as we do not truly understand how good the LLM answers are. The challenge is understanding to what extent LLMs can generate effective contrastive self-explanations for users. We introduce the Contrastive Self-Explanation Method (CoSEM) to narrow the gap between LLMs and explainability. It generates contrastive self- explanations and evaluates them through automation and a user study on generality, usefulness, readability, and relevance. Our results indicate that LLMs are capable of generating effective contrastive self-explanations. Lexical analysis of contrastive explanation indicates that explanations are not less general than the text those explain, and semantic analysis shows that more complex models generalize self-explanations more consistently. Although it is challenging to evaluate contrast in self-explanations semantically, user study shows that some models (Llama3-8B) help understand the contrast. Moreover, task selection affects how readable users find the explanations, where more self-explanations on general topics (movie reviews) are more readable than more specific topics (medical diagnoses). Lastly, some models, such as Llama3-8B, excel at generating contrastive self-explanations that contain relevant information regarding input text.

Files

CoSEM.pdf

(pdf | 1.94 Mb)

License info not available