Correctness is not Faithfulness in Retrieval Augmented Generation Attributions

Conference Paper (2025)
Author(s)

Jonas Wallat (L3S)

Maria Heuss (Universiteit van Amsterdam)

Maarten Rijke (Universiteit van Amsterdam)

A. Anand (TU Delft - Web Information Systems)

Research Group
Web Information Systems
DOI related publication
https://doi.org/10.1145/3731120.3744592
More Info
expand_more
Publication Year
2025
Language
English
Research Group
Web Information Systems
Pages (from-to)
22-32
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Large language models (LLMs) have transformed information retrieval through chat interfaces, but their hallucination tendencies pose significant risks. While Retrieval Augmented Generation (RAG) with citations has emerged as a solution by allowing users to verify responses through source attribution, current evaluation approaches focus primarily on citation correctness - whether cited documents support the corresponding statements. This is insufficient and we introduce citation faithfulness - whether the model's reliance on cited documents is genuine rather than post-rationalized to fit pre-existing knowledge. Our contributions are threefold: (i) we introduce coherent notions of attribution and introduce the concept of citation faithfulness; (ii) we propose desiderata for citations beyond correctness and accuracy needed for trustworthy systems; and (iii) we emphasize evaluating citation faithfulness by studying post-rationalization. Through experimentation, we reveal prevalent post-rationalization issues, finding that up to 57% of citations lack faithfulness. This undermines reliable attribution and may result in misplaced trust, highlighting a critical gap in current LLM-based IR systems. We demonstrate why both citation correctness and faithfulness must be considered when deploying LLMs in IR applications, contributing to a broader discussion of building more reliable and transparent information access systems.