Language models (LLMs) have demonstrated impressive performance on knowledge-intensive tasks like question answering when supported by external knowledge. However, their success relies not only on their reasoning capabilities and the accuracy of the external knowledge but also on
...
Language models (LLMs) have demonstrated impressive performance on knowledge-intensive tasks like question answering when supported by external knowledge. However, their success relies not only on their reasoning capabilities and the accuracy of the external knowledge but also on the truthfulness of the prompts provided. False premises in prompts can lead to "hallucinations," where the generated content appears plausible but is factually incorrect. This issue is common in online questions, particularly when users search for information on unfamiliar topics, leading to confirmation bias in information retrieval.
Existing methods for detecting hallucinations may not effectively handle false premises, as they can be misled by coherent responses that align with the false premises. Fact-checking methods may also be unsuitable, as LLMs can exhibit sycophantic behavior in attempting to satisfy user requirements.
To address this challenge, we propose a False Premise Detection with Abductive Reasoning FPDAR method for question answering with LLMs. Abductive reasoning enables backward thinking, minimizing less plausible assumptions and working towards the correct answer through a bottom-up approach. FPDAR is designed as a plug-and-play module that can be integrated after the question-answering process.
FPDAR employs a two-stage abductive reasoning process. First, it infers the most plausible question intent based on factual context and generated response without considering the potentially problematic question as input. This allows for the identification of false premises by comparing the inferred intent with the original question. Second, abductive reasoning helps generate a more plausible explanation aligned with factual context, increasing the likelihood of correctness and ruling out less plausible, potentially hallucinated responses.
To the best of our knowledge, this is the first study introducing abductive reasoning for identifying and diagnosing false premises. We conduct extensive experiments on two question-answering benchmarks containing false premises to validate the effectiveness of FPDAR. The results show that FPDAR can achieve high accuracy in terms of response correctness, although it may struggle to effectively detect false premises. Nevertheless, it achieves substantial accuracy improvements over state-of-the-art methods.