XCrowd

Combining Explainability and Crowdsourcing to Diagnose Models in Relation Extraction

Conference Paper (2024)
Author(s)

Alisa Smirnova (University of Fribourg)

J. Yang (TU Delft - Web Information Systems)

Philippe Cudré-Mauroux (University of Fribourg)

Research Group
Web Information Systems
DOI related publication
https://doi.org/10.1145/3627673.3679777
More Info
expand_more
Publication Year
2024
Language
English
Research Group
Web Information Systems
Pages (from-to)
2097-2107
ISBN (electronic)
979-8-4007-0436-9
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Relation extraction methods are currently dominated by deep neural models, which capture complex statistical patterns while being brittle and vulnerable to perturbations in data and distribution. Explainability techniques offer a means for understanding such vulnerabilities, and thus represent an opportunity to mitigate future errors; yet, existing methods are limited to describing what the model 'knows', while totally failing at explaining what the model does not know. This paper presents a new method for diagnosing model predictions and detecting potential inaccuracies. Our approach involves breaking down the problem into two components: (i) determining the necessary knowledge the model should possess for accurate prediction, through human annotations, and (ii) assessing the actual knowledge possessed by the model, using explainable AI methods (XAI). We apply our method to several relation extraction tasks and conduct an empirical study leveraging human specifications of what a model should know and does not know. Results show that human workers are capable of accurately specifying the model should-knows, despite variations in the specification, that the alignment between what a model really knows and what it should know is indeed indicative of model accuracy, and that the unknowns identified through our methods allow to foresee future errors that may or may not have been observed otherwise.