Maxplain – Value-based Evaluation of Explainable AI Techniques

Master Thesis (2023)
Author(s)

S. Deb (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Jie Yang – Mentor (TU Delft - Web Information Systems)

P. Lippmann – Mentor (TU Delft - Web Information Systems)

P.K. Murukannaiah – Graduation committee member (TU Delft - Interactive Intelligence)

Maria S. Pera – Graduation committee member (TU Delft - Web Information Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2023 Sreeparna Deb
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 Sreeparna Deb
Graduation Date
31-08-2023
Awarding Institution
Delft University of Technology
Programme
['Computer Science | Artificial Intelligence']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

A 2022 Harvard Business Review report critically examines the readiness of AI for real-world decision-making. The report cited several incidents, like an experimental healthcare chatbot suggesting a mock patient commit suicide in response to their distress or when a self-driving car experiment was called off after it resulted in the death of a pedestrian.

These incidents, leading to media frenzies and public outcries, underscore a pressing concern: "How do these AI systems reach their conclusions?" It has created an urgent demand for transparency and clarity in AI decision-making processes. This urge to understand has translated into a significant uptick in the volume of work in Explainable AI (XAI). This makes it crucial to have consistent evaluation standards for streamlined growth in the field.

However, XAI, being a multidisciplinary field, faces the challenge of a lack of consensus on what constitutes a "good" explanation. Stakeholders with diverse backgrounds and needs can have diverging expectations from XAI. Some might prioritize simple and concise explanations, while others prioritize detailed information about AI predictions, all depending on their end goal.

This thesis addresses the standardization of an evaluation framework for XAI methods, that accounts for stakeholders' needs in different usage contexts. It presents a prototype that can be customized and extended to suit various XAI methods and tasks. Findings affirm the framework’s ability to yield insightful comparisons between different XAI methods. It also highlights issues with human perception of specific XAI features in those methods. The efforts in this work contribute to XAI techniques being integrated into real-world applications, ensuring more reliable and consistent performance assessment.

Files

License info not available