Print Email Facebook Twitter Investigation and Comparison of Evaluation Methods of Model-Agnostic Explainable AI Models Title Investigation and Comparison of Evaluation Methods of Model-Agnostic Explainable AI Models Author Oedayrajsingh Varma, Vanisha (TU Delft Electrical Engineering, Mathematics and Computer Science) Contributor Lal, C. (mentor) Conti, M. (mentor) P. Gonçalves, Joana (graduation committee) Degree granting institution Delft University of Technology Programme Computer Science and Engineering Project CSE3000 Research Project Date 2022-06-24 Abstract Many artificial intelligence (AI) systems are built using black-box machine learning (ML) algorithms. The lack of transparency and interpretability reduces their trustworthiness. In recent years, research into explainable AI (XAI) has increased. These systems are designed to tackle common ML issues such as trust, accountability, and transparency. However, research into the evaluation of XAI is still low. In this paper, common trends in the evaluation of state-of-the-art model-agnostic XAI models and any missing or undervalued evaluation methods are identified. First, a taxonomy is explored, and an overview of existing evaluation metrics found in literature is made. Then, using this overview, a thorough analysis and comparison of the evaluation methods of 5 state-of-the-art model-agnostic XAI models (LIME, SHAP, Anchors, PASTLE, and CASTLE) is done. It has been discovered that only a small subset of the found evaluation metrics is used in the evaluation of the state-of-the-art models. Metrics that are not often assessed in user-studies but deserve more attention are (appropriate) trust, task time length, and task performance. For synthetic experiments, only fidelity is commonly assessed. The models are also only assessed using proxy tasks, none of them are assessed using real-world tasks. In addition, each identified metric was found to have various different measurement methods and units of measurement, indicating a lack of standardization. Subject Explainable Artificial IntelligenceExplainable Machine LearningEvaluationEvaluation metrics To reference this document use: http://resolver.tudelft.nl/uuid:379a7d52-ade9-47a5-b0a5-cdba651db2ff Part of collection Student theses Document type bachelor thesis Rights © 2022 Vanisha Oedayrajsingh Varma Files PDF Evaluation_of_Model_Agnos ... _Final.pdf 133.93 KB Close viewer /islandora/object/uuid:379a7d52-ade9-47a5-b0a5-cdba651db2ff/datastream/OBJ/view