Explainable AI: A Proof of Concept Demonstration in Financial Transaction Fraud Detection using TreeSHAP & Diverse Counterfactuals

More Info


The European Commission recently published a proposal for an Artificial Intelligence (AI) act that requires the development of trustworthy AI systems for European Union markets. The proposal clearly mentions that AI systems should make use of Explainable AI tools to increase transparency and interpretability.Financial institutions in the Netherlands are required by law to detect and monitor fraud in their infrastructure. Fraud detection in Financial Services (FS) or the FinTech industry is increasingly performed by Machine Learning (ML) and Artificial Neural Network models that provide high classificationperformance. ML/ANN-based fraud detection systems that are necessary for maintaining trust in the Dutch financial system are classified as high-risk applications by the proposal for the EU AI act. The EU AI act will directly impact high-risk AI applications used within the EU markets, Therefore,the Dutch financial institution sponsoring this research wants to future-proof their ML-based fraud detection to improve transparency and trust by solving the model interpretability problem. Explainable Artificial Intelligence (XAI) is a domain of AI research that seeks to solve model interpretabilityproblems of black-box ML models. In this thesis research, proofs of concepts are demonstrated for the investigation of two XAI approaches - TreeSHAP & Diverse Counterfactuals to improve the model explainabilityor interpretability of ML/ANN-based fraud detection systems. This research pioneers the investigation of Diverse Counterfactuals to improve model interpretability in ML/ANN-based frauddetection systems. Based on the existing literature, this is the first instance of research investigating Diverse Counterfactuals for generating explanations to ML/ANN-based fraud detection models trained using synthetic transaction datasets. Before demonstrating the proofs-of-concept, an extensive literature survey has been conducted to map the XAI research landscape, formulate an XAI taxonomy, and conduct a comparative analysis of XAI approaches to select and describe in detail, the relevant approaches for the use-case at hand. Subsequently, several ML and ANN models have been trained and tested using the PaySim synthetic transaction datasets. To overcome model performance challenges due to data quality issues and high class imbalancein the datasets, several experimentation scenarios involving hyperparameter optimization, SMOTE Oversampling and class-weighting have been investigated. Subsequently, two high-performing models (XGBoost & MLP) from these experiments have been used to demonstrate the proofs of conceptsby investigating TreeSHAP and Diverse Counterfactual algorithms. TreeSHAP algorithm greatly improved the interpretability of the global and local model behavior of the XGBoost-based fraud detection models. Diverse Counterfactuals algorithm-generated diverse but unfeasible counterfactuals. Counterfactual algorithms suffer from computational inefficiency and therefore, further research has to be conducted to generate feasible counterfactuals. Future work on model explainability should also conduct a human-grounded evaluation of the explanations to evaluate the quality or goodness of the explanations. Finally, real-world transaction datasets should be used instead of synthetic datasets so that the research is generalizable to the real world