Influence of graph neural network architecture on explainability of protein-protein interaction networks
H.T. Janczak (TU Delft - Electrical Engineering, Mathematics and Computer Science)
M. Khosla – Mentor (TU Delft - Multimedia Computing)
J.M. Weber – Mentor (TU Delft - Pattern Recognition and Bioinformatics)
T.E.P.M.F. Abeel – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
AI explainers are tools capable of approximating how a neural network arrived at a given predic- tion by providing parts of the input data most rel- evant for the model’s choice. These tools have become a major point of research due to a need for human-verifiable predictions in multiple differ- ent fields, such as biomedical engineering. Graph Neural Networks (GNNs) are often used for such tasks, which led to the development of GNNSub- net, a tool capable of finding disease subnetworks on models trained with protein-protein interaction (PPI) data. This tool has been tested with only a single GNN architecture, which left a knowledge gap about the performance of the tool under dif- ferent models, which can differ significantly in the way they operate. Here the question ”How does the explainer perfor- mance vary with change in architectures of training models?” is answered. This paper explores this knowledge gap by train- ing and evaluating two other models (GCN and GraphSAGE) to see if the explanation performance of GNNSubnet changes. The performance is eval- uated with BAGEL metrics, a tool developed for XAI analysis. These metrics allow for comparison of explanations on multiple benchmarks. Three of these - Fidelity, Validity- and Validity+ - measure how accurate an explanation was in terms of iden- tifiying important nodes. The last one - Sparsity - assesses the nontriviality of an explanation by mea- suring how few nodes have been identified as im- portant. The experimental process shows low performance changes with different GNN architectures for accuracy-related metrics - RDT-Fidelity, Validity- and Validity+, which means that GNNSubnet is highly generalizable and not tied to a specific GNN model. However, the Sparsity score differs across models, with GIN being able to provide the most concise - and therefore useful - explanations.