Influence of graph neural network architecture on explainability of protein-protein interaction networks

Bachelor Thesis (2024)
Author(s)

H.T. Janczak (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M. Khosla – Mentor (TU Delft - Multimedia Computing)

J.M. Weber – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

T.E.P.M.F. Abeel – Graduation committee member (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
26-06-2024
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

AI explainers are tools capable of approximating how a neural network arrived at a given predic- tion by providing parts of the input data most rel- evant for the model’s choice. These tools have become a major point of research due to a need for human-verifiable predictions in multiple differ- ent fields, such as biomedical engineering. Graph Neural Networks (GNNs) are often used for such tasks, which led to the development of GNNSub- net, a tool capable of finding disease subnetworks on models trained with protein-protein interaction (PPI) data. This tool has been tested with only a single GNN architecture, which left a knowledge gap about the performance of the tool under dif- ferent models, which can differ significantly in the way they operate. Here the question ”How does the explainer perfor- mance vary with change in architectures of training models?” is answered. This paper explores this knowledge gap by train- ing and evaluating two other models (GCN and GraphSAGE) to see if the explanation performance of GNNSubnet changes. The performance is eval- uated with BAGEL metrics, a tool developed for XAI analysis. These metrics allow for comparison of explanations on multiple benchmarks. Three of these - Fidelity, Validity- and Validity+ - measure how accurate an explanation was in terms of iden- tifiying important nodes. The last one - Sparsity - assesses the nontriviality of an explanation by mea- suring how few nodes have been identified as im- portant. The experimental process shows low performance changes with different GNN architectures for accuracy-related metrics - RDT-Fidelity, Validity- and Validity+, which means that GNNSubnet is highly generalizable and not tied to a specific GNN model. However, the Sparsity score differs across models, with GIN being able to provide the most concise - and therefore useful - explanations.

Files

License info not available