Modified GNN-SubNet: leveraging local versus global Graph Neural Network explanations for disease subnetwork detection

Bachelor Thesis (2024)
Author(s)

E. Milchi (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M. Khosla – Mentor (TU Delft - Multimedia Computing)

J.M. Weber – Mentor (TU Delft - Pattern Recognition and Bioinformatics)

Thomas Abeel – Coach (TU Delft - Pattern Recognition and Bioinformatics)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2024
Language
English
Graduation Date
26-06-2024
Awarding Institution
Delft University of Technology
Project
['CSE3000 Research Project']
Programme
['Computer Science and Engineering']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

As graph neural networks (GNNs) become more frequently used in the biomedical field, there is a growing need to provide insight into how their predictions are made. An algorithm that does this is GNN-SubNet, developed with the aim of detecting disease subnetworks in protein-protein interaction (PPI) networks. GNN-SubNet makes use of a sampling scheme to generate a global explanation in the form of a node mask which indicates each node's importance for all of the GNN's predictions on a dataset. The aim of this study is to validate GNN-SubNet by comparing it with an alternative approach of obtaining global explanations. Instead of obtaining the node mask via a sampling scheme, multiple (local) explanations are optimized per dataset sample, then the node masks are aggregated by either the mean (Mean Aggregation) or the median value (Median Aggregation) per node.
GNN-SubNet is compared with its two modifications firstly by analyzing which disease subnetworks each algorithm detects, and secondly by leveraging metrics devised to assess explainers for GNNs. The results show that all algorithms detect subnetworks associated with cancer. In terms of the metric scores, Mean Aggregation obtains explanations with the highest fidelity, however no algorithm obtains sparse explanations. The study also indicates that GNN-SubNet obtains variate outcomes over multiple runs, and as such the results may not be reproducible.

Files

License info not available