Sparse and Interpretable Graph Attention Networks

None, None

Sparse and Interpretable Graph Attention Networks

Master Thesis (2023)

Author(s)

T.T. Naber (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

E. Isufi – Mentor (TU Delft - Multimedia Computing)

Marcos Treviso – Mentor (Instituto Superior Técnico (IST))

Faculty

Electrical Engineering, Mathematics and Computer Science

Machine learning Attention Interpretability Sparse sampling Explainability Graph learning Sparse attention Graph attention networks

To reference this document use:

https://resolver.tudelft.nl/uuid:3cc32ddc-c135-4781-b2b0-bd33341f4ce8

More Info

expand_more

Publication Year

2023

Language

English

Graduation Date

11-10-2023

Awarding Institution

Delft University of Technology

Programme

['Computer Science | Artificial Intelligence | Multimedia Computing']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In this thesis, we study the impact of sparsity on both the performance and interpretability of Graph Attention Networks. Additionally, we introduce two novel methods that yield Sparse & Interpretable Graph Attention Networks.
In Chapter 1, we introduce the reader to the concept of Graph Attention Networks (GATs), sparse attention, and interpretability. Subsequently, we provide our research motivation and formulate the research question.
In Chapter 2, the necessary background information is presented at a fundamental level. First, the reader is formally introduced to Graph Neural Networks (GNNs) and GATs. Continuing, various approaches to sparse attention are discussed in detail. Finally, an overview of the current approaches to explainability in GNNs is provided, along with a focus on the category that the methods in this research can be assigned to.
Chapter 3 consists of the paper written on this topic. As this report acts as an envelope of the paper, the introduction and background of the paper overlap with the corresponding chapters in this report, although the paper is written more formally and concisely. Most importantly, this chapter contains an in-depth explanation of the methods developed in this study and their evaluation. Furthermore, rigorous evaluations are performed and presented in the form of Pareto curves, providing insights into the performance-interpretability trade-off for all datasets. Finally, an appendix is provided containing details regarding the evaluation and some additional results.
The paper has to function as a stand-alone research and is therefore considered the core of this report. However, a paper has its limitations due to its concise nature. Thus, Chapter 4 contains additional evaluations performed to gain insight into the behaviour of sparsity within the proposed methods and the effect of changing the sparsity parameter after training. Furthermore, we presented a failed concept due to its relevance for future work. Additional related work is presented in Chapter 5, where we discuss the idea of attention as an explanation and present other self-interpretable methods within the field.
This research is concluded by providing an answer to the research question in Chapter 6, along with suggestions for future work.

Files

Thesis_Report_Final.pdf

(pdf | 6.27 Mb)

- Embargo expired in 31-01-2024

License info not available