Property-Driven Comparison of GNNs on Multi-Label Graphs

None, None

Property-Driven Comparison of GNNs on Multi-Label Graphs

Bachelor Thesis (2026)

Author(s)

V. Paiu (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M. Khosla – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

E. Congeduti – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

C. Lofi – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Multi-label classification Graph Neural Networks

To reference this document use

https://resolver.tudelft.nl/uuid:efdaf2d7-7d28-4105-ae31-9716b7c5839f

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

19-06-2026

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

21

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Multi-label node classification on graphs occurs in domains where entities can have several labels, such as biological, social, and recommendation networks. Most Graph Neural Networks (GNN) research focuses on multi-class graphs, so it remains unclear how dataset properties affect model performance in multi-label settings. This thesis studies how structural, feature, and label properties influence Graph Convolutional Network (GCN) and Heterophilic Graph Convolutional Network (H2GCN). These models were chosen because they are widely used and represent homophilous and heterophilous graph learning, respectively. Synthetic graphs are used to vary their properties in a controlled way, with real-world datasets used as validation points, and a pooled Ridge regression then tests how well each property predicts model performance in a joint setting. The results show that no single property explains performance solely by itself. Label imbalance reduces both models similarly, structural noise harms GCN more, unlabeled nodes degrade the performance of H2GCN more quickly, and cross-class neighbourhood similarity adds information beyond homophily. All code, seeds, and trained-graph properties are released publicly.

Files

Victor_Paiu_research_paper.pdf

(pdf | 0.996 Mb)

License info not available