Heterophilic Methods on Multi-Label Graphs

None, None

Heterophilic Methods on Multi-Label Graphs

How Do Methods Designed for Heterophilic Graphs Compare for Multi-Label Node Classification?

Bachelor Thesis (2026)

Author(s)

C. Turcan (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M. Khosla – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

E. Congeduti – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Graph Neural Networks Heterophily Multi-Label Node Classification

To reference this document use

https://resolver.tudelft.nl/uuid:978d9840-2d95-4b64-a9f4-6633e07d6767

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

19-06-2026

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

21

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Graph neural networks for node classification usually assume homophily, meaning that connected nodes tend to share labels. A promising family of methods has been developed for heterophilic graphs, where neighbouring nodes instead tend to have different labels. These methods are almost always evaluated on multi-class datasets, in which each node has exactly one label. However, many real-world problems are multi-label, with each node carrying a set of labels. An important question is whether methods designed for heterophilic graphs remain effective for multi-label node classification.

To investigate this, we compare two simple baselines with six heterophily-oriented graph neural networks across several real multi-label graphs and one multi-class control dataset. This is complemented by two synthetic experiments: one varying homophily directly and one varying the number of labels per node. We also include a supplementary experiment that collapses the multi-label structure of real-world datasets into a multi-class setting. We report this for completeness but interpret it cautiously, since this transformation alters the datasets too substantially for the two settings to be considered equivalent.

We find that performance appears to depend at least as much on graph homophily as on model sophistication. When homophily is low, heterophilic models rarely outperform either a plain feature-only baseline or a simple structure-only embedding. The benefits of message passing tend to re-emerge only as homophily increases. These results suggest that the gains reported by heterophily-oriented methods on multi-class benchmarks may not transfer automatically to the low-homophily multi-label setting.

Files

Main.pdf

(pdf | 1.36 Mb)

License info not available