From Multi-Class to Multi-Label: Revisiting Edge Dropping for Graph Neural Networks

None, None

From Multi-Class to Multi-Label: Revisiting Edge Dropping for Graph Neural Networks

Bachelor Thesis (2026)

Author(s)

A. Andrei (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

M. Khosla – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

E. Congeduti – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)

C. Lofi – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Faculty

Electrical Engineering, Mathematics and Computer Science

Graph Neural Networks Edge Dropout Multi-Label Classification

To reference this document use

https://resolver.tudelft.nl/uuid:97d3b4e3-779e-424e-9b67-b80b0a0ff72c

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

19-06-2026

Awarding Institution

Delft University of Technology

Project

CSE3000 Research Project

Programme

Computer Science and Engineering

Faculty

Electrical Engineering, Mathematics and Computer Science

Downloads counter

20

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Many real-world tasks involve data that is naturally structured as a graph, such as proteins linked by interactions, papers linked by citations, or people connected in a social network. A common goal is to predict properties of each entity based on how it is connected to the rest. The models used for this, called graph neural networks, work by allowing each node to aggregate information from its neighbours, then from its neighbours’ neighbours, and so on.

However, looking too far can blur distant nodes together. A common remedy is to train the model on a random subset of the links and discard the rest, in the hope of preventing it from over-relying on any single part of the graph. This idea was proposed and tested only on the simpler problem in which each entity carries exactly one label. Many real-world problems are not like this. A single protein may participate in multiple biological processes simultaneously, so an effective predictor must assign several labels at once. Whether dropping links still helps in this more realistic multi-label setting has not been studied.

This work addresses that question using synthetic graphs with precisely controlled structure, ranging from strongly clustered to nearly random, as well as three real biological and bibliographic datasets spanning the same range. In almost every case, dropping links harms performance rather than improving it, and the damage increases with the fraction of removed links, with a single weak exception on the most strongly multi-label, lowest-homophily real graph.

The cause is not a flaw in the technique but a property of the multi-label task itself. When multiple labels must be predicted from the same node representation, each label receives only a fraction of the learning signal it would obtain in a single-label setting. Discarding links reduces this already limited signal even further.

Files

Alexandru_Andrei_thesis.pdf

(pdf | 0.42 Mb)

License info not available