Learning Curves of GNNs vs MLP vs Tikhonov
A Comparative Study on Semi-Supervised Node Classification
C. Radoi (TU Delft - Electrical Engineering, Mathematics and Computer Science)
C. Liu – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
M.S. Jebali – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
E. Isufi – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Graph neural networks (GNNs) are designed to use both node attributes and graph topology, but this extra information is not always beneficial. This paper compares learning curves for semi-supervised node classification under five informational and structural regimes: a ChebNet GNN (implicit topology and features), a feature-only multilayer perceptron (MLP), a topology-only graph Tikhonov baseline, and two variations enforcing explicit topological prior information via Graph Laplacian Regularisation (a regularized MLP and a regularized GNN). We evaluate Cora and PubMed as homophilic citation networks, and Chameleon and Squirrel as heterophilic web graphs over increasing labeled-node budgets, using fixed validation and test splits and repeated random budget sampling. We find that in the homophilic settings, the structural benefit of the GNN is substantial, though enforcing a strict smoothness prior makes the Tikhonov baseline surprisingly competitive at extremely low label budgets. In heterophilic settings, where the underlying structural smoothness prior is mathematically violated, model behaviour bifurcates based on structural density. On sparser heterophilic graphs (Chameleon), all informational regimes converge to an identical performance floor bounded by uninformative node features. Conversely, on massively dense heterophilic networks (Squirrel), classical topology-only solvers catastrophically collapse below feature-only baselines due to the amplified friction of the deceptive prior. Our results show that the benefit of explicit topological priors depends heavily on data availability. They provide clear advantages in low-label regimes under strict feature starvation, but become less effective as more labels or stronger feature signals become available.