GNN-LLM Hybrids for Node Classification
A Comparative Study of GNN, LLM-based, and Hybrid Models
Y. LI (TU Delft - Electrical Engineering, Mathematics and Computer Science)
E. Isufi – Mentor (TU Delft - Multimedia Computing)
M. Khosla – Mentor (TU Delft - Multimedia Computing)
T. Zhao – Mentor (TU Delft - Multimedia Computing)
R. Hai – Mentor (TU Delft - Web Information Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Node classification on text-attributed graphs requires both structural reasoning and rich semantic understanding.
While Graph Neural Networks (GNNs) have become the dominant solution by leveraging graph topology, they often rely on limited textual representations.
Recent advances, therefore, explore integrating large language models (LLMs) to provide stronger semantic encoding for graph learning. However, existing work often evaluates different LLM integration paradigms under individually designed experimental settings, making it difficult to assess their relative strengths for node classification tasks. In this work, we present a controlled empirical comparison between classical message-passing graph neural networks, parameter-efficient LLM-GNN integration (ENGINE), prompt-based generative reasoning (LLaGA), and a lightweight hybrid model that combines structural message passing with label-aware semantic alignment. We evaluate all models on two widely used textual graph benchmarks, \textsc{Cora} and \textsc{WikiCS}, under a unified transductive evaluation protocol and varying levels of training supervision. Our results show that the performance of ENGINE consistently outperforms strong GNN baselines, whereas LLaGA is more sensitive to inference constraints and evaluation protocols. The additional hybrid model we propose in this thesis further demonstrates complementary benefits, particularly in low-supervision regimes. These findings clarify practical trade-offs between discriminative and generative LLM-based graph models and highlight hybrid designs as a promising direction for efficient and robust node classification on textual graphs.