GNN-LLM Hybrids for Node Classification

None, None

GNN-LLM Hybrids for Node Classification

A Comparative Study of GNN, LLM-based, and Hybrid Models

Master Thesis (2026)

Author(s)

Y. LI (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

E. Isufi – Mentor (TU Delft - Multimedia Computing)

M. Khosla – Mentor (TU Delft - Multimedia Computing)

T. Zhao – Mentor (TU Delft - Multimedia Computing)

R. Hai – Mentor (TU Delft - Web Information Systems)

Large Language Models Graph Neural Networks Node Classification Semantic Alignment Hybrid GNN-LLM Models

To reference this document use

https://resolver.tudelft.nl/uuid:6611851d-79ea-4c8a-aeb9-122149cc5aa6

More Info

expand_more

Publication Year

2026

Language

English

Graduation Date

25-02-2026

Awarding Institution

Programme

Computer Science

Downloads counter

41

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Node classification on text-attributed graphs requires both structural reasoning and rich semantic understanding.
While Graph Neural Networks (GNNs) have become the dominant solution by leveraging graph topology, they often rely on limited textual representations.
Recent advances, therefore, explore integrating large language models (LLMs) to provide stronger semantic encoding for graph learning. However, existing work often evaluates different LLM integration paradigms under individually designed experimental settings, making it difficult to assess their relative strengths for node classification tasks. In this work, we present a controlled empirical comparison between classical message-passing graph neural networks, parameter-efficient LLM-GNN integration (ENGINE), prompt-based generative reasoning (LLaGA), and a lightweight hybrid model that combines structural message passing with label-aware semantic alignment. We evaluate all models on two widely used textual graph benchmarks, \textsc{Cora} and \textsc{WikiCS}, under a unified transductive evaluation protocol and varying levels of training supervision. Our results show that the performance of ENGINE consistently outperforms strong GNN baselines, whereas LLaGA is more sensitive to inference constraints and evaluation protocols. The additional hybrid model we propose in this thesis further demonstrates complementary benefits, particularly in low-supervision regimes. These findings clarify practical trade-offs between discriminative and generative LLM-based graph models and highlight hybrid designs as a promising direction for efficient and robust node classification on textual graphs.

Files

MScThesis-YangLi.pdf

(pdf | 8.01 Mb)

License info not available