MJ

M.S. Jebali

info

Please Note

2 records found

Bachelor thesis (2026) - I. Markov, E. Isufi, C. Liu, M.S. Jebali, T.J. Viering
Graph Neural Networks (GNNs) achieve strong performance on node classification tasks, but their effectiveness often depends on the quality of the supervision, and real-world labels are often noisy. Learning curves—which describe how test performance scales with the number of labelled training nodes—have been extensively studied in classical machine learning, but their behaviour under realistic annotation noise in GNNs remains poorly explored.

We present a systematic empirical study of how three label noise protocols—symmetric random flipping, feature-dependent asymmetric flipping, and structure-dependent flipping—affect the learning curve shape of ChebNet across four benchmark graphs spanning homophilic and heterophilic structure, at noise rates η ∈ {0.1, 0.3, 0.5}.

The central finding is that noise does not simply shift the learning curve downward: above a moderate noise rate it reduces the effective slope, so the gap between clean and noisy performance widens as the label budget grows. Feature-dependent asymmetric noise is consistently the most harmful protocol across all datasets and budgets for η ≥ 0.3, while structure-dependent noise is the least harmful on homophilic graphs. On graphs where the model already operates near its performance limit, noise type has little practical effect.

These findings suggest that beyond a moderate noise rate, cleaning existing labels yields greater returns than acquiring more noisy ones, and that the nature of annotation error interacts with graph structure in ways that single-budget evaluations cannot detect. ...
Bachelor thesis (2026) - V. Georgiev, E. Isufi, C. Liu, M.S. Jebali, T.J. Viering
Learning curves describe how model performance changes as more labeled data becomes available and can help estimate whether collecting additional labels is worthwhile. However, it remains unclear which mathematical functions best represent and extrapolate learning curves for graph neural networks. This study compares power-law and exponential models for learning curves generated by a graph neural network on node-classification datasets with different graph characteristics. The models are evaluated separately on how well they describe observed performance and how accurately they predict performance at larger, unseen labeling budgets. The results show that neither model family is universally preferable. Exponential models provide better descriptive fit on some datasets, while power-law models provide better descriptive fit on others. In the extrapolation experiments, power-law models often give more accurate predictions at larger labeled-node budgets, although the preferred model still depends on the dataset and fitting range. These findings indicate that descriptive fit and extrapolation accuracy should be treated as separate objectives. Overall, power-law behaviour appears to be a useful modelling assumption for some GNN learning curves, especially for extrapolation, but it should not be assumed to hold universally.
...