E. Congeduti
Please Note
16 records found
1
However, looking too far can blur distant nodes together. A common remedy is to train the model on a random subset of the links and discard the rest, in the hope of preventing it from over-relying on any single part of the graph. This idea was proposed and tested only on the simpler problem in which each entity carries exactly one label. Many real-world problems are not like this. A single protein may participate in multiple biological processes simultaneously, so an effective predictor must assign several labels at once. Whether dropping links still helps in this more realistic multi-label setting has not been studied.
This work addresses that question using synthetic graphs with precisely controlled structure, ranging from strongly clustered to nearly random, as well as three real biological and bibliographic datasets spanning the same range. In almost every case, dropping links harms performance rather than improving it, and the damage increases with the fraction of removed links, with a single weak exception on the most strongly multi-label, lowest-homophily real graph.
The cause is not a flaw in the technique but a property of the multi-label task itself. When multiple labels must be predicted from the same node representation, each label receives only a fraction of the learning signal it would obtain in a single-label setting. Discarding links reduces this already limited signal even further. ...
However, looking too far can blur distant nodes together. A common remedy is to train the model on a random subset of the links and discard the rest, in the hope of preventing it from over-relying on any single part of the graph. This idea was proposed and tested only on the simpler problem in which each entity carries exactly one label. Many real-world problems are not like this. A single protein may participate in multiple biological processes simultaneously, so an effective predictor must assign several labels at once. Whether dropping links still helps in this more realistic multi-label setting has not been studied.
This work addresses that question using synthetic graphs with precisely controlled structure, ranging from strongly clustered to nearly random, as well as three real biological and bibliographic datasets spanning the same range. In almost every case, dropping links harms performance rather than improving it, and the damage increases with the fraction of removed links, with a single weak exception on the most strongly multi-label, lowest-homophily real graph.
The cause is not a flaw in the technique but a property of the multi-label task itself. When multiple labels must be predicted from the same node representation, each label receives only a fraction of the learning signal it would obtain in a single-label setting. Discarding links reduces this already limited signal even further.
Evaluating Graph Neural Additive Networks for Multi-Label Node Classification
How does Graph Neural Additive Network (GNAN) perform on different multi-label node classification datasets, and what do the resulting explanations reveal about the data?
Heterophilic Methods on Multi-Label Graphs
How Do Methods Designed for Heterophilic Graphs Compare for Multi-Label Node Classification?
To investigate this, we compare two simple baselines with six heterophily-oriented graph neural networks across several real multi-label graphs and one multi-class control dataset. This is complemented by two synthetic experiments: one varying homophily directly and one varying the number of labels per node. We also include a supplementary experiment that collapses the multi-label structure of real-world datasets into a multi-class setting. We report this for completeness but interpret it cautiously, since this transformation alters the datasets too substantially for the two settings to be considered equivalent.
We find that performance appears to depend at least as much on graph homophily as on model sophistication. When homophily is low, heterophilic models rarely outperform either a plain feature-only baseline or a simple structure-only embedding. The benefits of message passing tend to re-emerge only as homophily increases. These results suggest that the gains reported by heterophily-oriented methods on multi-class benchmarks may not transfer automatically to the low-homophily multi-label setting. ...
To investigate this, we compare two simple baselines with six heterophily-oriented graph neural networks across several real multi-label graphs and one multi-class control dataset. This is complemented by two synthetic experiments: one varying homophily directly and one varying the number of labels per node. We also include a supplementary experiment that collapses the multi-label structure of real-world datasets into a multi-class setting. We report this for completeness but interpret it cautiously, since this transformation alters the datasets too substantially for the two settings to be considered equivalent.
We find that performance appears to depend at least as much on graph homophily as on model sophistication. When homophily is low, heterophilic models rarely outperform either a plain feature-only baseline or a simple structure-only embedding. The benefits of message passing tend to re-emerge only as homophily increases. These results suggest that the gains reported by heterophily-oriented methods on multi-class benchmarks may not transfer automatically to the low-homophily multi-label setting.
Graph Neural Networks for Long-Term Traffic Forecasting
Can GNNs effectively handle long-term predictions and how does their accuracy degrade over time?
Graph Neural Networks Training Set Analysis
Effect of Training Data Size
Effectiveness of Graph Neural Networks and Simpler Network Models in Various Traffic Scenarios
Graph Neural Networks for Traffic Forecasting
Scalability of Graph Neural Networks in Traffic Forecasting
Assessing Accuracy and Computational Efficiency in Varying Road Network Sizes and Complexities
...
Comparative Analysis of LSTM, ARIMA, and Facebook’s Prophet for Traffic Forecasting
Advancements, Challenges, and Limitations
Long term predictions for traffic forecasting
How does the accuracy degrade with time?
Deep learning approaches to short term traffic forecasting
Capturing the spatial temporal relation in historic traffic data