Polarisation and influence in online social networks
Annealed Heuristic Search for Local Community Detection in Probabilistically Weighted Graphs
A. Bărdaş (TU Delft - Electrical Engineering, Mathematics and Computer Science)
A.L.D. Latour – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
M. Khosla – Graduation committee member (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
In a polarised online social network, a user is exposed to many posts and accounts from across the platform. As a result, the social graph is dense and only weakly clustered. However, most real, meaningful interactions stay within a user's own community. Community membership is therefore hard to read from the topology (a user links to many groups), but clear in the interaction intensity (the edge weights). We study how to extract a local community from a seed user when these edge weights are probabilities rather than certainties. We use the standard negative-log transform to turn each edge probability into an additive cost, so a shortest-path search grows the community by selecting the most likely ties. Our method is an annealing schedule for a structural heuristic that we suppress while the set is small and phase in once the community is large enough to have a stable boundary. A single run produces one ordering of nodes, from which a budget-free patience policy selects the community without knowing its size in advance. We tune every method on training graphs and test on disjoint graphs. Under this fair comparison, the schedule helps in one specific scenario: when the community structure is weak, but the edge weights still carry a partial community signal. There it beats strong, individually tuned diffusion baselines by up to +0.115 in F1 on hard LFR graphs. The margin remains significant after correction for multiple comparisons. Elsewhere, it does no better, and often worse. We confirm the same effect on two real school contact networks, where it outperforms the best tuned baseline on both schools (+0.041 and +0.044 F1).