Diffusion based temporal network embedding for link prediction

More Info
expand_more

Abstract

Link prediction in complex networks has attracted increasing attention. The link prediction algorithms can be used to retrieve missing information, identify spurious interactions, capturing net- work evolution, and so on. Recently, network embedding has been proposed as a new strategy to embed network into low-dimensional vector space. By embedding nodes into vectors, the link pre- diction problem can be converted into a similarity comparison task. Nodes with similar vectors are more likely to connect. Some traditional network embedding methods include matrix factorization, random walk paradigm and deep neural network models.
In this thesis, we propose SISNE, a diffusion based paradigm for node embedding, applying Susceptible-Infected-Susceptible (SIS) model to extract node neighborhood structure. Both random walk based algorithms and our proposed method sample node sequences as input and feed them into a Skip-gram model, a representative language model that embeds words into vectors. Specially, our proposed model provides flexibility to explore the network topology by operating in- formation spreading on networks. Another contribution of the proposed model is that SISNE takes into the account of the evolving nature of complex networks. To verify the efficacy, we conduct experiments on missing link prediction task and show that our SIS diffusion based model outperforms other state-of-the-art network embedding algorithms across all four empirical datasets, reaching a maximal 7% improvement. Importantly, even when the input size is small, the performance remains stable whereas other baseline models drop dramatically, which indicates that our proposed model is less sensitive to input size and suggests that the model is applicable to large-scale networks. Moreover, we further show that as long as the infection probability β is larger than the threshold value of the diffusion model, we can obtain a relatively high performance for link prediction task. Taken together, our work has shown great effectiveness and efficiency in learning embeddings in temporal networks.