Node Influence Prediction in Complex Networks
Towards network embedding based features
More Info
expand_more
Abstract
The study of epidemic spreading processes on contact based complex networks has gained a lot of traction in recent years. These processes can entail a variety of problems such as disease spreading, opinion spreading in social networks or even airport congestion in airline networks. One of the key tasks in this area of research and also of this work is the prediction of the final epidemic size of an outbreak in a network, given that a contagion process has been initiated by a seed node. More specifically, the objective is to predict to which extent a seed node is able to activate the rest of the nodes in a network using a supervised learning model. In this work, this task is termed: “The node influence prediction problem”. Being able to predict the epidemic footprint of a node allows the design of robust networks and the application of efficient intervention strategies.
Recently, a limited number of studies have proposed methods on how to utilize classical network topology based features to predict the nodal influence. However, two main challenges still persist: (1) individual topology based features do not fully capture the information of a node and (2) it is tedious to obtain these features for nodes in large scale networks. As an alternative solution, this work aims to utilize network embedding based features instead, where feature vectors of the nodes are learned from the network topology. In this research we assume that the network topology and the nodal influence of a small subset of the nodes are known. We then proceed to show how to build and optimize a machine learning framework where only 10\% of the nodes are used as training data and which could even be applicable on large scale networks. Additionally, we also demonstrate why network embedding based features are applicable in the node influence prediction task.
The findings show that node pairs which are closer in proximity in the network, are also embedded closer in the embedding space (exhibiting a higher similarity). The performance evaluation of the predictive models illustrate that network embedding based features can compete with classical topological metrics, despite the disadvantage of their higher dimensionality. This is achieved by combining the embedding features with individual low cost topology features such as the degree.