Epidemic Processes on Complex Networks

Modelling, Simulation and Algorithms

More Info
expand_more

Abstract

Local interactions on a graph will lead to global dynamic behaviour. In this thesis we focus on two types of dynamic processes on graphs: the Susceptible-Infected-Susceptilbe (SIS) virus spreading model, and gossip style epidemic algorithms. The largest part of this thesis is devoted to the SIS model. We first introduce the SIS model in chapter 2. Even though the SIS model is a Markov process, and good mathematical tools exist to analyse Markov processes, the exploding state space of the SIS model make many of the standard approaches infeasible. Mean-field approximations and simulations are usually our best bet when trying to solve for the average fraction of infected nodes during an outbreak. We compare two of the mean-field approximations commonly used in literature, and benchmark them against a modified version of the SIS model, the "-SIS model. We show that especially around the epidemic threshold, which separates a regime where the virus dies out quickly from a regime where it stays in the network for a very long time (called the meta-stable state), the two approximations can deviate from the true value. In general, the NIMFA approximation has the upper hand in terms of accuracy. Although the time behaviour of the SIS model is used loosely in the description of the meta-stable state and the epidemic threshold, it has received less attention in research than the average fraction of infected nodes in meta-stable state. In chapter 3 we show that the survival time of an SIS process is exponentially distributed, and derive exact expressions for the average survival time in the complete and the star graph. We also propose a definition of the epidemic threshold using the survival time, and show in the star graph and the complete graph that it is accurate. In general, viruses do not live in isolation. In chapter 4 we show that two viruses competing for the same healthy nodes in a network form a rich dynamic system. If the viruses are prevented from dying out, one will completely dominate the other in terms of the number of infected nodes, but this domination will not last. Even if the dominated virus is down to only a few infected nodes, it will regain strength and eventually dominate the other virus. We compute the average domination time and show that it depends on the number of infected nodes at the start of a domination period. Also, we show that weaker or slower viruses can still dominate their stronger or quicker competitors, but for shorter periods of time. Real world spreading phenomena generally have infection and curing processes that are not well modelled by Poisson processes. In the context of content propagation, for example, it has been shown that human (online) behaviour is bursty: long periods of inactivity are followed by short periods of intense activity. In chapter 5, we analyse the effect of non exponential inter-arrival time distributions on the behaviour of an SIS process using a Weibull distribution. As the exponential distribution is a special case of the Weibull distribution, we can gradually steer the distribution away from exponential. The effects on the average fraction of infected nodes in the meta-stable state for the same average inter-arrival time is dramatic. However, by keeping the expected number of infection attempts during an infectious period constant, the NIMFA equations still hold in the non-Markovian setting. The fact that spreading attempts are no longer uniformly distributed over the infectious period of a node leads to shorter or longer survival times and can lead to a higher or lower fraction of infected nodes. When infection attempts tend to be concentrated at either the beginning or the end of the infectious period, the survival time is shorter. Also, the accuracy of the NIMFA approximation depends on the inter-arrival time distribution. The class of epidemic or gossip algorithms mimic the spreading of an infection in its communication process. In chapter 6 we develop COUNT, an epidemic algorithm that can count, average and find minima and maxima in a large distributed dynamic network. COUNT can be sped up dramatically by using BEACON, an algorithm where nodes compete amongst themselves to become the beacon node in a network towards which the message in COUNT can be forwarded. The two algorithms together form GOSSIPCO, a much faster version of COUNT that also includes convergence detection. GOSSIPICO is robust against network dynamics by the virtue of an automatic recount process that each node can trigger in response to changes in the network. The algorithm converges in logarithmic time in random graphs and scale-free graphs, and in a time proportional to the average hopcount in graphs such as lattices and paths. Finally, in chapter 7 we develop a framework to extract a network from a dataset based on a mapping. The interactions between entities that are recorded in the dataset, in our case a record of millions of instances of played matches in the online game Defence of the Ancients (DOTA), are mapped to links in the graph. We show that different mappings lead to different graphs, highlighting different implicit social interactions amongst players. We show that thresholding has a large effect on which social structures are revealed by the mappings, and can be used in these graphs as a crude but fast community detection scheme. Also in chapter 7, we investigate how to adapt networks to make them less vulnerable to viruses. The underlying idea is that by reducing the largest eigenvalue of the adjacency matrix, which is an approximation of the epidemic threshold, the real epidemic threshold is increased. As the minimisation problem is an NP-hard problem, we compare various strategies in a greedy optimisation approach. In appendix A we describe most of the software that we developed during the course of our research, while in appendix B we provide an overview of the SIS simulator we developed to perform all the simulations in the first five chapters.

Files

Proefschrift.pdf
(pdf | 4.19 Mb)
Unknown license