A Genetic Approach to Programming Large-Scale Systems

More Info
expand_more

Abstract

In recent years, there is an increasing interest in the scientific community for development of algorithms targeting large-scale networks. In addition to their sheer size, these networks also exhibit various topology dynamics -- nodes join and exit at high rates (churn), are mobile and are not always reliable. Such extreme properties make centralized algorithms unsuitable in a lot of practical situations. %Moreover, the majority of distributed algorithms require a coordinated starting condition. When execution ends (e.g., leader is elected), the entire procedure needs to be restarted thus leading to a lack of responsiveness to internal and external dynamics. Our goal is to find algorithms that run continuously and are based on local rules only (i.e., decisions based on information exchanged through 1-hop interactions). If designed correctly, these algorithms are highly scalable and robust. However, identifying the local rules is a very difficult task with no clear discovery methodology. The heuristics we introduce in this thesis represent a so-called "global-to-local compiler" -- designed to search for a mapping between global system requirements and local interaction rules. We test search methods such as Hill-Climbing, Simulated Annealing and Tabu Search. Additionally, we develop custom heuristics such as GG (Generation Gap) Tabu Hill-Climbing, GG Tabu Simulated Annealing and an implementation of GOMEA, named GOMGP. We perform thorough benchmarking in order to find the optimal parameter configuration. For the identified parameter combinations, the heuristics vastly outperform state-of-the-art heuristics. We discover new algorithms for the problems: "distributed maximal independent set formation'' and "event localization using graph spectral theory'' with good results and insights. Finally, we improve existing algorithms for the "node failure rate estimation'' (FailDetect) and "churn rate estimation'' (ChurnDetect). The improvements consist of lower estimation variance at network nodes.