Dense Reachable Subgraphs
Hardness, Algorithms and Experiments
S.P. van Krieken (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Leo van Iersel – Mentor (TU Delft - Discrete Mathematics and Optimization)
JLA Dubbeldam – Graduation committee member (TU Delft - Mathematical Physics)
Solon Pissis – Mentor (Centrum Wiskunde & Informatica (CWI))
Hilde Verbeek – Mentor (Centrum Wiskunde & Informatica (CWI))
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
We introduce two novel optimization problems on vertex-weighted multilayer graphs: the Predecessor Dense Reachable Subgraph (PDRS) problem and the Neighborhood Dense Reachable Subgraph (NDRS) problem. In PDRS, the aim is to find a subset of vertices that maximizes the ratio of the total vertex weight to the size of its predecessor neighborhood, under reachability constraints. A subset is considered reachable when it is a union of paths from the first layer to the last. The NDRS variant is equivalent to the PDRS problem, but the size of the entire neighborhood is used for the denominator of the objective function. The motivation for these problems comes from money laundering detection. We prove NP-hardness by reduction from the Vertex Cover problem and formulate MILPs that give exact solutions to the problems. Additionally, we show that the single path variant, where the solution space is restricted to single paths from the first layer to the last, can be solved in polynomial time using dynamic programming. We introduce a heuristic algorithm called Greedy Single Paths, which starts with the empty graph and uses an exact algorithm for the single path variant to iteratively grow the solution. We also introduce a heuristic algorithm called Greedy Peeling. This algorithm starts with the whole graph and iteratively peels vertices from the allowed neighbor set. The algorithms are implemented and tested on synthetic data sets to assess their performance in terms of running time and solution quality.