Streaming Graph Embeddings via Incremental Neighborhood Sketching
Dingqi Yang (University of Macau)
Bingqing Qu (United International College)
Jie Yang (TU Delft - Web Information Systems)
Liang Wang (Northwestern Polytechnical University)
Philipe Cudre-Mauroux (University of Fribourg)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Graph embeddings have become a key paradigm to learn node representations and facilitate downstream graph analysis tasks. Many real-world scenarios such as online social networks and communication networks involve streaming graphs, where edges connecting nodes are continuously received in a streaming manner, making the underlying graph structures evolve over time. Such a streaming graph raises great challenges for graph embedding techniques not only in capturing the structural dynamics of the graph, but also in efficiently accommodating high-speed edge streams. Against this background, we propose SGSketch, a highly-efficient streaming graph embedding technique via incremental neighborhood sketching. SGSketch cannot only generate high-quality node embeddings from a streaming graph by gradually forgetting outdated streaming edges, but also efficiently update the generated node embeddings via an incremental embedding updating mechanism. Our extensive evaluation compares SGSketch against a sizable collection of state-of-the-art techniques using both synthetic and real-world streaming graphs. The results show that SGSketch achieves superior performance on different graph analysis tasks, showing 31.9% and 21.9% improvement on average over the best-performing static and dynamic graph embedding baselines, respectively. Moreover, SGSketch is significantly more efficient in both embedding learning and incremental embedding updating processes, showing 54x-1813x and 118x-1955x speedup over the baseline techniques, respectively.