Efficient and scalable trie-based algorithms for computing set containment relations

Conference Paper (2015)
Author(s)

Yongming Luo (Eindhoven University of Technology)

George H. L. Fletcher (Eindhoven University of Technology)

A.J.H. Hidders (TU Delft - Web Information Systems)

Paul De Bra (Eindhoven University of Technology)

Research Group
Web Information Systems
DOI related publication
https://doi.org/10.1109/ICDE.2015.7113293
More Info
expand_more
Publication Year
2015
Language
English
Research Group
Web Information Systems
Pages (from-to)
303-314

Abstract

Computing containment relations between massive collections of sets is a fundamental operation in data management, for example in graph analytics and data mining applications. Motivated by recent hardware trends, in this paper we present two novel solutions for computing set-containment joins over massive sets: the Patricia Trie-based Signature Join (PTSJ) and PRETTI+, a Patricia trie enhanced extension of the state-of-the-art PRETTI join. The compact trie structure not only enables efficient use of main-memory, but also significantly boosts the performance of both approaches. By carefully analyzing the algorithms and conducting extensive experiments with various synthetic and real-world datasets, we show that, in many practical cases, our algorithms are an order of magnitude faster than the state-of-the-art.

No files available

Metadata only record. There are no files for this record.