Efficient execution of top-K SPARQL queries

Conference Paper (2012)
Author(s)

Sara Magliacane (Politecnico di Milano, Vrije Universiteit Amsterdam)

Alessandro Bozzon (Politecnico di Milano)

Emanuele Della Valle (Politecnico di Milano)

Affiliation
External organisation
DOI related publication
https://doi.org/10.1007/978-3-642-35176-1-22
More Info
expand_more
Publication Year
2012
Language
English
Affiliation
External organisation
Pages (from-to)
344-360
ISBN (print)
978-3-642-35175-4

Abstract

Top-k queries, i.e. queries returning the top k results ordered by a user-defined scoring function, are an important category of queries. Order is an important property of data that can be exploited to speed up query processing. State-of-the-art SPARQL engines underuse order, and top-k queries are mostly managed with a materialize-then-sort processing scheme that computes all the matching solutions (e.g. thousands) even if only a limited number k (e.g. ten) are requested. The PARQL-ANK algebra is an extended SPARQL algebra that treats order as a first class citizen, enabling efficient split-and-interleave processing schemes that can be adopted to improve the performance of top-k SPARQL queries. In this paper we propose an incremental execution model for PARQL-ANK queries, we compare the performance of alternative physical operators, and we propose a rank-aware join algorithm optimized for native RDF stores. Experiments conducted with an open source implementation of a PARQL-ANK query engine based on ARQ show that the evaluation of top-k queries can be sped up by orders of magnitude.

No files available

Metadata only record. There are no files for this record.