Finding and materializing common subexpressions among queries in a query workload

Master Thesis (2020)
Author(s)

Mark Pasterkamp (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Asterios Katsifodimos – Mentor (TU Delft - Web Information Systems)

Faculty
Electrical Engineering, Mathematics and Computer Science
More Info
expand_more
Publication Year
2020
Language
English
Graduation Date
07-02-2020
Awarding Institution
Delft University of Technology
Programme
['Computer Science | Software Technology']
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Most queries in a collection of queries, also called a query workload, to some degree have parts of their intermediate execution steps in common. These intermediate exe- cution steps, also called subexpressions, provide the opportunity to further optimize query workload execution in addition to the already existing query optimization done by the DBMS. A lot of research has been done into this topic however most of that research is either proprietary or just not applicable to big query workloads. In this thesis I developed a simple yet effictive heuristic algorithm to quickly find common subexpressions and materialize them to disk using open source software with results showing a significant increase in performance.

Files

Thesis.pdf
(pdf | 10.5 Mb)
License info not available