Finding and materializing common subexpressions among queries in a query workload
Mark Pasterkamp (TU Delft - Electrical Engineering, Mathematics and Computer Science)
Asterios Katsifodimos – Mentor (TU Delft - Web Information Systems)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Most queries in a collection of queries, also called a query workload, to some degree have parts of their intermediate execution steps in common. These intermediate exe- cution steps, also called subexpressions, provide the opportunity to further optimize query workload execution in addition to the already existing query optimization done by the DBMS. A lot of research has been done into this topic however most of that research is either proprietary or just not applicable to big query workloads. In this thesis I developed a simple yet effictive heuristic algorithm to quickly find common subexpressions and materialize them to disk using open source software with results showing a significant increase in performance.