Multi-way Hash Join Based on FPGAs

Master thesis (2018)

Authors

K. Huang Electrical Engineering, Mathematics and Computer Science

Contributors

H.P. Hofstee (mentor)

H. Peter Hofstee (mentor)

J. Fang (coach)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:17f9df3a-df17-43f3-b92e-eeaf06a6903a

More Info

expand_more

Published Date

30-01-2018

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

The multi-way hash join is one of the commonly used and time-consuming database operations. Many algorithms have been developed to accelerate this operation, some of which use accelerators such as field programmable gate arrays (FPGAs). However, most of the previous work was focused on computation-intensive operations such as (de)compression, because the interface between the FPGA and the host can only provide relatively low bandwidth.\par

However, new generation high-bandwidth, low-latency interfaces to interconnect host processors and accelerators such as the open coherent accelerator processor interface(OpenCAPI) provide FPGAs with new opportunities to accelerate database operations. In this thesis, we explore the potential of using OpenCAPI-attached FPGAs to accelerate multi-way joins. Via the OpenCAPI, the FPGA can obtain a high-bandwidth communicating with CPUs and the main memory at 25.6GB/s. We first investigate the previous research in software-based multi-way joins and observe that this operation is limited by the bandwidth of main memory. Thus, the main challenge of designing the accelerator emerges as avoiding unnecessary memory accesses. We partition the build relations into the size that can build a hash table in Block RAMs (BRAMs), and avoid multiple-pass memory accesses. In our design, the intermediate join phase is pipelined with a partition phase to reduce the size of the intermediate results. The proposed design is configurable for the attached bandwidth, and it can achieve a throughput of 5 GB/s when a 25.6 GB/s bandwidth is provided.

Files

Master_Thesis_Kangli_Huang.pdf

(pdf | 2.03 Mb)

Unknown license