High-Throughput Big Data Analytics Through Accelerated Parquet to Arrow Conversion

van Leeuwen, Lars

High-Throughput Big Data Analytics Through Accelerated Parquet to Arrow Conversion

Title

High-Throughput Big Data Analytics Through Accelerated Parquet to Arrow Conversion

Author

van Leeuwen, Lars (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Quantum & Computer Engineering)

Contributor

Al-Ars, Z. (mentor)
Peltenburg, J.W. (graduation committee)
Rellermeyer, Jan S. (graduation committee)
Hofstee, H.P. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Computer Engineering

Date

2019-08-27

Abstract

With the advent of high-bandwidth non-volatile storage devices, the classical assumption that database analytics applications are bottlenecked by CPUs having to wait for slow I/O devices is being flipped around. Instead, CPUs are no longer able to decompress and deserialize the data stored in storage-focused file formats fast enough to keep up with the speed at which compressed data is read from storage. In order to better utilize the increasing I/O bandwidth, this work proposes a hardware accelerated approach to converting storage-focused file formats to in-memory data structures. To that end, an FPGA-based Apache Parquet reading engine is developed that utilizes existing FPGA and memory interfacing hardware to write data to memory in Apache Arrow's in-memory format. A modular and expandable hardware architecture called the ParquetReader with out-of-the-box support for DELTA_BINARY_PACKED and DELTA_LENGTH_BYTE_ARRAY encodings is proposed and implemented on an Amazon EC2 F1 instance with an XCVU9P FPGA.

The ParquetReader has great area efficiency, with a single ParquetReader only requiring between 1.18% and 2.79% of LUTs, between 1.27% and 2.92% of registers and between 2.13% and 4.47% of BRAM depending on the targeted input data type and encoding. This area efficiency allows for instantiating a large number of (possibly different) ParquetReaders for parallel workloads. Multiple Parquet files of varying types and encodings were generated in order to measure the performance of the ParquetReaders. A single engine has achieved up to 2.81x speedup for DELTA_LENGTH_BYTE_ARRAY encoded strings and 2.79x speedup for DELTA_BINARY_PACKED integers when compared to CPU-only Parquet reading implementations, attaining a throughput between 2.3 GB/s and 7.2 GB/s (limited by the interface bandwidth of the testing system) depending on the input data. The high throughput and low resource utilization of the ParquetReader allow for the interface bandwidth to be saturated using multiple ParquetReaders utilizing only a small amount of the FPGA's resources.

Subject

FPGA
Apache Parquet
Apache Arrow
Big Data
accelerator

To reference this document use:

http://resolver.tudelft.nl/uuid:e64b56b7-ecdc-4f47-8aed-3dfbf7e269ac

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

LTJvanLeeuwen_thesis.pdf

3.58 MB

Close viewer