Supporting Columnar In-memory Formats on FPGA

The Hardware Design of Fletcher for Apache Arrow

Conference Paper (2019)
Author(s)

Johan Peltenburg (TU Delft - Computer Engineering)

Jeroen van Straten (TU Delft - Computer Engineering)

M. Brobbel (TU Delft - Computer Engineering)

H. Peter Hofstee (IBM)

Z. Al-Ars (TU Delft - Computer Engineering)

Research Group
Computer Engineering
Copyright
© 2019 J.W. Peltenburg, J. van Straten, M. Brobbel, H.P. Hofstee, Z. Al-Ars
DOI related publication
https://doi.org/10.1007/978-3-030-17227-5_3
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 J.W. Peltenburg, J. van Straten, M. Brobbel, H.P. Hofstee, Z. Al-Ars
Related content
Research Group
Computer Engineering
Pages (from-to)
32-47
ISBN (print)
978-3-030-17226-8
ISBN (electronic)
978-3-030-17227-5
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

As a columnar in-memory format, Apache Arrow has seen increased interest from the data analytics community. Fletcher is a framework that generates hardware interfaces based on this format, to be used in FPGA accelerators. This allows efficient integration of FPGA accelerators with various high-level software languages, while providing an easy-to-use hardware interface for the FPGA developer. The abstract descriptions of data sets stored in the Arrow format, that form the input of the interface generation step, can be complex. To generate efficient interfaces from it is challenging. In this paper, we introduce the hardware components of Fletcher that help solve this challenge. These components allow FPGA developers to express access to complex Arrow data records through row indices of tabular data sets, rather than through byte addresses. The data records are delivered as streams of the same abstract types as found in the data set, rather than as memory bus words. The generated interfaces allow for full system bandwidth to be utilized and have a low area profile. All components are open sourced and available for other researchers and developers to use in their projects.

Files

Fletcher_ARC2019.pdf
(pdf | 0 Mb)
License info not available