Title
Fletcher: A framework to efficiently integrate FPGA accelerators with apache arrow
Author
Peltenburg, J.W. (TU Delft Computer Engineering) ![ORCID 0000-0002-7043-7131 ORCID 0000-0002-7043-7131](/sites/all/themes/tud_repo3/img/icons/orcid_16x16.png)
van Straten, J. (TU Delft FTQC/Bertels Lab) ![ORCID 0000-0002-5610-2511 ORCID 0000-0002-5610-2511](/sites/all/themes/tud_repo3/img/icons/orcid_16x16.png)
Wijtemans, L. (TU Delft Education and Research Support)
Van Leeuwen, Lars (Student TU Delft)
Al-Ars, Z. (TU Delft Computer Engineering) ![ORCID 0000-0001-7670-8572 ORCID 0000-0001-7670-8572](/sites/all/themes/tud_repo3/img/icons/orcid_16x16.png)
Hofstee, H.P. (TU Delft Computer Engineering; IBM Austin)
Contributor
Sourdis, Ioannis (editor)
Bouganis, Christos-Savvas (editor)
Alvarez, Carlos (editor)
Toledo Diaz, Leonel Antonio (editor)
Valero, Pedro (editor)
Martorell, Xavier (editor)
Date
2019-09-01
Abstract
Modern big data systems are highly heterogeneous. The components found in their many layers of abstraction are often implemented in a wide variety of programming languages and frameworks. Due to language implementation differences, interfaces between these components, including hardware accelerated components, are often burdened by serialization overhead. Serialization bandwidth of many high-level language frameworks is an order of magnitude lower than contemporary FPGA accelerator interface bandwidth, especially when objects are small but numerous. Therefore, serialization bounds the effective end-to-end performance of FPGA-accelerated solutions integrated with applications written in high-level languages. The Apache Arrow project defines a language agnostic columnar in-memory format optimized for big data applications, preventing the need to serialize or even make copies during communication between components. To enable FPGA accelerators to benefit from the approach of Arrow, we first investigate the properties of its format in relation to hardware interfaces and establish that the format is usable. Second, we present the Fletcher framework, that automatically generates highly efficient hardware interfaces to access data of potentially complex, nested Arrow data types. Our approach allows 11 of the languages supported by Apache Arrow libraries to efficiently communicate large data sets with FPGA accelerators at system bandwidth. Furthermore, on the hardware side, the generated interfaces deliver any data type that Arrow can represent as groups of streams, providing a better starting point for data-flow-oriented kernel development, compared to manually creating custom interfaces to address issues related to pointer arithmetic, bus word misalignment and latency. For example applications, as measured on an AWS EC2 F1 and CAPI2-enabled POWER9 system, accelerated end-to-end application performance improves by 1.3x-49x compared to a hardware accelerated solution that still requires serialization.
Subject
Accelerator bandwidth
Apache Arrow
Big data systems
FPGA acceleration
Serialization
To reference this document use:
http://resolver.tudelft.nl/uuid:8074f97a-013e-4ab7-819d-bff6f56c35dc
DOI
https://doi.org/10.1109/FPL.2019.00051
Publisher
IEEE
ISBN
9781728148847
Source
Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019
Event
29th International Conferenceon Field-Programmable Logic and Applications, FPL 2019, 2019-09-09 → 2019-09-13, Barcelona, Spain
Series
Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019
Part of collection
Institutional Repository
Document type
conference paper
Rights
© 2019 J.W. Peltenburg, J. van Straten, L. Wijtemans, Lars Van Leeuwen, Z. Al-Ars, H.P. Hofstee