Search results | TU Delft Repositories

Searched for: +

(1 - 3 of 3)

document: Battling the CPU Bottleneck in Apache Parquet to Arrow Conversion Using FPGA
Peltenburg, J.W. (author), Van Leeuwen, Lars T.J. (author), Hoozemans, J.J. (author), Fang, J. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
In the domain of big data analytics, the bottleneck of converting storage-focused file formats to in-memory data structures has shifted from the bandwidth of storage to the performance of decoding and decompression software. Two widely used formats for big data storage and in-memory data are Apache Parquet and Apache Arrow, respectively. In...
conference paper 2021

document: An Efficient High-Throughput LZ77-Based Decompressor in Reconfigurable Logic
Fang, J. (author), Chen, Jianyu (author), Lee, Jinho (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
To best leverage high-bandwidth storage and network technologies requires an improvement in the speed at which we can decompress data. We present a “refine and recycle” method applicable to LZ77-type decompressors that enables efficient high-bandwidth designs and present an implementation in reconfigurable logic. The method refines the write...
journal article 2020

document: A High-Bandwidth Snappy Decompressor in Reconfigurable Logic
Fang, J. (author), Chen, Jianyu (author), Al-Ars, Z. (author), Hofstee, H.P. (author), Hidders, Jan (author)
While in-memory databases have largely removed I/O as a bottleneck for database operations, loading the data from storage into memory remains a significant limiter to end-to end performance. Snappy is a widely used compression algorithm in the Hadoop ecosystem and in database systems and is an option in often-used file formats such as Parquet...
conference paper 2018