Searched for: +
(1 - 20 of 26)

Pages

document
Ji, M. (author), Al-Ars, Z. (author), Hofstee, H.P. (author), Chang, Yuchun (author), Zhang, Baolin (author)
Convolutional neural networks (CNNs) are to be effective in many application domains, especially in the computer vision area. In order to achieve lower latency CNN processing, and reduce power consumption, developers are experimenting with using FPGAs to accelerate CNN processing in several applications. Current FPGA CNN accelerators usually use...
journal article 2023
document
Cromjongh, Casper (author), Tian, Y. (author), Hofstee, H.P. (author), Al-Ars, Z. (author)
In spite of progress on hardware design languages, the design of high-performance hardware accelerators forces many design decisions specializing the interfaces of these accelerators in ways that complicate the understanding of the design and hinder modularity and collaboration. In response to this challenge, Tydi is presented as an open...
conference paper 2023
document
Reukers, Matthijs A. (author), Tian, Y. (author), Al-Ars, Z. (author), Hofstee, H.P. (author), Brobbel, M. (author), Peltenburg, J.W. (author), van Straten, J. (author)
Tydi is an open specification for streaming dataflow designs in digital circuits, allowing designers to express how composite and variable-length data structures are transferred over streams using clear, data-centric types. These data types are extensively used in a many application domains, such as big data and SQL applications. This way,...
journal article 2023
document
Ahmad, T. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
Moving structured data between different big data frameworks and/or data warehouses/storage systems often cause significant overhead. Most of the time more than 80% of the total time spent in accessing data is elapsed in serialization/de-serialization step. Columnar data formats are gaining popularity in both analytics and transactional...
conference paper 2022
document
Ahmad, T. (author), Ma, Chengxin (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
Current cluster scaled genomics data processing solutions rely on big data frameworks like Apache Spark, Hadoop and HDFS for data scheduling, processing and storage. These frameworks come with additional computation and memory overheads by default. It has been observed that scaling genomics dataset processing beyond 32 nodes is not efficient on...
conference paper 2022
document
Park, Seongyeon (author), Kim, Hajin (author), Ahmad, T. (author), Ahmed, N. (author), Al-Ars, Z. (author), Hofstee, H.P. (author), Kim, Youngsok (author), Lee, Jinho (author)
Sequence alignment forms an important backbone in many sequencing applications. A commonly used strategy for sequence alignment is an approximate string matching with a two-dimensional dynamic programming approach. Although some prior work has been conducted on GPU acceleration of a sequence alignment, we identify several shortcomings that limit...
conference paper 2022
document
Peltenburg, J.W. (author), Van Leeuwen, Lars T.J. (author), Hoozemans, J.J. (author), Fang, J. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
In the domain of big data analytics, the bottleneck of converting storage-focused file formats to in-memory data structures has shifted from the bandwidth of storage to the performance of decoding and decompression software. Two widely used formats for big data storage and in-memory data are Apache Parquet and Apache Arrow, respectively. In...
conference paper 2021
document
Zhu, B. (author), Hofstee, H.P. (author), Lee, Jinho (author), Al-Ars, Z. (author)
Attention mechanism has been regarded as an advanced technique to capture long-range feature interactions and to boost the representation capability for convolutional neural networks. However, we found two ignored problems in current attentional activations-based models: the approximation problem and the insufficient capacity problem of the...
conference paper 2021
document
Ahmad, T. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
Background Recently many new deep learning–based variant-calling methods like DeepVariant have emerged as more accurate compared with conventional variant-calling algorithms such as GATK HaplotypeCaller, Sterlka2, and Freebayes albeit at higher computational costs. Therefore, there is a need for more scalable and higher performance workflows of...
review 2021
document
Hoozemans, J.J. (author), Peltenburg, J.W. (author), Nonnenmacher, Fabian (author), Hadnagy, A. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
The big data revolution has ushered an era with ever increasing volumes and complexity of data requiring ever faster computational analysis. During this very same era, CPU performance growth has been stagnating, pushing the industry to either scale their computation horizontally using multiple nodes in datacenters, or to scale vertically using...
journal article 2021
document
Peltenburg, J.W. (author), van Straten, J. (author), Brobbel, M. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
As big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a developer has to face when designing and integrating FPGA...
journal article 2021
document
Zhu, B. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
Binary Convolutional Neural Networks (CNNs) have significantly reduced the number of arithmetic operations and the size of memory storage needed for CNNs, which makes their deployment on mobile and embedded systems more feasible. However, after binarization, the CNN architecture has to be redesigned and refined significantly due to two reasons:...
conference paper 2020
document
Peltenburg, J.W. (author), Brobbel, M. (author), van Straten, J. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
Streaming dataflow designs describe hardware by connecting components through streams that transport data structures. We introduce a stream-oriented specification and type system that provides a clear and intuitive way to map complex, dynamically-sized data structures onto hardware streams. This helps designers to lift the abstraction of...
journal article 2020
document
Fang, J. (author), Chen, Jianyu (author), Lee, Jinho (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
To best leverage high-bandwidth storage and network technologies requires an improvement in the speed at which we can decompress data. We present a “refine and recycle” method applicable to LZ77-type decompressors that enables efficient high-bandwidth designs and present an implementation in reconfigurable logic. The method refines the write...
journal article 2020
document
Ahmad, T. (author), Ahmed, N. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
Background: Immense improvements in sequencing technologies enable producing large amounts of high throughput and cost effective next-generation sequencing (NGS) data. This data needs to be processed efficiently for further downstream analyses. Computing systems need this large amounts of data closer to the processor (with low latency) for...
journal article 2020
document
Zhu, B. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
High-level feature maps of Convolutional Neural Networks are computed by reusing their corresponding low-level feature maps, which brings into full play feature reuse to improve the computational efficiency. This form of feature reuse is referred to as feature reuse between convolutional layers. The second type of feature reuse is referred to...
journal article 2020
document
Peltenburg, J.W. (author), van Straten, J. (author), Wijtemans, L. (author), Van Leeuwen, Lars (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
Modern big data systems are highly heterogeneous. The components found in their many layers of abstraction are often implemented in a wide variety of programming languages and frameworks. Due to language implementation differences, interfaces between these components, including hardware accelerated components, are often burdened by...
conference paper 2019
document
van Dam, Laurens (author), Peltenburg, J.W. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
The newly proposed posit number format uses a significantly different approach to represent floating point numbers. This paper introduces a framework for posit arithmetic in reconfigurable logic that maintains full precision in intermediate results. We present the design and implementation of a L1 BLAS arithmetic accelerator on posit vectors...
conference paper 2019
document
Peltenburg, J.W. (author), van Straten, J. (author), Brobbel, M. (author), Hofstee, H.P. (author), Al-Ars, Z. (author)
As a columnar in-memory format, Apache Arrow has seen increased interest from the data analytics community. Fletcher is a framework that generates hardware interfaces based on this format, to be used in FPGA accelerators. This allows efficient integration of FPGA accelerators with various high-level software languages, while providing an easy-to...
conference paper 2019
document
Fang, J. (author), Mulder, Yvo T.B. (author), Hidders, Jan (author), Lee, Jinho (author), Hofstee, H.P. (author)
While FPGAs have seen prior use in database systems, in recent years interest in using FPGA to accelerate databases has declined in both industry and academia for the following three reasons. First, specifically for in-memory databases, FPGAs integrated with conventional I/O provide insufficient bandwidth, limiting performance. Second, GPUs,...
journal article 2019
Searched for: +
(1 - 20 of 26)

Pages