Searched for: +
(1 - 20 of 30)

Pages

document
Procaccini, Marco (author), Sahebi, Amin (author), Barbone, Marco (author), Luk, Wayne (author), Gaydadjiev, G. (author), Giorgi, Roberto (author)
Processing graphs on a large scale presents a range of difficulties, including irregular memory access patterns, device memory limitations, and the need for effective partitioning in distributed systems, all of which can lead to performance problems on traditional architectures such as CPUs and GPUs. To address these challenges, recent...
conference paper 2024
document
Miedema, Rene (author), Strydis, C. (author)
IntroductionIn-silico simulations are a powerful tool in modern neuroscience for enhancing our understanding of complex brain systems at various physiological levels. To model biologically realistic and detailed systems, an ideal simulation platform must possess: (1) high performance and performance scalability, (2) flexibility, and (3) ease of...
journal article 2024
document
Sahebi, Amin (author), Barbone, Marco (author), Procaccini, Marco (author), Luk, Wayne (author), Gaydadjiev, G. (author), Giorgi, Roberto (author)
Processing large-scale graphs is challenging due to the nature of the computation that causes irregular memory access patterns. Managing such irregular accesses may cause significant performance degradation on both CPUs and GPUs. Thus, recent research trends propose graph processing acceleration with Field-Programmable Gate Arrays (FPGA)....
journal article 2023
document
Jiang, Longxing (author), Aledo Ortega, D. (author), van Leuken, T.G.R.M. (author)
Logarithmic quantization for Convolutional Neural Networks (CNN): a) fits well typical weights and activation distributions, and b) allows the replacement of the multiplication operation by a shift operation that can be implemented with fewer hardware resources. We propose a new quantization method named Jumping Log Quantization (JLQ). The key...
conference paper 2023
document
Kalali, E. (author), van Leuken, T.G.R.M. (author)
DSP blocks are one of the efficient solutions to implement multiply-accumulate (MAC) operations on FPGAs. However, since the DSP blocks have wide multiplier and adder blocks, MAC operations using low bit-length parameters lead to an underutilization. Hence, an efficient approximation technique is introduced. The technique includes...
journal article 2022
document
Hogervorst, T.A. (author), Nane, R. (author), Marchiori, Giacomo (author), Qiu, Tong Dong (author), Blatt, Markus (author), Rustad, Alf Birger (author)
Scientific computing is at the core of many High-Performance Computing applications, including computational flow dynamics. Because of the utmost importance to simulate increasingly larger computational models, hardware acceleration is receiving increased attention due to its potential to maximize the performance of scientific computing....
journal article 2022
document
Peltenburg, J.W. (author), van Straten, J. (author), Brobbel, M. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
As big data analytics systems are squeezing out the last bits of performance of CPUs and GPUs, the next near-term and widely available alternative industry is considering for higher performance in the data center and cloud is the FPGA accelerator. We discuss several challenges a developer has to face when designing and integrating FPGA...
journal article 2021
document
Hoozemans, J.J. (author), Tervo, Kati (author), Jaaskelainen, Pekka (author), Al-Ars, Z. (author)
Many applications make extensive use of various forms of compression techniques for storing and communicating data. As decompression is highly regular and repetitive, it is a suitable candidate for acceleration. Examples are offloading (de)compression to a dedicated circuit on a heterogeneous System-on-Chip, or attaching FPGAs or ASICs...
conference paper 2021
document
Peltenburg, J.W. (author), Van Leeuwen, Lars T.J. (author), Hoozemans, J.J. (author), Fang, J. (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
In the domain of big data analytics, the bottleneck of converting storage-focused file formats to in-memory data structures has shifted from the bandwidth of storage to the performance of decoding and decompression software. Two widely used formats for big data storage and in-memory data are Apache Parquet and Apache Arrow, respectively. In...
conference paper 2021
document
Peltenburg, J.W. (author), Hadnagy, A. (author), Brobbel, M. (author), Morrow, Robert (author), Al-Ars, Z. (author)
JSON is a popular data interchange format for many web, cloud, and IoT systems due to its simplicity, human readability, and widespread support. However, applications must first parse and convert the data to a native in-memory format before being able to perform useful computations. Many big data applications with high performance requirements...
conference paper 2021
document
Chen, Jianyu (author), Daverveldt, Maurice (author), Al-Ars, Z. (author)
With the continued increase in the amount of big data generated and stored in various application domains, such as high-frequency trading, compression techniques are becoming ever more important to reduce the requirements on communication bandwidth and storage capacity. Zstandard (Zstd) is emerging as an important compression algorithm for big...
conference paper 2021
document
Peltenburg, J.W. (author)
Because of fundamental limitations of CMOS technology, computing researchers and the computing industry are focusing on using transistors in integrated circuits more efficiently towards obtaining a computational goal. At the architectural level, this has led to an era of heterogeneous computing, where various types of computational components...
doctoral thesis 2020
document
Castro do Amaral, G. (author), Calliari, Felipe (author), Lunglmayr, Michael (author)
Trend break detection is a fundamental problem that materializes in many areas of applied science, where being able to identify correctly, and in a timely manner, trend breaks in a noisy signal plays a central role in the success of the application. The linearized Bregman iterations algorithm is one of the methodologies that can solve such a...
journal article 2020
document
Fang, J. (author), Chen, Jianyu (author), Lee, Jinho (author), Al-Ars, Z. (author), Hofstee, H.P. (author)
To best leverage high-bandwidth storage and network technologies requires an improvement in the speed at which we can decompress data. We present a “refine and recycle” method applicable to LZ77-type decompressors that enables efficient high-bandwidth designs and present an implementation in reconfigurable logic. The method refines the write...
journal article 2020
document
Calliari, Felipe (author), Castro do Amaral, G. (author), Lunglmayr, Michael (author)
Detection of level shifts in a noisy signal, or trend break detection, is a problem that appears in several research fields, from biophysics to optics and economics. Although many algorithms have been developed to deal with such a problem, accurate and low-complexity trend break detection is still an active topic of research. The Linearized...
journal article 2020
document
Fang, J. (author)
Though field-programmable gate arrays (FPGAs) have been used to accelerate database systems, they have not been widely adopted for the following reasons. As databases have transitioned to higher bandwidth technology such as in-memory and NVMe, the communication overhead associated with accelerators has become more of a burden. Also, FPGAs are...
doctoral thesis 2019
document
Houtgast, E.J. (author)
Developments in sequencing technology have drastically reduced the cost of DNA sequencing. The raw sequencing data being generated requires processing through computationally demanding suites of bioinformatics algorithms called genomics pipelines. The greatly decreased cost of sequencing has resulted in its widespread adoption, and the amount of...
doctoral thesis 2019
document
Homulle, Harald (author)
Quantum computing promises an exponential speed-up of computation compared to what is nowadays achievable with classical computers. In this way, it enables the evaluation of more complex models and the breaching of current security algorithms. For the operation of a quantum system, many questions remain to be answered. Currently, there are...
doctoral thesis 2019
document
Peltenburg, J.W. (author), van Straten, J. (author), Brobbel, M. (author), Hofstee, H.P. (author), Al-Ars, Z. (author)
As a columnar in-memory format, Apache Arrow has seen increased interest from the data analytics community. Fletcher is a framework that generates hardware interfaces based on this format, to be used in FPGA accelerators. This allows efficient integration of FPGA accelerators with various high-level software languages, while providing an easy-to...
conference paper 2019
document
Hoozemans, J.J. (author), de Jong, Rob (author), van der Vlugt, Steven (author), van Straten, J. (author), Elango, Uttam Kumar (author), Al-Ars, Z. (author)
This paper presents and evaluates an approach to deploy image and video processing pipelines that are developed frame-oriented on a hardware platform that is stream-oriented, such as an FPGA. First, this calls for a specialized streaming memory hierarchy and accompanying software framework that transparently moves image segments between...
journal article 2019
Searched for: +
(1 - 20 of 30)

Pages