ZA

Z. Al-Ars

info

Please Note

118 records found

While modern HDLs such as Chisel (Constructing Hardware In a Scala Embedded Language) significantly improve the process of design entry, debugging these designs is often problematic, because the tools that aid debugging operate on translated code rather than the original HDL. Fur ...

GSST

Parallel string decompression at 191 GB/s on GPU

Most of the commonly used compression standards make use of some form of the LZ algorithm. Decompressing this type of data is not a good match for the Single-Instruction, Multiple Thread (SIMT) model of computation used by GPUs, resulting in low throughput and poor utilization of ...
This paper introduces SENMap, a mapping and synthesis tool for a scalable energy efficient neuromorphic computing architecture frameworks. SENECA a flexible architectural design optimized for executing edge AI SNN/ANN inference applications efficiently. To speed up the silicon ta ...
Synthetic image generation involves the creation of artificially generated images that are indistinguishable from real ones. Conventional simulation-based image synthesis approaches suffer from intensive computational and memory throughput demands associated with physically accur ...
Rapid technological advancements in sequencing technologies allow producing cost effective and high volume sequencing data. Processing this data for real-time clinical diagnosis is potentially time-consuming if done on a single computing node. This work presents a complete varian ...

GSST

Parallel string decompression at 191 GB/s on GPU

Most of the commonly used compression standards make use of some form of the LZ algorithm. Decompressing this type of data is not a good match for the Single-Instruction, Multiple Thread (SIMT) model of computation used by GPUs, resulting in low throughput and poor utilization of ...

SENSIM

An Event-driven Parallel Simulator for Multi-core Neuromorphic Systems

In this paper, we present SENSIM, which is an open-source simulator designed specifically for the SENECA neuromorphic processor. This simulator is unique in that it combines features from both hardware-specific and hardware-agnostic spiking neural network simulators, resulting in ...

Hardware-Accelerator Design by Composition

Dataflow Component Interfaces with Tydi-Chisel

As dedicated hardware is becoming more prevalent in accelerating complex applications, methods are needed to enable easy integration of multiple hardware components into a single accelerator system. However, this vision of composable hardware is hindered by the lack of standards ...

Beyond quantum Shannon decomposition

Circuit construction for n -qubit gates based on block- ZXZ decomposition

This paper proposes an optimized quantum block-ZXZ decomposition method that results in more optimal quantum circuits than the quantum Shannon decomposition, which was presented in 2005 by M. Möttönen, and J. J. Vartiainen [in Trends in quantum computing research, edited by S. Sh ...
This paper introduces TINA, a novel framework for implementing non Neural Network (NN) signal processing algorithms on NN accelerators such as GPUs, TPUs or FPGAs. The key to this approach is the concept of mapping mathematical and logic functions as a series of convolutional and ...
High accuracy nanopore basecalling uses large deep neural networks, requiring powerful GPUs, which is undesirable for sequencing experiments outside the lab. Research has shown that this can be circumvented by using smaller models to increase efficiency as well as basecalling spe ...
In this paper, we present a fully pipelined and semi-parallel channel convolutional neural network hardware accelerator structure. This structure can trade off the compute time and the hardware utilization, allowing the accelerator to be layer pipelined without the need for fully ...
Convolutional neural networks (CNNs) are to be effective in many application domains, especially in the computer vision area. In order to achieve lower latency CNN processing, and reduce power consumption, developers are experimenting with using FPGAs to accelerate CNN processing ...
In spite of progress on hardware design languages, the design of high-performance hardware accelerators forces many design decisions specializing the interfaces of these accelerators in ways that complicate the understanding of the design and hinder modularity and collaboration. ...
Tydi is an open specification for streaming dataflow designs in digital circuits, allowing designers to express how composite and variable-length data structures are transferred over streams using clear, data-centric types. These data types are extensively used in a many applicat ...

QKSA

Quantum Knowledge Seeking Agent

In this research, we extend the universal reinforcement learning agent models of artificial general intelligence to quantum environments. The utility function of a classical exploratory stochastic Knowledge Seeking Agent, KL-KSA, is generalized to distance measures from quantum i ...
Motion prediction is a key factor towards the full deployment of autonomous vehicles. It is fundamental in order to assure safety while navigating through highly interactive complex scenarios. In this work, the framework IAMP (Interaction-Aware Motion Prediction), producing multi ...
Background

Non-Invasive Prenatal Testing is often performed by utilizing read coverage-based profiles obtained from shallow whole genome sequencing to detect fetal copy number variations. Such screening typically operates on a discretized binned representation of the gen ...
Many researchers have proposed replacing the aggregation server in federated learning with a blockchain system to improve privacy, robustness, and scalability. In this approach, clients would upload their updated models to the blockchain ledger and use a smart contract to perform ...