Parallel Matrix Multiplication on Memristor-Based Computation-in-Memory Architecture

None, None; None, None; None, None; None, None; None, None; None, None

Parallel Matrix Multiplication on Memristor-Based Computation-in-Memory Architecture

Conference Paper (2016)

Author(s)

Adib Haron (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Jintao Yu (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Razvan Nane (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Mottaqiallah Taouil (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Said Hamdioui (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Koen Bertels (TU Delft - Electrical Engineering, Mathematics and Computer Science, TU Delft - FTQC/Bertels Lab)

Research Group

Computer Engineering

Computer architecture Computational modeling Three-dimensional displays Parallel algorithms Two dimensional displays

DOI related publication

https://doi.org/10.1109/HPCSim.2016.7568411 Final published version

To reference this document use

https://resolver.tudelft.nl/uuid:6a9edc28-f90d-44ad-8ba4-63756722617f

More Info

expand_more

Publication Year

2016

Language

English

Research Group

Computer Engineering

Pages (from-to)

759-766

ISBN (print)

978-1-5090-2088-1

Downloads counter

439

Collections

Institutional Repository

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

One of the most important constraints of today’s architectures for data-intensive applications is the limited bandwidth due to the memory-processor communication bottleneck. This significantly impacts performance and energy. For instance, the energy consumption share of communication and memory
access may exceed 80%. Recently, the concept of Computation-in-Memory (CIM) was proposed, which is based on the integration of storage and computation in the same physical location using a crossbar topology and non-volatile resistive-switching memristor technology. To illustrate the tremendous potential of CIM architecture in exploiting massively parallel computation while reducing the communication overhead, we present a communicationefficient mapping of a large-scale matrix multiplication algorithm on the CIM architecture. The experimental results show that, depending on the matrix size, CIM architecture exhibits several orders of magnitude higher performance in total execution time
and two orders of magnitude better in total energy consumption than the multicore-based on the shared memory architecture.

Files

10408497.pdf

(pdf | 0.95 Mb)

License info not available