Parallel Matrix Multiplication on Memristor-Based Computation-in-Memory Architecture

Conference Paper (2016)
Author(s)

Adib Haron (TU Delft - Computer Engineering)

Jintao Yu (TU Delft - Computer Engineering)

R. Nane (TU Delft - Computer Engineering)

Mottaqiallah Taouil (TU Delft - Computer Engineering)

Said Hamdioui (TU Delft - Computer Engineering)

Koen L.M. Bertels (TU Delft - Quantum & Computer Engineering, TU Delft - FTQC/Bertels Lab)

Research Group
Computer Engineering
Copyright
© 2016 M.A.B. Haron, J. Yu, R. Nane, M. Taouil, S. Hamdioui, K.L.M. Bertels
DOI related publication
https://doi.org/10.1109/HPCSim.2016.7568411
More Info
expand_more
Publication Year
2016
Language
English
Copyright
© 2016 M.A.B. Haron, J. Yu, R. Nane, M. Taouil, S. Hamdioui, K.L.M. Bertels
Research Group
Computer Engineering
Pages (from-to)
759-766
ISBN (print)
978-1-5090-2088-1
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

One of the most important constraints of today’s architectures for data-intensive applications is the limited bandwidth due to the memory-processor communication bottleneck. This significantly impacts performance and energy. For instance, the energy consumption share of communication and memory
access may exceed 80%. Recently, the concept of Computation-in-Memory (CIM) was proposed, which is based on the integration of storage and computation in the same physical location using a crossbar topology and non-volatile resistive-switching memristor technology. To illustrate the tremendous potential of CIM architecture in exploiting massively parallel computation while reducing the communication overhead, we present a communicationefficient mapping of a large-scale matrix multiplication algorithm on the CIM architecture. The experimental results show that, depending on the matrix size, CIM architecture exhibits several orders of magnitude higher performance in total execution time
and two orders of magnitude better in total energy consumption than the multicore-based on the shared memory architecture.

Files

10408497.pdf
(pdf | 0.95 Mb)
License info not available