Maximizing the Potential of Custom RISC-V Vector Extensions for Speeding up SHA-3 Hash Functions

Conference Paper (2023)
Author(s)

H. Li (TU Delft - Cyber Security)

Nele Mentens (Katholieke Universiteit Leuven, Universiteit Leiden)

Stjepan Picek (Radboud Universiteit Nijmegen, TU Delft - Cyber Security)

Research Group
Cyber Security
Copyright
© 2023 H. Li, Nele Mentens, S. Picek
DOI related publication
https://doi.org/10.23919/DATE56975.2023.10137009
More Info
expand_more
Publication Year
2023
Language
English
Copyright
© 2023 H. Li, Nele Mentens, S. Picek
Research Group
Cyber Security
Bibliographical Note
Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.@en
Pages (from-to)
1-6
ISBN (print)
979-8-3503-9624-9
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

SHA-3 is considered to be one of the most secure standardized hash functions. It relies on the Keccak-f[1 600] permutation, which operates on an internal state of 1 600 bits, mostly represented as a 5 x 5 x 64-bit matrix. While existing implementations process the state sequentially in chunks of typically 32 or 64 bits, the Keccak-f[1 600] permutation can benefit a lot from speedup through parallelization. This paper is the first to explore the full potential of parallelization of Keccak-f[1 600] in RISC-V based processors through custom vector extensions on 32-bit and 64-bit architectures. We analyze the Keccak $\mathbf{f}[1 \ 600]$ permutation, composed of five different step mappings, and propose ten custom vector instructions to speed up the computation. We realize these extensions in a SIMD processor described in System Verilog. We compare the performance of our designs to existing architectures based on vectorized application-specific instruction set processors (ASIP). We show that our designs outperform all related work in throughput due to our carefully selected custom vector instructions.

Files

Maximizing_the_Potential_of_Cu... (pdf)
(pdf | 0.731 Mb)
- Embargo expired in 02-12-2023
License info not available