Towards Understanding of Deep Learning in Profiled Side-Channel Analysis

Similarity of predictors measured and explained

Master Thesis (2019)
Author(s)

D. van der Valk (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Stjepan Picek – Mentor (TU Delft - Cyber Security)

P.H. Hartel – Graduation committee member (TU Delft - Cyber Security)

Julián Urbano – Graduation committee member (TU Delft - Multimedia Computing)

Lejla Batina – Graduation committee member (Radboud Universiteit Nijmegen)

Faculty
Electrical Engineering, Mathematics and Computer Science
Copyright
© 2019 Daan van der Valk
More Info
expand_more
Publication Year
2019
Language
English
Copyright
© 2019 Daan van der Valk
Graduation Date
22-08-2019
Awarding Institution
Delft University of Technology
Faculty
Electrical Engineering, Mathematics and Computer Science
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Side-channel attacks (SCA) aim to extract a secret cryptographic key from a device, based on unintended leakage. Profiled attacks are the most powerful SCAs, as they assume the attacker has a perfect copy of the target device under his control. In recent years, machine learning (ML) and deep learning (DL) techniques have became popular as profiling tools in SCA. Still, there are many settings for which their performance is far from expected. In such occasions, it is very important to understand the difficulty of the problem and the behavior of the learning algorithm. To that end, one needs to investigate not only the performance of machine learning but also to provide insights into its explainability.

In this work, we look at various ways to explain the behaviour of ML and DL techniques. We study the bias-variance decomposition, where the predictive error in various scenarios is split in bias, variance and noise. While the results shed some light on the underlying difficulty of the problem, existing decompositions are not tuned for SCA. We propose the Guessing Entropy (GE) bias-variance decomposition, incorporating the domain-specific GE metric in a tool to analyse attack characteristics. Additionally, we show the relation between the mean squared error and guessing entropy. Our experiments show this decomposition is a useful tool in trade-offs such as model complexity.

To dive deeper into the inner representations of neural networks (NNs), we use Singular Vector Canonical Correlation Analysis (SVCCA) to compare models used in SCA. We find that different datasets, or even leakage models, are represented very differently by neural networks. We apply SVCCA to a recent portability study, which shows one should be careful to overtrain their networks with too much data.

Finally, do we even need complicated neural networks to conduct an efficient attack? We demonstrate that a small network can perform much better by mimicking the outputs of a large network, compared to learning from the original dataset.

Files

License info not available