Towards Understanding of Deep Learning in Profiled Side-Channel Analysis

Similarity of predictors measured and explained

Master thesis (2019)

Authors

D. van der Valk Electrical Engineering, Mathematics and Computer Science

Contributors

S. Picek Cyber Security - (mentor)

P.H. Hartel Cyber Security - (graduation committee member)

Julián Urbano (graduation committee member)

Lejla Batina Radboud Universiteit Nijmegen (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

To reference this document use:

http://resolver.tudelft.nl/uuid:b2ccf849-e6f7-452a-a2d5-9ff63f85efe7

More Info

expand_more

Published Date

22-08-2019

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Side-channel attacks (SCA) aim to extract a secret cryptographic key from a device, based on unintended leakage. Profiled attacks are the most powerful SCAs, as they assume the attacker has a perfect copy of the target device under his control. In recent years, machine learning (ML) and deep learning (DL) techniques have became popular as profiling tools in SCA. Still, there are many settings for which their performance is far from expected. In such occasions, it is very important to understand the difficulty of the problem and the behavior of the learning algorithm. To that end, one needs to investigate not only the performance of machine learning but also to provide insights into its explainability.

In this work, we look at various ways to explain the behaviour of ML and DL techniques. We study the bias-variance decomposition, where the predictive error in various scenarios is split in bias, variance and noise. While the results shed some light on the underlying difficulty of the problem, existing decompositions are not tuned for SCA. We propose the Guessing Entropy (GE) bias-variance decomposition, incorporating the domain-specific GE metric in a tool to analyse attack characteristics. Additionally, we show the relation between the mean squared error and guessing entropy. Our experiments show this decomposition is a useful tool in trade-offs such as model complexity.

To dive deeper into the inner representations of neural networks (NNs), we use Singular Vector Canonical Correlation Analysis (SVCCA) to compare models used in SCA. We find that different datasets, or even leakage models, are represented very differently by neural networks. We apply SVCCA to a recent portability study, which shows one should be careful to overtrain their networks with too much data.

Finally, do we even need complicated neural networks to conduct an efficient attack? We demonstrate that a small network can perform much better by mimicking the outputs of a large network, compared to learning from the original dataset.

Files

TowardUnderstandingofDeepLearn... (.pdf)

(.pdf | 27.6 Mb)