A side-channel attack is performed by analyzing unwanted physical leakage to achieve a more effective attack on the cryptographic key. An attacker performs a profiled attack when he has a physical and identical copy of the target device, meaning the attacker is in full control of
...
A side-channel attack is performed by analyzing unwanted physical leakage to achieve a more effective attack on the cryptographic key. An attacker performs a profiled attack when he has a physical and identical copy of the target device, meaning the attacker is in full control of the target device. Therefore, these profiled attacks are known as the most powerful attacks in the side-channel analysis. This physical leakage is analyzed by machine learning and, in the last years, mostly deep learning, which both are used as a profiling tool to perform a side-channel attack. The best known deep learning technique for side-channel analysis at this moment is the convolutional neural network (CNN). However, this thesis investigates a well-known deep learning model that is never used before in side-channel analysis. The deep learning models RNN, LSTM, and GRU are tested and evaluated to look for the best hyperparameters. We show the influence of different models, amount of layers, dropout, activation function, units, recurrent dropout, and batch sizes in the experiments. We also show that using different sequence length gives a speedup in training. To reduce the sequence length, we use a linear regression technique. After that, we show that sequential data models are a suitable alternative for side-channel analysis; however, their results do not surpass the CNNs. After this, we experiment with an autoencoder as a preprocessing algorithm to "clean" noisy traces. We show that the LSTM autoencoder easily removes a hiding countermeasure with noise. However, a hiding countermeasure with delay is more challenging for the LSTM autoencoder. Combining both countermeasures seems impossible for the LSTM autoencoder. The performance we see when cleaning the traces also affects the guessing entropy. Lastly, we use an embedding layer as the first layer for MLP, CNN, and a sequential data model in the side-channel analysis. We experiment with different output dimensions and conclude that an embedding layer is a valid alternative to change the data dimension when using an MLP or a sequential data model.