A. Rezaeezade | TU Delft Repository

Jump, It Is Easy

JumpReLU Activation Function in Deep Learning-Based Side-Channel Analysis

Conference paper (2026) - Abraham Basurto-Becerra, Azade Rezaeezade, Stjepan Picek

Deep learning-based side-channel analysis has become a popular and powerful option for side-channel attacks in recent years. One of the main directions that the side-channel community explores is how to design efficient architectures that can break the targets with as little as possible attack traces, but also how to consistently build such architectures. In this work, we explore the usage of the JumpReLU activation function, which was designed to improve the robustness of neural networks. Intuitively speaking, improving the robustness seems a natural requirement for side-channel analysis, as hiding countermeasures could be considered adversarial attacks. In our experiments, we explore three strategies: 1) exchanging the activation functions with JumpReLU at the inference phase, 2) training common side-channel architectures with JumpReLU, and 3) conducting hyperparameter search with JumpReLU as the activation function. While the first two options do not yield improvements in results (but also do not show worse performance), the third option brings advantages, especially considering the number of neural networks that break the target. As such, we conclude that using JumpReLU is a good option to improve the stability of attack results. ...

Strive to Fail

Deep Learning-based Side-channel Analysis for Evaluators

Doctoral thesis (2026) - A. Rezaeezade, R.L. Lagendijk, Lejla Batina, S. Picek

Digital devices are now deeply embedded in modern life. These devices process sensitive information, including personal data, financial records, and data related to critical infrastructure. Cryptography is therefore a fundamental component of digital security, providing confidentiality, integrity, authentication, key exchange, and digital signatures.

Although cryptographic algorithms are designed to be mathematically secure, their physical implementations can introduce vulnerabilities. When cryptographic algorithms run on hardware, devices unintentionally leak information through side channels such as power consumption, electromagnetic radiation, and timing behavior. These leakages can be exploited through side-channel analysis to recover secret information, including cryptographic keys.

Security evaluation laboratories assess the resistance of cryptographic implementations against such attacks. However, this process is costly and must strike a balance between thoroughness and practical limitations on time, budget, data, and computational resources. Deep learning-based side-channel analysis (DL-SCA) is attractive in this context because neural networks can learn leakage characteristics directly from traces, reducing the need for manual preprocessing and explicit statistical assumptions. At the same time, deep learning introduces new costs, caused by sensitivity to neural network hyperparameter selection, instability, and overfitting in its training process.

The central problem addressed in this thesis is the tension between the benefits and costs of deep learning in side-channel evaluation. On the one hand, deep learning can reduce evaluation effort by relaxing assumptions about leakage models and reducing dependence on known data. On the other hand, it can make evaluation more expensive due to model selection, hyperparameter tuning, and the risk of overfitting. This thesis investigates how DL-SCA can be made more practical, reliable, and cost-effective for security evaluation workflows.

To this end, the thesis studies several strategies for improving DL-SCA without relying on excessive hyperparameter tuning. It examines the impact of increasing the amount of training data, regularization techniques, and ensemble learning. These approaches aim to improve generalization, robustness, and attack stability under realistic evaluation constraints. The thesis also investigates two deep learning approaches that relax major assumptions in classical SCA: leakage model-flexible DL-SCA, which avoids relying on fixed leakage models such as Hamming weight or identity, and deep learning-based blind SCA, which reduces dependence on plaintext or ciphertext by learning from noisy labels. ...

Digital devices are now deeply embedded in modern life. These devices process sensitive information, including personal data, financial records, and data related to critical infrastructure. Cryptography is therefore a fundamental component of digital security, providing confidentiality, integrity, authentication, key exchange, and digital signatures.

Although cryptographic algorithms are designed to be mathematically secure, their physical implementations can introduce vulnerabilities. When cryptographic algorithms run on hardware, devices unintentionally leak information through side channels such as power consumption, electromagnetic radiation, and timing behavior. These leakages can be exploited through side-channel analysis to recover secret information, including cryptographic keys.

Security evaluation laboratories assess the resistance of cryptographic implementations against such attacks. However, this process is costly and must strike a balance between thoroughness and practical limitations on time, budget, data, and computational resources. Deep learning-based side-channel analysis (DL-SCA) is attractive in this context because neural networks can learn leakage characteristics directly from traces, reducing the need for manual preprocessing and explicit statistical assumptions. At the same time, deep learning introduces new costs, caused by sensitivity to neural network hyperparameter selection, instability, and overfitting in its training process.

The central problem addressed in this thesis is the tension between the benefits and costs of deep learning in side-channel evaluation. On the one hand, deep learning can reduce evaluation effort by relaxing assumptions about leakage models and reducing dependence on known data. On the other hand, it can make evaluation more expensive due to model selection, hyperparameter tuning, and the risk of overfitting. This thesis investigates how DL-SCA can be made more practical, reliable, and cost-effective for security evaluation workflows.

To this end, the thesis studies several strategies for improving DL-SCA without relying on excessive hyperparameter tuning. It examines the impact of increasing the amount of training data, regularization techniques, and ensemble learning. These approaches aim to improve generalization, robustness, and attack stability under realistic evaluation constraints. The thesis also investigates two deep learning approaches that relax major assumptions in classical SCA: leakage model-flexible DL-SCA, which avoids relying on fixed leakage models such as Hamming weight or identity, and deep learning-based blind SCA, which reduces dependence on plaintext or ciphertext by learning from noisy labels.

Breaking the Blindfold

Deep Learning-based Blind Side-channel Analysis

Conference paper (2025) - Azade Rezaeezade, Trevor Yap, Dirmanto Jap, Shivam Bhasin, Stjepan Picek

Physical side-channel analysis (SCA) operates on the foundational assumption of access to known plaintext or ciphertext. However, this assumption can be easily invalidated in various scenarios, ranging from common encryption modes like Offset CodeBook (OCB) to complex hardware implementations, where such data may be inaccessible. Blind SCA addresses this challenge by operating without the knowledge of plaintext or ciphertext. Unfortunately, prior such approaches have shown limited success in practical settings. This paper introduces the Deep Learning-based Blind Side-channel Analysis (DL-BSCA) framework, leveraging deep neural networks to recover secret keys in blind SCA settings. In addition, we propose a novel labeling method, Multi-point Cluster-based (MC) labeling, accounting for dependencies between leakage variables by exploiting multiple sample points for each variable, improving the accuracy of trace labeling. We validate our approach across four datasets, including symmetric key algorithms (AES and ASCON) and a post-quantum cryptography algorithm, Kyber, with platforms ranging from high-leakage 8-bit AVR XMEGA to noisy 32-bit ARM STM32F4. Notably, previous methods failed to recover the key on the same datasets. We demonstrate the first successful blind SCA on a desynchronization countermeasure enabled by DL-BSCA and MC labeling. All experiments are validated with real-world SCA measurements, highlighting the practicality and effectiveness of our approach. ...

Regularizers to the rescue

Fighting overfitting in deep learning-based side-channel analysis

Journal article (2024) - Azade Rezaeezade, Lejla Batina

Despite considerable achievements of deep learning-based side-channel analysis, overfitting represents a significant obstacle in finding optimized neural network models. This issue is not unique to the side-channel domain. Regularization techniques are popular solutions to overfitting and have long been used in various domains. At the same time, the works in the side-channel domain show sporadic utilization of regularization techniques. What is more, no systematic study investigates these techniques’ effectiveness. In this paper, we aim to investigate the regularization effectiveness on a randomly selected model, by applying 4 powerful and easy-to-use regularization techniques to 8 combinations of datasets, leakage models, and deep learning topologies. The investigated techniques are L₁, L₂, dropout, and early stopping. Our results show that while all these techniques can improve performance in many cases, L₁ and L₂ are the most effective. Finally, if training time matters, early stopping is the best technique. ...

To Overfit, or Not to Overfit

Improving the Performance of Deep Learning-Based SCA

Conference paper (2022) - Azade Rezaeezade, Guilherme Perin, Stjepan Picek

Profiling side-channel analysis allows evaluators to estimate the worst-case security of a target. When security evaluations relax the assumptions about the adversary’s knowledge, profiling models may easily be sub-optimal due to the inability to extract the most informative points of interest from the side-channel measurements. When used for profiling attacks, deep neural networks can learn strong models without feature selection with the drawback of expensive hyperparameter tuning. Unfortunately, due to very large search spaces, one usually finds very different model behaviors, and a widespread situation is to face overfitting with typically poor generalization capacity. Usually, overfitting or poor generalization would be mitigated by adding more measurements to the profiling phase to reduce estimation errors. This paper provides a detailed analysis of different deep learning model behaviors and shows that adding more profiling traces as a single solution does not necessarily help improve generalization. We recognize the main problem to be the sub-optimal selection of hyperparameters, which is then difficult to resolve by simply adding more measurements. Instead, we propose to use small hyperparameter tweaks or regularization as techniques to resolve the problem. ...