Exploiting neuron activation values for creating adversarial examples

None, None

Exploiting neuron activation values for creating adversarial examples

Utilization of intermediate network information in genetic algorithms

Master Thesis (2021)

Author(s)

I.C. van der Blij (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Sicco Verwer – Mentor (TU Delft - Cyber Security)

Faculty

Electrical Engineering, Mathematics and Computer Science

Copyright

To reference this document use:

https://resolver.tudelft.nl/uuid:0cc33f03-ba87-4873-be90-30965aec12cc

More Info

expand_more

Publication Year

2021

Language

English

Copyright

Graduation Date

25-03-2021

Awarding Institution

Delft University of Technology

Programme

['Computer Science | Cyber Security']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

The increasing usage of neural networks forms a threat to the cyber security of the system that uses the network, and unfortunately adversaries use this vulnerability to attack the system with adversarial examples. As neural networks can have complex structures with tens of thousands of parameters, they are hard for humans to understand. Hence, existing white-box methods use very limited network information and most state-of-the-art methods are based on gradient descent. In this work we further investigate the inner workings of a neural network and consider using intermediate network information for the creation of adversarial examples. We show that neuron activation values can be distinguished by the class of the data point and contain meaningful information about the prediction of the network. Based on this information, we propose a new, gradient-free method for creating adversarial examples based on a genetic algorithm. By covering a larger part of the search space and manipulating the neuron activation values, our success rate exceeds most state-of-the-art methods, such as DeepFool and RFGSM. We also find that the trade-off between success rate and distance has a huge impact on the results of a method, wherefore we recommend to carefully balance this trade-off by formulating an optimization formula with a separate loss and distance component.

Files

Thesis_Irene_van_der_Blij.pdf

(pdf | 6.38 Mb)

License info not available