CH

C. Hong

info

Please Note

5 records found

Bachelor thesis (2022) - K. Dwivedi, S. Roos, C. Hong, J. Huang, G. Lan
Adversarial training and its variants have become the standard defense against adversarial attacks - perturbed inputs designed to fool the model. Boosting techniques such as Adaboost have been successful for binary classification problems, however, there is limited research in the application of them for providing adversarial robustness. In this work, we explore the question: How can AdaBoost ensemble learning provide adversarial robustness to white-box attacks when the "weak" learners are neural networks that do adversarial training? We design an extension of AdaBoost to support adversarial training in a multiclass setting, and name it Adven. To answer the question, we systematically study the effect of six variables of Adven’s training procedure on adversarial robustness. From a theoretical standpoint, our experiments show that known characteristics from adversarial training and ensemble learning apply in the combined context. Empirically, we demonstrate that an Adven ensemble is more robust than a single learner in every scenario. Using the best found values of the six tested variables, we derive an Adven ensemble that can defend against 91.88% of PGD attacks and obtain 96.72% accuracy on the MNIST dataset. ...

Effects of Data Distributions on Sample Transferability

Bachelor thesis (2022) - P.M. Vigilanza Lorenzo, S. Roos, J. Huang, C. Hong, G. Lan
Machine Learning (ML) models are vulnerable to adversarial samples — human imperceptible changes to regular input to elicit wrong output on a given model. Plenty of adversarial attacks assume an attacker has access to the underlying model or access to the data used to train the model. Instead, in this paper we focus on the effects the data distributions has on the transferability of adversarial samples under a ``black-box'' scenario. We assume an attacker has to train a separate model (the ``substitute model'') and generate adversaries using this independent model. The substitute models are trained with different data distributions: symmetric, cross-section or completely disjoint data to the one used to train the target model. The results demonstrate that an attacker only needs semantically similar data to execute an effective attack using a substitute model and well-known gradient based adversarial generation techniques. Under ideal attack scenarios, target model accuracies can drop below 50\%. Furthermore, our research shows that generating adversarial images from an ensemble increases average attack success. ...
Bachelor thesis (2022) - B.W.M.F. van Veen, S. Roos, C. Hong, J. Huang, G. Lan
Model extraction attacks are attacks which generate a substitute model of a targeted victim neural network. It is possible to perform these attacks without a preexisting dataset, but doing so requires a very high number of queries to be sent to the victim model. This is otfen in the realm of several million queries. The more difficult the dataset, the more queries required to gain an accurate substitute model. Through each state-of-the-art model extraction algorithm, one thing that is not thoroughly optimised are the hyperparameters of the models, and optimizing them has been found to have a strong impact on accuracy of the substitute model. To attempt to reduce the number of queries required, research has been done to find the effects of optimizing hyperparameters for both MNIST and fashionMNIST datasets. This is done through grid search and random search. The results show that proper hyperparameter tuning can reduce the number of queries required to perform model stealing if they are not already optimized. Examples include requiring 125000 + queries to achieve 95% accuracy for the MNIST dataset with some hyperparameter combinations to only requiring 15000. ...
Bachelor thesis (2022) - S.G. Psathas, C. Hong, J. Huang, S. Roos, G. Lan
A machine learning classifier can be tricked us- ing adversarial attacks, attacks that alter images slightly to make the target model misclassify the image. To create adversarial attacks on black-box classifiers, a substitute model can be created us- ing model stealing. The research question this re- port address is the topic of using model stealing while minimizing the amount of querying the sub- stitute model needs to train. The solution used in this report is a variant of the ActiveThief algo- rithm that makes use of active learning to deter- mine which data is being queried. The paper exper- iments with different subset selection strategies to find the most informative data points. Also, a seed- ing algorithm based on clustering is explored and finally, a stopping criterion for the ActiveThief al- gorithm is proposed. These variations are evaluated on their accuracy and the number of queries they take to achieve that accuracy. This paper shows cluster seeding is an alternative to random seeding in ActiveThief. This paper also presents different subset selection strategies that outperform the ran- dom sampling strategy. Finally, a stopping criterion based on entropy is introduced that halts the algo- rithm when an uncertainty threshold is reached. ...
Bachelor thesis (2022) - J.J. Jansen, S. Roos, J. Huang, C. Hong, G. Lan
In recent years, there has been a great deal of studies about the optimisation of generating adversarial examples for Deep Neural Networks (DNNs) in a black-box environment. The use of gradient-based techniques to get the adversarial images in a minimal amount of input-output correspondence with the attacked model has been extensively studied. However, existing studies have not been discussing the effect of different gradient estimation techniques coherently. In this paper, a new one-point residual estimate is compared to the known two-point estimates. The findings in this paper show that the one-point residual estimate is not a viable option to decrease the number of queries to the attacked model. The accuracy of the attacks with the use of an one-point residual estimate maintains the same for weaker models. For stronger models, there is a slight decrease in accuracy at identical distortion levels. All estimates are tested on different PGD attacks on the MNIST and F-MNIST datasets using a 3-layer convolutional network.
...