Benchmarking the Robustness of Neuro-Symbolic Learning against Backdoor Attacks

None, None

Benchmarking the Robustness of Neuro-Symbolic Learning against Backdoor Attacks

Semantic Loss vs BadNets Poisoning Attack

Bachelor Thesis (2025)

Author(s)

D. Becerra Merodio (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

A. Agiollo – Mentor (TU Delft - Cyber Security)

Kaitai Liang – Mentor (TU Delft - Cyber Security)

A. Hanjalic – Graduation committee member (TU Delft - Intelligent Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

BadNets Backdoor Attack Neuro Symbolic Semantic Loss

To reference this document use:

https://resolver.tudelft.nl/uuid:46e86a3a-d5b4-472d-8c38-bf4dd1d9b6a2

More Info

expand_more

Publication Year

2025

Language

English

Graduation Date

23-06-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Neuro-Symbolic (NeSy) models combine the generalization ability of neural networks with the interpretability of symbolic reasoning. While the vulnerability of neural networks to backdoor data poisoning attacks is well-documented, their implications for NeSy models remain underexplored. This paper investigates whether adding a semantic loss component to a neural network improves its robustness against BadNets backdoor attacks. We evaluate multiple semantic loss models trained on the CelebA dataset with varying constraints, semantic loss weights, and backdoor trigger configurations. Our results show that incorporating a semantic loss model with constraints that involve the target label significantly reduces the attack success rate. Additionally, we found that increasing the weight of the semantic loss component can enhance robustness, although at the cost of balanced accuracy. Interestingly, changes in the size and placement of the trigger had minimal effect on attack performance. These findings suggest that while semantic loss can improve robustness to some extent, its effectiveness is highly dependent on the nature and relevance of the constraints used as well as on the weight assigned to the semantic loss component.

Files

Research_Project_Diego_Becerra... (pdf)

(pdf | 3.06 Mb)

License info not available