How Robust Is Neural-Symbolic Model Logic Tensor Networks Against Clean-Label Data Poisoning Backdoor Attacks?

None, None

How Robust Is Neural-Symbolic Model Logic Tensor Networks Against Clean-Label Data Poisoning Backdoor Attacks?

Benchmarking Benign Accuracy and Attack Success Rate

Bachelor Thesis (2025)

Author(s)

A. Chiru (TU Delft - Electrical Engineering, Mathematics and Computer Science)

Contributor(s)

Kaitai Liang – Mentor (TU Delft - Cyber Security)

A. Agiollo – Mentor (TU Delft - Cyber Security)

A. Hanjalic – Graduation committee member (TU Delft - Intelligent Systems)

Faculty

Electrical Engineering, Mathematics and Computer Science

Backdoor attack Clean label attack Neuro-Symbolic Model

To reference this document use:

https://resolver.tudelft.nl/uuid:46cbc127-bc57-41fb-af0a-562a4fcac9d9

More Info

expand_more

Publication Year

2025

Language

English

Coordinates

4.37, 52

Graduation Date

08-11-2025

Awarding Institution

Delft University of Technology

Project

['CSE3000 Research Project', 'Towards Benchmarking the Robustness of Neuro-Symbolic Learning against Backdoor Attacks']

Programme

['Computer Science and Engineering']

Faculty

Electrical Engineering, Mathematics and Computer Science

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Neuro-Symbolic (NeSy) models promise better interpretability and robustness than conventional neural networks, yet their resilience to data poisoning backdoors is largely untested. This work investigates that gap by attacking a Logic Tensor Network (LTN) with clean-label triggers. Two attack strategies are benchmarked on MNIST addition and modulo tasks: (i) a targeted Projected Gradient Descent (PGD) variant that minimises the loss towards a target class, and (ii) a weighted pixel-blending (naïve) method. Furthermore, three trigger placements suited to the task (left, right, or both images), poison rates (0.5%-20%), and blend ratios (10%-90%) are benchmarked while reporting benign accuracy and attack-success rate (ASR). Results show that PGD can reach ≈ 15% ASR on the harder modulo task when both images are poisoned, but has negligible impact on the simpler addition task. Additionally, the naïve attack never exceeds 5% ASR unless the blend is large enough to be recognisable during visual inspection. Increasing the poison rate beyond 10% does not increase attack success rate. Overall, clean-label backdoors remain low-yield against LTNs, but even a modest ASR is a concern for safety-critical deployments. Extending this work to include dirty-label poisoning reveals a sharp trade-off: ASR increases to ≈ 75% on the modulo task at the cost of reduced stealth, without benign accuracy being affected. Clean-label poisoning reduced addition task accuracy by roughly 35% while keeping ASR near 10%. Clean-label attacks remain low-yield yet stealthy, whereas dirty-label strategies achieve higher efficacy but expose the attack to detection through accuracy degradation. These findings highlight that even modest attack success rates pose risks in safety-critical settings. The findings demonstrate that backdoor potency and collateral effects are governed by task structure, underscoring the necessity of task-aware defence strategies.

Files

Research_paper_12_.pdf

(pdf | 4.03 Mb)

License info not available