How can Large Languages Models for code be used to harm the privacy of users?
Red-Teaming Large Languages Models
I. Moruz (TU Delft - Electrical Engineering, Mathematics and Computer Science)
A. Van van Deursen – Mentor (TU Delft - Software Engineering)
A. Al-Kaswan – Coach (TU Delft - Software Engineering)
M. Izadi – Graduation committee member (TU Delft - Software Engineering)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
In recent years, Large Language Models (LLMs) have significantly advanced, demonstrating impressive capabilities in generating human-like text. This paper explores the potential privacy risks associated with Large Language Models for Code (LLMs4Code), which are increasingly used in various sectors. These models, while beneficial for tasks such as code generation and understanding, may inadvertently expose sensitive information contained in their training datasets. We investigate the specific types of personally identifiable information (PII) that can be leaked and explore targeted and untargeted attacks with diverse prompting styles under which these leaks occur. Our analysis reveals that LLMs4Code can leak PII with the targeted attacks, emphasizing the need for robust privacy-preserving measures. This research contributes to the ongoing discourse on AI ethics and privacy, providing insights into the safety of various prompting conditions under targeted and untargeted attacks. Future work should focus on running the experiment with more diverse parameters, implementing more advanced PII detection techniques, and testing a broader range of models to enhance the generalizability of the findings.