Towards Automatic Principles of Persuasion Detection Using Machine Learning Approach
Lázaro Bustio-Martínez (Universidad Iberoamericana)
Vitali Herrera-Semenets (Centro de Aplicaciones de Tecnologías de Avanzada)
Juan-Luis García-Mendoza (Université Sorbonne Paris Nord)
Jorge Ángel González-Ordiano (Universidad Iberoamericana)
Luis Zúñiga-Morales (Universidad Iberoamericana)
Rubén Sánchez Rivero (Centro de Aplicaciones de Tecnologías de Avanzada)
José Emilio Quiróz-Ibarra (Universidad Iberoamericana)
Pedro Antonio Santander-Molina (Pontificia Universidad Católica de Valparaíso)
Jan van den Berg (TU Delft - Cyber Security)
Davide Buscaldi (Université Sorbonne Paris Nord)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Persuasion is a human activity of influence. In marketing, persuasion can help customers find solutions to their problems, make informed choices, or convince someone to buy a useful (or useless) product or service. In computer crimes, persuasion can trick users into revealing sensitive information, or even performing actions that benefit attackers. Phishing is one of the most common and dangerous forms of persuasion-based attacks, as it exploits human vulnerabilities rather than technical ones. Therefore, an intelligent system capable of detecting and classifying persuasion attempts might be useful in protecting users. In this work, an approach that uses Machine Learning to analyze messages based on principles of persuasion and different data representations is presented. The aim of this research is to detect which data representation and which classification algorithm obtain the best results in detecting each principle of persuasion as a prior step to detecting phishing attacks. The results obtained indicate that among the combinations tested, there is one combination of data representation and classification algorithm that performs best. The related classification models obtained can detect the principles of persuasion at a rate that varies between 0.78 and 0.86 of AUC-ROC.