Enhanced phishing detection using multimodal data
Lázaro Bustio-Martínez (Universidad Iberoamericana)
Vitali Herrera-Semenets (Advanced Technologies Application Center)
Jorge Ángel González-Ordiano (Universidad Iberoamericana Ciudad de México)
Yamel Pérez-Guadarramas (Universidad Iberoamericana Ciudad de México)
Luis Zúñiga-Morales (Universidad Iberoamericana Ciudad de México)
Daniela Montoya-Godínez (Universidad Iberoamericana Ciudad de México)
Miguel Ángel Álvarez-Carmona (Centro de Investigacion en Matematicas, CIMAT)
Jan van den Berg (TU Delft - Cyber Security)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Phishing remains one of the most persistent cybersecurity threats, increasingly exploiting not only technical vulnerabilities but also human cognitive biases. Existing detection systems often rely on single-modality features and black-box models, which restrict both generalization and interpretability. This study presents an explainable multimodal framework that combines textual and technical cues, including message content, URL structure, and Principles of Persuasion, to capture both objective and subjective aspects of phishing. Several classifiers were evaluated using 10-fold stratified cross-validation, with Random Forest achieving the best balance between performance and transparency (ROC-AUC = 0.9840), supported by SHAP explanations that identify the most influential linguistic and structural features. Comparative analysis shows that the proposed framework outperforms unimodal baselines while preserving interpretability, enabling a clear rationale for classification outcomes. These results indicate that integrating multimodal representation with explainable learning strengthens phishing detection accuracy, improves user trust, and supports reliable deployment in real-world environments.