Human Insight vs. Artificial Intelligence: A Thematic Analysis
Comparing Manual and LLM Approaches to Understanding How Smokers Experience Preparatory Activities in a Digital Cessation Intervention
K. Nair (TU Delft - Electrical Engineering, Mathematics and Computer Science)
W.P. Brinkman – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
R.L. Lagendijk – Mentor (TU Delft - Electrical Engineering, Mathematics and Computer Science)
More Info
expand_more
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Abstract
Smoking remains a leading cause of preventable death, making effective cessation support a global health priority. While conversational agents (chatbots) offer a scalable solution, their success depends on understanding the user’s experience. This study addresses two interconnected challenges: first, understanding the subjective experience of smokers with preparatory activities proposed by a chatbot, and second, evaluating the efficacy of Large Language Models (LLMs) in analyzing this qualitative feedback. This research employs a comparative design. A manual thematic analysis of smokers’ written reflections first established a baseline coding scheme. This scheme was then compared against the outputs of three LLMs, which were tasked with both generating themes independently and applying the predefined manual scheme. The accuracy of the LLMs’ application was measured against the human baseline using Cohen’s Kappa. The manual analysis revealed that smokers’ experiences were predominantly positive, showing strong motivation and a sense that the activities helped reinforce their quitting goals. This was concurrently challenged by expressions of skepticism about the activities’ effectiveness and mentions of personal barriers to quitting. The comparative analysis demonstrated that while LLMs could identify these broad positive and negative topics, they failed to capture more subtle, attitude-based concepts, such as a user’s willingness to engage with an activity despite their personal doubts. Furthermore, the models’ accuracy in applying a predefined coding scheme was substantially lower than the human baseline. This work makes two primary contributions. For digital health, the findings show that cessation aids must be designed to personalize activities to address specific user barriers and skepticism. Methodologically, the study provides a clear verdict on the current role of LLMs in this context: while LLMs show potential as an exploratory aid in theme generation, they are not yet a viable tool for applying a predefined coding scheme, making human analytical oversight essential for ensuring the depth and validity of qualitative research.