Objective: The rapid expansion of digital healthcare has heightened the volume of patient communication, thereby increasing the workload for healthcare professionals. Large Language Models (LLMs) hold promises for offering automated responses to patient questions relayed through
...
Objective: The rapid expansion of digital healthcare has heightened the volume of patient communication, thereby increasing the workload for healthcare professionals. Large Language Models (LLMs) hold promises for offering automated responses to patient questions relayed through eHealth platforms, yet concerns persist regarding their effectiveness, accuracy, and limitations in healthcare settings. This study aims to evaluate the current evidence on the performance and perceived suitability of LLMs in healthcare, focusing on their role in supporting clinical decision-making and patient communication. Materials and methods: A systematic search in PubMed and Embase up to June 11, 2025 identified 330 studies, of which 20 met the inclusion criteria for comparing the accuracy and adequacy of medical information provided by LLMs versus healthcare professionals and guidelines. The search strategy combined terms related to LLMs, healthcare professionals, and patient questions. The ROBINS-I tool assessed the risk of bias. Results: A total of nineteen studies focused on medical specialties and one on the primary care setting. Twelve studies favored the responses generated by LLMs, six reported mixed results, and two favored the healthcare professionals’ response. Bias components generally scored moderate to low, indicating a low risk of bias. Discussion and conclusions: The review summarizes current evidence on the accuracy and adequacy of medical information provided by LLMs in response to patient questions, compared to healthcare professionals and clinical guidelines. While LLMs show potential as supportive tools in healthcare, their integration should be approached cautiously due to inconsistent performance and possible risks. Further research is essential before widespread adoption.