Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparative Evaluation of Chatbot Responses on Coronary Artery Disease
3
Zitationen
1
Autoren
2025
Jahr
Abstract
OBJECTIVE: Coronary artery disease (CAD) is the leading cause of morbidity and mortality globally. The growing interest in natural language processing chatbots (NLPCs) has driven their inevitable widespread adoption in healthcare. The purpose of this study was to evaluate the accuracy and reproducibility of responses provided by NLPCs, such as ChatGPT, Gemini, and Bing, to frequently asked questions about CAD. METHODS: Fifty frequently asked questions about CAD were asked twice, with a one-week interval, on ChatGPT, Gemini, and Bing. Two cardiologists independently scored the answers into four categories: comprehensive/correct (1), incomplete/partially correct (2), a mix of accurate and inaccurate/misleading (3), and completely inaccurate/irrelevant (4). The accuracy and reproducibility of each NLPC's responses were assessed. RESULTS: ChatGPT's responses were scored as 14% incomplete/partially correct and 86% comprehensive/correct. In contrast, Gemini provided 68% comprehensive/correct responses, 30% incomplete/partially correct responses, and 2% a mix of accurate and inaccurate/misleading information. Bing delivered 60% comprehensive/correct responses, 26% incomplete/partially correct responses, and 8% a mix of accurate and inaccurate/misleading information. Reproducibility scores were 88% for ChatGPT, 84% for Gemini, and 70% for Bing. CONCLUSION: ChatGPT demonstrates significant potential to improve patient education about coronary artery disease by providing more sensitive and accurate answers compared to Bing and Gemini.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.773 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.682 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.242 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.898 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.