Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Comparing generative and retrieval-based chatbots in answering patient questions regarding age-related macular degeneration and diabetic retinopathy
27
Zitationen
16
Autoren
2024
Jahr
Abstract
BACKGROUND/AIMS: To compare the performance of generative versus retrieval-based chatbots in answering patient inquiries regarding age-related macular degeneration (AMD) and diabetic retinopathy (DR). METHODS: We evaluated four chatbots: generative models (ChatGPT-4, ChatGPT-3.5 and Google Bard) and a retrieval-based model (OcularBERT) in a cross-sectional study. Their response accuracy to 45 questions (15 AMD, 15 DR and 15 others) was evaluated and compared. Three masked retinal specialists graded the responses using a three-point Likert scale: either 2 (good, error-free), 1 (borderline) or 0 (poor with significant inaccuracies). The scores were aggregated, ranging from 0 to 6. Based on majority consensus among the graders, the responses were also classified as 'Good', 'Borderline' or 'Poor' quality. RESULTS: ). ChatGPT-4 and ChatGPT-3.5 had no 'Poor' rated responses. Google Bard produced 6.7% Poor responses, and OcularBERT produced 20%. Across question types, ChatGPT-4 outperformed Google Bard only for AMD, and ChatGPT-3.5 outperformed Google Bard for DR and others. CONCLUSION: ChatGPT-4 and ChatGPT-3.5 demonstrated superior performance, followed by Google Bard and OcularBERT. Generative chatbots are potentially capable of answering domain-specific questions outside their original training. Further validation studies are still required prior to real-world implementation.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.674 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.583 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.105 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.862 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Autoren
Institutionen
- Singapore National Eye Center(SG)
- Singapore Eye Research Institute(SG)
- Chinese Academy of Medical Sciences & Peking Union Medical College(CN)
- Peking Union Medical College Hospital(CN)
- Duke-NUS Medical School(SG)
- National University of Singapore(SG)
- National University Hospital(SG)
- Beijing Tongren Hospital(CN)
- Moorfields Eye Hospital NHS Foundation Trust(GB)
- Moorfields Eye Hospital(GB)
- University College London(GB)
- University of Washington(US)
- Beijing Tsinghua Chang Gung Hospital(CN)
- Tsinghua University(CN)
- Sungkyunkwan University(KR)
- Kangbuk Samsung Hospital(KR)