Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
The performance of artificial intelligence-based large language models on ophthalmology-related questions in Swedish proficiency test for medicine: ChatGPT-4 omni vs Gemini 1.5 Pro
15
Zitationen
6
Autoren
2024
Jahr
Abstract
• The AI-based LLMs, ChatGPT-4o and Gemini 1.5 Pro demonstrate robust performance in addressing ophthalmology-related MCQs, which are occasionally challenging for healthcare professionals. • They have the potential to contribute ophthalmological medical education, not only by selecting correct answers to MCQs but also by providing explanations. • While both AI platforms are beneficial, ChatGPT-4o is one step ahead. To compare the interpretation and response context of two commonly used artificial intelligence (AI)-based large language model (LLM) platforms to ophthalmology-related multiple choice questions (MCQs) in the Swedish proficiency test for medicine (“ kunskapsprov för läkare ”) exams. Observational study. The questions of a total of 29 exams held between 2016 and 2024 were reviewed. All ophthalmology-related questions were included in this study, and categorized into ophthalmology sections. Questions were asked to ChatGPT-4o and Gemini 1.5 Pro AI-based LLM chatbots in Swedish and English with specific commands. Secondly, all MCQs were asked again without feedback. As the final step, feedback was given for questions that were still answered incorrectly, and all questions were subsequently re-asked. A total of 134 ophthalmology-related questions out of 4876 MCQs were evaluated via both AI-based LLMs. The MCQ count in the 29 exams was 4.62 ± 2.21 (range: 0–8). After the final step, ChatGPT-4o achieved higher accuracy in Swedish (94 %) and English (95.5 %) compared to Gemini 1.5 Pro (both at 88.1 %) ( p = 0.13 , and p = 0.04 , respectively). Moreover, ChatGPT-4o provided more correct answers in the neuro-ophthalmology section ( n = 47) compared to Gemini 1.5 Pro across all three attempts in English ( p < 0.05 ). There was no statistically significant difference either in the inter-AI comparison of other ophthalmology sections or in the inter-lingual comparison within AIs. Both AI-based LLMs, and especially ChatGPT-4o, appear to perform well in ophthalmology-related MCQs. AI-based LLMs can contribute to ophthalmological medical education not only by selecting correct answers to MCQs but also by providing explanations.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.324 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.189 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.588 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.470 Zit.
Autoren
Institutionen
- Kastamonu University(TR)
- Sahlgrenska University Hospital(SE)
- Region Västra Götaland(SE)
- Södra Älvsborg Hospital(SE)
- University of Copenhagen(DK)
- Copenhagen University Hospital(DK)
- Rigshospitalet(DK)
- Institute of Clinical Research(US)
- University of Southern Denmark(DK)
- University of Health Sciences Antigua(AG)