Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluating the Accuracy of Gemini 2.0 Advanced and ChatGPT 4o in Cataract Knowledge: A Performance Analysis Using Brazilian Council of Ophthalmology Board Exam Questions

2025·4 Zitationen·CureusOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

INTRODUCTION: Large language models (LLMs) like Gemini 2.0 Advanced and ChatGPT-4o are increasingly applied in medical contexts. This study assesses their accuracy in answering cataract-related questions from Brazilian ophthalmology board exams, evaluating their potential for clinical decision support. METHODS: A retrospective analysis was conducted using 221 multiple-choice questions. Responses from both LLMs were evaluated by two independent ophthalmologists against the official answer key. Accuracy rates and inter-evaluator agreement (Cohen's kappa) were analyzed. RESULTS: Gemini 2.0 Advanced achieved 85.45% and 80.91% accuracy, while ChatGPT-4o scored 80.00% and 84.09%. Inter-evaluator agreement was moderate (κ = 0.514 and 0.431, respectively). Performance varied across exam years. CONCLUSION: Both models demonstrated high accuracy in cataract-related board exam questions, supporting their potential as educational tools. However, moderate agreement and performance variability indicate the need for further refinement and validation.

Autoren

Institutionen

Instituto da Visão(BR)

Themen

Artificial Intelligence in Healthcare and EducationOphthalmology and Visual Health ResearchClinical Reasoning and Diagnostic Skills

Volltext beim Verlag öffnen

Evaluating the Accuracy of Gemini 2.0 Advanced and ChatGPT 4o in Cataract Knowledge: A Performance Analysis Using Brazilian Council of Ophthalmology Board Exam Questions

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen