OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 24.05.2026, 10:42

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Assessing GPT-4o and GPT-4 in answering and explaining ophthalmology examination questions from Taiwan’s medical licensing test

2025·0 Zitationen·Taiwan Journal of OphthalmologyOpen Access
Volltext beim Verlag öffnen

0

Zitationen

5

Autoren

2025

Jahr

Abstract

Abstract: PURPOSE: This study aims to evaluate and compare the performance of generative pretrained transformer (GPT)-4o and GPT-4 in answering Taiwan’s National Medical Licensing Examination (NMLE) ophthalmology questions from 2014 to 2023, focusing on both answer accuracy and explanation quality. MATERIALS AND METHODS: A total of 169 ophthalmology questions from Taiwan’s NMLE over the past decade were selected. GPT-4o and GPT-4 were tested on each question, and their performance was measured by correct answers and explanations. The results were categorized by ophthalmologic subspecialty and analyzed using statistical methods to determine significant differences between the two models. RESULTS: GPT-4o achieved a significantly higher overall correct answer rate (92.9%) compared to GPT-4 (69.2%) across all ophthalmology questions from 2014 to 2023 ( P < 0.01). GPT-4o outperformed GPT-4 in most subspecialties, including retina (95.8% vs. 58.3%, P < 0.01), external disease and cornea (96.3% vs. 77.8%, P = 0.04), and neuro-ophthalmology (87.5% vs. 50%, P = 0.02). GPT-4o and GPT-4 performed similarly in glaucoma and uveitis, with no significant differences observed. In terms of explanation quality, GPT-4o provided accurate explanations for 90.7% of the questions, with the highest accuracy in pediatric ophthalmology and strabismus (100%) and the lowest in uveitis (83.3%). CONCLUSION: GPT-4o exhibited superior performance in both answering and explaining ophthalmology questions from Taiwan’s NMLE compared to GPT-4. These results suggest that GPT-4o may be a more reliable tool for educational and diagnostic purposes in ophthalmology.

Ähnliche Arbeiten