OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 27.03.2026, 21:41

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Assessing ChatGPT 4.0’s Capabilities in the United Kingdom Medical Licensing Examination (UKMLA): A Robust Categorical Analysis

2025·6 Zitationen·Scientific ReportsOpen Access
Volltext beim Verlag öffnen

6

Zitationen

8

Autoren

2025

Jahr

Abstract

Advances in the various applications of artificial intelligence will have important implications for medical training and practice. The advances in ChatGPT-4 alongside the introduction of the medical licensing assessment (MLA) provide an opportunity to compare GPT-4's medical competence against the expected level of a United Kingdom junior doctor and discuss its potential in clinical practice. Using 191 freely available questions in MLA style, we assessed GPT-4's accuracy with and without offering multiple-choice options. We compared single and multi-step questions, which targeted different points in the clinical process, from diagnosis to management. A chi-squared test was used to assess statistical significance. GPT-4 scored 86.3% and 89.6% in papers one-and-two respectively. Without the multiple-choice options, GPT's performance was 61.5% and 74.7% in papers one-and-two respectively. There was no significant difference between single and multistep questions, but GPT-4 answered 'management' questions significantly worse than 'diagnosis' questions with no multiple-choice options (p = 0.015). GPT-4's accuracy across categories and question structures suggest that LLMs are competently able to process clinical scenarios but remain incapable of understanding these clinical scenarios. Large-Language-Models incorporated into practice alongside a trained practitioner may balance risk and benefit as the necessary robust testing on evolving tools is conducted.

Ähnliche Arbeiten