Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Accuracy of LLMs in medical education: evidence from a concordance test with medical teacher
16
Zitationen
3
Autoren
2025
Jahr
Abstract
The study provides an approach for assessing the accuracy of different LLMs. The study concludes that ChatGPT is far superior (70%) to other LLMs when asked medical questions across different specialties, while contrary to expectations, Gemini (50%) performed poorly. When compared with medical teachers, the low accuracy of LLMs suggests that general-purpose LLMs should be used with caution in medical education.
Ähnliche Arbeiten
The Measurement of Observer Agreement for Categorical Data
1977 · 77.129 Zit.
STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT
1986 · 47.144 Zit.
A Coefficient of Agreement for Nominal Scales
1960 · 40.476 Zit.
A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research
2016 · 26.117 Zit.
Intraclass correlations: Uses in assessing rater reliability.
1979 · 22.736 Zit.