Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Do they learn when they read? A two-stage evaluation of AI models’ orthopedic knowledge using Orthobullets and Miller’s review
0
Zitationen
4
Autoren
2025
Jahr
Abstract
Aims: Large language models (LLMs) such as ChatGPT, Gemini, Claude, and Perplexity are increasingly incorporated into medical education; however, their baseline orthopedic knowledge and their ability to utilize structured reference materials remain insufficiently characterized. This study aimed to compare the performance of four advanced LLMs before and after exposure to a standardized orthopedic textbook and to determine whether domain-specific educational content enhances inference-time accuracy. Methods: A two-stage evaluation was conducted using 110 multiple-choice questions from the Orthobullets platform. Each model first completed the question set under identical prompting conditions. A new chat session was then initiated, and the full PDF of Miller’s Review of Orthopaedics (9th edition) was uploaded using native document-processing functions. Models were subsequently retested with the same questions. Pre–post accuracy differences were analyzed using the Wilcoxon signed-rank test (effect size r calculated as Z/√N). Between-model differences were assessed using the Kruskal–Wallis test with Bonferroni adjusted pairwise comparisons. The primary outcome was the change in accuracy (%) after textbook exposure. Results: All four models demonstrated significant improvement following access to the textbook (p
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.479 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.364 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.814 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.543 Zit.