Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Education Research: Can Large Language Models Match MS Specialist Training?
5
Zitationen
7
Autoren
2025
Jahr
Abstract
Background and Objectives: Artificial intelligence (AI), particularly large language models (LLMs), is increasingly explored for clinical decision support and medical education. While general LLM proficiency on broad medical examinations has been demonstrated, their application of domain-specific knowledge in neurology remains underexplored. This study addresses that gap using multiple sclerosis (MS) as an exemplar, evaluating how LLM information access strategies affect accuracy in a specialized postgraduate curriculum and exploring possible roles of LLMs in neurology education. Methods: tests. Results: = 0.119), performance varied by question type and difficulty. For MCQs with a single correct answer, domain-specific LLMs outperformed GPT-4o, although differences remained nonsignificant. By contrast, students showed stronger performance on single-wrong answer formats. Stratified by difficulty, students outperformed LLMs on "easy" questions while LLMs tended to achieve higher accuracy on "medium" and "hard" items. For open-ended questions, students reached 77.8% accuracy while GPT-4o, MS RAG, and Prof. Valmed scored 66.7%-85.0%. Discussion: These findings indicate that while LLMs can perform at levels broadly comparable to postgraduate students, these may be particularly useful on more difficult tasks, where their consistency may complement human reasoning in a neurology subspecialty curriculum. While results should be interpreted cautiously given the limited sample size, this study illustrates possible implications of LLMs in neurology education-for example, as AI tutors for complex topics, as support for formative assessments, or as targeted review resources. Further research should assess integration into educational workflows and decision support.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.700 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.605 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.133 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.873 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.