Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Fine-Tuning Large Language Models to Enhance Programmatic Assessment in Graduate Medical Education

2024·6 Zitationen·Journal of Education in Perioperative MedicineOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

Background: Natural language processing is a collection of techniques designed to empower computer systems to comprehend and/or produce human language. The purpose of this investigation was to train several large language models (LLMs) to explore the tradeoff between model complexity and performance while classifying narrative feedback on trainees into the Accreditation Council for Graduate Medical Education subcompetencies. We hypothesized that classification accuracy would increase with model complexity. Methods: The authors fine-tuned several transformer-based LLMs (Bidirectional Encoder Representations from Transformers [BERT]-base, BERT-medium, BERT-small, BERT-mini, BERT-tiny, and SciBERT) to predict Accreditation Council for Graduate Medical Education subcompetencies on a curated dataset of 10 218 feedback comments. Performance was compared with the authors' previous work, which trained a FastText model on the same dataset. Performance metrics included F1 score for global model performance and area under the receiver operating characteristic curve for each competency. Results: No models were superior to FastText. Only BERT-tiny performed worse than FastText. The smallest model with comparable performance to FastText, BERT-mini, was 94% smaller. Area under the receiver operating characteristic curve for each competency was similar on BERT-mini and FastText with the exceptions of Patient Care 7 (Situational Awareness and Crisis Management) and Systems-Based Practice. Discussion: Transformer-based LLMs were fine-tuned to understand anesthesiology graduate medical education language. Complex LLMs did not outperform FastText. However, equivalent performance was achieved with a model that was 94% smaller, which may allow model deployment on personal devices to enhance speed and data privacy. This work advances our understanding of best practices when integrating LLMs into graduate medical education.

Autoren

Institutionen

Naval Medical Center Portsmouth(US)

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareSimulation-Based Education in Healthcare

Volltext beim Verlag öffnen

Fine-Tuning Large Language Models to Enhance Programmatic Assessment in Graduate Medical Education

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen