Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Assessing ChatGPT-4’s performance on the US prosthodontic exam: impact of fine-tuning and contextual prompting vs. base knowledge, a cross-sectional study

2025·11 Zitationen·BMC Medical EducationOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

BACKGROUND: Artificial intelligence (AI), such as ChatGPT-4 from OpenAI, has the potential to transform medical education and assessment. However, its effectiveness in specialized fields like prosthodontics, especially when comparing base to fine-tuned models, remains underexplored. This study evaluates the performance of ChatGPT-4 on the US National Prosthodontic Resident Mock Exam in its base form and after fine-tuning. The aim is to determine whether fine-tuning improves the AI's accuracy in answering specialized questions. METHODS: An official sample questions from the 2021 US National Prosthodontic Resident Mock Exam was used, obtained from the American College of Prosthodontists. A total of 150 questions were initially considered, and resources were available for 106 questions. Both the base and fine-tuned models of ChatGPT-4 were tested under simulated exam conditions. Performance was assessed by comparing correct and incorrect responses. The Chi-square test was used to analyze accuracy, with significance set at p < 0.05. The Kappa coefficient was calculated to measure agreement between the models' responses. RESULTS: The base model of ChatGPT-4 correctly answered 62.7% of the 150 questions. For the 106 questions with resources, the fine-tuned model answered 73.6% correctly. The Chi-square test showed a significant improvement in performance after fine-tuning (p < 0.001). The Kappa coefficient was 0.39, indicating moderate agreement between the models (p < 0.001). Performance varied by topic, with lower accuracy in areas such as Implant Prosthodontics, Removable Prosthodontics, and Occlusion, though the fine-tuned model consistently outperformed the base model. CONCLUSIONS: Fine-tuning ChatGPT-4 with specific resources significantly enhances its accuracy in answering specialized prosthodontic exam questions. While the base model provides a solid baseline, fine-tuning is essential for improving AI performance in specialized fields. However, certain topics may require more targeted training to achieve higher accuracy.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationAcademic integrity and plagiarismDental Research and COVID-19

Volltext beim Verlag öffnen

Assessing ChatGPT-4’s performance on the US prosthodontic exam: impact of fine-tuning and contextual prompting vs. base knowledge, a cross-sectional study

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen