Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Evaluation of ChatGPT-4’s performance on pediatric dentistry questions: accuracy and completeness analysis

2025·4 Zitationen·BMC Oral HealthOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

BACKGROUND: This study aimed to evaluate the accuracy and completeness of Chat Generative Pre-trained Transformer-4 (ChatGPT-4) responses to frequently asked questions (FAQs) posed by patients and parents, as well as curricular questions related to pediatric dentistry. Additionally, it sought to determine whether the ChatGPT-4's performance varied across different question topics. METHODS: Responses from ChatGPT-4 to 30 FAQs by patients and parents and 30 curricular questions covering six pediatric dentistry topics (fissure sealants, fluoride, early childhood caries, oral hygiene practices, development of dentition and occlusion, and pulpal therapy) were evaluated by 30 pediatric dentists. Accuracy was rated using a five-point Likert scale, while completeness was assessed via a three-point scale, capturing distinct aspects of response quality. Statistical analyses included Fisher's Exact test, Mann-Whitney U test, Kruskal-Wallis test, and Bonferroni-adjusted post hoc comparisons. RESULTS: ChatGPT-4's responses demonstrated high overall accuracy across all question types. Mean accuracy scores were 4.21 ± 0.55 for FAQs and 4.16 ± 0.70 for curricular questions, indicating that responses were generally rated as "good" to "excellent" by pediatric dentists, with no statistically significant difference between the two groups (p = 0.942). Completeness scores were moderate overall, with means of 2.51 ± 0.40 (median: 3) and 2.61 ± 1.53 (median: 3) for FAQs and curricular questions, respectively (p = 0.563), reflecting a generally acceptable response coverage. Accuracy scores for curricular questions varied significantly by topic (p = 0.007), with the highest score for fissure sealants (4.45 ± 0.62; median: 5) and the lowest for pulpal therapy (3.93 ± 0.93; median: 4). CONCLUSION: From a clinical perspective, ChatGPT-4 demonstrates promising accuracy and acceptable completeness in pediatric dental communication. However, its performance in certain curricular areas-particularly fluoride and pulpal therapy-warrants cautious interpretation and requires professional oversight.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsExplainable Artificial Intelligence (XAI)

Volltext beim Verlag öffnen

Evaluation of ChatGPT-4’s performance on pediatric dentistry questions: accuracy and completeness analysis

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen