Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluation of ChatGPT-4’s performance on pediatric dentistry questions: accuracy and completeness analysis
4
Zitationen
2
Autoren
2025
Jahr
Abstract
BACKGROUND: This study aimed to evaluate the accuracy and completeness of Chat Generative Pre-trained Transformer-4 (ChatGPT-4) responses to frequently asked questions (FAQs) posed by patients and parents, as well as curricular questions related to pediatric dentistry. Additionally, it sought to determine whether the ChatGPT-4's performance varied across different question topics. METHODS: Responses from ChatGPT-4 to 30 FAQs by patients and parents and 30 curricular questions covering six pediatric dentistry topics (fissure sealants, fluoride, early childhood caries, oral hygiene practices, development of dentition and occlusion, and pulpal therapy) were evaluated by 30 pediatric dentists. Accuracy was rated using a five-point Likert scale, while completeness was assessed via a three-point scale, capturing distinct aspects of response quality. Statistical analyses included Fisher's Exact test, Mann-Whitney U test, Kruskal-Wallis test, and Bonferroni-adjusted post hoc comparisons. RESULTS: ChatGPT-4's responses demonstrated high overall accuracy across all question types. Mean accuracy scores were 4.21 ± 0.55 for FAQs and 4.16 ± 0.70 for curricular questions, indicating that responses were generally rated as "good" to "excellent" by pediatric dentists, with no statistically significant difference between the two groups (p = 0.942). Completeness scores were moderate overall, with means of 2.51 ± 0.40 (median: 3) and 2.61 ± 1.53 (median: 3) for FAQs and curricular questions, respectively (p = 0.563), reflecting a generally acceptable response coverage. Accuracy scores for curricular questions varied significantly by topic (p = 0.007), with the highest score for fissure sealants (4.45 ± 0.62; median: 5) and the lowest for pulpal therapy (3.93 ± 0.93; median: 4). CONCLUSION: From a clinical perspective, ChatGPT-4 demonstrates promising accuracy and acceptable completeness in pediatric dental communication. However, its performance in certain curricular areas-particularly fluoride and pulpal therapy-warrants cautious interpretation and requires professional oversight.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.740 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.649 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.202 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.886 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.