Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Performance of 4 artificial intelligence chatbots in responding to multiple choice questions in operative dentistry
0
Zitationen
8
Autoren
2026
Jahr
Abstract
The accuracy and consistency of artificial intelligence (AI) based chatbots and their dependability in the field of dental education are questionable. This study was aimed to evaluate the performance of four different chatbots in answering multiple-choice questions (MCQs) in operative dentistry. Relying on textbooks in operative dentistry, a three-membered panel of experts developed 150 MCQs, which a fourth expert screened to yield a final 110 MCQs. These questions were input into GPT-4o, Grok 3, Gemini Advanced and Claude 3.7 Sonnet in two rounds with a gap of one-week interval. The proportion of correct answers reflected the performance of these chatbots. Inter- and intra-chatbot consistencies were analysed using the McNemar test and Cohen’s Kappa. In the first round, Grok 3 and Gemini Advanced answered 86.4% of the MCQs correctly, while GPT-4o and Claude 3.7 Sonnet answered 85.5% correctly. In the second round, the performance of GPT-4o and Claude 3.7 Sonnet improved, answering 87.3% and 91.8%, respectively. Intra-chatbot consistency ranged from fair (Kappa = 0.33) for Claude 3.7 Sonnet to substantial for GPT-4o. Inter-chatbot consistency ranged from 0.34 to 0.54 in the first round and 0.44 to 0.66 in the second round. The assessed chatbots showed promising performance in answering MCQs in operative dentistry and improved over time. The assessed chatbots can be used as adjuncts in the education process of operative dentistry while carefully considering their inherent limitations. Determining the accuracy and, consequently, the dependability of the most widely used AI-based chatbots in responding to dental queries is essential for dental students. Dental students must interpret chatbots’ responses with caution and use them as supplementary tools alongside the standard resources such as textbooks and guidance from mentors.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.773 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.682 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.242 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.898 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.