Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Quality and Reliability of AI Information on Dental Implant Failure: A Comparative Multi-Model Analysis
0
Zitationen
6
Autoren
2026
Jahr
Abstract
OBJECTIVE: This study aimed to develop a consensus-based set of patient questions on dental implant failure and to compare the clarity, quality, accuracy, reliability, and readability of responses generated by 4 widely used AI chatbots: ChatGPT-4, DeepSeek-R1, Microsoft Copilot, and Google Gemini. METHODS: Twenty-three expert-validated questions were derived from the EAO 2021 and ICOI Pisa Consensus reports and independently submitted to each AI model under standardized, non-personalized conditions. Responses were assessed using CLEAR criteria, mGQS, a 5-point accuracy scale, the first 8 DISCERN items, and Flesch-based readability indices. Nonparametric tests were used for intermodel comparisons. RESULTS: AI models demonstrated significant variability in performance. Gemini achieved the highest accuracy ( P <0.001), whereas ChatGPT-4 exhibited the highest reliability based on DISCERN scores. Copilot generated the most structurally fluent responses, whereas DeepSeek-R1 offered the best readability. Although CLEAR and mGQS scores were high across all systems, readability and linguistic complexity varied markedly. Accuracy, clarity, and reliability were strongly correlated, whereas readability displayed the expected inverse association with grade-level demand. CONCLUSIONS: AI chatbots hold potential as adjunct tools for patient education on implant failure; however, their performance characteristics differ substantially. Gemini excels in accuracy, ChatGPT-4 in reliability, Copilot in fluency, and DeepSeek-R1 in readability. Model-specific guidance and continued refinement are needed to enhance the clinical usefulness and accessibility of AI-generated patient information.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.578 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.470 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.984 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.814 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.