Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Benchmarking Different Natural Language Processing Models for Their Responses to Queries on Toothsupported Fixed Dental Prostheses in Terms of Accuracy and Consistency
0
Zitationen
2
Autoren
2025
Jahr
Abstract
Aim: This study aimed to evaluate the accuracy and repeatability of responses generated by four different software programs regarding tooth-supported fixed dental prostheses. Materials and Method: Twelve open-ended questions in Turkish were created and posed to four different NLPs according to the following models: OpenAI o3 (LRM-O), OpenAI GPT 4.5 (LLM-G), DeepSeek R1 (LRM-R), and DeepSeek V3 (LLM-V) with pre-prompts in the morning, afternoon, and evening. The responses were evaluated with a holistic rubric. For accuracy assessments, the Kruskal–Wallis H test was used. Consistency between the graders’ responses was assessed using the Brennan and Prediger coefficient and the Cohen kappa coefficient. Repeatability was assessed using the Fleiss kappa and Krippendorff alpha coefficients (p < 0.05). Results: There was no statistically significant difference in accuracy between the LRM-O, LLM-G, LRM-R, and LLM-V groups (p = 0.298). The respective accuracies of LRM-O, LLM-G, LRM-R, and LLM-V were 77.7%, 50%, 66.6%, and 77.7%. In addition, the repeatability of LLMs was found to be almost perfect, whereas that of LRMs was substantial. Conclusion: Within the limitations of the study, LRMs and LLMs exhibited similar accuracy. However, the repeatability of LLMs was higher than that of LRMs. Keywords: Artificial intelligence, Dental prostheses, Treatment protocols
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.336 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.207 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.607 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.476 Zit.