Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Who Knows Anatomy Best? A Comparative Study of <scp>ChatGPT</scp> ‐4o, <scp>DeepSeek</scp> , Gemini, and Claude
6
Zitationen
1
Autoren
2025
Jahr
Abstract
This study evaluates the performance of ChatGPT-4o (OpenAI), DeepSeek-v3 (DeepSeek), Gemini 2.0 (Google DeepMind), and Claude 3.7 Sonnet (Anthropic) in answering anatomy questions from the Turkish Dental Specialty Admission Exam (DUS). The study aims to compare their accuracy, response times, and answer lengths. A total of 74 text-based multiple choice anatomy questions from the Turkish Dental Specialty Admission Exam (DUS) administered between 2012 and 2021 were analyzed in this study. The questions varied in difficulty and included both basic anatomical identification and clinically oriented scenarios, with a majority focusing on head and neck anatomy, followed by thorax, neuroanatomy, and musculoskeletal regions, which are particularly relevant to dental education. The accuracy of answers was evaluated against official sources, and response times and word counts were recorded. Statistical analyses, including the Kruskal-Wallis and Cochran's Q tests, were used to compare performance differences. ChatGPT-4o demonstrated the highest accuracy (98.6%), while the other models achieved the same rate of 89.2%. Gemini produced the fastest responses (mean: 4.47 s), whereas DeepSeek generated the shortest answers and Gemini the longest (p = 0.000). The differences in accuracy, response times, and word count were statistically significant (p < 0.05). ChatGPT-4o outperformed other models in accuracy for DUS anatomy questions, suggesting its superior potential as a tool for dental education. Future research should explore the integration of LLMs into structured learning programs.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.393 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.259 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.688 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.502 Zit.