Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluating the Performance of ChatGPT and DeepSeek in Bilingual Responses to Questions Regarding Craniofacial Microsomia
0
Zitationen
5
Autoren
2026
Jahr
Abstract
BACKGROUND: Craniofacial microsomia (CFM) is the second most common congenital craniofacial anomaly. As patients increasingly seek health information online, large language models (LLMs) like ChatGPT and DeepSeek have emerged as potential sources of medical information. This study evaluates the performance of ChatGPT-5 and DeepSeek-V3.2 in providing bilingual responses to CFM-related questions. METHODS: Twenty-two questions covering CFM definition, etiology, diagnosis, treatment, and prognosis were developed. Each question was submitted in English and Chinese to both LLMs using a zero-prompt approach. Responses were evaluated for accuracy using a predefined 4-point scale, with readability assessed using the Flesch Reading Ease score for English and the Chinese Readability Platform for Chinese. Safety statement frequency was also recorded. RESULTS: DeepSeek demonstrated significantly higher accuracy than ChatGPT in both English (score 1: 86.4% versus 45.5%, P=0.004) and Chinese (77.3% versus 40.9%, P=0.014). However, only DeepSeek produced responses with inaccurate or misleading content (score 3). For English readability, DeepSeek scored significantly higher (39.4±5.5 versus 35.1±8.4, P=0.031), while Chinese readability was comparable. DeepSeek also included safety statements more frequently (54.5%-72.7% versus 4.5%-18.2%). CONCLUSIONS: Both LLMs show potential for CFM patient education, with DeepSeek offering superior accuracy and readability in English, though it occasionally produced misleading information. ChatGPT provided safer but less detailed responses. These findings highlight the need for model-specific optimization and clinician oversight when integrating LLMs into patient education for complex craniofacial conditions.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.700 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.605 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.133 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.873 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.