Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Multi‐model Artificial Intelligence Evaluation in Sudden Sensorineural Hearing Loss
2
Zitationen
4
Autoren
2026
Jahr
Abstract
Abstract Objective To compare the diagnostic accuracy, linguistic clarity, and user satisfaction of three large language models (ChatGPT‐4.0, Claude 3.7 Sonet, and OpenAI Mini 3) in managing sudden sensorineural hearing loss. Study Design Prospective, multi‐domain comparative analysis using blinded expert evaluation. Setting Online artificial intelligence (AI) platforms accessed under standardized conditions. Methods Twenty‐seven sudden sensorineural hearing loss‐related questions—covering general knowledge, audiometric interpretation, and clinical case scenarios—were submitted to the three AI models. Responses were evaluated by 10 board‐certified otolaryngologists using three validated tools: Quality Assessment of Medical Artificial Intelligence (QAMAI), Artificial Intelligence Performance Instrument (AIPI), and Artificial Intelligence Satisfaction and Performance Evaluation Questionnaire (AISPE‐Q). Linguistic complexity was assessed using metrics such as word count, sentence length, lexical diversity, and clinical verb use. Results ChatGPT‐4.0 demonstrated the highest scores in clinical accuracy (QAMAI: 4.57), completeness (4.53), and evaluator satisfaction (AISPE‐Q: 94%). Claude 3.7 outperformed in clarity and sentence complexity, while OpenAI Mini 3 exhibited the highest lexical diversity and directive tone but scored lower overall. Inter‐rater reliability was strong (intraclass correlation coefficient [ICC] > 0.85). Correlation analysis revealed a significant relationship between objective quality and subjective satisfaction ( r > 0.76). Conclusion ChatGPT‐4.0 delivered the most clinically aligned and satisfactory responses, whereas Claude 3.7 provided linguistically refined outputs. Our findings support the context‐specific application of hybrid large language model approaches in otolaryngology, particularly for patient education, diagnosis, and AI‐driven triage. Level of Evidence 2—prospective comparative diagnostic accuracy study.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.707 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.613 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.159 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.875 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.