Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Empowering Radiologists With ChatGPT-4o
3
Zitationen
4
Autoren
2025
Jahr
Abstract
Purpose: This study evaluated the diagnostic accuracy and differential diagnostic capabilities of 12 Large Language Models (LLMs), one cardiac radiologist, and 3 general radiologists in cardiac radiology. The impact of the ChatGPT-4o assistance on radiologist performance was also investigated. Materials and Methods: We collected publicly available 80 “Cardiac Case of the Month” from the Society of Thoracic Radiology website. LLMs and Radiologist-III were provided with text-based information, whereas other radiologists visually assessed the cases with and without the ChatGPT-4o assistance. Diagnostic accuracy and differential diagnosis scores (DDx scores) were analyzed using the χ 2 , Kruskal-Wallis, Wilcoxon, McNemar, and Mann-Whitney U tests. Results: The unassisted diagnostic accuracy of the cardiac radiologist was 72.5%, general radiologist-I was 53.8%, and general radiologist-II was 51.3%. With ChatGPT-4o, the accuracy improved to 78.8%, 70.0%, and 63.8%, respectively. The improvements for general radiologists-I and II were statistically significant ( P ≤0.006). All radiologists’ DDx scores improved significantly with ChatGPT-4o assistance ( P ≤0.05). Remarkably, Radiologist-I’s GPT-4o-assisted diagnostic accuracy and DDx score were not significantly different from the Cardiac Radiologist’s unassisted performance ( P >0.05). Among the LLMs, Claude 3 Opus and Claude 3.5 Sonnet had the highest accuracy (81.3%), followed by Claude 3 Sonnet (70.0%). Regarding the DDx score, Claude 3 Opus outperformed all models and radiologist-III ( P <0.05). The accuracy of the general radiologist-III significantly improved from 48.8% to 63.8% with GPT4o assistance ( P <0.001). Conclusions: ChatGPT-4o may enhance the diagnostic performance of general radiologists in cardiac imaging, suggesting its potential as a diagnostic support tool. Further studies are required to assess the clinical integration.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.719 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.628 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.176 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.880 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.