Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Performance of ChatGPT and Microsoft Copilot in Bing in answering obstetric ultrasound questions and analyzing obstetric ultrasound reports
3
Zitationen
7
Autoren
2025
Jahr
Abstract
To evaluate and compare the performance of publicly available ChatGPT-3.5, ChatGPT-4.0 and Microsoft Copilot in Bing (Copilot) in answering obstetric ultrasound questions and analyzing obstetric ultrasound reports. Twenty questions related to obstetric ultrasound were answered and 110 obstetric ultrasound reports were analyzed by ChatGPT-3.5, ChatGPT-4.0 and Copilot, with each question and report being posed three times to them at different times. The accuracy and consistency of each response to twenty questions and each analysis result in the report were evaluated and compared. In answering twenty questions, both ChatGPT-3.5 and ChatGPT-4.0 outperformed Copilot in accuracy (95.0% vs. 80.0%) and consistency (90.0% and 85.0% vs. 75.0%). However, no statistical difference was found among them. When analyzing obstetric ultrasound reports, ChatGPT-3.5 and ChatGPT-4.0 demonstrated superior accuracy compared to Copilot (P < 0.05), and all three showed high consistency and the ability to provide recommendations. The overall accuracy and consistency of ChatGPT-3.5, ChatGPT-4.0, and Copilot were 83.86%, 84.13% vs. 77.51% in accuracy, and 87.30%, 93.65% vs. 90.48% in consistency, respectively. These large language models (ChatGPT-3.5, ChatGPT-4.0 and Copilot) have the potential to assist clinical workflows by enhancing patient education and patient clinical communication around common obstetric ultrasound issues. With inconsistent and sometimes inaccurate responses, along with cybersecurity concerns, physician supervision is crucial in the use of these models.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.357 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.221 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.640 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.482 Zit.