Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Abstract 4364513: Comparison of ChatGPT-4o and OpenEvidence for Clinical Decision Support in Transcatheter Tricuspid Valve Interventions
0
Zitationen
5
Autoren
2025
Jahr
Abstract
Background: Artificial intelligence (AI) tools are increasingly used to support clinical decision-making. However, data on their performance in complex, subspecialty areas such as structural heart disease remain limited, particularly in light of recent FDA approvals of the TriClip (Abbott) and EVOQUE (Edwards) systems for transcatheter tricuspid interventions. Hypothesis: We hypothesized that ChatGPT-4o and OpenEvidence would differ in clinical accuracy, response time, and consistency when addressing clinician-facing questions on transcatheter tricuspid valve repair (T-TEER) and replacement (TTVR). Methods: Fifteen clinical questions related to T-TEER and TTVR were submitted to both ChatGPT-4o and OpenEvidence in April 2025. Two board-certified cardiac imaging specialists independently graded responses as accurate, partially accurate, or inaccurate. Disagreements were resolved through consensus. Inter-rater agreement was measured using the free-marginal Fleiss kappa (κ). Response times were recorded and compared using Welch’s t-test. Results: ChatGPT-4o responses were rated as fully accurate in 10/15 cases (66.7%), partially accurate in 3/15 (20.0%), and inaccurate in 2/15 (13.3%) (Figure 1). OpenEvidence produced 4/15 fully accurate answers (26.7%), 3/15 partially accurate (20.0%), and 8/15 inaccurate (53.3%). ChatGPT-4o had significantly faster response times (mean 2.11 ± 0.84 s vs. 14.15 ± 2.91 s; p < 0.001) (Figure 2). Inter-rater agreement was higher for ChatGPT-4o (κ = 0.87) vs. OpenEvidence (κ = 0.69). The likelihood of a fully accurate response was significantly greater with ChatGPT-4o (RR 2.5; 95% CI 1.02–6.14; p = 0.043). Conclusion: ChatGPT-4o outperformed OpenEvidence in accuracy, speed, and consistency when answering complex questions about tricuspid valve therapies, suggesting potential utility in real-time clinical support. However, both platforms showed limitations, emphasizing the need for expert oversight. ChatGPT-4o’s contextual flexibility may be especially beneficial in busy clinical settings, while OpenEvidence’s citation-based outputs may better serve academic tasks.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.485 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.371 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.827 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.549 Zit.