Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
ChatGPT-4 versus emergency physicians for walk-in ED patients: history, differential diagnosis, testing, and disposition—a prospective feasibility study
0
Zitationen
6
Autoren
2026
Jahr
Abstract
As generative artificial intelligence (GenAI) increasingly intersects with healthcare, evaluating model performance for clinical decision support is gaining relevance. This exploratory, prospective feasibility study compared ChatGPT-4 with emergency physicians across four predefined emergency-care tasks: history capture, diagnostic reasoning (differential diagnosis versus discharge diagnosis), recommended diagnostic testing, and disposition. A feasibility study enrolled 32 consecutive adult walk-in ED patients during January-April 2024. The clinical teams provided the usual care to these patients during their ED visit. After an initial evaluation by the ED team, while waiting for their test results and further decisions, patients were invited to provide their medical history to ChatGPT-4. In parallel, an external expert physician (not involved in the patient's direct care) entered the de-identified clinical details from each patient's ED medical file into ChatGPT-4. This allowed the GenAI assistant to generate a simulated assessment and care recommendation for each patient, independent of the actual care provided by the ED physicians. Finally, a blinded ED expert reviewed and compared the assessments and recommendations made by the ED physicians to those of the GenAI assistant. Patients were followed for 30 days to track any unplanned medical utilization after their initial ED visit. ChatGPT-4 identified additional medical history details not recorded by the ED physician in 21.2% of cases. Agreement between ChatGPT-4 differential diagnoses and final discharge diagnoses was moderate (κ = 0.54, 95% CI 0.24–0.90). The model tended to recommend additional diagnostic tests and more hospital admissions than the treating physicians, reflecting a safety-forward but higher-resource pattern of decision-making. This feasibility study provides preliminary evidence for the potential use of a GenAI assistant in walk-in emergency care. ChatGPT-4 demonstrated strength in structured history capture and moderate diagnostic alignment with clinicians, yet showed a conservative bias favoring admission and expanded testing. Future larger studies should assess whether this safety-oriented, high-resource utilization pattern can be calibrated to balance patient safety, efficiency, and clinical appropriateness.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.393 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.259 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.688 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.502 Zit.