Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Advancing conversational diagnostic AI with multimodal reasoning
1
Zitationen
36
Autoren
2025
Jahr
Abstract
Real-world clinical practice is inherently multimodal, relying on the synthesis of patient history with visual information such as medical imagery and clinical documents. Although large language models (LLMs) have shown promise in diagnostic dialogue, their evaluation has been largely restricted to text-only interactions, failing to capture the complexity of modern remote care delivery. Here we introduce a multimodal extension of the Articulate Medical Intelligence Explorer (multimodal AMIE), capable of gathering, interpreting and reasoning about multimodal data within a diagnostic conversation. To achieve this, we developed a state-aware dialogue framework that dynamically guides history-taking based on diagnostic uncertainty and evolving patient states, emulating the structured reasoning of experienced clinicians. We evaluated this updated, state-aware version of multimodal AMIE against primary care physicians (PCPs) in a randomized, blinded exploratory study comprising 105 simulated telehealth consultations, which included dermatology photographs, electrocardiograms and clinical documents. As assessed by 18 specialist physicians, multimodal AMIE outperformed PCPs not only in diagnostic accuracy but also in conversation quality, including history-taking and empathy. Specifically, multimodal AMIE demonstrated superior performance on 29 of 32 evaluation axes, including seven of nine metrics that assess multimodal reasoning. These results validate the efficacy of state-aware reasoning in bridging the gap between text and visual information and demonstrate the potential for artificial intelligence (AI) systems to augment clinicians in complex, multimodal diagnostic settings. Improvements in the Articulate Medical Intelligence Explorer, a large language model designed for diagnostic dialogue, enable the model to request, interpret and reason about multimodal medical data.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.687 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.591 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.114 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.867 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Autoren
- Khaled Saab
- Jan Freyberg
- Chunjong Park
- Tim Strother
- Yong Cheng
- Wei‐Hung Weng
- David G. T. Barrett
- David Stutz
- Nenad Tomašev
- Anil Palepu
- Valentin Liévin
- Yash Sharma
- Roma Ruparel
- Abdullah Ahmed Ali Ahmed
- Elahe Vedadi
- Kimberly Kanada
- Cían Hughes
- Yun Liu
- Geoff Brown
- Yang Gao
- Xiang Li
- S. Sara Mahdavi
- James Manyika
- Katherine Chou
- Yossi Matias
- Avinatan Hassidim
- Dale R. Webster
- Pushmeet Kohli
- S. M. Ali Eslami
- Joëlle Barral
- Adam Rodman
- Vivek Natarajan
- Mike Schaekermann
- Tao Tu
- Alan Karthikesalingam
- Ryutaro Tanno