OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 16.05.2026, 18:23

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Advancing conversational diagnostic AI with multimodal reasoning

2025·1 Zitationen·Nature MedicineOpen Access
Volltext beim Verlag öffnen

1

Zitationen

36

Autoren

2025

Jahr

Abstract

Real-world clinical practice is inherently multimodal, relying on the synthesis of patient history with visual information such as medical imagery and clinical documents. Although large language models (LLMs) have shown promise in diagnostic dialogue, their evaluation has been largely restricted to text-only interactions, failing to capture the complexity of modern remote care delivery. Here we introduce a multimodal extension of the Articulate Medical Intelligence Explorer (multimodal AMIE), capable of gathering, interpreting and reasoning about multimodal data within a diagnostic conversation. To achieve this, we developed a state-aware dialogue framework that dynamically guides history-taking based on diagnostic uncertainty and evolving patient states, emulating the structured reasoning of experienced clinicians. We evaluated this updated, state-aware version of multimodal AMIE against primary care physicians (PCPs) in a randomized, blinded exploratory study comprising 105 simulated telehealth consultations, which included dermatology photographs, electrocardiograms and clinical documents. As assessed by 18 specialist physicians, multimodal AMIE outperformed PCPs not only in diagnostic accuracy but also in conversation quality, including history-taking and empathy. Specifically, multimodal AMIE demonstrated superior performance on 29 of 32 evaluation axes, including seven of nine metrics that assess multimodal reasoning. These results validate the efficacy of state-aware reasoning in bridging the gap between text and visual information and demonstrate the potential for artificial intelligence (AI) systems to augment clinicians in complex, multimodal diagnostic settings. Improvements in the Articulate Medical Intelligence Explorer, a large language model designed for diagnostic dialogue, enable the model to request, interpret and reason about multimodal medical data.

Ähnliche Arbeiten