Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Visual-language foundation modelsandartificial intelligence agents in healthcare: Bridging from technological innovation to clinical impact

2026·0 Zitationen·Computational Visual MediaOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

Recent advances in foundation models (FMs) have enabled artificial intelligence systems to acquire generalizable capabilities, with language and visual models demonstrating adaptability across diverse healthcare applications and multimodal vision-language FMs (VLFMs) supporting complex tasks such as report generation and question answering. Building on these developments, autonomous agents, particularly vision language agents (VLAs), represent the next frontier, extending AI from perception and recognition to cognition, decision-making, and action. By integrating multimodal understanding with planning, interaction, and tool use, VLAs introduce autonomous intelligence capable of managing multi-step clinical workflows, adapting over time, and collaborating with clinicians. This survey provides a comprehensive perspective on VLAs in healthcare. We examine their current progress, analyze the major challenges hindering their integration, and highlight promising directions for future development. Unlike prior surveys that primarily focus on diverse FMs, our work prioritizes emerging VLFMs and incorporates agentic concepts, emphasizing their transformative potential beyond narrow research prototypes. Specifically, we (i) evaluate VLAs in terms of VLFM architectures and adaptation strategies and agentic design, (ii) assess their potential impact across diverse clinical workflows, and (iii) discuss pathways towards next-generation clinical decision support, considering technical and clinical challenges.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareElectronic Health Records Systems

Volltext beim Verlag öffnen

Visual-language foundation modelsandartificial intelligence agents in healthcare: Bridging from technological innovation to clinical impact

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen