Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Clinical feasibility of AI Doctors: Evaluating the replacement potential of large language models in outpatient settings for central nervous system tumors

2025·11 Zitationen·International Journal of Medical InformaticsOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

BACKGROUND AND OBJECTIVES: The treatment of central nervous system (CNS) tumors is complex and resource-intensive, with higher mortality in underserved regions. Large language models (LLMs) show promise in medical support, but their real-world performance in CNS tumor outpatient care remains unclear. This study aims to assess the diagnostic and treatment capabilities of LLMs in bilingual clinical settings. METHODS: This retrospective study evaluated three LLMs (ChatGPT-4o, DeepSeek-R1, and Doubao) in assisting neuro-oncology outpatient decision-making within bilingual (Chinese/English) clinical environments. A total of 338 outpatient cases were included, with each model assigned three clinical tasks: differential diagnosis, main diagnosis, and treatment advice. Model outputs were compared against assessments by experienced neurosurgeons. Statistical analysis employed McNemar tests (P < 0.05). RESULTS: ChatGPT-4o and DeepSeek-R1 achieved over 90 % accuracy in differential diagnosis, showing no significant difference compared to doctors (P > 0.05), while Doubao performed significantly worse (Chinese: P = 0.02, English: P = 0.01). In main diagnosis, both ChatGPT-4o and DeepSeek-R1 showed no significant deviation from doctors performance (P > 0.05), whereas Doubao underperformed (Chinese: P = 0.019, English: P = 0.011). For treatment recommendations, all models showed reduced accuracy (ChatGPT-4o: 80.5 %; DeepSeek-R1: 79 %; Doubao: 71.3 %), significantly lower than doctors (Whether in Chinese or English: P < 0.05). No performance difference was observed between Chinese and English cases. CONCLUSION: LLMs show strong potential in the preliminary diagnosis and decision support for CNS tumors, and their cross-lingual adaptability underscores their clinical feasibility.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationGlioma Diagnosis and TreatmentMeningioma and schwannoma management

Volltext beim Verlag öffnen

Clinical feasibility of AI Doctors: Evaluating the replacement potential of large language models in outpatient settings for central nervous system tumors

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen