Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

AI decision-making performance in Maternal-Fetal Medicine compared with human specialists: a cross-sectional study

2025·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Background : Large Language Models (LLMs), such as ChatGPT-4 and Gemini, are increasingly used in clinical care; however, reliability in Maternal–Fetal Medicine (MFM) remains uncertain. Objective : evaluating alignment of AI case management recommendations with those of MFM specialists, focusing on accuracy, agreement and clinical relevance. Study Design & Setting: Cross-sectional study, online blinded evaluation, November–December 2024 Sample & methods: 20 hypothetical MFM cases were developed. Responses were generated by ChatGPT-4, Gemini, and three MFM specialists, then rated by 22 blinded board-certified MFM evaluators using a 10-point Likert scale. Agreement was assessed with Spearman’s rho (ρ) and Cohen’s (κ); accuracy differences with Wilcoxon signed-rank tests. Outcomes : ChatGPT-4 showed moderate alignment (mean 6.6 ± 2.95; ρ=0.408; κ=0.232, p<0.001), performing well in routine cases. Gemini scored 7.0 ± 2.64 showing negligible inter-rater reliability (κ=−0.024, p=0.352). No significant difference found between ChatGPT-4 and clinicians (p=0.18), while Gemini was less accurate (p<0.01). Conclusions : AI demonstrates potential in routine MFM decision-making but remains limited in complex scenarios, requiring caution .

Autoren

Institutionen

Wolfson Medical Center(IL)

Themen

Artificial Intelligence in Healthcare and Education

Volltext beim Verlag öffnen

AI decision-making performance in Maternal-Fetal Medicine compared with human specialists: a cross-sectional study

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen