Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Impact of authoritative and subjective cues on large language model reliability for clinical inquiries: an experimental study

2026·0 Zitationen·Scientific ReportsOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

To determine how subjective or authoritative misinformation embedded in user prompts affects large language model (LLM) accuracy on a clinical question with a known gold-standard answer (the treatment line of aripiprazole). Five leading LLMs answered the clinical question under three prompt conditions: (1) neutral, (2) an incorrect “self-recalled” memory, and (3) an incorrect statement attributed to an authority. Each model–scenario pair was repeated ten times (250 total responses). Accuracy differences were tested with χ² and Cramér’s V, and score shifts were analyzed with van Elteren tests. All models were correct under the neutral prompt (100% accuracy). Accuracy dropped to 45% with self-recall prompts and to 1% with authoritative prompts, indicating a strong prompt–accuracy association (Cramér’s V = 0.75, P < 0.001). Efficacy and tolerability ratings fell in parallel, yet models’ self-rated confidence under authoritative prompting stayed high and was statistically indistinguishable from baseline. LLMs are highly susceptible to misleading cues, especially those invoking authority, while remaining overconfident. These findings call for stronger validation standards, user education, and design safeguards before deploying LLMs in healthcare.

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationElectronic Health Records SystemsPatient-Provider Communication in Healthcare

Volltext beim Verlag öffnen

Impact of authoritative and subjective cues on large language model reliability for clinical inquiries: an experimental study

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen