OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 17.05.2026, 12:44

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Two Sources of Conviction: A Two-Probe Empirical Study of Bias Mechanisms in Large Language Models

2026·0 Zitationen·Zenodo (CERN European Organization for Nuclear Research)Open Access
Volltext beim Verlag öffnen

0

Zitationen

1

Autoren

2026

Jahr

Abstract

We present a two-probe empirical battery designed to isolate and distinguish conviction biases in large language models (LLMs). Probe 1 — a three-stage geometric construction sequence culminating in a seeded mathematical error — tests source-trust bias: the tendency of context-rich AI systems to retrofit coherence onto implausible inputs from trusted interlocutors. Probe 2 — the three-character command 0>1, asserted as the shortest command to create a file named "1" containing the character "0" — tests documentation bias: the tendency to privilege training-derived authoritative sources over direct empirical evidence. Six frontier AI systems were evaluated across both probes: ChatGPT (OpenAI), Gemini (Google), Claude (Anthropic), Grok (xAI), Mistral, and Meta AI. Results demonstrate two structurally distinct failure modes operating in opposite directions. Source-trust bias caused systems to accept a mathematically implausible value (13, versus a domain ceiling of π ≈ 3.14) without question. Documentation bias caused one system to reject a correct empirical result across six turns, multiple screenshots, and live PowerShell output, until raw hexadecimal byte data was provided. A third pattern — sycophantic reversal — was observed when systems initially gave correct answers before abandoning them under user pressure without evidence. We propose a unified taxonomy of conviction biases in LLMs and argue that the two probes together constitute a minimal reproducible test battery for adaptive reasoning under epistemic pressure.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Topic ModelingArtificial Intelligence in Healthcare and EducationExplainable Artificial Intelligence (XAI)
Volltext beim Verlag öffnen