Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Two Sources of Conviction: A Two-Probe Empirical Study of Bias Mechanisms in Large Language Models

2026·0 Zitationen·Zenodo (CERN European Organization for Nuclear Research)Open Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

We present a two-probe empirical battery designed to isolate and distinguish conviction biases in large language models (LLMs). Probe 1 — a three-stage geometric construction sequence culminating in a seeded mathematical error — tests source-trust bias: the tendency of context-rich AI systems to retrofit coherence onto implausible inputs from trusted interlocutors. Probe 2 — the three-character command 0>1, asserted as the shortest command to create a file named "1" containing the character "0" — tests documentation bias: the tendency to privilege training-derived authoritative sources over direct empirical evidence. Six frontier AI systems were evaluated across both probes: ChatGPT (OpenAI), Gemini (Google), Claude (Anthropic), Grok (xAI), Mistral, and Meta AI. Results demonstrate two structurally distinct failure modes operating in opposite directions. Source-trust bias caused systems to accept a mathematically implausible value (13, versus a domain ceiling of π ≈ 3.14) without question. Documentation bias caused one system to reject a correct empirical result across six turns, multiple screenshots, and live PowerShell output, until raw hexadecimal byte data was provided. A third pattern — sycophantic reversal — was observed when systems initially gave correct answers before abandoning them under user pressure without evidence. We propose a unified taxonomy of conviction biases in LLMs and argue that the two probes together constitute a minimal reproducible test battery for adaptive reasoning under epistemic pressure.

Autoren

Culaj Mark

Institutionen

Institut de Biologia Evolutiva(ES)

Themen

Topic ModelingArtificial Intelligence in Healthcare and EducationExplainable Artificial Intelligence (XAI)

Volltext beim Verlag öffnen

Two Sources of Conviction: A Two-Probe Empirical Study of Bias Mechanisms in Large Language Models

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen