OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 09.05.2026, 16:46

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

KS-Probe: Benchmarking Context Fidelity Dynamics in Frontier Language Models Across Length, Position, and Format

2026·0 Zitationen·Zenodo (CERN European Organization for Nuclear Research)Open Access
Volltext beim Verlag öffnen

0

Zitationen

2

Autoren

2026

Jahr

Abstract

Large language models (LLMs) are increasingly deployed with extended context windows, yet their ability to reliably utilize information across long contexts remains poorly characterized. In particular, it is unclear how recall fidelity varies with context length, token position, and conversational depth. We introduce KS-Probe (Kangaroo Shift Probe), a benchmarking framework designed to systematically evaluate information retention and recall in long-context LLMs. KS-Probe operates by injecting synthetic probe facts, defined as discrete and verifiable information units, into controlled filler contexts. Models are subsequently queried to assess recall accuracy. Performance is quantified using the Probe Recall Accuracy (PRA) metric, defined as the proportion of correctly retrieved probe facts under varying experimental conditions. KS-Probe evaluates recall behavior across five dimensions: Context Fidelity: baseline recall as a function of context length Positional Recall Bias: dependence of recall on token position within context Multi-Turn Degradation: decay in recall across sequential interaction turns Silent Truncation: failure modes where context is dropped without explicit indication Tokenizer Divergence: variation in recall induced by tokenization differences across model families We benchmark frontier models including Claude Sonnet 4.6, GPT-5.2, Grok 4.1, and DeepSeek V3.2, providing a comparative analysis of long-context reliability across architectures.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Topic ModelingMultimodal Machine Learning ApplicationsArtificial Intelligence in Healthcare and Education
Volltext beim Verlag öffnen