Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
A Unified Evaluation and Governance Framework for Trustworthy LLM Agents
0
Zitationen
2
Autoren
2026
Jahr
Abstract
Large Language Model (LLM) agents are increasingly deployed as autonomous decision-support and workflow-automation components in enterprise, healthcare, finance, and public-sector environments. Despite advances in model capability, current LLM agents still exhibit persistent weaknesses, including hallucinated reasoning steps, unverifiable factual assertions, opaque decision traces, and inconsistent adherence to operational policies. Existing evaluation methodologies emphasize offline benchmarks and text-generation metrics, whereas governance mechanisms typically rely on post-hoc filtering. This separation creates a structural gap between agent capability and agent controllability, hindering safe deployment in mission-critical settings. This paper introduces a unified evaluation and governance framework that embeds multi-layer verification, retrievalaugmented grounding assessment, and policy-aligned decision controls directly within the LLM agent inference loop. We define four quantitative trust metrics—the Agent Reliability Score (ARS), RAG Grounding Confidence (RGC), Attribution Completeness Ratio (ACR), and Policy-Aligned Action Score (PAAS)—to jointly measure end-to-end correctness, evidential grounding, transparency, and policy compliance. We further propose a modular architecture consisting of a structured Reasoning Engine, a RAG Grounding Optimizer, a Multi-Layer Verification Module, a Governance and Audit Layer, and a Continuous Evaluation Loop that adaptively regulates autonomy. We evaluate the proposed framework through two real-world case studies: (i) enterprise workflow automation using Salesforce Agentforce, and (ii) causal evidence synthesis for global development using the World Bank DIME ImpactAI platform. Results show substantial gains over baseline agents, including up to an 88% reduction in hallucinations, a 20–30% improvement in grounding confidence, and near-perfect policy-alignment scores. These findings indicate that the proposed framework provides a practical and principled foundation for deploying trustworthy, auditable, and governable LLM agents in high-stakes operational environments.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.758 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.666 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.220 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.896 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.