OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 30.03.2026, 08:27

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Applied Explainability for Large Language Models: A Comparative Study

2026·0 Zitationen·Zenodo (CERN European Organization for Nuclear Research)Open Access
Volltext beim Verlag öffnen

0

Zitationen

1

Autoren

2026

Jahr

Abstract

Large Language Models (LLMs) achieve strong performance across natural language processing tasks, yet their internal decision processes remain difficult to interpret. This lack of transparency creates challenges in real-world deployments requiring trust, debugging, and accountability. This study presents a comparative analysis of three explainability techniques—Integrated Gradients, Attention Rollout, and SHAP—applied to a fine-tuned DistilBERT model on the SST-2 sentiment classification task. The methods are evaluated under a consistent experimental setup using qualitative criteria such as faithfulness, stability, and interpretability. The results show that gradient-based attribution methods provide the most stable and intuitive explanations, while attention-based approaches are computationally efficient but less aligned with prediction-relevant features. Model-agnostic methods offer flexibility but introduce computational overhead and variability. This work highlights practical trade-offs in explainability techniques and emphasizes the importance of evaluating them in realistic scenarios. The findings provide actionable insights for machine learning practitioners working with transformer-based NLP systems.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Explainable Artificial Intelligence (XAI)Artificial Intelligence in Healthcare and EducationMultimodal Machine Learning Applications
Volltext beim Verlag öffnen