Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Promptexp: Multi-Granularity Prompt Explanation of Large Language Models
1
Zitationen
5
Autoren
2025
Jahr
Abstract
Large Language Models (LLMs) excel in tasks like natural language understanding and text generation. Prompt engineering plays a critical role in leveraging LLM effectively. However, LLM's black-box nature hinders its interpretability and effective prompt engineering. A wide range of model explanation approaches have been developed for deep learning models (e.g., feature attribution-based and attention-based techniques). However, these local explanations are designed for single-output tasks like classification and regression, and cannot be directly applied to LLMs, which generate sequences of tokens. Recent efforts in LLM explanation focus on natural language explanations, but they are prone to hallucinations and inaccuracies. To address this, we introduce PromptExp, a framework for multi-granularity prompt explanations by aggregating tokenlevel insights. PromptExp introduces two token-level explanation approaches: (1) an aggregation-based approach combining local explanation techniques (e.g., Integrated Gradient), and (2) a perturbation-based approach with novel techniques to evaluate token masking impact. PromptExp supports both white-box and black-box explanations and extends explanations to higher granularity levels (e.g., sentences and components), enabling flexible analysis. We evaluate PromptExp in case studies such as sentiment analysis, showing the perturbation-based approach performs best using semantic similarity to assess perturbation impact. Furthermore, we conducted a user study in our industrial partner's company to confirm PromptExp's accuracy and practical value, and demonstrate its potential to enhance LLM interpretability.
Ähnliche Arbeiten
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
2017 · 20.748 Zit.
Generative Adversarial Nets
2023 · 19.896 Zit.
Visualizing and Understanding Convolutional Networks
2014 · 15.325 Zit.
"Why Should I Trust You?"
2016 · 14.570 Zit.
On a Method to Measure Supervised Multiclass Model’s Interpretability: Application to Degradation Diagnosis (Short Paper)
2024 · 13.201 Zit.