Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
What is Interpretability?
73
Zitationen
3
Autoren
2020
Jahr
Abstract
We argue that artificial networks are explainable and offer a novel theory of interpretability. Two sets of conceptual questions are prominent in theoretical engagements with artificial neural networks, especially in the context of medical artificial intelligence: (1) Are networks <i>explainable</i>, and if so, what does it mean to explain the output of a network? And (2) what does it mean for a network to be <i>interpretable</i>? We argue that accounts of "explanation" tailored specifically to neural networks have ineffectively reinvented the wheel. In response to (1), we show how four familiar accounts of explanation apply to neural networks as they would to any scientific phenomenon. We diagnose the confusion about explaining neural networks within the machine learning literature as an equivocation on "explainability," "understandability" and "interpretability." To remedy this, we distinguish between these notions, and answer (2) by offering a theory and typology of interpretation in machine learning. Interpretation is something one does to an explanation with the aim of producing another, more understandable, explanation. As with explanation, there are various concepts and methods involved in interpretation: <i>Total</i> or <i>Partial</i>, <i>Global</i> or <i>Local</i>, and <i>Approximative</i> or <i>Isomorphic</i>. Our account of "interpretability" is consistent with uses in the machine learning literature, in keeping with the philosophy of explanation and understanding, and pays special attention to medical artificial intelligence systems.
Ähnliche Arbeiten
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
2017 · 20.464 Zit.
Generative Adversarial Nets
2023 · 19.843 Zit.
Visualizing and Understanding Convolutional Networks
2014 · 15.259 Zit.
"Why Should I Trust You?"
2016 · 14.315 Zit.
On a Method to Measure Supervised Multiclass Model’s Interpretability: Application to Degradation Diagnosis (Short Paper)
2024 · 13.138 Zit.