Interpretability


Methods and techniques in the application of artificial intelligence technology (AI) such that the results of the solution can be understood by human experts.

Overview

Interpretable Machine Learning
Extracting human understandable insights from any Machine Learning model.
interpretability ermutation-importance partial-dependence-plots shap-values
Explainable Deep Learning: A Field Guide for the Uninitiated
A field guide to deep learning explainability for those uninitiated in the field.
interpretability explainability deep-learning survey

Tutorials

Interpretable Machine Learning for Computer Vision
Recent progress we made on visualization, interpretation, and explanation methodologies for analyzing both the data and the models in computer vision.
computer-vision interpretability cvpr-2020 article
Integrated Gradients for Interpretability
Integrated Gradients is a technique for attributing a classification model's prediction to its input features.
interpretability gradients convolutional-neural-networks deep-learning
Interpretable Machine Learning
A guide for making black box models explainable.
lime shapely shap interpretability
Interpretability and Analysis of Models for NLP
An in-depth look at interpretability and analysis of models for NLP (ACL 2020).
interpretability natural-language-processing acl-2020 article
WT5?! Training Text-to-Text Models to Explain their Predictions
We leverage the text-to-text framework proposed by Raffel et al.(2019) to train language models to output a natural text explanation alongside their ...
t5 transformers interpretability natural-language-processing
Neural Additive Models: Interpretable ML with Neural Nets
Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models.
neural-additive-models interpretability feed-forward-neural-networks additive-models

Libraries

General
Lime: Local Interpretable Model-Agnostic Explanations
Explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction.
interpretability lime code paper
SHAP: SHapley Additive exPlanations
A game theoretic approach to explain the output of any machine learning model.
interpretability shap explainability gradient-boosting
ELI5
A library for debugging/inspecting machine learning classifiers and explaining their predictions.
interpretability eli5 debugging inspection
Path Explain
A toolkit for explaining feature attributions and feature interactions in deep neural networks.
interpretability integrated-gradients expected-gradients path-explain
Tf-explain
Interpretability Methods for tf.keras models with Tensorflow 2.0
interpretability tensorflow tensorflow-2-0 article
NLP Libraries
Language Interpretability Tool (LIT)
The Language Interpretability Tool (LIT) is a visual, interactive model-understanding tool for NLP models.
natural-language-processing interpretability library code
AllenNLP Interpret
A Framework for Explaining Predictions of NLP Models
interpretability explainability natural-language-processing api
BertViz
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
interpretability visualization bert attention
WT5?! Training Text-to-Text Models to Explain their Predictions
We leverage the text-to-text framework proposed by Raffel et al.(2019) to train language models to output a natural text explanation alongside their ...
t5 transformers interpretability natural-language-processing
CV Libraries
FlashTorch
Visualization toolkit for neural networks in PyTorch
interpretability computer-vision pytorch flashtorch
CNN Explainer
CNN Explainer uses TensorFlow.js, an in-browser GPU-accelerated deep learning library to load the pretrained model for visualization.
convolutional-neural-networks tensorflow-js interactive interpretability
Other Libraries
Neural-Backed Decision Trees
Combine interpretability of a decision tree with accuracy of a neural network.
decision-trees neural-networks deep-learning interpretability
InterpretML
Fit interpretable machine learning models. Explain blackbox machine learning.
interpretability explainability lime shap
ExplainX
ExplainX is an explainable AI framework for data scientists to explain any black-box model behavior to business stakeholders.
interpretability explainx video code
Table of Contents
Share a project
Share something you or the community has made with ML.
Topic experts
Share