Evaluating faithful interpretability
Guidelines we should keep in mind when working on faithful interpretations from Jacovi & Goldberg 2020 (talk’s slides, number 16–24) 1. Faithfulness is not Plausibility. A plausible but unfaithful interpretation is akin to lying, and can be dangerous. 2. A model decision process is not a human decision process. Humans cannot judge if an interpretation is faithful. Evaluating interpretation using human input is evaluating plausibility, not faithfulness. 3. Claims are just claims until tested. A model which is believed to be “inherently interpretable” should be rigorously tested in just the same way as post-hoc methods.
Interpretability via different methods
and many other techniques for interpretability in NLP!
Don't forget to tag @carolinlawrence in your comment, otherwise they may not be notified.