Attention mechanism, which for each input, weighs the relevance of every input and draws information from them accordingly when producing the output.


Illustrated: Self-Attention
Step-by-step guide to self-attention with illustrations and code.
self-attention attention pytorch transformers
Attention? Attention!
In this post, we are gonna look into how attention was invented, and various attention mechanisms and models, such as transformer and SNAIL.
attention self-attention pointer-network recurrent-neural-networks


Attention Mechanism
Main concepts behind Attention, including an implementation of a sequence-to-sequence Attention model, followed by the application of Attention in ...
attention self-attention article transformers
All about attention in neural networks. Soft attention, attention maps, local and global attention and multi-head attention.
attention tensorflow attention-maps multi-head-attention


Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
interpretability visualization bert attention
