latest | popular

Filter by
A Survey of Long-Term Context in Transformers
Over the past two years the NLP community has developed a veritable zoo of methods to combat expensive multi-head self-attention.
transformers multi-head-attention attention natural-language-processing
Talking-Heads Attention
A variation on multi-head attention which includes linear projections across the attention-heads dimension, immediately before and after the softmax ...
multi-head-attention talking-heads-attention attention transformers
All about attention in neural networks. Soft attention, attention maps, local and global attention and multi-head attention.
attention tensorflow attention-maps multi-head-attention
projects 1 - 3 of 3
Topic experts
Share a project
Share something you or the community has made with ML.