The Illustrated Transformer
In this post, we will look at The Transformer – a model that uses attention to boost the speed with which these models can be trained.
transformers positional-encoding encoder decoder
Using Different Decoding Methods for LM with Transformers
A look at different decoding methods for generate subsequent tokens in language modeling.
language-modeling decoder transformers huggingface
