The Transformer … “Explained”?
An intuitive explanation of the Transformer by motivating it through the lens of CNNs, RNNs, etc.
transformers natural-language-processing tutorial article
Links
Details

  • I’m going to take a “historical” route where I go through some other, mostly older architectural patterns first, to put it in context; hopefully it’ll be useful to people who are new to this stuff, while also not too tiresome to those who aren’t.
  • The closest thing to an intuitive explainer than I know of is “The Illustrated Transformer,” but IMO it’s too light on intuition and too heavy on near-pseudocode (including stuff like “now you divide by 8,” as the third of six enumerated “steps” which themselves only cover part of the whole computation!).
  • This is a shame, because once you hack through all the surrounding weeds, the basic idea of the Transformer is really simple. This post is my attempt at a explainer.

Top collections

Don't forget to tag @nostalgebraist in your comment.

Authors community post
Share this project
Similar projects
Tips for Successfully Training Transformers on Small Datasets
It turns out that you can easily train transformers on small datasets when you use tricks (and have the patience to train a very long time).
PyTorch Transformers Tutorials
A set of annotated Jupyter notebooks, that give user a template to fine-tune transformers model to downstream NLP tasks such as classification, NER etc.
IntelliCode Compose: Code Generation Using Transformer
Code completion tool which is capable of predicting sequences of code tokens of arbitrary types, generating up to entire lines of syntactically correct ...
Multi-task Training with Hugging Face Transformers and NLP
A recipe for multi-task training with Transformers' Trainer and NLP datasets.