Finetuning Transformers with JAX + Haiku
Walking through a port of the RoBERTa pre-trained model to JAX + Haiku, then fine-tuning the model to solve a downstream task.
jax haiku roberta transformers fine-tuning natural-language-processing pretraining tutorial article code notebook

• This post will be code-oriented and will usually show code examples first before providing commentary. * We're going to be working in a top-down fashion, so we'll lay out our Transformer model in broad strokes and then fill in the detail. * I'll introducing Haiku's features as they're needed for our Transformer finetuning project.

Machine Learning Architect at @IndicoDataSolutions
