Top Down Introduction to BERT with HuggingFace and PyTorch
I will also provide some intuition into how BERT works with a top down approach (applications to algorithm).
Tips for Successfully Training Transformers on Small Datasets
It turns out that you can easily train transformers on small datasets when you use tricks (and have the patience to train a very long time).
How to Steal Modern NLP Systems with Gibberish?
It’s possible to steal BERT-based models without any real training data, even using gibberish word sequences.
A colab notebook to showcase how to fine-tune T5 model on various NLP tasks (especially non text-2-text tasks with text-2-text approach)
The Transformer Family
This post presents how the vanilla Transformer can be improved for longer-term attention span, less memory and computation consumption, RL task solving, ...
Transformers - Hugging Face
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
Finetuning Transformers with JAX + Haiku
Walking through a port of the RoBERTa pre-trained model to JAX + Haiku, then fine-tuning the model to solve a downstream task.
Finetune: Scikit-learn Style Model Finetuning for NLP
Finetune is a library that allows users to leverage state-of-the-art pretrained NLP models for a wide variety of downstream tasks.
IntelliCode Compose: Code Generation Using Transformer
Code completion tool which is capable of predicting sequences of code tokens of arbitrary types, generating up to entire lines of syntactically correct ...
Synthesizer: Rethinking Self-Attention in Transformer Models
The dot product self-attention is known to be central and indispensable to state-of-the-art Transformer models. But is it really required?
1 - 10
Share a project
Share something interesting you found that's made with ML.
Share what you've made with ML.