Data Augmentation


Data augmentation is a strategy that enables practitioners to significantly increase the diversity of data available for training models, without actually collecting new data. Data augmentation techniques such as cropping, padding, and horizontal flipping are commonly used to train large neural networks.

Overview

Data Augmentation | How to use Deep Learning With Limited Data
This article is a comprehensive review of Data Augmentation techniques for Deep Learning, specific to images.
data-augmentation pretraining article tutorial
A Visual Survey of Data Augmentation in NLP
An extensive overview of text data augmentation techniques for Natural Language Processing
natural-language-processing data-augmentation tutorial article

Tutorials

Automating the Art of Data Augmentation
Learning to Compose Domain-Specific Transformations for Data Augmentation
data-augmentation tanda tutorial article
Data augmentation recipes in tf.keras image-based models
Learn about different ways of doing data augmentation when training an image classifier in tf.keras.
image-classification deep-learning data-augmentation computer-vision
Automating Data Augmentation: Practice, Theory and New Direction
A new framework for exploiting data augmentation to patch a flawed model and improve performance on crucial subpopulation of data.
data-augmentation tutorial article

Libraries

Natural Language Processing (NLP)
TextAttack
A Python framework for building adversarial attacks on NLP models.
data-augmentation natural-language-processing adversarial-attacks adversarial-text
TextAugment
Improving Short Text Classification through Global Augmentation Methods
data-augmentation natural-language-processing library code
Niacin
A Python library for replacing the missing variation in your text data.
data-augmentation natural-language-processing enrichement negative-sampling
Computer Vision (CV)
Augmentor
Image augmentation library in Python for machine learning.
data-augmentation computer-vision library code
Albumentations
Fast image augmentation library and easy to use wrapper around other libraries.
data-augmentation computer-vision demo notebook
SOLT: Data Augmentation for Deep Learning
Data augmentation library for Deep Learning, which supports images, segmentation masks, labels and key points.
data-augmentation image-segmentation deep-learning pytorch
TF Sprinkles
Fast and efficient sprinkles augmentation implemented in TensorFlow.
data-augmentation computer-vision tensorflow library
Data augmentation recipes in tf.keras image-based models
Learn about different ways of doing data augmentation when training an image classifier in tf.keras.
image-classification deep-learning data-augmentation computer-vision
Kornia: Differentiable Computer Vision Library for PyTorch
Set of routines and differentiable modules to solve generic computer vision problems.
computer-vision pytorch data-augmentation edge-detection
Other
DeltaPy⁠⁠
Tabular Data Augmentation & Feature Engineering.
data-augmentation tabular-data tabular table
Audiomentations
A Python library for audio data augmentation. Inspired by albumentations.
data-augmentation audio library code
Snorkel
A system for quickly generating training data with weak supervision.
weak-supervision rules data-augmentation snorkel
Image Augmentations for GAN Training
We systematically study the effectiveness of various existing augmentation techniques for GAN training in a variety of settings.
data-augmentation image-augmentation generative-adversarial-networks research
Automatic Data Augmentation for Generalization in Deep RL
We compare three approaches for automatically finding an appropriate augmentation combined with two novel regularization terms for the policy and value ...
data-augmentation reinforcement-learning kornia pytorch
Training Generative Adversarial Networks with Limited Data
An adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes.
generative-adversarial-networks data-augmentation limited-data research
Table of Contents
Share a project
Share something you or the community has made with ML.
Topic experts
Share