Data Augmentation


Data augmentation is a strategy that enables practitioners to significantly increase the diversity of data available for training models, without actually collecting new data. Data augmentation techniques such as cropping, padding, and horizontal flipping are commonly used to train large neural networks.

Overview

Data Augmentation | How to use Deep Learning With Limited Data
This article is a comprehensive review of Data Augmentation techniques for Deep Learning, specific to images.
data-augmentation pretraining article tutorial
A Visual Survey of Data Augmentation in NLP
An extensive overview of text data augmentation techniques for Natural Language Processing
natural-language-processing data-augmentation tutorial article

Tutorials

Automating the Art of Data Augmentation
Learning to Compose Domain-Specific Transformations for Data Augmentation
data-augmentation tanda tutorial article
Data augmentation recipes in tf.keras image-based models
Learn about different ways of doing data augmentation when training an image classifier in tf.keras.
image-classification deep-learning data-augmentation computer-vision
Automating Data Augmentation: Practice, Theory and New Direction
A new framework for exploiting data augmentation to patch a flawed model and improve performance on crucial subpopulation of data.
data-augmentation tutorial article
Image Augmentations for GAN Training
We systematically study the effectiveness of various existing augmentation techniques for GAN training in a variety of settings.
data-augmentation image-augmentation generative-adversarial-networks research
Automatic Data Augmentation for Generalization in Deep RL
We compare three approaches for automatically finding an appropriate augmentation combined with two novel regularization terms for the policy and value ...
data-augmentation reinforcement-learning kornia pytorch
Training Generative Adversarial Networks with Limited Data
An adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes.
generative-adversarial-networks data-augmentation limited-data research
Multi-target in Albumentations
Many images, many masks, bounding boxes, and key points. How to transform them in sync?
data-augmentation image-augmentation computer-vision albumentations
Test-Time Data Augmentation
Tutorial on how to properly implement test-time image data augmentation in a production environment with limited computational resources.
data-augmentation keras production tensorflow

Libraries

General
Snorkel
A system for quickly generating training data with weak supervision.
weak-supervision rules data-augmentation snorkel
DeltaPy⁠⁠
Tabular Data Augmentation & Feature Engineering.
data-augmentation tabular-data tabular table
NLP Libraries
TextAttack
A Python framework for building adversarial attacks on NLP models.
data-augmentation natural-language-processing adversarial-attacks adversarial-text
TextAugment
Improving Short Text Classification through Global Augmentation Methods
data-augmentation natural-language-processing library code
Niacin
A Python library for replacing the missing variation in your text data.
data-augmentation natural-language-processing enrichement negative-sampling
CV Libraries
SOLT: Data Augmentation for Deep Learning
Data augmentation library for Deep Learning, which supports images, segmentation masks, labels and key points.
data-augmentation deep-learning pytorch computer-vision
Albumentations
Fast image augmentation library and easy to use wrapper around other libraries.
data-augmentation computer-vision demo notebook
Augmentor
Image augmentation library in Python for machine learning.
data-augmentation computer-vision library code
CLoDSA: A Tool for Augmentation in Computer Vision tasks
CLoDSA is an open-source image augmentation library for object classification, localization, detection, semantic segmentation and instance segmentation. It ...
data-augmentation object-classification object-detection computer-vision
TF Sprinkles
Fast and efficient sprinkles augmentation implemented in TensorFlow.
data-augmentation computer-vision tensorflow library
Data augmentation recipes in tf.keras image-based models
Learn about different ways of doing data augmentation when training an image classifier in tf.keras.
image-classification deep-learning data-augmentation computer-vision
Other Libraries
Audiomentations
A Python library for audio data augmentation. Inspired by albumentations.
data-augmentation audio library code
Tsaug
A Python package for time series augmentation.
time-series data-augmentation tsaug code
Table of Contents
Share a project
Share something you or the community has made with ML.
Topic experts
Share