Cut, Paste and Learn: Surprisingly Easy Synthesis for Detection
Generate synthetic scenes and bounding box annotations for object detection.
computer-vision data-augmentation object-detection instance-segmentation segmentation code paper arxiv:1708.01642 research

A major impediment in rapidly deploying object detection models for instance detection is the lack of large annotated datasets. For example, finding a large labeled dataset containing instances in a particular kitchen is unlikely. Each new environment with new instances requires expensive data collection and annotation. In this paper, we propose a simple approach to generate large annotated instance datasets with minimal effort. Our key insight is that ensuring only patch-level realism provides enough training signal for current object detector models. We automatically cut' object instances andpaste' them on random backgrounds. A naive way to do this results in pixel artifacts which result in poor performance for trained models. We show how to make detectors ignore these artifacts during training and generate data that gives competitive performance on real data. Our method outperforms existing synthesis approaches and when combined with real images improves relative performance by more than 21% on benchmark datasets. In a cross-domain setting, our synthetic data combined with just 10% real data outperforms models trained on all real data.

Don't forget to tag @debidatta in your comment, otherwise they may not be notified.

Authors community post
Share this project
Similar projects
CLoDSA: A Tool for Augmentation in Computer Vision tasks
CLoDSA is an open-source image augmentation library for object classification, localization, detection, semantic segmentation and instance segmentation. It ...
Augmentor
Image augmentation library in Python for machine learning.
TF Sprinkles
Fast and efficient sprinkles augmentation implemented in TensorFlow.
Diverse Image Generation via Self-Conditioned GANs
A simple but effective unsupervised method for generating realistic & diverse images using a class-conditional GAN model without using manually annotated ...