How to Detect Data-Copying in Generative Models
I propose some new definitions and test statistics for conceptualizing and measuring overfitting by generative models.
generative-modeling data-copying generative-adversarial-networks variational-autoencoders over-representation autoencoders code research article paper arxiv:2004.05675

Detecting overfitting in generative models is an important challenge in machine learning. In this work, we formalize a form of overfitting that we call {\em{data-copying}} -- where the generative model memorizes and outputs training samples or small variations thereof. We provide a three sample non-parametric test for detecting data-copying that uses the training set, a separate sample from the target distribution, and a generated sample from the model, and study the performance of our test on several canonical models and datasets.

Don't forget to tag @ucsdml , @casey-meehan in your comment, otherwise they may not be notified.

Authors community post
PhD student at UCSD CSE
Share this project
Similar projects
Generative Adversarial Networks: A Crash Course in GANs
This course covers GAN basics, and also how to use the TF-GAN library to create GANs.
“Reparameterization” trick in Variational Autoencoders
In this article, we are going to learn about the “reparameterization” trick that makes Variational Autoencoders (VAE) an eligible candidate for ...
Towards Deep Generative Modeling with W&B
In this report, we will learn about the evolution of generative modeling.
GenForce Lib for Generative Modeling
GenForce: an efficient PyTorch library for deep generative modeling (StyleGANv1v2, PGGAN, etc).