In the realms of computer vision, it is evident that deep neural networks perform better in a supervised setting with a large amount of labeled data. The representations learned with supervision are not only of high quality but also helps the model in enhancing its accuracy. However, the collection and annotation of a large dataset are costly and time-consuming. To avoid the same, there has been a lot of research going on in the field of unsupervised visual representation learning especially in a self-supervised setting. Amongst the recent advancements in self-supervised methods for visual recognition, in SimCLR Chen et al. shows that good quality representations can indeed be learned without explicit supervision. In SimCLR, the authors maximize the similarity of augmentations of the same image and minimize the similarity of augmentations of different images. A linear classifier trained with the representations learned using this approach yields 76.5% top-1 accuracy on the ImageNet ILSVRC-2012 dataset. In this work, we propose that, with the normalized temperature-scaled cross-entropy (NT-Xent) loss function (as used in SimCLR), it is beneficial to not have images of the same category in the same batch. In an unsupervised setting, the information of images pertaining to the same category is missing. We use the latent space representation of a denoising autoencoder trained on the unlabeled dataset and cluster them with k-means to obtain pseudo labels. With this apriori information we batch images, where no two images from the same category are to be found. We report comparable performance enhancements on the CIFAR10 dataset and a subset of the ImageNet dataset. We refer to our method as G-SimCLR.

The paper is accepted at ICDM 2020 for the Deep Learning for Knowledge Transfer (DLKT) workshop.

Don't forget to tag @sayakpaul , @ariG23498 , @souradip-chakraborty in your comment, otherwise they may not be notified.

Authors original post
Calling `model.fit()` @ https://pyimagesearch.com | Netflix Nerd
I learn with a learning rate of 1e-10
Statistical Analyst@Walmart Labs. Masters in Applied Statistics @indian Statistical Institute. Representation Learning,Mixed Space modelling.
Share this project
Similar projects
Self-Supervised Scene De-occlusion
We investigate the problem of scene de-occlusion, which aims to recover the underlying occlusion ordering and complete the invisible parts of occluded ...
Self-Supervised Representation Learning
What if we can get labels for free for unlabelled data and train unsupervised dataset in a supervised manner?
Single-Stage Semantic Segmentation from Image Labels
We attain competitive results by training a single network model for segmentation in a self-supervised fashion using only image-level annotations
Understanding & Implementing SimCLR - an ELI5 guide
I explain the SimCLR and its contrastive loss function step by step, build image embeddings and then show how to use them to train image classifier on top.