Shape and Viewpoint without Keypoints
Recover the 3D shape, pose and texture from a single image, trained on an image collection without any ground truth 3D shape, multi-view, camera ...
3d unsupervised-learning pascal-3d computer-vision research paper code article arxiv:2007.10982

We present a learning framework that learns to recover the 3D shape, pose and texture from a single image, trained on an image collection without any ground truth 3D shape, multi-view, camera viewpoints or keypoint supervision. We approach this highly under-constrained problem in a "analysis by synthesis" framework where the goal is to predict the likely shape, texture and camera viewpoint that could produce the image with various learned category-specific priors. Our particular contribution in this paper is a representation of the distribution over cameras, which we call "camera-multiplex". Instead of picking a point estimate, we maintain a set of camera hypotheses that are optimized during training to best explain the image given the current shape and texture. We call our approach Unsupervised Category-Specific Mesh Reconstruction (U-CMR), and present qualitative and quantitative results on CUB, Pascal 3D and new web-scraped datasets. We obtain state-of-the-art camera prediction results and show that we can learn to predict diverse shapes and textures across objects using an image collection without any keypoint annotations or 3D ground truth.

Don't forget to tag @shubham-goel in your comment, otherwise they may not be notified.

Authors community post
Computer Vision | 3D
Share this project
Similar projects
Unsupervised Learning of Probably Symmetric Deformable 3D Objects
A method to learn 3D deformable object categories from raw single-view images, without external supervision.
OpenMMLab Computer Vision
MMCV is a python library for CV research and supports many research projects such as object detection, segmentation, pose estimation, action ...
MediaPipe
Simplest way for researchers and developers to build world-class ML solutions and applications for mobile, edge, cloud and the web.
Face Alignment in Full Pose Range: A 3D Total Solution
Face Alignment in Full Pose Range: A 3D Total Solution.