In this technical report, we introduce our winning solution "HorizonLiDAR3D" for the 3D detection track and the domain adaptation track in Waymo Open Dataset Challenge at CVPR 2020. Many existing 3D object detectors include prior-based anchor box design to account for different scales and aspect ratios and classes of objects, which limits its capability of generalization to a different dataset or domain and requires post-processing (e.g. Non-Maximum Suppression (NMS)). We proposed a one-stage, anchor-free and NMS-free 3D point cloud object detector AFDet, using object key-points to encode the 3D attributes, and to learn an end-to-end point cloud object detection without the need of hand-engineering or learning the anchors. AFDet serves as a strong baseline in our winning solution and significant improvements are made over this baseline during the challenges. Specifically, we design stronger networks and enhance the point cloud data using densification and point painting. To leverage camera information, we append/paint additional attributes to each point by projecting them to camera space and gathering image-based perception information. The final detection performance also benefits from model ensemble and Test-Time Augmentation (TTA) in both the 3D detection track and the domain adaptation track. Our solution achieves the 1st place with 77.11% mAPH/L2 and 69.49% mAPH/L2 respectively on the 3D detection track and the domain adaptation track.

Don't forget to tag @HorizonRobotics in your comment, otherwise they may not be notified.

Authors community post
Establish the Leading Edge AI Platform to Make Human Life Safer and Better
Share this project
Similar projects
Simplest way for researchers and developers to build world-class ML solutions and applications for mobile, edge, cloud and the web.
OpenMMLab Computer Vision
MMCV is a python library for CV research and supports many research projects such as object detection, segmentation, pose estimation, action ...
Object Detection with RetinaNet
Implementing RetinaNet: Focal Loss for Dense Object Detection.
Different deep learning architectures definitions that can be applied to image segmentation.