QuadricSLAM

QuadricSLAM uses constrained dual quadrics as 3D landmark representations, exploiting their ability to compactly represent the size, position and orientation of an object.

Our paper shows how 2D object detections can directly constrain the quadric parameters via a novel geometric error formulation. We develop a sensor model for object detectors that addresses the challenge of partially visible objects, and demonstrate how to jointly estimate the camera pose and constrained dual quadric parameters in factor graph based SLAM with a general perspective camera.

QuadricSLAM uses objects as landmarks and represents them as constrained dual quadrics in 3D space. QuadricSLAM jointly estimates camera poses and quadric parameters from odometry measurements and object detections, implicitly performing loop closures based on the object observations.

This figure illustrates how well the estimated quadrics fit the true objects when projected into the camera images from different viewpoints. (Scene from the TUM RGB-D Dataset)

Contributions

With this research, we make the following contributions:

Impressions

This video illustrates the quality of QuadricSLAM on a real world image sequence by projecting the estimated quadric landmarks into the video, based on the estimated camera pose.

We use a pretrained YOLOv3 network to generate 2D object detections, which are manually associated with distinct physical objects, and then introduced as factors in our factor graph formulation. Odometry measurements for this sequence (fr2_desk, from the TUM RGB-D dataset) has been provided using an implementation of ORB-SLAM2 with loop closures disabled.

These pictures illustrate how well the estimated quadrics fit the true objects when projected into the camera images from different viewpoints. (Scene from the TUM RGB-D Dataset)

Publications

Journal

Workshop

arXiv Preprints