SimTrack relies on the Robotics Operating System (ROS) to obtain sensory information and to connect its pose detection with its pose tracking component. SimTrack in turn publishes the estimated poses on tf. SimTrack makes extensive use of the graphics processing unit’s compute and graphics capabilities.

A pose detection module uses SIFT keypoints to initialize pose tracking and to recover from tracking failures. Depending on the number of available GPUs, this detection module continuously assists a pose tracking module. The latter uses a large number of dense optical flow and depth measurements to continuously refine the estimated poses. A detailed discussion of how tracking and detection interact can be found here:

K. Pauwels, L. Rubio, and E. Ros, “Real-time Pose Detection and Tracking of Hundreds of Objects,” IEEE Transactions on Circuits and Systems for Video Technology, Jan. 2015.

To achieve robustness, SimTrack exploits as much model information as possible. Its models consist of textured meshes supplemented with SIFT keypoints. Such models can be obtained in various ways and we discuss a relatively straightforward approach in the modeling section.