Here you can find the benchmark dataset from the paper:
- K. Pauwels, L. Rubio, J. Diaz Alonso, and E. Ros, “Real-time model-based rigid object pose estimation and tracking combining dense and sparse visual cues,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Portland, 2013, pp. 2347–2354.
The following video gives some indication of the dataset contents.
Six different objects are available under three conditions: noise-free, noisy, and occluded.
Due to the size of the dataset compressed versions of the sequences are provided (about 140MB for each object):
These sequences were compressed using high-quality (but lossy) h.264 video encoding. The raw dataset is also available (about 2GB for each object):
The following calibration info corresponds to the rectified sequences:
focal_length = 500.6795; % (in pixels) baseline = 70.7722; % (in mm) nodal_point_x = 352.1633; % column (in pixels) nodal_point_y = 260.3113; % row (in pixels)
Pixels are square and focal lengths and nodal point are identical in both (rectified) images.