The Middlebury Stereo dataset consists of high-resolution stereo sequences with complex geometry and pixel-accurate ground-truth disparity data. The ground-truth disparities are acquired using a novel technique that employs structured lighting and does not require the calibration of the light projectors.
These datasets were created by Daniel Scharstein, Alexander Vandenberg-Rodes, and Rick Szeliski. They consist of high-resolution stereo sequences with complex geometry and pixel-accurate ground-truth disparity data. The ground-truth disparities are acquired using a novel technique that employs structured lighting and does not require the calibration of the light projectors. See our CVPR 2003 paper for more details._x000D_ Quarter-size (450 x 375) versions of our new data sets "Cones" and "Teddy" are available for download below. Each data set contains 9 color images (im0..im8) and 2 disparity maps (disp2 and disp6). The 9 color images form a multi-baseline stereo sequence, i.e., they are taken from equally-spaced viewpoints along the x-axis from left to right. The images are rectified so that all image motion is purely horizontal. To test a two-view stereo algorithm, the two reference views im2 (left) and im6 (right) should be used. Ground-truth disparites with quarter-pixel accuracy are provided for these two views. Disparities are encoded using a scale factor 4 for gray levels 1 .. 255, while gray level 0 means "unknown disparity". Therefore, the encoded disparity range is 0.25 .. 63.75 pixels.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The Middlebury Stereo dataset consists of high-resolution stereo sequences with complex geometry and pixel-accurate ground-truth disparity data. The ground-truth disparities are acquired using a novel technique that employs structured lighting and does not require the calibration of the light projectors.