The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. It features:
1449 densely labeled pairs of aligned RGB and depth images 464 new scenes taken from 3 cities 407,024 new unlabeled frames Each object is labeled with a class and an instance number. The dataset has several components: Labeled: A subset of the video data accompanied by dense multi-class labels. This data has also been preprocessed to fill in missing depth labels. Raw: The raw RGB, depth and accelerometer data as provided by the Kinect. Toolbox: Useful functions for manipulating the data and labels.
The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('nyu_depth_v2', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/nyu_depth_v2-0.0.1.png" alt="Visualization" width="500px">
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Source: https://huggingface.co/datasets/sayakpaul/nyu_depth_v2 (origin : fast-depth)
Train: 48k Test: 654 Image dtype: uint8 Depth dtype: uint16
def image2depth(path):
depth = cv2.imwrite(path, cv2.IMREAD_UNCHANGED)
depth = depth.astype('float32')
depth /= (2**16 - 1)
depth *= 10.0
return depth
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
NYUv2
This is an unofficial and preprocessed version of NYU Depth Dataset V2 made available for easier integration with modern ML workflows. The dataset was converted from the original .mat format into a split structure with embedded RGB images, depth maps, semantic masks, and instance masks in Hugging Face-compatible format.
📸 Sample Visualization
RGB
Depth (Jet colormap)
Semantic Mask… See the full description on the dataset page: https://huggingface.co/datasets/jagennath-hari/nyuv2.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Quantitative comparison on NYU Depth v2 dataset.
The RMRC 2014 indoor dataset is a dataset for indoor semantic segmentation. It employs the NYU Depth V2 and Sun3D datasets to define the training set. The test data consists of newly acquired images.
This is the NYUv2 dataset for scene understanding tasks. I downloaded the original data from the Tsinghua Cloud and transformed it into Huggingface Dataset. Credit to ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning.
Dataset Information
This data contains two splits: 'train' and 'val' (used as test dataset). Each sample in the dataset has 5 items: 'image', 'segmentation', 'depth', 'normal', and 'noise'. The noise is generated using torch.rand().
Usage… See the full description on the dataset page: https://huggingface.co/datasets/tanganke/nyuv2.
Processed versions of some open-source datasets for evaluation of monocular geometry estimation.
Dataset Source Publication Num images Storage Size Note
NYUv2 NYU Depth Dataset V2 [1] 654 243 MB Offical test split. Mirror, glass and window manually removed. Depth beyound 5 m truncated.
KITTI KITTI Vision Benchmark Suite [2, 3] 652 246 MB Eigen's test split.
ETH3D ETH3D SLAM & Stereo Benchmarks [4] 454 1.3 GB Downsized from 6202×4135 to 2048×1365
iBims-1 iBims-1 (independent… See the full description on the dataset page: https://huggingface.co/datasets/Ruicheng/monocular-geometry-evaluation.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. It features:
1449 densely labeled pairs of aligned RGB and depth images 464 new scenes taken from 3 cities 407,024 new unlabeled frames Each object is labeled with a class and an instance number. The dataset has several components: Labeled: A subset of the video data accompanied by dense multi-class labels. This data has also been preprocessed to fill in missing depth labels. Raw: The raw RGB, depth and accelerometer data as provided by the Kinect. Toolbox: Useful functions for manipulating the data and labels.