Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
The depth prediction evaluation is related to the work published in Sparsity Invariant CNNs (THREEDV 2017). It contains over 93 thousand depth maps with corresponding raw LiDaR scans and RGB images, aligned with the "raw data" of the KITTI dataset. Given a large amount of training data, this dataset shall allow the training of complex deep learning models for depth completion and single image depth prediction tasks. Also, manually selected images with unpublished depth maps are provided here to serve as a benchmark for those two challenging tasks.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
This dataset consists of four ZIP files containing annotated images used for experiments in the research of formal specification and specification-based testing for image recognition in autonomous driving systems. The dataset has been derived and modified from the KITTI dataset.
image1.zip: Contains 349 images. These images are part of the first subset used in the experiments.
label1.zip: Contains the 2D bounding box annotations for vehicles corresponding to the images in image1.zip. There are 349 annotation files, and in total, 2,736 vehicles are annotated.
image2.zip: Contains 1,300 images. These images are part of the second subset used in the experiments.
label2.zip: Contains the 2D bounding box annotations for vehicles corresponding to the images in image2.zip. There are 1,300 annotation files, and in total, 5,644 vehicles are annotated.
The dataset was utilized in the research project focused on Bounding Box Specification Language (BBSL), a formal specification language designed for image recognition in autonomous driving systems. This research explores specification-based testing methodologies for object detection systems.
The BBSL project and related testing tools can be accessed on GitHub: https://github.com/IOKENTOI/BBSL-test.
The original KITTI dataset used for modification can be found at [KITTI dataset source link].
If you use this dataset, please cite the original KITTI dataset:
@inproceedings{Geiger2012CVPR,
author = {Andreas Geiger and Philip Lenz and Raquel Urtasun},
title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},
booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2012}
}
This dataset was created by Sumanyu Ghoshal
The Segmenting and Tracking Every Pixel (STEP) benchmark consists of 21 training sequences and 29 test sequences. It is based on the KITTI Tracking Evaluation and the Multi-Object Tracking and Segmentation (MOTS) benchmark. This benchmark extends the annotations to the Segmenting and Tracking Every Pixel (STEP) task. [Copy-pasted from http://www.cvlibs.net/datasets/kitti/eval_step.php]
The odometry benchmark consists of 22 stereo sequences, saved in loss less png format: We provide 11 sequences (00-10) with ground truth trajectories for training and 11 sequences (11-21) without ground truth for evaluation. For this benchmark you may provide results using monocular or stereo visual odometry, laser-based SLAM or algorithms that combine visual and LIDAR information. The only restriction we impose is that your method is fully automatic (e.g., no manual loop-closure tagging is allowed) and that the same parameter set is used for all sequences. A development kit provides details about the data format. More details are available at: https://www.cvlibs.net/datasets/kitti/eval_odometry.php.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is created by DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation, based on KITTI and nuScenes. You can check this link for more details: https://www.arxiv.org/abs/2503.11122 And access the code: https://github.com/Hongbin98/DriveGEN Please double-check the demands of KITTI and nuScenes when you try to download this dataset and obey their rules.
https://www.cvlibs.net/datasets/kitti/ https://www.nuscenes.org/
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Light Detection And Ranging (LiDAR) is a sensor that is used to measure distances between the sensor and the surroundings. It depends on sending multiple laser beams and sense them back after being reflected to calculate the distance between the sensor and the objected they were reflected on. Since the rise of the research in the field of self-driving cars, LiDAR has been widely used and was even developed to be with lower cost than before.
KITTI dataset is one of the most famous datasets targeting the field of self-driving cars. It contains recorded data from camera, LiDAR and other sensors mounted on top of a car that moves in many streets with many different scenes and scenarios.
This dataset contains the LiDAR frames of KITTI dataset converted to 2D depth images and it was converted using this code. These 2D depth images represents the same scene of the corresponding LiDAR frame but in an easier to process format.
This dataset contains 2D depth images, like the one represented below. The 360 LiDAR frames like those in KITTI dataset are in a cylindrical format around the sensor itself. The 2D depth images in this dataset could be represented as if you have made a cut in the cylinder of the LiDAR frame and straightened it to be in a 2D plane. The pixels of these 2D depth images represent the distance of the reflecting object from the LiDAR sensor. The vertical resolution of the 2D depth image (64 in our case) represents the number of laser beams of the LiDAR sensor used to scan the surroundings. These 2D depth images could be used for segmentation, detection, recognition and etc. tasks and could make use of the huge literature of computer vision on 2D images.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3283916%2F71fcde75b3e94ab78896aa75d7efea09%2F0000000077.png?generation=1595578603898080&alt=media" alt="">
KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) is one of the most popular datasets for use in mobile robotics and autonomous driving. It consists of hours of traffic scenarios recorded with a variety of sensor modalities, including high-resolution RGB, grayscale stereo cameras, and a 3D laser scanner. Despite its popularity, the dataset itself does not contain ground truth for semantic segmentation. However, various researchers have manually annotated parts of the dataset to fit their necessities. Álvarez et al. generated ground truth for 323 images from the road detection challenge with three classes: road, vertical, and sky. Zhang et al. annotated 252 (140 for training and 112 for testing) acquisitions – RGB and Velodyne scans – from the tracking challenge for ten object categories: building, sky, road, vegetation, sidewalk, car, pedestrian, cyclist, sign/pole, and fence. Ros et al. labeled 170 training images and 46 testing images (from the visual odometry challenge) with 11 classes: building, tree, sky, car, sign, road, pedestrian, fence, pole, sidewalk, and bicyclist.
this dataset is from kitti-Road/Lane Detection Evaluation 2013.
This benchmark has been created in collaboration with Jannik Fritsch and Tobias Kuehnl from Honda Research Institute Europe GmbH. The road and lane estimation benchmark consists of 289 training and 290 test images. It contains three different categories of road scenes:
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
The depth prediction evaluation is related to the work published in Sparsity Invariant CNNs (THREEDV 2017). It contains over 93 thousand depth maps with corresponding raw LiDaR scans and RGB images, aligned with the "raw data" of the KITTI dataset. Given a large amount of training data, this dataset shall allow the training of complex deep learning models for depth completion and single image depth prediction tasks. Also, manually selected images with unpublished depth maps are provided here to serve as a benchmark for those two challenging tasks.