Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Datasets for automotive applications require human annotators to label objects such as traffic lights, cars, and pedestrians. There are many available today (e.g. image data sets and infrared images), as well sensor fusion data sets (e.g. image/RADAR/LiDAR, images with athermalized lenses, and images with event-based sensor data). UDayton24Automotive differs from other datasets in the sense that it is specifically designed for developing, training, and benchmarking object detection algorithms using raw sensor data. Multiple automotive cameras are involved, as described below.
RGGB Camera Data (Baseline Training Set) We collected a new dataset of raw/demosaicked image pairs using automotive camera (SONY IMX390 camera with RGGB color filter array and 174 degree fisheye camera), yielding 438 images for training and 88 images for testing tasks. The dataset was annotated by human for cars (3089), pedestrians (687), stop signs (110), and traffic lights (848). This dataset is used to train the raw sensor data-based object detection algorithm for the RGGB camera module, which we may regards as the โteacherโ algorithm in knowledge distillation.
RCCB Camera Data (Test Set) We collected this dataset by using the RCCB camera module with 169 degree fisheye lens to test and evaluate the performance of the proposed object detection algorithm. There are total number of 474 raw/demosaicked image pairs captured by this automotive camera. The dataset was annotated by human for cars (2506), pedestrians (406), stop-signs (109),and traffic lights (784).
Joint RGGB-RCCB Camera Data (Cross-Camera Training Set) We collected 90 RGGB-RCCB pair images using the dual-camera configuration shown in 2 and captured by Sony IMX390 Cameras with RGGB and RCCB color filter arrays. As this dataset is intended to support the unsupervised learning of raw RCCB sensor data-based object detection, the image pairs in this dataset are not annotated. The two cameras are externally triggered by two separate laptops (again, limitation to the hardware/software environment we are given). Although not perfectly synchronized, they are manually triggered together so that they are captured within a fraction of a second. Unlike the RGGB Camera Dataset (Baseline Training Set) or the RCCB Camera Data (Test Set), the RGGB-RCCB Camera Dataset does not need to contain moving targets such as pedestrians and cars, and therefore strict synchronization is not necessary.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset that can be used to evaluate methods, which are able to detect changed objects when comparing two recordings of the same environment at different time instances. Based on the labeled ground truth objects, it is possible to differentiate between static, moved, removed and novel objects.
The dataset was recorded with an Asus Xtion PRO Live mounted on the HSR robot. We provide scenes from five different rooms or parts of rooms, namely a big room, a small room, a living area, a kitchen counter and an office desk. Each room is visited by the robot at least five times while between each run a subset of objects from the YCB Object and Model Set (YCB)[1] is re-arranged in the room. In total we generated 26 recordings. For each recording between three and 17 objects are placed (219 in total). Furthermore, furniture and permanent background objects are slightly rearranged. These changes are not labeled because for most service robot tasks, this is not relevant.
Assuming most objects are placed on horizontal surfaces, we extracted planes in each room in a pre-processing step (excluding the floor). For each surface, all frames from the recording where it is visible are extracted and used as the input for ElasticFusion[2]. This results in a total of 34 reconstructed surfaces.
We provide pointwise annotation of the YCB objects for each surface reconstruction from each recording.
Images of exemplary surface reconstructions can be found here: https://www.acin.tuwien.ac.at/vision-for-robotics/software-tools/obchange/
The file structure of ObChange.zip is the following:
Room
- scene2
- planes
- 0
- merged_plane_clouds_ds002.pcd
- merged_plane_clouds_ds002.anno
- merged_plane_clouds_ds002_GT.anno
- 1
- merged_plane_clouds_ds002.pcd
- merged_plane_clouds_ds002.anno
- merged_plane_clouds_ds002_GT.anno
- ...
table.txt
- scene3
The pcd-file contains the reconstruction of the surface. The merged_plane_clouds_ds002.anno lists the YCB objects visible in the reconstruction and merged_plane_clouds_ds002_GT.anno contains the point indices of the reconstruction corresponding to the YCB objects together with the corresponding object name. The last element for each object is a bool value indicating if the object is on the floor (and was reconstructed by chance). The table.txt lists for each detected plane the centroid, height, convex hull points and plane coefficients.
We provide the original input data for each room. The zip-files contain the rosbag file for each recording. Each rosbag contains the tf-tree and the RGB and depth stream, as well as the camera intrinsic. Additionally, the semantically annotated Voxblox[3] reconstruction created with SparseConvNet[4] is provided for each recording.
You may also be interested in Object Change Detection Dataset of Indoor Environments. It uses the same input data, but the ground truth annotation is based on a full room reconstruction instead of individual planes.
The research leading to these results has received funding from the Austrian Science Fund (FWF) under grant agreement Nos. I3969-N30 (InDex), I3967-N30 (BURG) and from the Austrian Research Promotion Agency (FFG) under grant agreement 879878 (K4R).
[1] B. Calli, A. Singh, J. Bruce, A. Walsman, K. Konolige, S. Srinivasa, P. Abbeel, A. M. Dollar, Yale-CMU-Berkeley dataset for robotic manipulation research, The International Journal of Robotics Research, vol. 36, Issue 3, pp. 261 โ 268, April 2017.
[2] T. Whelan, S. Leutenegger, R. Salas-Moreno, B. Glocker, A. Davison, ElasticFusion: Dense SLAM without a pose graph, Proceedings of Robotics: Science and Systems, July 2015.
[3] H. Oleynikova, Z. Taylor, M. Fehr, R. Siegwart, J. Nieto, Juan, Voxblox: Incremental 3D Euclidean Signed Distance Fields for On-Board MAV Planning, in Proceedings of IEEE International Conference on Intelligent Robots and Systems (IROS), pp. 1366-1373, 2017.
[4] B. Graham, M. Engelcke, L. van der Maaten, 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks, Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 9224 โ 9232, 2018.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The ORBIT (Object Recognition for Blind Image Training) -India Dataset is a collection of 105,243 images of 76 commonly used objects, collected by 12 individuals in India who are blind or have low vision. This dataset is an "Indian subset" of the original ORBIT dataset [1, 2], which was collected in the UK and Canada. In contrast to the ORBIT dataset, which was created in a Global North, Western, and English-speaking context, the ORBIT-India dataset features images taken in a low-resource, non-English-speaking, Global South context, a home to 90% of the worldโs population of people with blindness. Since it is easier for blind or low-vision individuals to gather high-quality data by recording videos, this dataset, like the ORBIT dataset, contains images (each sized 224x224) derived from 587 videos. These videos were taken by our data collectors from various parts of India using the Find My Things [3] Android app. Each data collector was asked to record eight videos of at least 10 objects of their choice.
Collected between July and November 2023, this dataset represents a set of objects commonly used by people who are blind or have low vision in India, including earphones, talking watches, toothbrushes, and typical Indian household items like a belan (rolling pin), and a steel glass. These videos were taken in various settings of the data collectors' homes and workspaces using the Find My Things Android app.
The image dataset is stored in the โDatasetโ folder, organized by folders assigned to each data collector (P1, P2, ...P12) who collected them. Each collector's folder includes sub-folders named with the object labels as provided by our data collectors. Within each object folder, there are two subfolders: โcleanโ for images taken on clean surfaces and โclutterโ for images taken in cluttered environments where the objects are typically found. The annotations are saved inside a โAnnotationsโ folder containing a JSON file per video (e.g., P1--coffee mug--clean--231220_084852_coffee mug_224.json) that contains keys corresponding to all frames/images in that video (e.g., "P1--coffee mug--clean--231220_084852_coffee mug_224--000001.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, "P1--coffee mug--clean--231220_084852_coffee mug_224--000002.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, ...). The โobject_not_present_issueโ key is True if the object is not present in the image, and the โpii_present_issueโ key is True, if there is a personally identifiable information (PII) present in the image. Note, all PII present in the images has been blurred to protect the identity and privacy of our data collectors. This dataset version was created by cropping images originally sized at 1080โรโ1920; therefore, an unscaled version of the dataset will follow soon.
This project was funded by the Engineering and Physical Sciences Research Council (EPSRC) Industrial ICASE Award with Microsoft Research UK Ltd. as the Industrial Project Partner. We would like to acknowledge and express our gratitude to our data collectors for their efforts and time invested in carefully collecting videos to build this dataset for their community. The dataset is designed for developing few-shot learning algorithms, aiming to support researchers and developers in advancing object-recognition systems. We are excited to share this dataset and would love to hear from you if and how you use this dataset. Please feel free to reach out if you have any questions, comments or suggestions.
REFERENCES:
Daniela Massiceti, Lida Theodorou, Luisa Zintgraf, Matthew Tobias Harris, Simone Stumpf, Cecily Morrison, Edward Cutrell, and Katja Hofmann. 2021. ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision. DOI: https://doi.org/10.25383/city.14294597
microsoft/ORBIT-Dataset. https://github.com/microsoft/ORBIT-Dataset
Linda Yilin Wen, Cecily Morrison, Martin Grayson, Rita Faia Marques, Daniela Massiceti, Camilla Longden, and Edward Cutrell. 2024. Find My Things: Personalized Accessibility through Teachable AI for People who are Blind or Low Vision. In Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems (CHI EA '24). Association for Computing Machinery, New York, NY, USA, Article 403, 1โ6. https://doi.org/10.1145/3613905.3648641
Facebook
TwitterArea exposed to one or more hazards represented on the hazard map used for risk analysis of the RPP. The hazard map is the result of the study of hazards, the objective of which is to assess the intensity of each hazard at any point in the study area. The evaluation method is specific to each hazard type. It leads to the delimitation of a set of areas on the study perimeter constituting a zoning graduated according to the level of the hazard. The allocation of a hazard level at a given point in the territory takes into account the probability of occurrence of the dangerous phenomenon and its degree of intensity. For multi-random PPRNs, each zone is usually identified on the hazard map by a code for each hazard to which it is exposed.
All hazard areas shown on the hazard map are included. Areas protected by protective structures must be represented (possibly in a specific way) as they are always considered subject to hazard (case of breakage or inadequacy of the structure). Hazard zones can be described as developed data to the extent that they result from a synthesis using multiple sources of calculated, modelled or observed hazard data. These source data are not concerned by this class of objects but by another standard dealing with the knowledge of hazards. Some areas within the study area are considered โno or insignificant hazard zonesโ. These are the areas where the hazard has been studied and is nil. These areas are not included in the object class and do not have to be represented as hazard zones. However, in the case of natural RPPs, regulatory zoning may classify certain areas not exposed to hazard as prescribing areas (see definition of the PPR class).
Facebook
TwitterArea exposed to one or more hazards represented on the hazard map used for risk analysis of the RPP. The hazard map is the result of the study of hazards, the objective of which is to assess the intensity of each hazard at any point in the study area. The evaluation method is specific to each hazard type. It leads to the delimitation of a set of areas on the study perimeter constituting a zoning graduated according to the level of the hazard. The assignment of a hazard level at a given point in the territory takes into account the probability of occurrence of the dangerous phenomenon and its degree of intensity.For multi-random PPRNs, each zone is usually identified on the hazard map by a code for each hazard to which it is exposed. All hazard areas shown on the hazard map are included. Areas protected by protective structures must be represented (possibly in a specific way) as they are always considered to be subject to hazard (cases of breakage or inadequacy of the structure).The hazard zones may be classified as data compiled in so far as they result from a synthesis using several sources of calculated, modelled or observed hazard data. These source data are not covered by this class of objects but by another standard dealing with the knowledge of hazards.Some areas of the study perimeter are considered โzero or insignificant hazard zonesโ. These are the areas where the hazard has been studied and is nil. These areas are not included in the object class and do not have to be represented as hazard zones. However, in the case of natural RPPs, regulatory zoning may classify certain areas not exposed to hazard as prescribing areas (see definition of the PPR class).
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Tsinghua-Daimler Cyclist Detection Benchmark Dataset in yolo format for Object Detection
I'm not owner the of this dataset, all the credit goes to X. Li, F. Flohr, Y. Yang, H. Xiong, M. Braun, S. Pan, K. Li and D. M. Gavrila, the creators of this dataset.
id center_x center_y width height (relative to image width and height)0 0.41015625 0.44140625 0.0341796875 0.11328125
This dataset is made freely available non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use, copy, and distribute the data given that you agree:
X. Li, F. Flohr, Y. Yang, H. Xiong, M. Braun, S. Pan, K. Li and D. M. Gavrila. A New Benchmark for Vision-Based Cyclist Detection. In Proc. of the IEEE Intelligent Vehicles Symposium (IV), Gothenburg, Sweden, pp.1028-1033, 2016.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Current captioning datasets focus on object-centric captions, describing the visible objects in the image, often ending up stating the obvious (for humans), e.g. "people eating food in a park". Although these datasets are useful to evaluate the ability of Vision & Language models to recognize and describe visual content, they do not support controlled experiments involving model testing or fine-tuning, with more high-level captions, which humans find easy and natural to produce. For example, people often describe images based on the type of scene they depict ("people at a holiday resort") and the actions they perform ("people having a picnic"). Such concepts are based on personal experience and contribute to forming common sense assumptions. We present the High-Level Dataset, a dataset extending 14997 images from the COCO dataset, aligned with a new set of 134,973 human-annotated (high-level) captions collected along three axes: scenes, actions and rationales. We further extend this dataset with confidence scores collected from an independent set of readers, as well as a set of narrative captions generated synthetically, by combining each of the three axes. We describe this dataset and analyse it extensively. We also present baseline results for the High-Level Captioning task.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Developing robot perception systems for handling objects in the real-world requires computer vision algorithms to be carefully scrutinized with respect to the expected operating domain. This demands large quantities of ground truth data to rigorously evaluate the performance of algorithms.
The Object Cluttered Indoor Dataset is an RGBD-dataset containing point-wise labeled point-clouds for each object. The data was captured using two ASUS-PRO Xtion cameras that are positioned at different heights. It captures diverse settings of objects, background, context, sensor to scene distance, viewpoint angle and lighting conditions. The main purpose of OCID is to allow systematic comparison of existing object segmentation methods in scenes with increasing amount of clutter. In addition OCID does also provide ground-truth data for other vision tasks like object-classification and recognition.
OCID comprises 96 fully built up cluttered scenes. Each scene is a sequence of labeled pointclouds which are created by building a increasing cluttered scene incrementally and adding one object after the other. The first item in a sequence contains no objects, the second one object, up to the final count of added objects.
The dataset uses 89 different objects that are chosen representatives from the Autonomous Robot Indoor Dataset(ARID)[1] classes and YCB Object and Model Set (YCB)[2] dataset objects.
The ARID20 subset contains scenes including up to 20 objects from ARID. The ARID10 and YCB10 subsets include cluttered scenes with up to 10 objects from ARID and the YCB objects respectively. The scenes in each subset are composed of objects from only one set at a time to maintain separation between datasets. Scene variation includes different floor (plastic, wood, carpet) and table textures (wood, orange striped sheet, green patterned sheet). The complete set of data provides 2346 labeled point-clouds.
OCID subsets are structured so that specific real-world factors can be individually assessed.
You can find all labeled pointclouds of the ARID20 dataset for the first sequence on a table recorded with the lower mounted camera in this directory:
./ARID20/table/bottom/seq01/pcd/
In addition to labeled organized point-cloud files, corresponding depth, RGB and 2d-label-masks are available:
OCID was created using EasyLabel โ a semi-automatic annotation tool for RGBD-data. EasyLabel processes recorded sequences of organized point-cloud files and exploits incrementally built up scenes, where in each take one additional object is placed. The recorded point-cloud data is then accumulated and the depth difference between two consecutive recordings are used to label new objects. The code is available here.
OCID data for instance recognition/classification
For ARID10 and ARID20 there is additional data available usable for object recognition and classification tasks. It contains semantically annotated RGB and depth image crops extracted from the OCID dataset.
The structure is as follows:
The data is provided by Mohammad Reza Loghmani.
If you found our dataset useful, please cite the following paper:
@inproceedings{DBLP:conf/icra/SuchiPFV19,
author = {Markus Suchi and
Timothy Patten and
David Fischinger and
Markus Vincze},
title = {EasyLabel: {A} Semi-Automatic Pixel-wise Object Annotation Tool for
Creating Robotic {RGB-D} Datasets},
booktitle = {International Conference on Robotics and Automation, {ICRA} 2019,
Montreal, QC, Canada, May 20-24, 2019},
pages = {6678--6684},
year = {2019},
crossref = {DBLP:conf/icra/2019},
url = {https://doi.org/10.1109/ICRA.2019.8793917},
doi = {10.1109/ICRA.2019.8793917},
timestamp = {Tue, 13 Aug 2019 20:25:20 +0200},
biburl = {https://dblp.org/rec/bib/conf/icra/SuchiPFV19},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
@proceedings{DBLP:conf/icra/2019,
title = {International Conference on Robotics and Automation, {ICRA} 2019,
Montreal, QC, Canada, May 20-24, 2019},
publisher = {{IEEE}},
year = {2019},
url = {http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8780387},
isbn = {978-1-5386-6027-0},
timestamp = {Tue, 13 Aug 2019 20:23:21 +0200},
biburl = {https://dblp.org/rec/bib/conf/icra/2019},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
For any questions or issues with the OCID-dataset, feel free to contact the author:
For specific questions about the OCID-semantic crops data please contact:
[1] Loghmani, Mohammad Reza et al. "Recognizing Objects in-the-Wild: Where do we Stand?" 2018 IEEE International Conference on Robotics and Automation (ICRA) (2018): 2170-2177.
[2] Berk Calli, Arjun Singh, James Bruce, Aaron Walsman, Kurt Konolige, Siddhartha Srinivasa, Pieter Abbeel, Aaron M Dollar, Yale-CMU-Berkeley dataset for robotic manipulation research, The International Journal of Robotics Research, vol. 36, Issue 3, pp. 261 โ 268, April 2017.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Vehicle Detection Dataset
This dataset is designed for vehicle detection tasks, featuring a comprehensive collection of images annotated for object detection. This dataset, originally sourced from Roboflow (https://universe.roboflow.com/object-detection-sn8ac/ai-traffic-system), was exported on May 29, 2025, at 4:59 PM GMT and is now publicly available on Kaggle under the CC BY 4.0 license.
../train/images../valid/images../test/imagesThis dataset was created and exported via Roboflow, an end-to-end computer vision platform that facilitates collaboration, image collection, annotation, dataset creation, model training, and deployment. The dataset is part of the ai-traffic-system project (version 1) under the workspace object-detection-sn8ac. For more details, visit: https://universe.roboflow.com/object-detection-sn8ac/ai-traffic-system/dataset/1.
This dataset is ideal for researchers, data scientists, and developers working on vehicle detection and traffic monitoring systems. It can be used to: - Train and evaluate deep learning models for object detection, particularly using the YOLOv11 framework. - Develop AI-powered traffic management systems, autonomous driving applications, or urban mobility solutions. - Explore computer vision techniques for real-world traffic scenarios.
For advanced training notebooks compatible with this dataset, check out: https://github.com/roboflow/notebooks. To explore additional datasets and pre-trained models, visit: https://universe.roboflow.com.
The dataset is licensed under CC BY 4.0, allowing for flexible use, sharing, and adaptation, provided appropriate credit is given to the original source.
This dataset is a valuable resource for building robust vehicle detection models and advancing computer vision applications in traffic systems.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Brain Tumor MRI dataset, curated by Roboflow Universe, is a comprehensive dataset designed for the detection and classification of brain tumors using advanced computer vision techniques. It comprises 3,903 MRI images categorized into four distinct classes:
Each image in the dataset is annotated with bounding boxes to indicate tumor locations, facilitating object detection tasks precisely. The dataset is structured into training (70%), validation (20%), and test (10%) sets, ensuring a robust framework for model development and evaluation.
The primary goal of this dataset is to aid in the early detection and diagnosis of brain tumors, contributing to improved treatment planning and patient outcomes. By offering a diverse range of annotated MRI images, this dataset enables researchers and practitioners to develop and fine-tune computer vision models with high accuracy in identifying and localizing brain tumors.
This dataset supports multiple annotation formats, including YOLOv8, YOLOv9, and YOLOv11, making it versatile and compatible with various machine-learning frameworks. Its integration with these formats ensures real-time and efficient object detection, ideal for applications requiring timely and precise results.
By leveraging this dataset, researchers and healthcare professionals can make significant strides in developing cutting-edge AI solutions for medical imaging, ultimately supporting more effective and accurate diagnoses in clinical settings.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F14850461%2Fe03fba81bb62e32c0b73d6535a25cb8d%2F3.jpg?generation=1734173601629363&alt=media" alt="">
Facebook
TwitterJurisdictional Unit, 2022-05-21. For use with WFDSS, IFTDSS, IRWIN, and InFORM.This is a feature service which provides Identify and Copy Feature capabilities. If fast-drawing at coarse zoom levels is a requirement, consider using the tile (map) service layer located at https://nifc.maps.arcgis.com/home/item.html?id=3b2c5daad00742cd9f9b676c09d03d13.OverviewThe Jurisdictional Agencies dataset is developed as a national land management geospatial layer, focused on representing wildland fire jurisdictional responsibility, for interagency wildland fire applications, including WFDSS (Wildland Fire Decision Support System), IFTDSS (Interagency Fuels Treatment Decision Support System), IRWIN (Interagency Reporting of Wildland Fire Information), and InFORM (Interagency Fire Occurrence Reporting Modules). It is intended to provide federal wildland fire jurisdictional boundaries on a national scale. The agency and unit names are an indication of the primary manager name and unit name, respectively, recognizing that:There may be multiple owner names.Jurisdiction may be held jointly by agencies at different levels of government (ie State and Local), especially on private lands, Some owner names may be blocked for security reasons.Some jurisdictions may not allow the distribution of owner names. Private ownerships are shown in this layer with JurisdictionalUnitIdentifier=null,JurisdictionalUnitAgency=null, JurisdictionalUnitKind=null, and LandownerKind="Private", LandownerCategory="Private". All land inside the US country boundary is covered by a polygon.Jurisdiction for privately owned land varies widely depending on state, county, or local laws and ordinances, fire workload, and other factors, and is not available in a national dataset in most cases.For publicly held lands the agency name is the surface managing agency, such as Bureau of Land Management, United States Forest Service, etc. The unit name refers to the descriptive name of the polygon (i.e. Northern California District, Boise National Forest, etc.).These data are used to automatically populate fields on the WFDSS Incident Information page.This data layer implements the NWCG Jurisdictional Unit Polygon Geospatial Data Layer Standard.Relevant NWCG Definitions and StandardsUnit2. A generic term that represents an organizational entity that only has meaning when it is contextualized by a descriptor, e.g. jurisdictional.Definition Extension: When referring to an organizational entity, a unit refers to the smallest area or lowest level. Higher levels of an organization (region, agency, department, etc) can be derived from a unit based on organization hierarchy.Unit, JurisdictionalThe governmental entity having overall land and resource management responsibility for a specific geographical area as provided by law.Definition Extension: 1) Ultimately responsible for the fire report to account for statistical fire occurrence; 2) Responsible for setting fire management objectives; 3) Jurisdiction cannot be re-assigned by agreement; 4) The nature and extent of the incident determines jurisdiction (for example, Wildfire vs. All Hazard); 5) Responsible for signing a Delegation of Authority to the Incident Commander.See also: Unit, Protecting; LandownerUnit IdentifierThis data standard specifies the standard format and rules for Unit Identifier, a code used within the wildland fire community to uniquely identify a particular government organizational unit.Landowner Kind & CategoryThis data standard provides a two-tier classification (kind and category) of landownership. Attribute Fields JurisdictionalAgencyKind Describes the type of unit Jurisdiction using the NWCG Landowner Kind data standard. There are two valid values: Federal, and Other. A value may not be populated for all polygons.JurisdictionalAgencyCategoryDescribes the type of unit Jurisdiction using the NWCG Landowner Category data standard. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State. A value may not be populated for all polygons.JurisdictionalUnitNameThe name of the Jurisdictional Unit. Where an NWCG Unit ID exists for a polygon, this is the name used in the Name field from the NWCG Unit ID database. Where no NWCG Unit ID exists, this is the โUnit Nameโ or other specific, descriptive unit name field from the source dataset. A value is populated for all polygons.JurisdictionalUnitIDWhere it could be determined, this is the NWCG Standard Unit Identifier (Unit ID). Where it is unknown, the value is โNullโ. Null Unit IDs can occur because a unit may not have a Unit ID, or because one could not be reliably determined from the source data. Not every land ownership has an NWCG Unit ID. Unit ID assignment rules are available from the Unit ID standard, linked above.LandownerKindThe landowner category value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. There are three valid values: Federal, Private, or Other.LandownerCategoryThe landowner kind value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State, Private.DataSourceThe database from which the polygon originated. Be as specific as possible, identify the geodatabase name and feature class in which the polygon originated.SecondaryDataSourceIf the Data Source is an aggregation from other sources, use this field to specify the source that supplied data to the aggregation. For example, if Data Source is "PAD-US 2.1", then for a USDA Forest Service polygon, the Secondary Data Source would be "USDA FS Automated Lands Program (ALP)". For a BLM polygon in the same dataset, Secondary Source would be "Surface Management Agency (SMA)."SourceUniqueIDIdentifier (GUID or ObjectID) in the data source. Used to trace the polygon back to its authoritative source.MapMethod:Controlled vocabulary to define how the geospatial feature was derived. Map method may help define data quality. MapMethod will be Mixed Method by default for this layer as the data are from mixed sources. Valid Values include: GPS-Driven; GPS-Flight; GPS-Walked; GPS-Walked/Driven; GPS-Unknown Travel Method; Hand Sketch; Digitized-Image; DigitizedTopo; Digitized-Other; Image Interpretation; Infrared Image; Modeled; Mixed Methods; Remote Sensing Derived; Survey/GCDB/Cadastral; Vector; Phone/Tablet; OtherDateCurrentThe last edit, update, of this GIS record. Date should follow the assigned NWCG Date Time data standard, using 24 hour clock, YYYY-MM-DDhh.mm.ssZ, ISO8601 Standard.CommentsAdditional information describing the feature. GeometryIDPrimary key for linking geospatial objects with other database systems. Required for every feature. This field may be renamed for each standard to fit the feature.JurisdictionalUnitID_sansUSNWCG Unit ID with the "US" characters removed from the beginning. Provided for backwards compatibility.JoinMethodAdditional information on how the polygon was matched information in the NWCG Unit ID database.LocalNameLocalName for the polygon provided from PADUS or other source.LegendJurisdictionalAgencyJurisdictional Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.LegendLandownerAgencyLandowner Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.DataSourceYearYear that the source data for the polygon were acquired.Data InputThis dataset is based on an aggregation of 4 spatial data sources: Protected Areas Database US (PAD-US 2.1), data from Bureau of Indian Affairs regional offices, the BLM Alaska Fire Service/State of Alaska, and Census Block-Group Geometry. NWCG Unit ID and Agency Kind/Category data are tabular and sourced from UnitIDActive.txt, in the WFMI Unit ID application (https://wfmi.nifc.gov/unit_id/Publish.html). Areas of with unknown Landowner Kind/Category and Jurisdictional Agency Kind/Category are assigned LandownerKind and LandownerCategory values of "Private" by use of the non-water polygons from the Census Block-Group geometry.PAD-US 2.1:This dataset is based in large part on the USGS Protected Areas Database of the United States - PAD-US 2.`. PAD-US is a compilation of authoritative protected areas data between agencies and organizations that ultimately results in a comprehensive and accurate inventory of protected areas for the United States to meet a variety of needs (e.g. conservation, recreation, public health, transportation, energy siting, ecological, or watershed assessments and planning). Extensive documentation on PAD-US processes and data sources is available.How these data were aggregated:Boundaries, and their descriptors, available in spatial databases (i.e. shapefiles or geodatabase feature classes) from land management agencies are the desired and primary data sources in PAD-US. If these authoritative sources are unavailable, or the agency recommends another source, data may be incorporated by other aggregators such as non-governmental organizations. Data sources are tracked for each record in the PAD-US geodatabase (see below).BIA and Tribal Data:BIA and Tribal land management data are not available in PAD-US. As such, data were aggregated from BIA regional offices. These data date from 2012 and were substantially updated in 2022. Indian Trust Land affiliated with Tribes, Reservations, or BIA Agencies: These data are not considered the system of record and are not intended to be used as such. The Bureau of Indian Affairs (BIA), Branch of Wildland Fire Management (BWFM) is not the originator of these data. The
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
We propose Safe Human dataset consisting of 17 different objects referred to as SH17 dataset. We scrapped images from the Pexels website, which offers "https://www.pexels.com/license/">clear usage rights for all its images, showcasing a range of human activities across diverse industrial operations.
To extract relevant images, we used multiple queries such as manufacturing worker, industrial worker, human worker, labor, etc. The tags associated with Pexels images proved reasonably accurate. After removing duplicate samples, we obtained a dataset of 8,099 images. The dataset exhibits significant diversity, representing manufacturing environments globally, thus minimizing potential regional or racial biases. Samples of the dataset are shown below.
The data consists of three folders, - images contains all images - labels contains labels in YOLO format for all images - voc_labels contains labels in VOC format for all images - train_files.txt contains list of all images we used for training - val_files.txt contains list of all images we used for validation
This dataset, scrapped through the Pexels website, is intended for educational, research, and analysis purposes only. You may be able to use the data for training of the Machine learning models only. Users are urged to use this data responsibly, ethically, and within the bounds of legal stipulations.
Legal Simplicity: All photos and videos on Pexels can be downloaded and used for free.
The dataset is provided "as is," without warranty, and the creator disclaims any legal liability for its use by others.
Users are encouraged to consider the ethical implications of their analyses and the potential impact on broader community.
https://github.com/ahmadmughees/SH17dataset
@misc{ahmad2024sh17datasethumansafety,
title={SH17: A Dataset for Human Safety and Personal Protective Equipment Detection in Manufacturing Industry},
author={Hafiz Mughees Ahmad and Afshin Rahimi},
year={2024},
eprint={2407.04590},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2407.04590},
}
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2806979%2F0a24bd8b9a3f281cf924a5171db28a40%2Fpexels-photo-3862627.jpeg?generation=1720104820503689&alt=media" alt="">
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area exposed to one or more hazards represented on the hazard map used for risk analysis of the RPP. The hazard map is the result of the study of hazards, the objective of which is to assess the intensity of each hazard at any point in the study area. The evaluation method is specific to each hazard type. It leads to the delimitation of a set of areas on the study perimeter constituting a zoning graduated according to the level of the hazard. The assignment of a hazard level at a given point in the territory takes into account the probability of occurrence of the dangerous phenomenon and its degree of intensity.For multi-random PPRNs, each zone is usually identified on the hazard map by a code for each hazard to which it is exposed. All hazard areas shown on the hazard map are included. Areas protected by protective structures must be represented (possibly in a specific way) as they are always considered to be subject to hazard (cases of breakage or inadequacy of the structure).The hazard zones may be classified as data compiled in so far as they result from a synthesis using several sources of calculated, modelled or observed hazard data. These source data are not covered by this class of objects but by another standard dealing with the knowledge of hazards.Some areas of the study perimeter are considered โzero or insignificant hazard zonesโ. These are the areas where the hazard has been studied and is nil. These areas are not included in the object class and do not have to be represented as hazard zones. However, in the case of natural RPPs, regulatory zoning may classify certain areas not exposed to hazard as prescribing areas (see definition of the PPR class).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The Market_Basket_Optimisation dataset is a classic transactional dataset often used in association rule mining and market basket analysis.
It consists of multiple transactions where each transaction represents the collection of items purchased together by a customer in a single shopping trip.
Market_Basket_Optimisation.csv Example transaction rows (simplified):
| Item 1 | Item 2 | Item 3 | Item 4 | ... |
|---|---|---|---|---|
| Bread | Butter | Jam | ||
| Mineral water | Chocolate | Eggs | Milk | |
| Spaghetti | Tomato sauce | Parmesan |
Here, empty cells mean no item was purchased in that slot.
This dataset is frequently used in data mining, analytics, and recommendation systems. Common applications include:
Association Rule Mining (Apriori, FP-Growth):
{Bread, Butter} โ {Jam} with high support and confidence. Product Affinity Analysis:
Recommendation Engines:
Marketing Campaigns:
Inventory Management:
No Customer Identifiers:
No Timestamps:
No Quantities or Prices:
Sparse & Noisy:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Cartographic Sign Detection Dataset (CaSiDD) comprises 796 manually annotated historical map samples, corresponding to 18,750 cartographic signs, such as icons and symbols. Moreover, the signs are categorized into 24 distinct classes, such as tree, mill, hill, religious edifice, or grave. The original images are part of the Semap dataset [1].
The dataset is published in the context of R. Petitpierre's PhD thesis: Studying Maps at Scale: A Digital Investigation of Cartography and the Evolution of Figuration [2]. Details on the annotation process and statistics on the annotated cartographic signs are provided in the manuscript.
The data is organized following the COCO dataset format.
project_root/ โโโ classes.txt โโโ images/ โ โโโ train/ โ โ โโโ image1.png โ โ โโโ image2.png โ โโโ val/ โ โโโ image3.png โ โโโ image4.png โโโ labels/ โโโ train/ โ โโโ image1.txt โ โโโ image2.txt โโโ val/ โโโ image3.txt โโโ image4.txt
The labels are stored in separate text files, one for each image. In the text files, object classes and coordinates are stored line by line, using the following syntax:
class_id x_center y_center width height
Where x is the horizontal axis. The dimensions are expressed relative to the size of the labeled image. Example:
13 0.095339 0.271003 0.061719 0.0271611 0.154258 0.490052 0.017370 0.019010 8 0.317982 0.556484 0.017370 0.014063
0 battlefield
1 tree
2 train (e.g. wagon)
3 mill (watermill or windmill)
4 bridge
5 settlement or building
6 army
7 grave
8 bush
9 marsh
10 grass
11 vine
12 religious monument
13 hill/mountain
14 cannon
15 rock
16 tower
17 signal or survey point
18 gate (e.g. city gate)
19 ship/boat/shipwreck
20 station (e.g. metro/tram/train station)
21 dam/lock
22 harbor
23 well/basin/reservoir
24 miscellaneous (e.g. post office, spring, hospital, school, etc.)
A YOLOv10 model yolov10_single_class_model.pt, trained as described in [2], is provided for convenience and reproducibility. The model does not support multi-class object detection. The YOLOv10 implementation used is distributed by Ultralytics [3].
Number of distinct classes: 24 + misc
Number of image samples: 796
Number of annotations: 18,750
Study period: 1492โ1948.
For any mention of this dataset, please cite :
@misc{casidd_petitpierre_2025, author = {Petitpierre, R{\'{e}}mi and Jiang, Jiaming}, title = {{Cartographic Sign Detection Dataset (CaSiDD)}}, year = {2025},
publisher = {EPFL},
url = {https://doi.org/10.5281/zenodo.16278380}}@phdthesis{studying_maps_petitpierre_2025, author = {Petitpierre, R{\'{e}}mi}, title = {{Studying Maps at Scale: A Digital Investigation of Cartography and the Evolution of Figuration}}, year = {2025},
school = {EPFL}}
Rรฉmi PETITPIERRE - remi.petitpierre@epfl.ch - ORCID - Github - Scholar - ResearchGate
85% of the data were annotated by RP. The remainder was annotated by JJ, a master's student from EPFL, Switzerland.
This project is licensed under the CC BY 4.0 License. See the license_images file for details about the respective reuse policy of digitized map images.
We do not assume any liability for the use of this dataset.
Facebook
TwitterInternData-A1
InternData-A1 is a hybrid synthetic-real manipulation dataset containing over 630k trajectories and 7,433 hours across 4 embodiments, 18 skills, 70 tasks, and 227 scenes, covering rigid, articulated, deformable, and fluid-object manipulation.
Your browser does not support the video tag.
Your browser does not support the video tag.โฆ See the full description on the dataset page: https://huggingface.co/datasets/InternRobotics/InternData-A1.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The UFBA-425 dataset is designed to support object detection tasks with a variety of unique classes. The dataset contains 15 images and 32 distinct classes identified by numeric codes. Each class corresponds to specific objects that are visually identifiable. The goal is to annotate each object class based on its visual characteristics and ensure precision in object detection.
Class 11 represents objects characterized by their upright, elongated shape, often found in specific environments such as industrial or outdoor landscapes.
Annotate the entire elongated structure, ensuring to include any visible base or fixture connecting it to the ground. Do not annotate partial or obscured sections unless identifiable.
Class 12 objects are distinguished by their flat, rectangular surfaces and sharp, distinct edges. Often used in man-made structures.
Outline the boundaries of the flat surfaces, paying attention to capture all four corners precisely. Avoid annotating if these objects are stacked, unless clear separation is visible.
Objects in class 13 include spheres or rounded shapes that maintain symmetry from multiple perspectives.
Focus on capturing the outer contour of the spherical shape. Ensure to capture the entirety of the outline even if it extends partially behind another object.
Class 14 consists of objects with complex, irregular outlines, often with a textured surface.
Detail the contour of these complex objects accurately, including any protrusions. Avoid over-simplifying the shape and ensure internal segments remain unannotated unless distinct.
Class 15 covers objects with multiple geometric components arranged in a symmetrical pattern.
Annotate each geometric component, ensuring alignment is consistent with the overall pattern. Do not separate annotations unless components differ from the pattern.
Class 16 objects feature prominently in vertical settings with a consistent width throughout.
Capture the full height of the object, including its base connection. Avoid annotating if the object is severely obstructed or if identification is uncertain.
This class includes objects that are commonly found in pairs or groups, exhibiting symmetry.
Annotate each individual component in the pair or group, ensuring each is distinctly identified. Do not join annotations unless the components are physically connected.
Objects with class 18 are identified by their bright surfaces and reflective properties.
Highlight the reflective surfaces, ensuring boundaries are clearly defined. Exclude reflections not originating from the object itself.
Class 21 is dedicated to static objects that have a fixed presence in their environment.
Identify the static object's position, from ground level to visible extent. Do not include dynamic objects in close proximity unless physically connected.
These objects are characterized by dynamic shapes, often fluctuating in form while maintaining a recognizable profile.
Document the entirety of the object in its current shape, focusing on its most defined features. Avoid annotating incomplete forms or shapes without definitive boundaries.
Class 23 involves horizontally extended objects with a shallow vertical profile.
Delineate the horizontal length meticulously, ensuring the full span is captured. Ignore vertical deviations that do not contribute to the primary horizontal feature.
Objects defined by a central core with surrounding features that taper or extend outward.
The annotation should include the core and tapering features while ensuring the central portion maintains prominence. Avoid isolating peripheral elements unless completely detached.
Objects in class 25 consist of layered elements, oriented either vertically or horizontally.
Each layer should be defined distinctly, with annotat
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area exposed to one or more hazards represented on the hazard map used for risk analysis of the RPP. The hazard map is the result of the study of hazards, the objective of which is to assess the intensity of each hazard at any point in the study area. The evaluation method is specific to each hazard type. It leads to the delimitation of a set of areas on the study perimeter constituting a zoning graduated according to the level of the hazard. The allocation of a hazard level at a given point in the territory takes into account the probability of occurrence of the dangerous phenomenon and its degree of intensity. For PPRTs the hazard levels are determined effect by effect on maps by type of effect and overall on an aggregated level on a synthesis map. All hazard areas shown on the hazard map are included. Areas protected by protective structures must be represented (possibly in a specific way) as they are always considered subject to hazard (case of breakage or inadequacy of the structure). Hazard zones can be described as developed data to the extent that they result from a synthesis using multiple sources of calculated, modelled or observed hazard data. These source data are not concerned by this class of objects but by another standard dealing with the knowledge of hazards. Some areas within the study area are considered โno or insignificant hazard zonesโ. These are the areas where the hazard has been studied and is nil. These areas are not included in the object class and do not have to be represented as hazard zones.
Facebook
TwitterThe Highway Capacity Manual (HCM) historically has been among the most important reference guides used by transportation professionals seeking a systematic basis for evaluating the capacity, level of service, and performance measures for elements of the surface transportation system, particularly highways but also other modes. The objective of this project was to determine how data and information on the impacts of differing causes of nonrecurrent congestion (incidents, weather, work zones, special events, etc.) in the context of highway capacity can be incorporated into the performance measure estimation procedures contained in the HCM. The methodologies contained in the HCM for predicting delay, speed, queuing, and other performance measures for alternative highway designs are not currently sensitive to traffic management techniques and other operation/design measures for reducing nonrecurrent congestion. A further objective was to develop methodologies to predict travel time reliability on selected types of facilities and within corridors. This project developed new analytical procedures and prepared chapters about freeway facilities and urban streets for potential incorporation of travel-time reliability into the HCM. The methods are embodied in two computational engines, and a final report documents the research. This zip file contains comma separated value (.csv) files of data to support SHRP 2 report S2-L08-RW-1, Incorporating travel time reliability into the Highway Capacity Manual. Zip size is 1.83 MB. Files were accessed in Microsoft Excel 2016. Data will be preserved as is. To view publication see: https://rosap.ntl.bts.gov/view/dot/3606
Facebook
TwitterThese data were automated to provide an accurate high-resolution historical shoreline of Klamath River, CA suitable as a geographic information system (GIS) data layer. These data are derived from shoreline maps that were produced by the NOAA National Ocean Service including its predecessor agencies which were based on an office interpretation of imagery and/or field survey. The NGS attribution scheme 'Coastal Cartographic Object Attribute Source Table (C-COAST)' was developed to conform the attribution of various sources of shoreline data into one attribution catalog. C-COAST is not a recognized standard, but was influenced by the International Hydrographic Organization's S-57 Object-Attribute standard so the data would be more accurately translated into S-57. This resource is a member of https://inport.nmfs.noaa.gov/inport/item/39808
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Datasets for automotive applications require human annotators to label objects such as traffic lights, cars, and pedestrians. There are many available today (e.g. image data sets and infrared images), as well sensor fusion data sets (e.g. image/RADAR/LiDAR, images with athermalized lenses, and images with event-based sensor data). UDayton24Automotive differs from other datasets in the sense that it is specifically designed for developing, training, and benchmarking object detection algorithms using raw sensor data. Multiple automotive cameras are involved, as described below.
RGGB Camera Data (Baseline Training Set) We collected a new dataset of raw/demosaicked image pairs using automotive camera (SONY IMX390 camera with RGGB color filter array and 174 degree fisheye camera), yielding 438 images for training and 88 images for testing tasks. The dataset was annotated by human for cars (3089), pedestrians (687), stop signs (110), and traffic lights (848). This dataset is used to train the raw sensor data-based object detection algorithm for the RGGB camera module, which we may regards as the โteacherโ algorithm in knowledge distillation.
RCCB Camera Data (Test Set) We collected this dataset by using the RCCB camera module with 169 degree fisheye lens to test and evaluate the performance of the proposed object detection algorithm. There are total number of 474 raw/demosaicked image pairs captured by this automotive camera. The dataset was annotated by human for cars (2506), pedestrians (406), stop-signs (109),and traffic lights (784).
Joint RGGB-RCCB Camera Data (Cross-Camera Training Set) We collected 90 RGGB-RCCB pair images using the dual-camera configuration shown in 2 and captured by Sony IMX390 Cameras with RGGB and RCCB color filter arrays. As this dataset is intended to support the unsupervised learning of raw RCCB sensor data-based object detection, the image pairs in this dataset are not annotated. The two cameras are externally triggered by two separate laptops (again, limitation to the hardware/software environment we are given). Although not perfectly synchronized, they are manually triggered together so that they are captured within a fraction of a second. Unlike the RGGB Camera Dataset (Baseline Training Set) or the RCCB Camera Data (Test Set), the RGGB-RCCB Camera Dataset does not need to contain moving targets such as pedestrians and cars, and therefore strict synchronization is not necessary.