100+ datasets found
  1. Gryphon Urban dataset

    • kaggle.com
    zip
    Updated Jan 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    evdoteo (2023). Gryphon Urban dataset [Dataset]. https://www.kaggle.com/datasets/evdoteo/gryphon-dataset
    Explore at:
    zip(786664524 bytes)Available download formats
    Dataset updated
    Jan 30, 2023
    Authors
    evdoteo
    Description

    Context

    Gryphon Urban dataset contains raw and labeled images taken from a Bumblebee stereo camera, placed onto car's
    hood, driven in the city of Xanthi,Greece. The dataset includes labeled semantic segmentation images alongside with geolocation details for each image.

    Content

    This dataset has 8868 unlabeled images divided in 3 different hours and 291 semantic labeled images randomly selected from the dataset. Each image file is 640 x 480 pixels and is linked with geolocation details that you can find inside the .csv file. A .txt file is attached for camera calibration and also a pdf file with the necessary bibliography in case a depth map has to be extracted.

    Acknowledgements

    This dataset is part of my diploma dissertation about Autonomous Vehicles and it wouldn't be feasible without the help of Vasia Balaska, Phd candidate of Laboratory of Robotics and Automation in DUTH.

    License

    This dataset is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation.Permission is granted to use the data given that you agree:

    1) That you do not distribute this dataset or modified versions. It is permissible to distribute derivative works in as far as they are abstract representations of this dataset (such as models trained on it or additional annotations that do not directly include any of our data) and do not allow to recover the dataset or something similar in character.

    2) That you may not use the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.

    Dataset Citation:

    Balaska, V., Theodoridis, E., Papapetros, I. T., Tsompanoglou, C., Bampis, L., & Gasteratos, A. (2023). Semantic communities from graph-inspired visual representations of cityscapes. Automation, 4(1), 110-122.

    Inspiration

    Can we built a really fully autonomous vehicle with the current technology status?

  2. o

    3.1_OALGreece_ Area of Forest Cover - Datasets - OPERANDUM

    • data-catalogue.operandum-project.eu
    Updated Oct 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). 3.1_OALGreece_ Area of Forest Cover - Datasets - OPERANDUM [Dataset]. https://data-catalogue.operandum-project.eu/dataset/3-1_oalgreece_-area-of-forest-cover
    Explore at:
    Dataset updated
    Oct 24, 2023
    Description

    This dataset contains geographical information and a processed table showing the area of forest land, according to Corine categorization, for the basin of Spercheios. You are not authorized to view this dataset. You may email the responsible party ITC to request access.

  3. o

    Current Landuse in OALs - Datasets - OPERANDUM

    • data-catalogue.operandum-project.eu
    Updated Feb 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Current Landuse in OALs - Datasets - OPERANDUM [Dataset]. https://data-catalogue.operandum-project.eu/dataset/current-landuse-in-oals
    Explore at:
    Dataset updated
    Feb 2, 2022
    Description

    This dataset contains current status of landuse in OAL, reported in the deliverable D3.1 (WP3) of OPERANDUM Project. The dataset is in MS Excel format. You are not authorized to view this dataset. You may email the responsible party OPERANDUM to request access.

  4. T

    Published Public Datasets

    • mydata.iowa.gov
    csv, xlsx, xml
    Updated Nov 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Published Public Datasets [Dataset]. https://mydata.iowa.gov/dataset/Published-Public-Datasets/dwf6-q2v7
    Explore at:
    xlsx, xml, csvAvailable download formats
    Dataset updated
    Nov 23, 2025
    Description

    This filtered view provides a list of datasets that have been published and where a public audience has been authorized. Not all public datasets may be federated to the public portal.

    This filtered view is used to identify such datasets that have been recently created or updated.

  5. AUTH-OpenDR ACelebA Dataset

    • data.europa.eu
    unknown
    Updated Dec 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2021). AUTH-OpenDR ACelebA Dataset [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-5809273?locale=bg
    Explore at:
    unknown(120777)Available download formats
    Dataset updated
    Dec 28, 2021
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Multi-view Facial Image Dataset Based on CelebA: A dataset of facial images from several viewing angles was created by Aristotle University of Thessaloniki based on the CelebA image dataset, using the software that was created in OpenDR H2020 research project based on this paper and the respective code provided by the authors. CelebA is a large scale facial dataset and consists of 202,599 facial images of 10,177 celebrities captured in the wild. The new dataset namely AUTH-OpenDR Augmented CelebA (AUTH-OpenDR ACelebA) was generated from 140,000 facial images corresponding to 9161 persons, i.e. a subset of CelebA was used. For each CelebA image used, 13 synthetic images generated by yaw axis camera rotation in the interval [0◦ : +60◦ ] with step +5◦ were obtained. Moreover, 10 synthetic images generated by pitch axis camera rotation in the interval [0◦: +45◦] with step +5◦ are also created for each facial image of the aforementioned dataset. Since CelebA license does not allow distribution of derivative work we do not make AcelebA directly available but instead provide instructions and scripts on how to recreate it.

  6. Crops Disease Dataset

    • kaggle.com
    zip
    Updated Jul 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vishesh2395 (2025). Crops Disease Dataset [Dataset]. https://www.kaggle.com/datasets/vishesh2395/crops-disease-dataset
    Explore at:
    zip(2134609420 bytes)Available download formats
    Dataset updated
    Jul 13, 2025
    Authors
    Vishesh2395
    Description

    Dataset Composition The dataset is meticulously organized by crop and condition, ensuring a clear and balanced structure for training. The classes are categorized as follows:

    Wheat (15 Classes):

    Wheat_Aphid, Wheat_BlackRust, Wheat_Blast, Wheat_BrownRust, Wheat_CommonRootRot, Wheat_FusariumHeadBlight, Wheat_LeafBlight, Wheat_Mildew, Wheat_Mite, Wheat_Septoria, Wheat_Smut, Wheat_Stemfly, Wheat_Tanspot, Wheat_YellowRust, and a baseline Wheat_Healthy.

    Rice (4 Classes):

    Rice_BrownSpot, Rice_Hispa, Rice_LeafBlast, and a baseline Rice_Healthy.

    Potato (3 Classes):

    Potato_Early_Blight, Potato_Late_Blight, and a baseline Potato_Healthy.

    Corn / Maize (4 Classes):

    Corn_Common_Rust, Corn_Gray_Leaf_Spot, Corn_Northern_Leaf_Blight, and a baseline Corn_Healthy.

    Methodology The dataset was created by aggregating images from various agricultural research repositories and public databases. Each image was manually verified for accuracy to ensure it correctly represents the visual symptoms of the specific disease. The collection was then structured into class-specific folders. During the training pipeline, all images are resized to a uniform 224x224 pixel dimension to provide a consistent input for the neural network. The scale and diversity of this dataset allow the resulting model to be a far more powerful and useful tool, capable of serving a broader community of farmers with different crops.

  7. Gold Insights Dataset (2020–2023)

    • kaggle.com
    zip
    Updated Mar 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vesela Gencheva (2025). Gold Insights Dataset (2020–2023) [Dataset]. https://www.kaggle.com/datasets/veselagencheva/gold-insights-dataset-20202023
    Explore at:
    zip(5748 bytes)Available download formats
    Dataset updated
    Mar 9, 2025
    Authors
    Vesela Gencheva
    Description

    This dataset offers a comprehensive view of global gold-related trends and metrics for the period 2020–2023. It is organized into several interrelated components, making it highly valuable for analyzing the role of gold in different sectors and its relationship with broader economic trends. The dataset contains the following files: 1. Gold Demand: Demand per Sector: Tracks gold demand in key sectors, including jewelry, investment, central banks, and technology, over time. Demand per Quarter: Provides quarterly demand data globally for greater temporal granularity. Yearly Demand by Country: Breaks down annual gold demand by individual countries. 2. Gold Reserves: Gold Reserves in Tonnes per Country: Highlights gold holdings by Central Banks, expressed in tonnes, for countries worldwide during the period. 3. Gold Jewelry: Gold Jewelry Demand by Country: Focuses on country-specific demand for gold jewelry, providing insights into cultural and economic patterns. The dataset is sourced from reliable and recognized industry databases and is designed to support a wide range of analyses, including demand trends, international comparisons, and the relationship between gold reserves and other economic indicators. Licensing: This dataset is sourced from the World Gold Council's website - Gold Demand & Supply by Country | World Gold Council The data is provided for general informational and educational purposes only. You are permitted to save, display, or print out this dataset strictly for personal, non-commercial use. Modifying, copying, scraping, distributing, reproducing, or using this dataset for commercial purposes is prohibited without prior written authorization from WGC. To request authorization, please contact WGC at info@gold.org.

  8. d

    Historical Medallion Vehicles - Authorized

    • catalog.data.gov
    • data.cityofnewyork.us
    • +2more
    Updated Sep 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofnewyork.us (2023). Historical Medallion Vehicles - Authorized [Dataset]. https://catalog.data.gov/dataset/historical-medallion-vehicles-authorized
    Explore at:
    Dataset updated
    Sep 2, 2023
    Dataset provided by
    data.cityofnewyork.us
    Description

    This list contains historical information on the status of current medallion vehicles authorized to operate in New York City. This list is accurate to the date and time represented in the Last Date Updated and Last Time Updated fields. For inquiries about the contents of this dataset, please email licensinginquiries@tlc.nyc.gov. To view the latest list please visit https://data.cityofnewyork.us/Transportation/Medallion-Vehicles-Authorized/rhe8-mgbb/data.

  9. d

    Satellite Electric Vehicle Dataset (TESLA,LUCID, RIVIAN

    • datarade.ai
    .csv
    Updated Jan 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Space Know (2023). Satellite Electric Vehicle Dataset (TESLA,LUCID, RIVIAN [Dataset]. https://datarade.ai/data-products/satellite-electric-vehicle-dataset-tesla-lucid-rivian-space-know
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jan 21, 2023
    Dataset authored and provided by
    Space Know
    Area covered
    United States of America, China
    Description

    SpaceKnow uses satellite (SAR) data to capture activity in electric vehicles and automotive factories.

    Data is updated daily, has an average lag of 4-6 days, and history back to 2017.

    The insights provide you with level and change data that monitors the area which is covered with assembled light vehicles in square meters.

    We offer 3 delivery options: CSV, API, and Insights Dashboard

    Available companies Rivian (NASDAQ: RIVN) for employee parking, logistics, logistic centers, product distribution & product in the US. (See use-case write up on page 4) TESLA (NASDAQ: TSLA) indices for product, logistics & employee parking for Fremont, Nevada, Shanghai, Texas, Berlin, and Global level Lucid Motors (NASDAQ: LCID) for employee parking, logistics & product in US

    Why get SpaceKnow's EV datasets?

    Monitor the company’s business activity: Near-real-time insights into the business activities of Rivian allow users to better understand and anticipate the company’s performance.

    Assess Risk: Use satellite activity data to assess the risks associated with investing in the company.

    Types of Indices Available Continuous Feed Index (CFI) is a daily aggregation of the area of metallic objects in square meters. There are two types of CFI indices. The first one is CFI-R which gives you level data, so it shows how many square meters are covered by metallic objects (for example assembled cars). The second one is CFI-S which gives you change data, so it shows you how many square meters have changed within the locations between two consecutive satellite images.

    How to interpret the data SpaceKnow indices can be compared with the related economic indicators or KPIs. If the economic indicator is in monthly terms, perform a 30-day rolling sum and pick the last day of the month to compare with the economic indicator. Each data point will reflect approximately the sum of the month. If the economic indicator is in quarterly terms, perform a 90-day rolling sum and pick the last day of the 90-day to compare with the economic indicator. Each data point will reflect approximately the sum of the quarter.

    Product index This index monitors the area covered by manufactured cars. The larger the area covered by the assembled cars, the larger and faster the production of a particular facility. The index rises as production increases.

    Product distribution index This index monitors the area covered by assembled cars that are ready for distribution. The index covers locations in the Rivian factory. The distribution is done via trucks and trains.

    Employee parking index Like the previous index, this one indicates the area covered by cars, but those that belong to factory employees. This index is a good indicator of factory construction, closures, and capacity utilization. The index rises as more employees work in the factory.

    Logistics index The index monitors the movement of materials supply trucks in particular car factories.

    Logistics Centers index The index monitors the movement of supply trucks in warehouses.

    Where the data comes from: SpaceKnow brings you information advantages by applying machine learning and AI algorithms to synthetic aperture radar and optical satellite imagery. The company’s infrastructure searches and downloads new imagery every day, and the computations of the data take place within less than 24 hours.

    In contrast to traditional economic data, which are released in monthly and quarterly terms, SpaceKnow data is high-frequency and available daily. It is possible to observe the latest movements in the EV industry with just a 4-6 day lag, on average.

    The EV data help you to estimate the performance of the EV sector and the business activity of the selected companies.

    The backbone of SpaceKnow’s high-quality data is the locations from which data is extracted. All locations are thoroughly researched and validated by an in-house team of annotators and data analysts.

    Each individual location is precisely defined so that the resulting data does not contain noise such as surrounding traffic or changing vegetation with the season.

    We use radar imagery and our own algorithms, so the final indices are not devalued by weather conditions such as rain or heavy clouds.

    → Reach out to get a free trial

    Use Case - Rivian:

    SpaceKnow uses the quarterly production and delivery data of Rivian as a benchmark. Rivian targeted to produce 25,000 cars in 2022. To achieve this target, the company had to increase production by 45% by producing 10,683 cars in Q4. However the production was 10,020 and the target was slightly missed by reaching total production of 24,337 cars for FY22.

    SpaceKnow indices help us to observe the company’s operations, and we are able to monitor if the company is set to meet its forecasts or not. We deliver five different indices for Rivian, and these indices observe logistic centers, employee parking lot, logistics, product, and prod...

  10. d

    Mosquito 2022 DINS Public View El Dorado County

    • catalog.data.gov
    • data.ca.gov
    • +6more
    Updated Jul 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CAL FIRE (2025). Mosquito 2022 DINS Public View El Dorado County [Dataset]. https://catalog.data.gov/dataset/mosquito-2022-dins-public-view-el-dorado-county-1564b
    Explore at:
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    CAL FIRE
    Area covered
    El Dorado County
    Description

    This database was designed in response to the Director Memorandum - "Effective January 1, 2019 all structure greater than 120 square feet in the State Responsibility Area (SRA) damaged by wildfire will be inspected and documented in the DINS Collector App."To document and structure damaged or destroyed by the Mosquito wildland fire open the associated Field Map app.NOTE - this feature service is configured to not allow record deletion. If a record needs to be deleted contact the program manager below.This is the schema developed and used by the CAL FIRE Office of State Fire Marshal to assess and record structure damage on wildland fire incidents. The schema is designed to be configured in the Esri Collector/Field Maps app for data collection during or after an incident.

  11. DAQUAR Dataset (Processed) for VQA

    • kaggle.com
    zip
    Updated Jan 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tezan Sahu (2022). DAQUAR Dataset (Processed) for VQA [Dataset]. https://www.kaggle.com/datasets/tezansahu/processed-daquar-dataset/code
    Explore at:
    zip(430733804 bytes)Available download formats
    Dataset updated
    Jan 19, 2022
    Authors
    Tezan Sahu
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    The first significant Visual Question Answering (VQA) dataset was the DAtaset for QUestion Answering on Real-world images (DAQUAR). It contains 6794 training and 5674 test question-answer pairs, based on images from the NYU-Depth V2 Dataset. That means about 9 pairs per image on average.

    This dataset is a processed version of the Full DAQUAR Dataset where the questions have been normalized (for easier consumption by tokenizers) & the image IDs, questions & answers are stored in a tabular (CSV) format, which can be loaded & used as-is for training VQA models.

    Content

    This dataset contains the processed DAQUAR Dataset (full), along with some of the raw files from the original dataset.

    Processed data: - data.csv: This is the processed dataset after normalizing all the questions & converting the {question, answer, image_id} data into a tabular format for easier consumption. - data_train.csv: This contains those records from data.csv which correspond to images present in train_images_list.txt - data_eval.csv: This contains those records from data.csv which correspond to images present in test_images_list.txt - answer_space.txt: This file contains a list of all possible answers extracted from all_qa_pairs.txt (This will allow the VQA task to be modelled as a multi-class classification problem)

    Raw files: - all_qa_pairs.txt - train_images_list.txt - test_images_list.txt

    Acknowledgements

    Malinowski, Mateusz, and Mario Fritz. "A multi-world approach to question answering about real-world scenes based on uncertain input." Advances in neural information processing systems 27 (2014): 1682-1690.

  12. d

    Mill Mountain 2022 DINS Public View

    • datasets.ai
    • data.ca.gov
    • +6more
    15, 21, 25, 3, 57, 8
    Updated Mar 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    State of California (2024). Mill Mountain 2022 DINS Public View [Dataset]. https://datasets.ai/datasets/mill-mountain-2022-dins-public-view-aa2f2
    Explore at:
    15, 57, 8, 21, 25, 3Available download formats
    Dataset updated
    Mar 29, 2024
    Dataset authored and provided by
    State of California
    Description

    This database was designed in response to the Director Memorandum - "Effective January 1, 2019 all structure greater than 120 square feet in the State Responsibility Area (SRA) damaged by wildfire will be inspected and documented in the DINS Collector App."


    To document and structure damaged or destroyed by the Mill wildland fire open the associated Field Map app.

    NOTE - this feature service is configured to not allow record deletion. If a record needs to be deleted contact the program manager below.

    This is the schema developed and used by the CAL FIRE Office of State Fire Marshal to assess and record structure damage on wildland fire incidents. The schema is designed to be configured in the Esri Collector/Field Maps app for data collection during or after an incident.

  13. t

    OCID – Object Clutter Indoor Dataset

    • researchdata.tuwien.at
    application/gzip
    Updated Jul 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jean-Baptiste Nicolas Weibel; Markus Suchi; Jean-Baptiste Nicolas Weibel; Markus Suchi; Jean-Baptiste Nicolas Weibel; Markus Suchi; Jean-Baptiste Nicolas Weibel; Markus Suchi (2025). OCID – Object Clutter Indoor Dataset [Dataset]. http://doi.org/10.48436/pcbjd-4wa12
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset provided by
    TU Wien
    Authors
    Jean-Baptiste Nicolas Weibel; Markus Suchi; Jean-Baptiste Nicolas Weibel; Markus Suchi; Jean-Baptiste Nicolas Weibel; Markus Suchi; Jean-Baptiste Nicolas Weibel; Markus Suchi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 20, 2019
    Description

    OCID – Object Clutter Indoor Dataset

    Developing robot perception systems for handling objects in the real-world requires computer vision algorithms to be carefully scrutinized with respect to the expected operating domain. This demands large quantities of ground truth data to rigorously evaluate the performance of algorithms.

    The Object Cluttered Indoor Dataset is an RGBD-dataset containing point-wise labeled point-clouds for each object. The data was captured using two ASUS-PRO Xtion cameras that are positioned at different heights. It captures diverse settings of objects, background, context, sensor to scene distance, viewpoint angle and lighting conditions. The main purpose of OCID is to allow systematic comparison of existing object segmentation methods in scenes with increasing amount of clutter. In addition OCID does also provide ground-truth data for other vision tasks like object-classification and recognition.

    OCID comprises 96 fully built up cluttered scenes. Each scene is a sequence of labeled pointclouds which are created by building a increasing cluttered scene incrementally and adding one object after the other. The first item in a sequence contains no objects, the second one object, up to the final count of added objects.

    Dataset

    The dataset uses 89 different objects that are chosen representatives from the Autonomous Robot Indoor Dataset(ARID)[1] classes and YCB Object and Model Set (YCB)[2] dataset objects.

    The ARID20 subset contains scenes including up to 20 objects from ARID. The ARID10 and YCB10 subsets include cluttered scenes with up to 10 objects from ARID and the YCB objects respectively. The scenes in each subset are composed of objects from only one set at a time to maintain separation between datasets. Scene variation includes different floor (plastic, wood, carpet) and table textures (wood, orange striped sheet, green patterned sheet). The complete set of data provides 2346 labeled point-clouds.

    OCID subsets are structured so that specific real-world factors can be individually assessed.

    ARID20-structure

    • location: floor, table
    • view: bottom, top
    • scene: sequence-id
    • free: clearly separated (objects 1-9 in corresponding sequence)
    • touching: physically touching (objects 10-16 in corresponding sequence)
    • stacked: on top of each other (objects 17-20 in corresponding sequence)

    ARID10-structure

    • location: floor, table
    • view: bottom, top
    • box: objects with sharp edges (e.g. cereal-boxes)
    • curved: objects with smooth curved surfaces (e.g. ball)
    • mixed: objects from both the box and curved
    • fruits: fruit and vegetables
    • non-fruits: mixed objects without fruits
    • scene: sequence-id

    YCB10-structure

    • location: floor, table
    • view: bottom, top
    • box: objects with sharp edges (e.g. cereal-boxes)
    • curved: objects with smooth curved surfaces (e.g. ball)
    • mixed: objects from both the box and curved
    • scene: sequence-id

    Structure:

    You can find all labeled pointclouds of the ARID20 dataset for the first sequence on a table recorded with the lower mounted camera in this directory:

    ./ARID20/table/bottom/seq01/pcd/

    In addition to labeled organized point-cloud files, corresponding depth, RGB and 2d-label-masks are available:

    • pcd: 640×480 organized XYZRGBL-pointcloud file with ground truth
    • rgb: 640×480 RGB png-image
    • depth: 640×480 16-bit png-image with depth in mm
    • label: 640×480 16-bit png-image with unique integer-label for each object at each pixel

    Dataset creation using EasyLabel:

    OCID was created using EasyLabel – a semi-automatic annotation tool for RGBD-data. EasyLabel processes recorded sequences of organized point-cloud files and exploits incrementally built up scenes, where in each take one additional object is placed. The recorded point-cloud data is then accumulated and the depth difference between two consecutive recordings are used to label new objects. The code is available here.

    OCID data for instance recognition/classification

    For ARID10 and ARID20 there is additional data available usable for object recognition and classification tasks. It contains semantically annotated RGB and depth image crops extracted from the OCID dataset.

    The structure is as follows:

    • type: depth, RGB
    • class name: eg. banana, kleenex, …
    • class instance: eg. banana_1, banana_2, kleenex_1, kleenex_2,…

    The data is provided by Mohammad Reza Loghmani.

    Research paper

    If you found our dataset useful, please cite the following paper:

    @inproceedings{DBLP:conf/icra/SuchiPFV19,

    author = {Markus Suchi and

    Timothy Patten and

    David Fischinger and

    Markus Vincze},

    title = {EasyLabel: {A} Semi-Automatic Pixel-wise Object Annotation Tool for

    Creating Robotic {RGB-D} Datasets},

    booktitle = {International Conference on Robotics and Automation, {ICRA} 2019,

    Montreal, QC, Canada, May 20-24, 2019},

    pages = {6678--6684},

    year = {2019},

    crossref = {DBLP:conf/icra/2019},

    url = {https://doi.org/10.1109/ICRA.2019.8793917},

    doi = {10.1109/ICRA.2019.8793917},

    timestamp = {Tue, 13 Aug 2019 20:25:20 +0200},

    biburl = {https://dblp.org/rec/bib/conf/icra/SuchiPFV19},

    bibsource = {dblp computer science bibliography, https://dblp.org}

    }

    @proceedings{DBLP:conf/icra/2019,

    title = {International Conference on Robotics and Automation, {ICRA} 2019,

    Montreal, QC, Canada, May 20-24, 2019},

    publisher = {{IEEE}},

    year = {2019},

    url = {http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8780387},

    isbn = {978-1-5386-6027-0},

    timestamp = {Tue, 13 Aug 2019 20:23:21 +0200},

    biburl = {https://dblp.org/rec/bib/conf/icra/2019},

    bibsource = {dblp computer science bibliography, https://dblp.org}

    }

    Contact & credits

    For any questions or issues with the OCID-dataset, feel free to contact the author:

    • Markus Suchi – email: suchi@acin.tuwien.ac.at
    • Tim Patten – email: patten@acin.tuwien.ac.at

    For specific questions about the OCID-semantic crops data please contact:

    • Mohammad Reza Loghmani – email: loghmani@acin.tuwien.ac.at

    References

    [1] Loghmani, Mohammad Reza et al. "Recognizing Objects in-the-Wild: Where do we Stand?" 2018 IEEE International Conference on Robotics and Automation (ICRA) (2018): 2170-2177.

    [2] Berk Calli, Arjun Singh, James Bruce, Aaron Walsman, Kurt Konolige, Siddhartha Srinivasa, Pieter Abbeel, Aaron M Dollar, Yale-CMU-Berkeley dataset for robotic manipulation research, The International Journal of Robotics Research, vol. 36, Issue 3, pp. 261 – 268, April 2017.

  14. t

    LUMPI: The Leibniz University Multi-Perspective Intersection Dataset

    • service.tib.eu
    • data.uni-hannover.de
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). LUMPI: The Leibniz University Multi-Perspective Intersection Dataset [Dataset]. https://service.tib.eu/ldmservice/dataset/luh-lumpi
    Explore at:
    Dataset updated
    May 16, 2025
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Increasing improvements in sensor technologies as well as machine learning methods allow an efficient collection, processing and analysis of the dynamic environment, which can be used for detection and tracking of traffic participants. Current datasets in this domain mostly present a single view, preventing high accurate pose estimations by occlusions. The integration of different, simultaneously acquired data allows to exploit and develop collaboration principles to increase the quality, reliability and integrity of the derived information. This work addresses this problem by providing a multi-view dataset, including 2D image information (videos) and 3D point clouds with labels of the traffic participants in the scene. The dataset was recorded during different weather and light conditions on several days at a large junction in Hanover, Germany. Paper Dataset teaser video: https://youtu.be/elwFdCu5IFo Dataset download path: https://data.uni-hannover.de/vault/ikg/busch/LUMPI/ Labeling process pipeline video: https://youtu.be/Ns6qsHsb06E Python-SDK: https://github.com/St3ff3nBusch/LUMPI-SDK-Python Labeling Tool/ C++ SDK: https://github.com/St3ff3nBusch/LUMPI-Labeling

  15. o

    Green Roof data - Datasets - OPERANDUM

    • data-catalogue.operandum-project.eu
    Updated Sep 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Green Roof data - Datasets - OPERANDUM [Dataset]. https://data-catalogue.operandum-project.eu/dataset/green-roof-data
    Explore at:
    Dataset updated
    Sep 8, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Green roof data along with the climatic parameters. You are not authorized to view this dataset. You may email the responsible party University College Dublin to request access.

  16. d

    Oak 2022 DINS Public View

    • datasets.ai
    • data.cnra.ca.gov
    • +8more
    15, 21, 25, 3, 57, 8
    Updated Mar 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    State of California (2024). Oak 2022 DINS Public View [Dataset]. https://datasets.ai/datasets/oak-2022-dins-public-view-591fc
    Explore at:
    8, 57, 3, 25, 21, 15Available download formats
    Dataset updated
    Mar 29, 2024
    Dataset authored and provided by
    State of California
    Description

    This database was designed in response to the Director Memorandum - "Effective January 1, 2019 all structure greater than 120 square feet in the State Responsibility Area (SRA) damaged by wildfire will be inspected and documented in the DINS Collector App."


    To document and structure damaged or destroyed by the Oak wildland fire open the associated Field Map app.

    NOTE - this feature service is configured to not allow record deletion. If a record needs to be deleted contact the program manager below.

    This is the schema developed and used by the CAL FIRE Office of State Fire Marshal to assess and record structure damage on wildland fire incidents. The schema is designed to be configured in the Esri Collector/Field Maps app for data collection during or after an incident.

  17. ebay E-Commerce Scrapped Datasets with Selenium

    • kaggle.com
    zip
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Belayet HossainDS (2023). ebay E-Commerce Scrapped Datasets with Selenium [Dataset]. https://www.kaggle.com/datasets/belayethossainds/ebay-e-commerce-scrapped-datasets-with-selenium
    Explore at:
    zip(73664 bytes)Available download formats
    Dataset updated
    Jun 6, 2023
    Authors
    Belayet HossainDS
    Description

    Welcome to the eBay E-Commerce Scraped Datasets with Selenium! This dataset is a treasure trove of valuable information, providing comprehensive data scraped from the popular e-commerce platform, eBay. With the power of Selenium web scraping tool, millions of product listings have been extracted, offering researchers, data scientists, and e-commerce enthusiasts an invaluable resource to explore and analyze.

    Key Features:

    Extensive Coverage:This dataset encompasses a vast array of product listings from various categories on eBay, ensuring a diverse range of products to analyze.

    Rich Attributes: Each listing is accompanied by a variety of attributes, including the brand, item name, rating, comments, all price, shipping cost, shipping from, sold quantity, view quantity, notice, and image URL.

    User-generated Insights:The rating and comments columns provide authentic feedback from previous buyers, enabling researchers to gain valuable insights into customer satisfaction and opinions.

    Pricing Transparency: The all price column includes the total cost of each item, giving researchers a comprehensive view of pricing strategies on eBay. Additionally, the shipping cost column provides transparency on the additional charges associated with shipping.

    Geographical Analysis:The shipping from column provides valuable information on the location from where the item will be shipped, facilitating geographical analysis and enabling researchers to study regional preferences and trends.

    Popularity Metrics:The sold quantity and view quantity columns offer valuable metrics to gauge the popularity and demand for different products, empowering researchers to identify trending items and consumer preferences.

    Potential Applications:

    >Market Research: Researchers can analyze pricing patterns, customer ratings, and comments to gain insights into market trends, competition, and consumer preferences.

    >Predictive Modeling: By leveraging the dataset's attributes, data scientists can develop predictive models to forecast product demand, pricing trends, and customer behavior.

    > The rich attributes in the dataset can be used to build recommender systems, helping e-commerce platforms personalize recommendations based on user preferences and historical data.

    "url">>Sentiment Analysis: The rating and comments columns allow for sentiment analysis to understand customer satisfaction levels and sentiment trends associated with different product categories.

    By uploading this eBay E-Commerce Scraped Datasets with Selenium to Kaggle, we aim to foster innovation and provide a valuable resource for researchers and data enthusiasts alike. We encourage you to explore this dataset, unlock its hidden insights, and contribute to the advancement of e-commerce analytics. Happy analyzing!

    Note: The dataset has been collected using Selenium, a powerful tool for web scraping, ensuring the authenticity and reliability of the reviews while adhering to ethical data collection practices.

  18. HBO and HBO Max Content Dataset

    • kaggle.com
    zip
    Updated Dec 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). HBO and HBO Max Content Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/hbo-and-hbo-max-content-dataset
    Explore at:
    zip(100874 bytes)Available download formats
    Dataset updated
    Dec 3, 2023
    Authors
    The Devastator
    Description

    HBO and HBO Max Content Dataset

    Genres, Ratings, and Platforms of HBO and HBO Max TV Shows and Movies

    By Hunter Kempf [source]

    About this dataset

    The dataset HBO Content provides comprehensive information on the TV shows and movies available on HBO and HBO Max. It contains details about various aspects of the content, such as the title, type (whether it's a TV show or a movie), year of release, rating (indicating appropriate age group), IMDb score (a measure of popularity and quality), rotten_score (if available), decade (the decade in which the content was released), and IMDb score bucket (categorizing popularity range).

    Additionally, it includes binary values indicating whether the content belongs to specific genres such as Action/Adventure, Animation, Biography, Children, Comedy Crime, Cult Documentary Drama Family Fantasy Food Game Show History Horror Independent LGBTQ Musical Mystery Reality Romance Science Fiction Sport Stand-up/Talk Thriller Travel. These genre indicators allow users to filter content based on their preferences.

    The dataset also provides information about various platforms where the content can be accessed. These platforms include Acorntv Amazon Prime Cinemax Epix Fandor Free Fubo TV HBO HBO Max Hoopla Hulu Plus Kanopy Netflix Shout Factory TV Sundance Now Syfy TV Everywhere TLC Go Viceland TV Everywere Adult Swim TV Everywhere AMC AMC Premiere BBC America TVE BritBox Cartoon Network CBS All Access Comedy Central TVE Criterion Channel Crunchyroll Premium CuriosityStream DC Universe Funimation NBC TVEverywhere Showtime Shudder Starz TNT truTV TVEverywhere Urban Movie Channel Velocity Go Watch TCM and TBS.

    The availability of each platform is indicated by binary values for each platform column. If a value is 1 (true) for a particular platform column, it means that the content is available on that platform.

    This comprehensive dataset captures vital information about HBO's extensive library of TV shows and movies. It not only helps users discover content according to their preferred genres but also allows them to determine which platforms offer access to their desired titles. The IMDb score and rating further aid in making informed decisions about the popularity and appropriateness of the content

    How to use the dataset

    • Understanding the Columns: Familiarize yourself with the columns in the dataset to comprehend what each column represents. The column names are self-explanatory and provide information about various aspects of the content like title, type (TV show or movie), year of release, rating, IMDb score, rotten score (if available), decade it belongs to, genres it falls into (like Action/Adventure or Drama), and platforms where it can be accessed.

    • Exploring Genres: The dataset includes several genre-related columns such as genres_Action_Adventure, genres_Drama,, genres_Thriller etc. You can analyze these columns to identify trends in popular genres among HBO and HBO Max content.

    • Genre-based Filtering: If you're interested in a specific genre such as Action/Adventure or Documentary content available on these platforms, you can use boolean filtering by selecting rows that have values set for corresponding genre columns.

    • Platform Availability: The dataset provides information about which platforms offer access to each content item through various platform-related columns like platforms_hbo_max ,platforms_netflix etc.). You can filter data based on platform availability if you only want to explore shows or movies accessible through certain platforms.

    • Ratings Analysis: Use the rating column for analyzing content suitable for specific age groups or audience preferences.

    • IMDb Scores: The imdb_score column contains IMDb ratings ranging from 0-10 for each TV show/movie included in this dataset. You can analyze this field across different dimensions like average scores per genre/platform/year etc., identify highly-rated titles within specific categories using boolean filtering.

    • Data Visualization: Visualize the dataset using charts or graphs to gain insights visually and interpret trends more effectively. You can create bar charts, pie charts, scatter plots, line graphs, or any other visualization technique that suits your analysis requirements.

    • Combining Datasets: If you have similar datasets from other platforms or services like Netflix or Amazon Prime Video, you can combine them with this dataset to perform comparative analyses across different streaming platforms.

    • Predictive Analysis: Use various machine learning algorithms such as regression models, classification models, or clustering algorithms to explore patterns and pred...

  19. m

    Data from: URDD: An open dataset for urban roadway disease detection and...

    • data.mendeley.com
    Updated Feb 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shuaiqi Liu (2025). URDD: An open dataset for urban roadway disease detection and classification [Dataset]. http://doi.org/10.17632/r7pnxpr2bb.2
    Explore at:
    Dataset updated
    Feb 20, 2025
    Authors
    Shuaiqi Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present two urban road disease datasets: DURDD for road disease detection and CURDD for road disease classification. DURDD includes four main types of underground road diseases: cavity, detachment, water-rich, and looseness. It also contains disease detection datasets in three base formats: COCO, Pascal VOC, and YOLO. In CURDD, the dataset is divided into two levels: level 0 and level 1, corresponding to the "Cls0" and "Cls1" catalogs, respectively. Level 1 includes cavity, detachment, water-rich, looseness, and background. Level 0 categories combine the four main disease types mentioned earlier into a single "diseases" category, with the other category being "background." This dataset was jointly published by Hebei University and the 519 Team of North China Geological Exploration Bureau. We support individuals or teams using the data for research purposes. We also welcome collaboration for commercial use. For commercial inquiries, please contact us for authorization.

  20. C

    Crimes data--gang

    • data.cityofchicago.org
    Updated Nov 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chicago Police Department (2025). Crimes data--gang [Dataset]. https://data.cityofchicago.org/d/iynt-rqgy
    Explore at:
    csv, kml, kmz, xml, application/geo+json, xlsxAvailable download formats
    Dataset updated
    Nov 15, 2025
    Authors
    Chicago Police Department
    Description

    This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
evdoteo (2023). Gryphon Urban dataset [Dataset]. https://www.kaggle.com/datasets/evdoteo/gryphon-dataset
Organization logo

Gryphon Urban dataset

Semantic Segmentation dataset for autonomous driving

Explore at:
zip(786664524 bytes)Available download formats
Dataset updated
Jan 30, 2023
Authors
evdoteo
Description

Context

Gryphon Urban dataset contains raw and labeled images taken from a Bumblebee stereo camera, placed onto car's
hood, driven in the city of Xanthi,Greece. The dataset includes labeled semantic segmentation images alongside with geolocation details for each image.

Content

This dataset has 8868 unlabeled images divided in 3 different hours and 291 semantic labeled images randomly selected from the dataset. Each image file is 640 x 480 pixels and is linked with geolocation details that you can find inside the .csv file. A .txt file is attached for camera calibration and also a pdf file with the necessary bibliography in case a depth map has to be extracted.

Acknowledgements

This dataset is part of my diploma dissertation about Autonomous Vehicles and it wouldn't be feasible without the help of Vasia Balaska, Phd candidate of Laboratory of Robotics and Automation in DUTH.

License

This dataset is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation.Permission is granted to use the data given that you agree:

1) That you do not distribute this dataset or modified versions. It is permissible to distribute derivative works in as far as they are abstract representations of this dataset (such as models trained on it or additional annotations that do not directly include any of our data) and do not allow to recover the dataset or something similar in character.

2) That you may not use the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.

Dataset Citation:

Balaska, V., Theodoridis, E., Papapetros, I. T., Tsompanoglou, C., Bampis, L., & Gasteratos, A. (2023). Semantic communities from graph-inspired visual representations of cityscapes. Automation, 4(1), 110-122.

Inspiration

Can we built a really fully autonomous vehicle with the current technology status?

Search
Clear search
Close search
Google apps
Main menu