100+ datasets found
  1. unstructured data

    • kaggle.com
    zip
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ajacks (2023). unstructured data [Dataset]. https://www.kaggle.com/datasets/ajacks/unstructured-data
    Explore at:
    zip(1050 bytes)Available download formats
    Dataset updated
    Dec 11, 2023
    Authors
    ajacks
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by ajacks

    Released under Apache 2.0

    Contents

  2. h

    unstructured

    • huggingface.co
    Updated Oct 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    world (2024). unstructured [Dataset]. https://huggingface.co/datasets/halloween90/unstructured
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 6, 2024
    Authors
    world
    Description

    Dataset Card for Dataset Name

    This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    Curated by: [More Information Needed] Funded by [optional]: [More Information Needed] Shared by [optional]: [More Information Needed] Language(s) (NLP): [More Information Needed] License: [More Information Needed]

      Dataset Sources [optional]
    

    Repository: [More… See the full description on the dataset page: https://huggingface.co/datasets/halloween90/unstructured.

  3. driving unstructured traffic dataset

    • kaggle.com
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    praney (2025). driving unstructured traffic dataset [Dataset]. https://www.kaggle.com/datasets/praneydubey/driving-unstructured-traffic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    praney
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Robustness of autonomous driving of vehicles is critical for the safe deployment of the system. Robustness of the systems also depends on environment for which it is being evaluated. In this work we have focused on unstructured driving environment where, other than weather and road conditions, traffic conditions are also difficult to be identified and analyzed to make right driving decision. We have created dataset where traffic is highly congested, with uneven roads, vague or absence of division of road as well as dividers, less predictable behavior of pedestrian and other bike and vehicles. The dataset comprises of more than 100,000 images under variety of conditions. Each images are segmented using Segment-Anything Model. Each images, contains on an average more than 50 segments, whose annotations (>50 class of labels) were created using LLMs and reverified by human annotator for quality assessment. We have also created inertial sensor data along with vehicle speed to safe limits for acceleration, break and speed maintenance for each scenario.

  4. Keypoint decription for planetary environments

    • kaggle.com
    zip
    Updated Sep 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Petrakis (2023). Keypoint decription for planetary environments [Dataset]. https://www.kaggle.com/datasets/georgepetrakis/unstructured-and-planetary-environments
    Explore at:
    zip(1577493688 bytes)Available download formats
    Dataset updated
    Sep 23, 2023
    Authors
    George Petrakis
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This training dataset is focused on deep neural networks which are focused on keypoint detection and description such as SuperPoint (DeTone et al. 2018). The dataset was designed for unstructured and planetary scenes aiming to train learning-based keypoint detectors and descriptors. The dataset contains about 48,000 of images from Earth, Mars and Moon while all the images have been converted in grayscale with a size of 320x240.

    The original images from Earth were captured in a quarry and construction sites in Greece while the images from Mars were collected by a publicly available dataset of NASA. The moon images are artificial rover-based images which were generated and released with CC (Creative Commons) license by Keio University in Japan.

    The dataset will further enriched soon.

    References: DeTone D, Malisiewicz T, Rabinovich A, SuperPoint: Self-Supervised Interest Point Detection and Description, 2018, arXiv:1712.07629

  5. Data cleaning using unstructured data

    • zenodo.org
    zip
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer (2024). Data cleaning using unstructured data [Dataset]. http://doi.org/10.5281/zenodo.13135983
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 30, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this project, we work on repairing three datasets:

    • Trials design: This dataset was obtained from the European Union Drug Regulating Authorities Clinical Trials Database (EudraCT) register and the ground truth was created from external registries. In the dataset, multiple countries, identified by the attribute country_protocol_code, conduct the same clinical trials which is identified by eudract_number. Each clinical trial has a title that can help find informative details about the design of the trial.
    • Trials population: This dataset delineates the demographic origins of participants in clinical trials primarily conducted across European countries. This dataset include structured attributes indicating whether the trial pertains to a specific gender, age group or healthy volunteers. Each of these categories is labeled as (`1') or (`0') respectively denoting whether it is included in the trials or not. It is important to note that the population category should remain consistent across all countries conducting the same clinical trial identified by an eudract_number. The ground truth samples in the dataset were established by aligning information about the trial populations provided by external registries, specifically the CT.gov database and the German Trials database. Additionally, the dataset comprises other unstructured attributes that categorize the inclusion criteria for trial participants such as inclusion.
    • Allergens: This dataset contains information about products and their allergens. The data was collected from the German version of the `Alnatura' (Access date: 24 November, 2020), a free database of food products from around the world `Open Food Facts', and the websites: `Migipedia', 'Piccantino', and `Das Ist Drin'. There may be overlapping products across these websites. Each product in the dataset is identified by a unique code. Samples with the same code represent the same product but are extracted from a differentb source. The allergens are indicated by (‘2’) if present, or (‘1’) if there are traces of it, and (‘0’) if it is absent in a product. The dataset also includes information on ingredients in the products. Overall, the dataset comprises categorical structured data describing the presence, trace, or absence of specific allergens, and unstructured text describing ingredients.

    N.B: Each '.zip' file contains a set of 5 '.csv' files which are part of the afro-mentioned datasets:

    • "{dataset_name}_train.csv": samples used for the ML-model training. (e.g "allergens_train.csv")
    • "{dataset_name}_test.csv": samples used to test the the ML-model performance. (e.g "allergens_test.csv")
    • "{dataset_name}_golden_standard.csv": samples represent the ground truth of the test samples. (e.g "allergens_golden_standard.csv")
    • "{dataset_name}_parker_train.csv": samples repaired using Parker Engine used for the ML-model training. (e.g "allergens_parker_train.csv")
    • "{dataset_name}_parker_train.csv": samples repaired using Parker Engine used to test the the ML-model performance. (e.g "allergens_parker_test.csv")
  6. Respiratory Diseases Unstructured Data

    • kaggle.com
    zip
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luai Waleed (2025). Respiratory Diseases Unstructured Data [Dataset]. https://www.kaggle.com/datasets/mistaluai/respiratory-diseases-unstructured-data
    Explore at:
    zip(459253202 bytes)Available download formats
    Dataset updated
    Jun 12, 2025
    Authors
    Luai Waleed
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Luai Waleed

    Released under CC0: Public Domain

    Contents

  7. HIRENASD coarse unstructured - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). HIRENASD coarse unstructured - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/hirenasd-coarse-unstructured
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Unstructured HIRENASD mesh: - coarse size (5.7 million nodes, 14.4 million elements) - for node centered solvers - 01.06.2011 - caution: dimensions in mm

  8. h

    risk-dataset-unstructured

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tan de shao, risk-dataset-unstructured [Dataset]. https://huggingface.co/datasets/tandeshao/risk-dataset-unstructured
    Explore at:
    Authors
    tan de shao
    Description

    tandeshao/risk-dataset-unstructured dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. d

    Replication Data for: \"A Topic-based Segmentation Model for Identifying...

    • search.dataone.org
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kim, Sunghoon; Lee, Sanghak; McCulloch, Robert (2024). Replication Data for: \"A Topic-based Segmentation Model for Identifying Segment-Level Drivers of Star Ratings from Unstructured Text Reviews\" [Dataset]. http://doi.org/10.7910/DVN/EE3DE2
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Kim, Sunghoon; Lee, Sanghak; McCulloch, Robert
    Description

    We provide instructions, codes and datasets for replicating the article by Kim, Lee and McCulloch (2024), "A Topic-based Segmentation Model for Identifying Segment-Level Drivers of Star Ratings from Unstructured Text Reviews." This repository provides a user-friendly R package for any researchers or practitioners to apply A Topic-based Segmentation Model with Unstructured Texts (latent class regression with group variable selection) to their datasets. First, we provide a R code to replicate the illustrative simulation study: see file 1. Second, we provide the user-friendly R package with a very simple example code to help apply the model to real-world datasets: see file 2, Package_MixtureRegression_GroupVariableSelection.R and Dendrogram.R. Third, we provide a set of codes and instructions to replicate the empirical studies of customer-level segmentation and restaurant-level segmentation with Yelp reviews data: see files 3-a, 3-b, 4-a, 4-b. Note, due to the dataset terms of use by Yelp and the restriction of data size, we provide the link to download the same Yelp datasets (https://www.kaggle.com/datasets/yelp-dataset/yelp-dataset/versions/6). Fourth, we provided a set of codes and datasets to replicate the empirical study with professor ratings reviews data: see file 5. Please see more details in the description text and comments of each file. [A guide on how to use the code to reproduce each study in the paper] 1. Full codes for replicating Illustrative simulation study.txt -- [see Table 2 and Figure 2 in main text]: This is R source code to replicate the illustrative simulation study. Please run from the beginning to the end in R. In addition to estimated coefficients (posterior means of coefficients), indicators of variable selections, and segment memberships, you will get dendrograms of selected groups of variables in Figure 2. Computing time is approximately 20 to 30 minutes 3-a. Preprocessing raw Yelp Reviews for Customer-level Segmentation.txt: Code for preprocessing the downloaded unstructured Yelp review data and preparing DV and IVs matrix for customer-level segmentation study. 3-b. Instruction for replicating Customer-level Segmentation analysis.txt -- [see Table 10 in main text; Tables F-1, F-2, and F-3 and Figure F-1 in Web Appendix]: Code for replicating customer-level segmentation study with Yelp data. You will get estimated coefficients (posterior means of coefficients), indicators of variable selections, and segment memberships. Computing time is approximately 3 to 4 hours. 4-a. Preprocessing raw Yelp reviews_Restaruant Segmentation (1).txt: R code for preprocessing the downloaded unstructured Yelp data and preparing DV and IVs matrix for restaurant-level segmentation study. 4-b. Instructions for replicating restaurant-level segmentation analysis.txt -- [see Tables 5, 6 and 7 in main text; Tables E-4 and E-5 and Figure H-1 in Web Appendix]: Code for replicating restaurant-level segmentation study with Yelp. you will get estimated coefficients (posterior means of coefficients), indicators of variable selections, and segment memberships. Computing time is approximately 10 to 12 hours. [Guidelines for running Benchmark models in Table 6] Unsupervised Topic model: 'topicmodels' package in R -- after determining the number of topics(e.g., with 'ldatuning' R package), run 'LDA' function in the 'topicmodels'package. Then, compute topic probabilities per restaurant (with 'posterior' function in the package) which can be used as predictors. Then, conduct prediction with regression Hierarchical topic model (HDP): 'gensimr' R package -- 'model_hdp' function for identifying topics in the package (see https://radimrehurek.com/gensim/models/hdpmodel.html or https://gensimr.news-r.org/). Supervised topic model: 'lda' R package -- 'slda.em' function for training and 'slda.predict' for prediction. Aggregate regression: 'lm' default function in R. Latent class regression without variable selection: 'flexmix' function in 'flexmix' R package. Run flexmix with a certain number of segments (e.g., 3 segments in this study). Then, with estimated coefficients and memberships, conduct prediction of dependent variable per each segment. Latent class regression with variable selection: 'Unconstraind_Bayes_Mixture' function in Kim, Fong and DeSarbo(2012)'s package. Run the Kim et al's model (2012) with a certain number of segments (e.g., 3 segments in this study). Then, with estimated coefficients and memberships, we can do prediction of dependent variables per each segment. The same R package ('KimFongDeSarbo2012.zip') can be downloaded at: https://sites.google.com/scarletmail.rutgers.edu/r-code-packages/home 5. Instructions for replicating Professor ratings review study.txt -- [see Tables G-1, G-2, G-4 and G-5, and Figures G-1 and H-2 in Web Appendix]: Code to replicate the Professor ratings reviews study. Computing time is approximately 10 hours. [A list of the versions of R, packages, and computer...

  10. h

    unstructured-data-multilingual

    • huggingface.co
    Updated Aug 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jingwora (2023). unstructured-data-multilingual [Dataset]. https://huggingface.co/datasets/jingwora/unstructured-data-multilingual
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 18, 2023
    Authors
    jingwora
    Description

    Dataset Card for "unstructured-data-multilingual"

    More Information needed

  11. R

    Unstructured Environment Dataset

    • universe.roboflow.com
    zip
    Updated Oct 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    farhanii draft dataset (2025). Unstructured Environment Dataset [Dataset]. https://universe.roboflow.com/farhanii-draft-dataset/unstructured-environment-spmkd/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 13, 2025
    Dataset authored and provided by
    farhanii draft dataset
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Unstructured_Environment ABCJ Polygons
    Description

    Unstructured Environment

    ## Overview
    
    Unstructured Environment is a dataset for instance segmentation tasks - it contains Unstructured_Environment ABCJ annotations for 1,490 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  12. Z

    M100 dataset 8: 22-05

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    Updated May 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrea Borghesi; Carmine Di Santi; Martin Molan; Mohsen Seyedkazemi Ardebili; Alessio Mauri; Massimiliano Guarrasi; Daniela Galetti; Mirko Cestari; Francesco Barchi; Luca Benini; Francesco Beneventi; Andrea Bartolini (2023). M100 dataset 8: 22-05 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7590546
    Explore at:
    Dataset updated
    May 22, 2023
    Authors
    Andrea Borghesi; Carmine Di Santi; Martin Molan; Mohsen Seyedkazemi Ardebili; Alessio Mauri; Massimiliano Guarrasi; Daniela Galetti; Mirko Cestari; Francesco Barchi; Luca Benini; Francesco Beneventi; Andrea Bartolini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This entry is a part of a larger data set collected from the most recent Tier-0 supercomputer hosted at CINECA (Marconi100, https://www.hpc.cineca.it/hardware/marconi100). The data covers the entirety of the system, ranging from the computing nodes (980+ computing nodes) internal information such as core loads, temperatures, frequencies, memory write/read operations, CPU power consumption, fan speed, GPU usage details, etc., to the system-wide information, including the liquid cooling infrastructure, the air conditioning system, the power supply units, workload manager statistics, and job-related information, system status alerts, and weather forecast. It comprises hundreds of metrics measured on each computing node, in addition to hundreds of other metrics gathered from sensors monitored along all system components. The whole data set is stored as a collection of Zenodo entries; this particular entry corresponds to the period: 22-05.

    The dataset is stored as a partitioned Parquet dataset, with this partitioning hierarchy: year_month ("YY-MM"), plugin, metric. The data is distributed as tarball files, each corresponding to one month of data (first-level partitioning, year_month). The collected data is generated by a monitoring infrastructure working on unstructured data (to improve efficiency and scalability); however, this data has been organized in a structured manner to facilitate its fruition. The simplest way to understand how the access the data is to refer to the companion software modules released together with the dataset itself, which can be found at: https://gitlab.com/ecs-lab/exadata.

  13. Multimodal Student Retention Dataset

    • kaggle.com
    zip
    Updated Aug 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ziya (2025). Multimodal Student Retention Dataset [Dataset]. https://www.kaggle.com/datasets/ziya07/multimodal-student-retention-dataset
    Explore at:
    zip(78893 bytes)Available download formats
    Dataset updated
    Aug 7, 2025
    Authors
    Ziya
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset, titled Multimodal Student Retention Dataset for Intelligent University Management, contains 2134 records of college students collected for the purpose of analyzing and predicting student retention risks using a multimodal information fusion approach.

    It simulates a realistic university campus service environment, integrating both structured (e.g., GPA, demographics, attendance) and unstructured (e.g., advising and counseling notes) data, along with temporal academic performance trends.

    The dataset is designed to support research and development of deep learning models—particularly those using multimodal fusion and graph-based neural networks—to improve the efficiency and intelligence of education management systems.

    📊 Features 🔹 Structured Data (Tabular): Academic metrics: GPA (current and historical), credits, academic warnings

    Demographics: Age, gender, ethnicity, major

    Administrative: Financial aid status, enrollment gaps, advisor meetings

    🔹 Unstructured Data (Textual): Advising notes: Summaries from academic advisor interactions

    Counseling notes: Mental health or well-being insights (if available)

    🔹 Temporal Data: GPA and attendance history over past 4 semesters

    🔹 Target Variables (Labels): dropout_risk: Whether the student is at risk of dropping out

    next_semester_dropout: Likelihood of dropout in the next semester

    dropout_type: Type of dropout (Temporary, Permanent, Transfer)

    dropout_duration: Predicted duration of dropout in semesters

    dropout_cause: Reason for dropout (Academic, Financial, etc.)

  14. HIRENASD medium unstructured

    • data.nasa.gov
    • datasets.ai
    • +2more
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). HIRENASD medium unstructured [Dataset]. https://data.nasa.gov/dataset/hirenasd-medium-unstructured
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Unstructured HIRENASD mesh: - medium size (16 million nodes, 39 million elements) - for node centered solvers - 31.05.2011 - caution: dimensions in mm

  15. g

    T-Shirts Dataset

    • gts.ai
    json
    Updated Sep 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Globose Technology Solutions Private Limited (2024). T-Shirts Dataset [Dataset]. https://gts.ai/dataset-download/page/71/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Sep 18, 2024
    Dataset authored and provided by
    Globose Technology Solutions Private Limited
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    T-Shirt views, Fashion classification, E-commerce product images, Raw and unfiltered retail data
    Description

    The T-Shirts Dataset contains raw and unfiltered e-commerce images of T-shirts, including multiple views per frame, offering real-world challenges for AI, computer vision, and retail analytics projects.

  16. HIRENASD fine unstructured grid - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). HIRENASD fine unstructured grid - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/hirenasd-fine-unstructured-grid
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Unstructured HIRENASD mesh - fine size (46.4 million nodes, 104 million elements) for node centered solvers - 31.05.2011

  17. h

    risk-dataset-unstructured-paraphrased-humarin

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tan de shao, risk-dataset-unstructured-paraphrased-humarin [Dataset]. https://huggingface.co/datasets/tandeshao/risk-dataset-unstructured-paraphrased-humarin
    Explore at:
    Authors
    tan de shao
    Description

    tandeshao/risk-dataset-unstructured-paraphrased-humarin dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. F1-score of extracting relational triples from sentences that contain...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xuan Liu; Wanru Du; Xiaoyin Wang; Ruiqun Li; Pengcheng Sun; Xiaochuan Jing (2023). F1-score of extracting relational triples from sentences that contain different numbers of triples. [Dataset]. http://doi.org/10.1371/journal.pone.0260426.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Xuan Liu; Wanru Du; Xiaoyin Wang; Ruiqun Li; Pengcheng Sun; Xiaochuan Jing
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    F1-score of extracting relational triples from sentences that contain different numbers of triples.

  19. R

    Unstructured Road Obstacles Dataset

    • universe.roboflow.com
    zip
    Updated Feb 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    unstructuredroadobstacles (2025). Unstructured Road Obstacles Dataset [Dataset]. https://universe.roboflow.com/unstructuredroadobstacles/unstructured-road-obstacles
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 28, 2025
    Dataset authored and provided by
    unstructuredroadobstacles
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Road Objects Bounding Boxes
    Description

    Unstructured Road Obstacles

    ## Overview
    
    Unstructured Road Obstacles is a dataset for object detection tasks - it contains Road Objects annotations for 6,156 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  20. i

    IDD-X

    • india-data.org
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IIIT Hyderabad, IHUB (2025). IDD-X [Dataset]. https://india-data.org/googleSEO-list-dataset-search
    Explore at:
    videos with text annotationsAvailable download formats
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    IIIT Hyderabad, IHUB
    License

    https://india-data.org/terms-conditionshttps://india-data.org/terms-conditions

    Area covered
    India
    Description

    IDD-X is a large-scale, dual-view driving video dataset designed to advance understanding of how road conditions and surrounding traffic influence the ego vehicle’s driving behavior, particularly in dense, heterogeneous, and unstructured traffic scenarios common in developing countries. Unlike existing datasets focused on structured environments, IDD-X captures the complexity of real-world traffic through 3634 annotated driving scenarios across 1140 untrimmed videos. It features 697K bounding boxes, 9K object tracks, and annotations for 1–12 important road objects per scenario, spanning 10 object categories and 19 explanation label categories. The dataset includes both front and rear views, enabling a more complete and explainable analysis of driving decisions. To support this, custom-designed deep networks are introduced for localizing important objects and predicting per-object behavioral explanations, forming a foundation for safe and efficient navigation in intelligent vehicle systems.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
ajacks (2023). unstructured data [Dataset]. https://www.kaggle.com/datasets/ajacks/unstructured-data
Organization logo

unstructured data

Explore at:
zip(1050 bytes)Available download formats
Dataset updated
Dec 11, 2023
Authors
ajacks
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Dataset

This dataset was created by ajacks

Released under Apache 2.0

Contents

Search
Clear search
Close search
Google apps
Main menu