100+ datasets found
  1. R

    Fire Data Annotations Dataset

    • universe.roboflow.com
    zip
    Updated May 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dataset (2022). Fire Data Annotations Dataset [Dataset]. https://universe.roboflow.com/dataset-9xayt/fire-data-annotations-zyvds
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 6, 2022
    Dataset authored and provided by
    dataset
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Fire Bounding Boxes
    Description

    Fire Data Annotations

    ## Overview
    
    Fire Data Annotations is a dataset for object detection tasks - it contains Fire annotations for 1,942 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  2. Data from: X-ray CT data with semantic annotations for the paper "A workflow...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). X-ray CT data with semantic annotations for the paper "A workflow for segmenting soil and plant X-ray CT images with deep learning in Google’s Colaboratory" [Dataset]. https://catalog.data.gov/dataset/x-ray-ct-data-with-semantic-annotations-for-the-paper-a-workflow-for-segmenting-soil-and-p-d195a
    Explore at:
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    Leaves from genetically unique Juglans regia plants were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA). Soil samples were collected in Fall of 2017 from the riparian oak forest located at the Russell Ranch Sustainable Agricultural Institute at the University of California Davis. The soil was sieved through a 2 mm mesh and was air dried before imaging. A single soil aggregate was scanned at 23 keV using the 10x objective lens with a pixel resolution of 650 nanometers on beamline 8.3.2 at the ALS. Additionally, a drought stressed almond flower bud (Prunus dulcis) from a plant housed at the University of California, Davis, was scanned using a 4x lens with a pixel resolution of 1.72 µm on beamline 8.3.2 at the ALS Raw tomographic image data was reconstructed using TomoPy. Reconstructions were converted to 8-bit tif or png format using ImageJ or the PIL package in Python before further processing. Images were annotated using Intel’s Computer Vision Annotation Tool (CVAT) and ImageJ. Both CVAT and ImageJ are free to use and open source. Leaf images were annotated in following Théroux-Rancourt et al. (2020). Specifically, Hand labeling was done directly in ImageJ by drawing around each tissue; with 5 images annotated per leaf. Care was taken to cover a range of anatomical variation to help improve the generalizability of the models to other leaves. All slices were labeled by Dr. Mina Momayyezi and Fiona Duong.To annotate the flower bud and soil aggregate, images were imported into CVAT. The exterior border of the bud (i.e. bud scales) and flower were annotated in CVAT and exported as masks. Similarly, the exterior of the soil aggregate and particulate organic matter identified by eye were annotated in CVAT and exported as masks. To annotate air spaces in both the bud and soil aggregate, images were imported into ImageJ. A gaussian blur was applied to the image to decrease noise and then the air space was segmented using thresholding. After applying the threshold, the selected air space region was converted to a binary image with white representing the air space and black representing everything else. This binary image was overlaid upon the original image and the air space within the flower bud and aggregate was selected using the “free hand” tool. Air space outside of the region of interest for both image sets was eliminated. The quality of the air space annotation was then visually inspected for accuracy against the underlying original image; incomplete annotations were corrected using the brush or pencil tool to paint missing air space white and incorrectly identified air space black. Once the annotation was satisfactorily corrected, the binary image of the air space was saved. Finally, the annotations of the bud and flower or aggregate and organic matter were opened in ImageJ and the associated air space mask was overlaid on top of them forming a three-layer mask suitable for training the fully convolutional network. All labeling of the soil aggregate and soil aggregate images was done by Dr. Devin Rippner. These images and annotations are for training deep learning models to identify different constituents in leaves, almond buds, and soil aggregates Limitations: For the walnut leaves, some tissues (stomata, etc.) are not labeled and only represent a small portion of a full leaf. Similarly, both the almond bud and the aggregate represent just one single sample of each. The bud tissues are only divided up into buds scales, flower, and air space. Many other tissues remain unlabeled. For the soil aggregate annotated labels are done by eye with no actual chemical information. Therefore particulate organic matter identification may be incorrect. Resources in this dataset:Resource Title: Annotated X-ray CT images and masks of a Forest Soil Aggregate. File Name: forest_soil_images_masks_for_testing_training.zipResource Description: This aggregate was collected from the riparian oak forest at the Russell Ranch Sustainable Agricultural Facility. The aggreagate was scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 0,0,0; pores spaces have a value of 250,250, 250; mineral solids have a value= 128,0,0; and particulate organic matter has a value of = 000,128,000. These files were used for training a model to segment the forest soil aggregate and for testing the accuracy, precision, recall, and f1 score of the model.Resource Title: Annotated X-ray CT images and masks of an Almond bud (P. Dulcis). File Name: Almond_bud_tube_D_P6_training_testing_images_and_masks.zipResource Description: Drought stressed almond flower bud (Prunis dulcis) from a plant housed at the University of California, Davis, was scanned by X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 4x lens with a pixel resolution of 1.72 µm using. For masks, the background has a value of 0,0,0; air spaces have a value of 255,255, 255; bud scales have a value= 128,0,0; and flower tissues have a value of = 000,128,000. These files were used for training a model to segment the almond bud and for testing the accuracy, precision, recall, and f1 score of the model.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads Resource Title: Annotated X-ray CT images and masks of Walnut leaves (J. Regia) . File Name: 6_leaf_training_testing_images_and_masks_for_paper.zipResource Description: Stems were collected from genetically unique J. regia accessions at the 117 USDA-ARS-NCGR in Wolfskill Experimental Orchard, Winters, California USA to use as scion, and were grafted by Sierra Gold Nursery onto a commonly used commercial rootstock, RX1 (J. microcarpa × J. regia). We used a common rootstock to eliminate any own-root effects and to simulate conditions for a commercial walnut orchard setting, where rootstocks are commonly used. The grafted saplings were repotted and transferred to the Armstrong lathe house facility at the University of California, Davis in June 2019, and kept under natural light and temperature. Leaves from each accession and treatment were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 170,170,170; Epidermis value= 85,85,85; Mesophyll value= 0,0,0; Bundle Sheath Extension value= 152,152,152; Vein value= 220,220,220; Air value = 255,255,255.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads

  3. R

    Training and development dataset for information extraction in plant...

    • entrepot.recherche.data.gouv.fr
    zip
    Updated Feb 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MaIAGE; Plateforme ESV; MaIAGE; Plateforme ESV (2025). Training and development dataset for information extraction in plant epidemiomonitoring [Dataset]. http://doi.org/10.57745/ZDNOGF
    Explore at:
    zip(479001)Available download formats
    Dataset updated
    Feb 20, 2025
    Dataset provided by
    Recherche Data Gouv
    Authors
    MaIAGE; Plateforme ESV; MaIAGE; Plateforme ESV
    License

    https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57745/ZDNOGFhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57745/ZDNOGF

    Dataset funded by
    INRAE
    Agence nationale de la recherche
    PIA DATAIA
    Description

    The “Training and development dataset for information extraction in plant epidemiomonitoring” is the annotation set of the “Corpus for the epidemiomonitoring of plant”. The annotations include seven entity types (e.g. species, locations, disease), their normalisation by the NCBI taxonomy and GeoNames and binary (seven) and ternary relationships. The annotations refer to character positions within the documents of the corpus. The annotation guidelines give their definitions and representative examples. Both datasets are intended for the training and validation of information extraction methods.

  4. R

    Car Highway Dataset

    • universe.roboflow.com
    zip
    Updated Sep 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sallar (2023). Car Highway Dataset [Dataset]. https://universe.roboflow.com/sallar/car-highway
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 13, 2023
    Dataset authored and provided by
    Sallar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Vehicles Bounding Boxes
    Description

    Car-Highway Data Annotation Project

    Introduction

    In this project, we aim to annotate car images captured on highways. The annotated data will be used to train machine learning models for various computer vision tasks, such as object detection and classification.

    Project Goals

    • Collect a diverse dataset of car images from highway scenes.
    • Annotate the dataset to identify and label cars within each image.
    • Organize and format the annotated data for machine learning model training.

    Tools and Technologies

    For this project, we will be using Roboflow, a powerful platform for data annotation and preprocessing. Roboflow simplifies the annotation process and provides tools for data augmentation and transformation.

    Annotation Process

    1. Upload the raw car images to the Roboflow platform.
    2. Use the annotation tools in Roboflow to draw bounding boxes around each car in the images.
    3. Label each bounding box with the corresponding class (e.g., car).
    4. Review and validate the annotations for accuracy.

    Data Augmentation

    Roboflow offers data augmentation capabilities, such as rotation, flipping, and resizing. These augmentations can help improve the model's robustness.

    Data Export

    Once the data is annotated and augmented, Roboflow allows us to export the dataset in various formats suitable for training machine learning models, such as YOLO, COCO, or TensorFlow Record.

    Milestones

    1. Data Collection and Preprocessing
    2. Annotation of Car Images
    3. Data Augmentation
    4. Data Export
    5. Model Training

    Conclusion

    By completing this project, we will have a well-annotated dataset ready for training machine learning models. This dataset can be used for a wide range of applications in computer vision, including car detection and tracking on highways.

  5. q

    St Bees acoustic sensor data annotations

    • researchdatafinder.qut.edu.au
    • researchdata.edu.au
    Updated Dec 6, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Roe (2010). St Bees acoustic sensor data annotations [Dataset]. https://researchdatafinder.qut.edu.au/individual/q82
    Explore at:
    Dataset updated
    Dec 6, 2010
    Dataset provided by
    Queensland University of Technology (QUT)
    Authors
    Paul Roe
    Description

    This dataset is the tagged csv file resulting from a study investigating the vocalisations of Koala populations on St Bees island. Audio data can be retrieved by date and time period and by searching annotation tags which have been applied to the audio recordings (for example it is possible to search for all audio samples tagged with Kookaburra). Researchers can download audio files and csv files containing information about the tags specified in the search. The 'tag' file includes: Tag Name,Start Time,End Time,Max Frequency (hz), Min Frequency (hz),Project Site, Sensor Name, Score and a link to the specific audio sample associated with the individual tag.

  6. f

    Example of a sentence from the dataset, annotated by 5 independent...

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrey Rzhetsky; Hagit Shatkay; W. John Wilbur (2023). Example of a sentence from the dataset, annotated by 5 independent annotators (sentence 10835394_70). [Dataset]. http://doi.org/10.1371/journal.pcbi.1000391.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS Computational Biology
    Authors
    Andrey Rzhetsky; Hagit Shatkay; W. John Wilbur
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Annotations in the context of the real sentence are as follows:The phenotypes of mxp19 (Fig 1B) |A2:**1SP3E3| and mxp170 (data not shown) homozygotes and hemizygotes (data not shown) are identical, |A3:**1SP3E3| |A4:**1SP3E3| |A5:**1GP3E3| suggesting that mxp19 and mxp170 are null alleles. |A1:**1SP3E3| |A2:**2SP3E1| |A3:**1SP2E0| |A4:**2SP2E0| |A5:**2GP2E3|The minimum number of sentence fragments required to represent these annotations is three:A = “The phenotypes of mxp19 (Fig 1B)”B = “and mxp170 (data not shown) homozygotes and hemizygotes (data not shown) are identical,”C = “suggesting that mxp19 and mxp170 are null alleles.”Annotators' identities are concealed with codes A1, A2, A3, A4, and A5.

  7. EmoVisual Data

    • kaggle.com
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arya Shah (2024). EmoVisual Data [Dataset]. https://www.kaggle.com/datasets/aryashah2k/emovisual-data/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Arya Shah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Emo Visual Data

    Introduction

    This is an emoticon visual annotation data set, which collects 5329 emoticons and uses the glm-4v api and step-free-api projects to complete the visual annotation through multi-modal large models.

    Example:

    0f20b31d-e019-4565-9286-fdf29cc8e144.jpg

    Original 这个表情包中的内容和笑点在于它展示了一只卡通兔子,兔子的表情看起来既无奈又有些生气,配文是“活着已经够累了,上网你还要刁难我”。这句话以一种幽默的方式表达了许多人在上网时可能会遇到的挫折感或烦恼,尤其是当遇到困难或不顺心的事情时。这种对现代生活压力的轻松吐槽使得这个表情包在社交媒体上很受欢迎,人们用它来表达自己在网络世界中的疲惫感或面对困难时的幽默态度。

    Translated: The content and laughter of this emoticon package is that it shows a cartoon rabbit. The rabbit's expression looks helpless and a little angry. The caption is "I am tired of living, but you still make things difficult for me online." This quote expresses in a humorous way the frustration or annoyance that many people may experience when surfing the Internet, especially when something difficult or doesn't go their way. This lighthearted take on the pressures of modern life has made the meme popular on social media, where people use it to express their feelings of exhaustion in the online world or to use humor in the face of difficulties.

  8. T

    Guidelines for Data Annotation

    • dataverse.tdl.org
    pdf
    Updated Sep 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kate Mesh; Kate Mesh (2020). Guidelines for Data Annotation [Dataset]. http://doi.org/10.18738/T8/FWOOJQ
    Explore at:
    pdf(167426), pdf(2472574)Available download formats
    Dataset updated
    Sep 15, 2020
    Dataset provided by
    Texas Data Repository
    Authors
    Kate Mesh; Kate Mesh
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Included here are a coding manual and supplementary examples of gesture forms (in still images and video recordings) that informed the coding of the first author (Kate Mesh) and four project reliability coders.

  9. Self-Annotated Wearable Activity Data

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Sep 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Hölzemann; Alexander Hölzemann; Kristof Van Laerhoven; Kristof Van Laerhoven (2024). Self-Annotated Wearable Activity Data [Dataset]. http://doi.org/10.3389/fcomp.2024.1379788
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 18, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Hölzemann; Alexander Hölzemann; Kristof Van Laerhoven; Kristof Van Laerhoven
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Our dataset contains 2 weeks of approx. 8-9 hours of acceleration data per day from 11 participants wearing a Bangle.js Version 1 smartwatch with our firmware installed.

    The dataset contains annotations from 4 different commonly used annotation methods utilized in user studies that focus on in-the-wild data. These methods can be grouped in user-driven, in situ annotations - which are performed before or during the activity is recorded - and recall methods - where participants annotate their data in hindsight at the end of the day.

    The participants had the task to label their activities using (1) a button located on the smartwatch, (2) the activity tracking app Strava, (3) a (hand)written diary and (4) a tool to visually inspect and label activity data, called MAD-GUI. Methods (1)-(3) are used in both weeks, however method (4) is introduced in the beginning of the second study week.

    The accelerometer data is recorded with 25 Hz, a sensitivity of ±8g and is stored in a csv format. Labels and raw data are not yet combined. You can either write your own script to label the data or follow the instructions in our corresponding Github repository.

    The following unique classes are included in our dataset:

    laying, sitting, walking, running, cycling, bus_driving, car_driving, vacuum_cleaning, laundry, cooking, eating, shopping, showering, yoga, sport, playing_games, desk_work, guitar_playing, gardening, table_tennis, badminton, horse_riding.

    However, many activities are very participant specific and therefore only performed by one of the participants.

    The labels are also stored as a .csv file and have the following columns:

    week_day, start, stop, activity, layer

    Example:

    week2_day2,10:30:00,11:00:00,vacuum_cleaning,d

    The layer columns specifies which annotation method was used to set this label.

    The following identifiers can be found in the column:

    b: in situ button

    a: in situ app

    d: self-recall diary

    g: time-series recall labelled with a the MAD-GUI

    The corresponding publication is currently under review.

  10. h

    Annotated_NER_PDF_Resumes

    • huggingface.co
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MehyarMlaweh (2024). Annotated_NER_PDF_Resumes [Dataset]. https://huggingface.co/datasets/Mehyaar/Annotated_NER_PDF_Resumes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 22, 2024
    Authors
    MehyarMlaweh
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    IT Skills Named Entity Recognition (NER) Dataset

      Description:
    

    This dataset includes 5,029 curriculum vitae (CV) samples, each annotated with IT skills using Named Entity Recognition (NER). The skills are manually labeled and extracted from PDFs, and the data is provided in JSON format. This dataset is ideal for training and evaluating NER models, especially for extracting IT skills from CVs.

      Highlights:
    

    5,029 CV samples with annotated IT skills Manual annotations for… See the full description on the dataset page: https://huggingface.co/datasets/Mehyaar/Annotated_NER_PDF_Resumes.

  11. HED schema library for SCORE annotations example

    • openneuro.org
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tal Pal Attia; Kay Robbins; Dora Hermes (2025). HED schema library for SCORE annotations example [Dataset]. http://doi.org/10.18112/openneuro.ds006392.v1.0.1
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Tal Pal Attia; Kay Robbins; Dora Hermes
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    BIDS example with HED-SCORE schema library annotations

    The HED schema library for the Standardized Computer-based Organized Reporting of EEG (SCORE) can be used to add annotations for BIDS datasets. The annotations are machine readable and validated with the BIDS and HED validators.

    This example is related to the following preprint: Dora Hermes, Tal Pal Attia, Sándor Beniczky, Jorge Bosch-Bayard, Arnaud Delorme, Brian Nils Lundstrom, Christine Rogers, Stefan Rampp, Seyed Yahya Shirazi, Dung Truong, Pedro Valdes-Sosa, Greg Worrell, Scott Makeig, Kay Robbins. Hierarchical Event Descriptor library schema for EEG data annotation. arXiv preprint arXiv:2310.15173. 2024 Oct 27.

    General information

    This BIDS example dataset includes iEEG data from one subject that were measured during clinical photic stimulation. Intracranial EEG data were collected at Mayo Clinic Rochester, MN under IRB#: 15-006530.

    Events

    The events are annotated according to the HED-SCORE schema library. Data are annotated by adding a column for annotations in the _events.tsv. The levels and annotations in this column are defined in the _events.json sidecar as HED tags.

    More information

    HED: https://www.hedtags.org/ HED schema library for SCORE: https://github.com/hed-standard/hed-schema-library

    Contact

    Dora Hermes: hermes.dora@mayo.edu

  12. r

    Data from: Dataset with condition monitoring vibration data annotated with...

    • researchdata.se
    Updated Jun 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karl Löwenmark; Fredrik Sandin; Marcus Liwicki; Stephan Schnabel (2025). Dataset with condition monitoring vibration data annotated with technical language, from paper machine industries in northern Sweden [Dataset]. http://doi.org/10.5878/hxc0-bd07
    Explore at:
    (200308), (124)Available download formats
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Luleå University of Technology
    Authors
    Karl Löwenmark; Fredrik Sandin; Marcus Liwicki; Stephan Schnabel
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    Sweden
    Description

    Labelled industry datasets are one of the most valuable assets in prognostics and health management (PHM) research. However, creating labelled industry datasets is both difficult and expensive, making publicly available industry datasets rare at best, in particular labelled datasets. Recent studies have showcased that industry annotations can be used to train artificial intelligence models directly on industry data ( https://doi.org/10.36001/ijphm.2022.v13i2.3137 , https://doi.org/10.36001/phmconf.2023.v15i1.3507 ), but while many industry datasets also contain text descriptions or logbooks in the form of annotations and maintenance work orders, few, if any, are publicly available. Therefore, we release a dataset consisting with annotated signal data from two large (80mx10mx10m) paper machines, from a Kraftliner production company in northern Sweden. The data consists of 21 090 pairs of signals and annotations from one year of production. The annotations are written in Swedish, by on-site Swedish experts, and the signals consist primarily of accelerometer vibration measurements from the two machines. The dataset is structured as a Pandas dataframe and serialized as a pickle (.pkl) file and a JSON (.json) file. The first column (‘id’) is the ID of the samples; the second column (‘Spectra’) are the fast Fourier transform and envelope-transformed vibration signals; the third column (‘Notes’) are the associated annotations, mapped so that each annotation is associated with all signals from ten days before the annotation date, up to the annotation date; and finally the fourth column (‘Embeddings’) are pre-computed embeddings using Swedish SentenceBERT. Each row corresponds to a vibration measurement sample, though there is no distinction in this data between which sensor or machine part each measurement is from.

  13. Z

    Dataset for "Information Correspondence between Types of Documentation for...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deeksha M. Arya (2024). Dataset for "Information Correspondence between Types of Documentation for APIs" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3959239
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Deeksha M. Arya
    Martin P. Robillard
    Jin L.C. Guo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This online appendix contains the coding guide and the data used in the paper Information Correspondence between Types of Documentation for APIs accepted for publication in the Empirical Software Engineering (EMSE) journal. The tutorial data was retrieved in October 2018.

    It contains the following files:

    1. CodingGuide.pdf: the coding guide to classify a sentence as API Information or Supporting Text.

    2. annotated_sampled_sentences.csv: the set of 332 sampled sentences and two columns of corresponding annotations – one by the first author of this work and the second by an external annotator. This data was used to calculate the agreement score reported in the paper.

    3. -.csv: the data set of annotated sentences in the tutorial on in . For example Python-REGEX.csv is the file containing sentences from the Python tutorial on regular expressions. This file contains the preprocessed sentences from the tutorial, their source files, and their annotation of sentence correspondence with reference documentation.

    For licensing reasons, we are unable to upload the original API reference documentation and tutorials, however these are available on request.

  14. e

    Virtual Annotated Cooking Environment Dataset - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Jul 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Virtual Annotated Cooking Environment Dataset - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/88e320c0-604a-519d-ab41-a65aeab14dad
    Explore at:
    Dataset updated
    Jul 4, 2024
    Description

    This dataset was recorded in the Virtual Annotated Cooking Environment (VACE), a new open-source virtual reality dataset (https://sites.google.com/view/vacedataset) and simulator (https://github.com/michaelkoller/vacesimulator) for object interaction tasks in a rich kitchen environment. We use the Unity-based VR simulator to create thoroughly annotated video sequences of a virtual human avatar performing food preparation activities. Based on the MPII Cooking 2 dataset, it enables the recreation of recipes for meals such as sandwiches, pizzas, fruit salads and smaller activity sequences such as cutting vegetables. For complex recipes, multiple samples are present, following different orderings of valid partially ordered plans. The dataset includes an RGB and depth camera view, bounding boxes, object masks segmentation, human joint poses and object poses, as well as ground truth interaction data in the form of temporally labeled semantic predicates (holding, on, in, colliding, moving, cutting). In our effort to make the simulator accessible as an open-source tool, researchers are able to expand the setting and annotation to create additional data samples. The research leading to these results has received funding from the Austrian Science Fund (FWF) under grant agreement No. I3969-N30 InDex and the project Doctorate College TrustRobots by TU Wien. Thanks go out to Simon Schreiberhuber for sharing his Unity expertise and to the colleagues at the TU Wien Center for Research Data Management for data hosting and support.

  15. V

    Data from: The Distributed Annotation System

    • data.virginia.gov
    • catalog.data.gov
    html
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). The Distributed Annotation System [Dataset]. https://data.virginia.gov/dataset/the-distributed-annotation-system
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Currently, most genome annotation is curated by centralized groups with limited resources. Efforts to share annotations transparently among multiple groups have not yet been satisfactory.

       Results
       Here we introduce a concept called the Distributed Annotation System (DAS). DAS allows sequence annotations to be decentralized among multiple third-party annotators and integrated on an as-needed basis by client-side software. The communication between client and servers in DAS is defined by the DAS XML specification. Annotations are displayed in layers, one per server. Any client or server adhering to the DAS XML specification can participate in the system; we describe a simple prototype client and server example.
    
    
       Conclusions
       The DAS specification is being used experimentally by Ensembl, WormBase, and the Berkeley Drosophila Genome Project. Continued success will depend on the readiness of the research community to adopt DAS and provide annotations. All components are freely available from the project website .
    
  16. Cattle (Cows) Object Detection Dataset

    • kaggle.com
    Updated Aug 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Training Data (2023). Cattle (Cows) Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/cows-detection-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 14, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Training Data
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Cows Object Detection Dataset

    The dataset is a collection of images along with corresponding bounding box annotations that are specifically curated for detecting cows in images. The dataset covers different cows breeds, sizes, and orientations, providing a comprehensive representation of cows appearances and positions. Additionally, the visibility of each cow is presented in the .xml file.

    💴 For Commercial Usage: To discuss your requirements, learn about the price and buy the dataset, leave a request on TrainingData to buy the dataset

    The cow detection dataset offers a diverse collection of annotated images, allowing for comprehensive algorithm development, evaluation, and benchmarking, ultimately aiding in the development of accurate and robust models.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fc1495731b6dff54b97ba132fc8d36fd9%2FMacBook%20Air%20-%201.png?generation=1692031830924617&alt=media" alt="">

    Dataset structure

    • images - contains of original images of cows
    • boxes - includes bounding box labeling for the original images
    • annotations.xml - contains coordinates of the bounding boxes and labels, created for the original photo

    Data Format

    Each image from images folder is accompanied by an XML-annotation in the annotations.xml file indicating the coordinates of the bounding boxes for cows detection. For each point, the x and y coordinates are provided. Visibility of the cow is also provided by the label is_visible (true, false).

    Example of XML file structure

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F7a0f0bd6a019e945074361896d27ee90%2Fcarbon%20(1).png?generation=1692032268744062&alt=media" alt="">

    Cows Detection might be made in accordance with your requirements.

    💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to discuss your requirements, learn about the price and buy the dataset

    TrainingData provides high-quality data annotation tailored to your needs

    keywords: farm animal, animal recognition, farm animal detection, image-based recognition, farmers, “on-farm” data, cows detection, cow images dataset, object detection, deep learning, computer vision, animal contacts, images dataset, agriculture, multiple animal pose estimation, cattle detection, identification, posture recognition, cattle images, individual beef cattle, cattle ranch, dairy cattle, farming, bounding boxes

  17. Dataset for the paper: "Monant Medical Misinformation Dataset: Mapping...

    • zenodo.org
    Updated Apr 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ivan Srba; Ivan Srba; Branislav Pecher; Branislav Pecher; Matus Tomlein; Matus Tomlein; Robert Moro; Robert Moro; Elena Stefancova; Elena Stefancova; Jakub Simko; Jakub Simko; Maria Bielikova; Maria Bielikova (2022). Dataset for the paper: "Monant Medical Misinformation Dataset: Mapping Articles to Fact-Checked Claims" [Dataset]. http://doi.org/10.5281/zenodo.5996864
    Explore at:
    Dataset updated
    Apr 22, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ivan Srba; Ivan Srba; Branislav Pecher; Branislav Pecher; Matus Tomlein; Matus Tomlein; Robert Moro; Robert Moro; Elena Stefancova; Elena Stefancova; Jakub Simko; Jakub Simko; Maria Bielikova; Maria Bielikova
    Description

    Overview

    This dataset of medical misinformation was collected and is published by Kempelen Institute of Intelligent Technologies (KInIT). It consists of approx. 317k news articles and blog posts on medical topics published between January 1, 1998 and February 1, 2022 from a total of 207 reliable and unreliable sources. The dataset contains full-texts of the articles, their original source URL and other extracted metadata. If a source has a credibility score available (e.g., from Media Bias/Fact Check), it is also included in the form of annotation. Besides the articles, the dataset contains around 3.5k fact-checks and extracted verified medical claims with their unified veracity ratings published by fact-checking organisations such as Snopes or FullFact. Lastly and most importantly, the dataset contains 573 manually and more than 51k automatically labelled mappings between previously verified claims and the articles; mappings consist of two values: claim presence (i.e., whether a claim is contained in the given article) and article stance (i.e., whether the given article supports or rejects the claim or provides both sides of the argument).

    The dataset is primarily intended to be used as a training and evaluation set for machine learning methods for claim presence detection and article stance classification, but it enables a range of other misinformation related tasks, such as misinformation characterisation or analyses of misinformation spreading.

    Its novelty and our main contributions lie in (1) focus on medical news article and blog posts as opposed to social media posts or political discussions; (2) providing multiple modalities (beside full-texts of the articles, there are also images and videos), thus enabling research of multimodal approaches; (3) mapping of the articles to the fact-checked claims (with manual as well as predicted labels); (4) providing source credibility labels for 95% of all articles and other potential sources of weak labels that can be mined from the articles' content and metadata.

    The dataset is associated with the research paper "Monant Medical Misinformation Dataset: Mapping Articles to Fact-Checked Claims" accepted and presented at ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '22).

    The accompanying Github repository provides a small static sample of the dataset and the dataset's descriptive analysis in a form of Jupyter notebooks.

    Options to access the dataset

    There are two ways how to get access to the dataset:

    1. Static dump of the dataset available in the CSV format
    2. Continuously updated dataset available via REST API

    In order to obtain an access to the dataset (either to full static dump or REST API), please, request the access by following instructions provided below.

    References

    If you use this dataset in any publication, project, tool or in any other form, please, cite the following papers:

    @inproceedings{SrbaMonantPlatform,
      author = {Srba, Ivan and Moro, Robert and Simko, Jakub and Sevcech, Jakub and Chuda, Daniela and Navrat, Pavol and Bielikova, Maria},
      booktitle = {Proceedings of Workshop on Reducing Online Misinformation Exposure (ROME 2019)},
      pages = {1--7},
      title = {Monant: Universal and Extensible Platform for Monitoring, Detection and Mitigation of Antisocial Behavior},
      year = {2019}
    }
    @inproceedings{SrbaMonantMedicalDataset,
      author = {Srba, Ivan and Pecher, Branislav and Tomlein Matus and Moro, Robert and Stefancova, Elena and Simko, Jakub and Bielikova, Maria},
      booktitle = {Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '22)},
      numpages = {11},
      title = {Monant Medical Misinformation Dataset: Mapping Articles to Fact-Checked Claims},
      year = {2022},
      doi = {10.1145/3477495.3531726},
      publisher = {Association for Computing Machinery},
      address = {New York, NY, USA},
      url = {https://doi.org/10.1145/3477495.3531726},
    }
    


    Dataset creation process

    In order to create this dataset (and to continuously obtain new data), we used our research platform Monant. The Monant platform provides so called data providers to extract news articles/blogs from news/blog sites as well as fact-checking articles from fact-checking sites. General parsers (from RSS feeds, Wordpress sites, Google Fact Check Tool, etc.) as well as custom crawler and parsers were implemented (e.g., for fact checking site Snopes.com). All data is stored in the unified format in a central data storage.


    Ethical considerations

    The dataset was collected and is published for research purposes only. We collected only publicly available content of news/blog articles. The dataset contains identities of authors of the articles if they were stated in the original source; we left this information, since the presence of an author's name can be a strong credibility indicator. However, we anonymised the identities of the authors of discussion posts included in the dataset.

    The main identified ethical issue related to the presented dataset lies in the risk of mislabelling of an article as supporting a false fact-checked claim and, to a lesser extent, in mislabelling an article as not containing a false claim or not supporting it when it actually does. To minimise these risks, we developed a labelling methodology and require an agreement of at least two independent annotators to assign a claim presence or article stance label to an article. It is also worth noting that we do not label an article as a whole as false or true. Nevertheless, we provide partial article-claim pair veracities based on the combination of claim presence and article stance labels.

    As to the veracity labels of the fact-checked claims and the credibility (reliability) labels of the articles' sources, we take these from the fact-checking sites and external listings such as Media Bias/Fact Check as they are and refer to their methodologies for more details on how they were established.

    Lastly, the dataset also contains automatically predicted labels of claim presence and article stance using our baselines described in the next section. These methods have their limitations and work with certain accuracy as reported in this paper. This should be taken into account when interpreting them.


    Reporting mistakes in the dataset

    The mean to report considerable mistakes in raw collected data or in manual annotations is by creating a new issue in the accompanying Github repository. Alternately, general enquiries or requests can be sent at info [at] kinit.sk.


    Dataset structure

    Raw data

    At first, the dataset contains so called raw data (i.e., data extracted by the Web monitoring module of Monant platform and stored in exactly the same form as they appear at the original websites). Raw data consist of articles from news sites and blogs (e.g. naturalnews.com), discussions attached to such articles, fact-checking articles from fact-checking portals (e.g. snopes.com). In addition, the dataset contains feedback (number of likes, shares, comments) provided by user on social network Facebook which is regularly extracted for all news/blogs articles.

    Raw data are contained in these CSV files (and corresponding REST API endpoints):

    • sources.csv
    • articles.csv
    • article_media.csv
    • article_authors.csv
    • discussion_posts.csv
    • discussion_post_authors.csv
    • fact_checking_articles.csv
    • fact_checking_article_media.csv
    • claims.csv
    • feedback_facebook.csv

    Note: Personal information about discussion posts' authors (name, website, gravatar) are anonymised.


    Annotations

    Secondly, the dataset contains so called annotations. Entity annotations describe the individual raw data entities (e.g., article, source). Relation annotations describe relation between two of such entities.

    Each annotation is described by the following attributes:

    1. category of annotation (`annotation_category`). Possible values: label (annotation corresponds to ground truth, determined by human experts) and prediction (annotation was created by means of AI method).
    2. type of annotation (`annotation_type_id`). Example values: Source reliability (binary), Claim presence. The list of possible values can be obtained from enumeration in annotation_types.csv.
    3. method which created annotation (`method_id`). Example values: Expert-based source reliability evaluation, Fact-checking article to claim transformation method. The list of possible values can be obtained from enumeration methods.csv.
    4. its value (`value`). The value is stored in JSON format and its structure differs according to particular annotation type.


    At the same time, annotations are associated with a particular object identified by:

    1. entity type (parameter entity_type in case of entity annotations, or source_entity_type and target_entity_type in case of relation annotations). Possible values: sources, articles, fact-checking-articles.
    2. entity id (parameter entity_id in case of entity annotations, or source_entity_id and target_entity_id in case of relation

  18. h

    fava-data-processed

    • huggingface.co
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weights and Biases (2024). fava-data-processed [Dataset]. https://huggingface.co/datasets/wandb/fava-data-processed
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 2, 2024
    Dataset authored and provided by
    Weights and Biases
    Description

    FAVA Dataset (Processed)

      Dataset Description
    
    
    
    
    
      Dataset Summary
    

    The FAVA (Factual Association and Verification Annotations) dataset is designed for evaluating hallucinations in language model outputs. This processed version contains binary hallucination labels derived from detailed span-level annotations in the original dataset.

      Dataset Structure
    

    Each example contains:

    Required columns: query: The prompt given to the model context: Empty field (for… See the full description on the dataset page: https://huggingface.co/datasets/wandb/fava-data-processed.

  19. The SEE-AI Project Dataset

    • kaggle.com
    Updated May 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    capsule yolo (2023). The SEE-AI Project Dataset [Dataset]. http://doi.org/10.34740/kaggle/ds/1516536
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    capsule yolo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context

    The SEE-AI Project Dataset is a collection of small bowel capsule endoscopy (CE) images obtained using the PillCam™ SB 3 (Medtronic, Minneapolis, MN, USA), which is the subject of the present paper (Small Bowel Capsule Endoscopy Examination with Object Detection Artificial Intelligence Model: The SEE-AI Project; paper is currently in submission). This dataset comprises 18,481 images extracted from 523 small bowel capsule endoscopy videos. We annotated 12,3320 images with 23,033 disease lesions and combined them with 6,161 normal mucosa images. The annotations are provided in YOLO format. While automated or assisted reading techniques for small bowel CE are highly desired, current AI models have not yet been able to accurately identify multiple types of clinically relevant lesions from CE images to the same extent as expert physicians. One major reason for this is the presence of a certain number of images that are difficult to annotate and label, and the lack of adequately constructed data sets. In the aforementioned paper, we tested an object detection model using YOLO v5. The annotations were created by us, and we believe that more effective methods for annotation should be further investigated. We hope that this dataset will be useful for future small bowel CE object detection research."

    License

    We have presented the dataset of the SEE-AI project at Kaggle (https://www.kaggle.com/), the world’s largest data science online community. Our data are licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. The material is free to copy and redistribute in any medium or format and can be remixed, transformed, and built upon for any purpose if appropriate credit is given.

    Credit

    More details on this data set can be found in the following paper. Please cite this paper when using this dataset. Yokote, A., Umeno, J., Kawasaki, K., Fujioka, S., Fuyuno, Y., Matsuno, Y., Yoshida, Y., Imazu, N., Miyazono, S., Moriyama, T., Kitazono, T. and Torisu, T. (2024), Small bowel capsule endoscopy examination and open access database with artificial intelligence: The SEE-artificial intelligence project. DEN Open, 4: e258. https://doi.org/10.1002/deo2.258

    Content

    The main content of The SEE-AI Project Dataset includes image data and annotation data. 18,481 images and annotation data in YOLO format are available. The annotations are written in the txt data whose filenames match the image data. There are also empty txt data for images without annotations.

    Acknowledgements

    We want to thank Department of Medicine and Clinical Science, Kyushu University, for their cooperation in data collection. We also thank Ultralytics for making YOLO ver5 available. The project name of this dataset was changed due to its name duplication. The previous project name was The AICE project. This was changed on May 14, 2023.

    Example usage on google colab with sample model weights

    https://colab.research.google.com/drive/1mEE5zXq1U9vC01P-qjxHR2kvxr_3Imz0?usp=sharing

    Inspiration

    We would be grateful if you could consider setting up better annotation and colleting small intestine CE images. We hope that many more facilities will collect CE images in the future, and datasets will become larger.

  20. Image Tagging and Annotation Services Market Report | Global Forecast From...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Image Tagging and Annotation Services Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-image-tagging-and-annotation-services-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset provided by
    Authors
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Image Tagging and Annotation Services Market Outlook



    The global image tagging and annotation services market size was valued at approximately USD 1.5 billion in 2023 and is projected to reach around USD 4.8 billion by 2032, growing at a compound annual growth rate (CAGR) of about 14%. This robust growth is driven by the exponential rise in demand for machine learning and artificial intelligence applications, which heavily rely on annotated datasets to train algorithms effectively. The surge in digital content creation and the increasing need for organized data for analytical purposes are also significant contributors to the market expansion.



    One of the primary growth factors for the image tagging and annotation services market is the increasing adoption of AI and machine learning technologies across various industries. These technologies require large volumes of accurately labeled data to function optimally, making image tagging and annotation services crucial. Specifically, sectors such as healthcare, automotive, and retail are investing in AI-driven solutions that necessitate high-quality annotated images to enhance machine learning models' efficiency. For example, in healthcare, annotated medical images are essential for developing tools that can aid in diagnostics and treatment decisions. Similarly, in the automotive industry, annotated images are pivotal for the development of autonomous vehicles.



    Another significant driver is the growing emphasis on improving customer experience through personalized solutions. Companies are leveraging image tagging and annotation services to better understand consumer behavior and preferences by analyzing visual content. In retail, for instance, businesses analyze customer-generated images to tailor marketing strategies and improve product offerings. Additionally, the integration of augmented reality (AR) and virtual reality (VR) in various applications has escalated the need for precise image tagging and annotation, as these technologies rely on accurately labeled datasets to deliver immersive experiences.



    Data Collection and Labeling are foundational components in the realm of image tagging and annotation services. The process of collecting and labeling data involves gathering vast amounts of raw data and meticulously annotating it to create structured datasets. These datasets are crucial for training machine learning models, enabling them to recognize patterns and make informed decisions. The accuracy of data labeling directly impacts the performance of AI systems, making it a critical step in the development of reliable AI applications. As industries increasingly rely on AI-driven solutions, the demand for high-quality data collection and labeling services continues to rise, underscoring their importance in the broader market landscape.



    The rising trend of digital transformation across industries has also significantly bolstered the demand for image tagging and annotation services. Organizations are increasingly investing in digital tools that can automate processes and enhance productivity. Image annotation plays a critical role in enabling technologies such as computer vision, which is instrumental in automating tasks ranging from quality control to inventory management. Moreover, the proliferation of smart devices and the Internet of Things (IoT) has led to an unprecedented amount of image data generation, further fueling the need for efficient image tagging and annotation services to make sense of the vast data deluge.



    From a regional perspective, North America is currently the largest market for image tagging and annotation services, attributed to the early adoption of advanced technologies and the presence of numerous tech giants investing in AI and machine learning. The region is expected to maintain its dominance due to ongoing technological advancements and the growing demand for AI solutions across various sectors. Meanwhile, the Asia Pacific region is anticipated to experience the fastest growth during the forecast period, driven by rapid industrialization, increasing internet penetration, and the rising adoption of AI technologies in countries like China, India, and Japan. The European market is also witnessing steady growth, supported by government initiatives promoting digital innovation and the use of AI-driven applications.



    Service Type Analysis



    The service type segment in the image tagging and annotation services market is bifurcated into manual annotation and automa

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
dataset (2022). Fire Data Annotations Dataset [Dataset]. https://universe.roboflow.com/dataset-9xayt/fire-data-annotations-zyvds

Fire Data Annotations Dataset

fire-data-annotations-zyvds

fire-data-annotations-dataset

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
zipAvailable download formats
Dataset updated
May 6, 2022
Dataset authored and provided by
dataset
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured
Fire Bounding Boxes
Description

Fire Data Annotations

## Overview

Fire Data Annotations is a dataset for object detection tasks - it contains Fire annotations for 1,942 images.

## Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

  ## License

  This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Search
Clear search
Close search
Google apps
Main menu