63 datasets found
  1. A

    Automated Data Annotation Tools Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Automated Data Annotation Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/automated-data-annotation-tools-1947663
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Apr 24, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Discover the booming Automated Data Annotation Tools market! This comprehensive analysis reveals key trends, drivers, restraints, and forecasts for 2025-2033, covering major regions & applications. Learn about leading companies and unlock opportunities in this rapidly evolving AI landscape.

  2. A

    Automated Data Annotation Tools Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Automated Data Annotation Tools Report [Dataset]. https://www.marketresearchforecast.com/reports/automated-data-annotation-tools-544649
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Jul 9, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Automated Data Annotation Tools market is booming, projected to reach $3.2 Billion by 2033. Discover key market trends, growth drivers, and leading companies shaping this vital sector for AI development. Explore our in-depth analysis covering market segmentation, regional insights, and future forecasts.

  3. M

    Manual Data Annotation Tools Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Apr 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Manual Data Annotation Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/manual-data-annotation-tools-1450942
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Apr 15, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The booming manual data annotation tools market is projected to reach $1045.4 million by 2025, growing at a CAGR of 14.2% through 2033. Learn about key drivers, trends, regional insights, and leading companies shaping this crucial sector for AI development. Explore market segmentation by application (IT, BFSI, Healthcare, etc.) and annotation type (image/video, text, audio).

  4. A

    Automated Data Annotation Tool Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Mar 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Automated Data Annotation Tool Report [Dataset]. https://www.marketresearchforecast.com/reports/automated-data-annotation-tool-33033
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Mar 13, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The automated data annotation tool market is booming, projected to reach $10 billion by 2033. Learn about market trends, key players (Amazon, Google, etc.), and the driving forces behind this explosive growth in AI training data. Discover insights into regional market shares and segmentation data.

  5. r

    Integrated Data Annotation

    • rrid.site
    • neuinfo.org
    • +2more
    Updated Nov 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Integrated Data Annotation [Dataset]. http://identifiers.org/RRID:SCR_010499
    Explore at:
    Dataset updated
    Nov 15, 2025
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE. Documented September 15, 2017.A virtual database of annotations between databases.

  6. d

    AI Training Data | Annotated Checkout Flows for Retail, Restaurant, and...

    • datarade.ai
    Updated Dec 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MealMe (2024). AI Training Data | Annotated Checkout Flows for Retail, Restaurant, and Marketplace Websites [Dataset]. https://datarade.ai/data-products/ai-training-data-annotated-checkout-flows-for-retail-resta-mealme
    Explore at:
    Dataset updated
    Dec 18, 2024
    Dataset authored and provided by
    MealMe
    Area covered
    United States of America
    Description

    AI Training Data | Annotated Checkout Flows for Retail, Restaurant, and Marketplace Websites Overview

    Unlock the next generation of agentic commerce and automated shopping experiences with this comprehensive dataset of meticulously annotated checkout flows, sourced directly from leading retail, restaurant, and marketplace websites. Designed for developers, researchers, and AI labs building large language models (LLMs) and agentic systems capable of online purchasing, this dataset captures the real-world complexity of digital transactions—from cart initiation to final payment.

    Key Features

    Breadth of Coverage: Over 10,000 unique checkout journeys across hundreds of top e-commerce, food delivery, and service platforms, including but not limited to Walmart, Target, Kroger, Whole Foods, Uber Eats, Instacart, Shopify-powered sites, and more.

    Actionable Annotation: Every flow is broken down into granular, step-by-step actions, complete with timestamped events, UI context, form field details, validation logic, and response feedback. Each step includes:

    Page state (URL, DOM snapshot, and metadata)

    User actions (clicks, taps, text input, dropdown selection, checkbox/radio interactions)

    System responses (AJAX calls, error/success messages, cart/price updates)

    Authentication and account linking steps where applicable

    Payment entry (card, wallet, alternative methods)

    Order review and confirmation

    Multi-Vertical, Real-World Data: Flows sourced from a wide variety of verticals and real consumer environments, not just demo stores or test accounts. Includes complex cases such as multi-item carts, promo codes, loyalty integration, and split payments.

    Structured for Machine Learning: Delivered in standard formats (JSONL, CSV, or your preferred schema), with every event mapped to action types, page features, and expected outcomes. Optional HAR files and raw network request logs provide an extra layer of technical fidelity for action modeling and RLHF pipelines.

    Rich Context for LLMs and Agents: Every annotation includes both human-readable and model-consumable descriptions:

    “What the user did” (natural language)

    “What the system did in response”

    “What a successful action should look like”

    Error/edge case coverage (invalid forms, OOS, address/payment errors)

    Privacy-Safe & Compliant: All flows are depersonalized and scrubbed of PII. Sensitive fields (like credit card numbers, user addresses, and login credentials) are replaced with realistic but synthetic data, ensuring compliance with privacy regulations.

    Each flow tracks the user journey from cart to payment to confirmation, including:

    Adding/removing items

    Applying coupons or promo codes

    Selecting shipping/delivery options

    Account creation, login, or guest checkout

    Inputting payment details (card, wallet, Buy Now Pay Later)

    Handling validation errors or OOS scenarios

    Order review and final placement

    Confirmation page capture (including order summary details)

    Why This Dataset?

    Building LLMs, agentic shopping bots, or e-commerce automation tools demands more than just page screenshots or API logs. You need deeply contextualized, action-oriented data that reflects how real users interact with the complex, ever-changing UIs of digital commerce. Our dataset uniquely captures:

    The full intent-action-outcome loop

    Dynamic UI changes, modals, validation, and error handling

    Nuances of cart modification, bundle pricing, delivery constraints, and multi-vendor checkouts

    Mobile vs. desktop variations

    Diverse merchant tech stacks (custom, Shopify, Magento, BigCommerce, native apps, etc.)

    Use Cases

    LLM Fine-Tuning: Teach models to reason through step-by-step transaction flows, infer next-best-actions, and generate robust, context-sensitive prompts for real-world ordering.

    Agentic Shopping Bots: Train agents to navigate web/mobile checkouts autonomously, handle edge cases, and complete real purchases on behalf of users.

    Action Model & RLHF Training: Provide reinforcement learning pipelines with ground truth “what happens if I do X?” data across hundreds of real merchants.

    UI/UX Research & Synthetic User Studies: Identify friction points, bottlenecks, and drop-offs in modern checkout design by replaying flows and testing interventions.

    Automated QA & Regression Testing: Use realistic flows as test cases for new features or third-party integrations.

    What’s Included

    10,000+ annotated checkout flows (retail, restaurant, marketplace)

    Step-by-step event logs with metadata, DOM, and network context

    Natural language explanations for each step and transition

    All flows are depersonalized and privacy-compliant

    Example scripts for ingesting, parsing, and analyzing the dataset

    Flexible licensing for research or commercial use

    Sample Categories Covered

    Grocery delivery (Instacart, Walmart, Kroger, Target, etc.)

    Restaurant takeout/delivery (Ub...

  7. Taxonomies for Semantic Research Data Annotation

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christoph Göpfert; Christoph Göpfert; Jan Ingo Haas; Jan Ingo Haas; Lucas Schröder; Lucas Schröder; Martin Gaedke; Martin Gaedke (2024). Taxonomies for Semantic Research Data Annotation [Dataset]. http://doi.org/10.5281/zenodo.7908855
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 23, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christoph Göpfert; Christoph Göpfert; Jan Ingo Haas; Jan Ingo Haas; Lucas Schröder; Lucas Schröder; Martin Gaedke; Martin Gaedke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains 35 of 39 taxonomies that were the result of a systematic review. The systematic review was conducted with the goal of identifying taxonomies suitable for semantically annotating research data. A special focus was set on research data from the hybrid societies domain.

    The following taxonomies were identified as part of the systematic review:

    Filename

    Taxonomy Title

    acm_ccs

    ACM Computing Classification System [1]

    amec

    A Taxonomy of Evaluation Towards Standards [2]

    bibo

    A BIBO Ontology Extension for Evaluation of Scientific Research Results [3]

    cdt

    Cross-Device Taxonomy [4]

    cso

    Computer Science Ontology [5]

    ddbm

    What Makes a Data-driven Business Model? A Consolidated Taxonomy [6]

    ddi_am

    DDI Aggregation Method [7]

    ddi_moc

    DDI Mode of Collection [8]

    n/a

    DemoVoc [9]

    discretization

    Building a New Taxonomy for Data Discretization Techniques [10]

    dp

    Demopaedia [11]

    dsg

    Data Science Glossary [12]

    ease

    A Taxonomy of Evaluation Approaches in Software Engineering [13]

    eco

    Evidence & Conclusion Ontology [14]

    edam

    EDAM: The Bioscientific Data Analysis Ontology [15]

    n/a

    European Language Social Science Thesaurus [16]

    et

    Evaluation Thesaurus [17]

    glos_hci

    The Glossary of Human Computer Interaction [18]

    n/a

    Humanities and Social Science Electronic Thesaurus [19]

    hcio

    A Core Ontology on the Human-Computer Interaction Phenomenon [20]

    hft

    Human-Factors Taxonomy [21]

    hri

    A Taxonomy to Structure and Analyze Human–Robot Interaction [22]

    iim

    A Taxonomy of Interaction for Instructional Multimedia [23]

    interrogation

    A Taxonomy of Interrogation Methods [24]

    iot

    Design Vocabulary for Human–IoT Systems Communication [25]

    kinect

    Understanding Movement and Interaction: An Ontology for Kinect-Based 3D Depth Sensors [26]

    maco

    Thesaurus Mass Communication [27]

    n/a

    Thesaurus Cognitive Psychology of Human Memory [28]

    mixed_initiative

    Mixed-Initiative Human-Robot Interaction: Definition, Taxonomy, and Survey [29]

    qos_qoe

    A Taxonomy of Quality of Service and Quality of Experience of Multimodal Human-Machine Interaction [30]

    ro

    The Research Object Ontology [31]

    senses_sensors

    A Human-Centered Taxonomy of Interaction Modalities and Devices [32]

    sipat

    A Taxonomy of Spatial Interaction Patterns and Techniques [33]

    social_errors

    A Taxonomy of Social Errors in Human-Robot Interaction [34]

    sosa

    Semantic Sensor Network Ontology [35]

    swo

    The Software Ontology [36]

    tadirah

    Taxonomy of Digital Research Activities in the Humanities [37]

    vrs

    Virtual Reality and the CAVE: Taxonomy, Interaction Challenges and Research Directions [38]

    xdi

    Cross-Device Interaction [39]


    We converted the taxonomies into SKOS (Simple Knowledge Organisation System) representation. The following 4 taxonomies were not converted as they were already available in SKOS and were for this reason excluded from this dataset:

    1) DemoVoc, cf. http://thesaurus.web.ined.fr/navigateur/
    available at https://thesaurus.web.ined.fr/exports/demovoc/demovoc.rdf

    2) European Language Social Science Thesaurus, cf. https://thesauri.cessda.eu/elsst/en/
    available at https://zenodo.org/record/5506929

    3) Humanities and Social Science Electronic Thesaurus, cf. https://hasset.ukdataservice.ac.uk/hasset/en/
    available at https://zenodo.org/record/7568355

    4) Thesaurus Cognitive Psychology of Human Memory, cf. https://www.loterre.fr/presentation/
    available at https://skosmos.loterre.fr/P66/en/

    References

    [1] “The 2012 ACM Computing Classification System,” ACM Digital Library, 2012. https://dl.acm.org/ccs (accessed May 08, 2023).

    [2] AMEC, “A Taxonomy of Evaluation Towards Standards.” Aug. 31, 2016. Accessed: May 08, 2023. [Online]. Available: https://amecorg.com/amecframework/home/supporting-material/taxonomy/

    [3] B. Dimić Surla, M. Segedinac, and D. Ivanović, “A BIBO ontology extension for evaluation of scientific research results,” in Proceedings of the Fifth Balkan Conference in Informatics, in BCI ’12. New York, NY, USA: Association for Computing Machinery, Sep. 2012, pp. 275–278. doi: 10.1145/2371316.2371376.

    [4] F. Brudy et al., “Cross-Device Taxonomy: Survey, Opportunities and Challenges of Interactions Spanning Across Multiple Devices,” in Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, in CHI ’19. New York, NY, USA: Association for Computing Machinery, Mai 2019, pp. 1–28. doi: 10.1145/3290605.3300792.

    [5] A. A. Salatino, T. Thanapalasingam, A. Mannocci, F. Osborne, and E. Motta, “The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas,” in Lecture Notes in Computer Science 1137, D. Vrandečić, K. Bontcheva, M. C. Suárez-Figueroa, V. Presutti, I. Celino, M. Sabou, L.-A. Kaffee, and E. Simperl, Eds., Monterey, California, USA: Springer, Oct. 2018, pp. 187–205. Accessed: May 08, 2023. [Online]. Available: http://oro.open.ac.uk/55484/

    [6] M. Dehnert, A. Gleiss, and F. Reiss, “What makes a data-driven business model? A consolidated taxonomy,” presented at the European Conference on Information Systems, 2021.

    [7] DDI Alliance, “DDI Controlled Vocabulary for Aggregation Method,” 2014. https://ddialliance.org/Specification/DDI-CV/AggregationMethod_1.0.html (accessed May 08, 2023).

    [8] DDI Alliance, “DDI Controlled Vocabulary for Mode Of Collection,” 2015. https://ddialliance.org/Specification/DDI-CV/ModeOfCollection_2.0.html (accessed May 08, 2023).

    [9] INED - French Institute for Demographic Studies, “Thésaurus DemoVoc,” Feb. 26, 2020. https://thesaurus.web.ined.fr/navigateur/en/about (accessed May 08, 2023).

    [10] A. A. Bakar, Z. A. Othman, and N. L. M. Shuib, “Building a new taxonomy for data discretization techniques,” in 2009 2nd Conference on Data Mining and Optimization, Oct. 2009, pp. 132–140. doi: 10.1109/DMO.2009.5341896.

    [11] N. Brouard and C. Giudici, “Unified second edition of the Multilingual Demographic Dictionary

  8. H

    PEARC20 submitted paper: "Scientific Data Annotation and Dissemination:...

    • hydroshare.org
    • beta.hydroshare.org
    zip
    Updated Jul 29, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sean Cleveland; Gwen Jacobs; Jennifer Geis (2020). PEARC20 submitted paper: "Scientific Data Annotation and Dissemination: Using the ‘Ike Wai Gateway to Manage Research Data" [Dataset]. http://doi.org/10.4211/hs.d66ef2686787403698bac5368a29b056
    Explore at:
    zip(873 bytes)Available download formats
    Dataset updated
    Jul 29, 2020
    Dataset provided by
    HydroShare
    Authors
    Sean Cleveland; Gwen Jacobs; Jennifer Geis
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Time period covered
    Jul 29, 2020
    Description

    Abstract: Granting agencies invest millions of dollars on the generation and analysis of data, making these products extremely valuable. However, without sufficient annotation of the methods used to collect and analyze the data, the ability to reproduce and reuse those products suffers. This lack of assurance of the quality and credibility of the data at the different stages in the research process essentially wastes much of the investment of time and funding and fails to drive research forward to the level of potential possible if everything was effectively annotated and disseminated to the wider research community. In order to address this issue for the Hawai’i Established Program to Stimulate Competitive Research (EPSCoR) project, a water science gateway was developed at the University of Hawai‘i (UH), called the ‘Ike Wai Gateway. In Hawaiian, ‘Ike means knowledge and Wai means water. The gateway supports research in hydrology and water management by providing tools to address questions of water sustainability in Hawai‘i. The gateway provides a framework for data acquisition, analysis, model integration, and display of data products. The gateway is intended to complement and integrate with the capabilities of the Consortium of Universities for the Advancement of Hydrologic Science’s (CUAHSI) Hydroshare by providing sound data and metadata management capabilities for multi-domain field observations, analytical lab actions, and modeling outputs. Functionality provided by the gateway is supported by a subset of the CUAHSI’s Observations Data Model (ODM) delivered as centralized web based user interfaces and APIs supporting multi-domain data management, computation, analysis, and visualization tools to support reproducible science, modeling, data discovery, and decision support for the Hawai’i EPSCoR ‘Ike Wai research team and wider Hawai‘i hydrology community. By leveraging the Tapis platform, UH has constructed a gateway that ties data and advanced computing resources together to support diverse research domains including microbiology, geochemistry, geophysics, economics, and humanities, coupled with computational and modeling workflows delivered in a user friendly web interface with workflows for effectively annotating the project data and products. Disseminating results for the ‘Ike Wai project through the ‘Ike Wai data gateway and Hydroshare makes the research products accessible and reusable.

  9. r

    Metaphor Web Annotation Data from Subproject C03 of CRC 1475 version 1.0

    • rdms.rd.ruhr-uni-bochum.de
    Updated Oct 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Metaphor Web Annotation Data from Subproject C03 of CRC 1475 version 1.0 [Dataset]. https://rdms.rd.ruhr-uni-bochum.de/concern/datasets/6969z376g?locale=en
    Explore at:
    Dataset updated
    Oct 30, 2025
    Description

    This dataset consists of web annotation data that originates from the CRC 1475 "Metaphors of Religion" (https://w3id.org/MoRe-SFB1475/) in a collaborative effort by researchers from Ruhr University Bochum (RUB) and Karlsruhe Institute of Technology (KIT). It includes metaphor analyses that have been created by scholars from the subproject C03 "Metaphors of Everyday Life" during the CRC's first funding period from 2022 to late 2025.

  10. r

    Metaphor Web Annotation Data from Subproject C04 of CRC 1475 version 1.0

    • rdms.rd.ruhr-uni-bochum.de
    Updated Oct 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Metaphor Web Annotation Data from Subproject C04 of CRC 1475 version 1.0 [Dataset]. https://rdms.rd.ruhr-uni-bochum.de/concern/datasets/bc386n217?locale=en
    Explore at:
    Dataset updated
    Oct 30, 2025
    Description

    This dataset consists of web annotation data that originates from the CRC 1475 "Metaphors of Religion" (https://w3id.org/MoRe-SFB1475/) in a collaborative effort by researchers from Ruhr University Bochum (RUB) and Karlsruhe Institute of Technology (KIT). It includes metaphor analyses that have been created by scholars from the subproject C04 "Metaphor and Social Positioning in Religious Online Forums" during the CRC's first funding period from 2022 to late 2025.

  11. d

    Data from: Model Predictions, Observations, and Annotation Data for Deep...

    • catalog.data.gov
    • data.usgs.gov
    Updated Sep 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Model Predictions, Observations, and Annotation Data for Deep Learning Models Developed to Estimate Relative Flow at 11 Massachusetts Streamflow Sites, 2017-2024 [Dataset]. https://catalog.data.gov/dataset/model-predictions-observations-and-annotation-data-for-deep-learning-models-developed-2017
    Explore at:
    Dataset updated
    Sep 13, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    This dataset consists of tabular data of observed streamflow, URL links to timelapse images, and deep learning model predictions for 11 sites in western Massachusetts. The dataset also includes a record of annotation data used to train the deep learning models. This data release is supporting information for an associated journal article describing the data collection, development of the deep learning models, and the interpretation of estimated relative streamflow produced by the models.

  12. P

    Premium Annotation Tools Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Mar 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Premium Annotation Tools Report [Dataset]. https://www.marketresearchforecast.com/reports/premium-annotation-tools-34887
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Mar 15, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Discover the booming premium annotation tools market! Explore a comprehensive analysis revealing a $1115.9 million market size in 2025, projected to grow at a 7.8% CAGR. Learn about key drivers, trends, and regional insights impacting this crucial sector for AI and machine learning development.

  13. r

    Metaphor Web Annotation Data from Subproject A02 of CRC 1475 version 1.0

    • rdms.rd.ruhr-uni-bochum.de
    Updated Oct 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Metaphor Web Annotation Data from Subproject A02 of CRC 1475 version 1.0 [Dataset]. https://rdms.rd.ruhr-uni-bochum.de/concern/datasets/mg74qq188?locale=en
    Explore at:
    Dataset updated
    Oct 30, 2025
    Description

    This dataset consists of web annotation data that originates from the CRC 1475 "Metaphors of Religion" (https://w3id.org/MoRe-SFB1475/) in a collaborative effort by researchers from Ruhr University Bochum (RUB) and Karlsruhe Institute of Technology (KIT). It includes metaphor analyses that have been created by scholars from the subproject A02 "The Kinesis of Immortality. Spatial-kinetic Metaphors and Daoist Salvation" during the CRC's first funding period from 2022 to late 2025.

  14. A

    Annotating Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Annotating Software Report [Dataset]. https://www.datainsightsmarket.com/reports/annotating-software-1447731
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    May 7, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The annotating software market is booming, projected to reach over $1 billion by 2033. Discover key trends, regional insights, and leading companies driving this growth in our comprehensive market analysis. Explore web-based vs. on-premise solutions and their applications in education, business, and machine learning.

  15. People - Segmentation

    • kaggle.com
    zip
    Updated Apr 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Quantigo AI Inc (2023). People - Segmentation [Dataset]. https://www.kaggle.com/datasets/quantigoai/people-segmentation/data
    Explore at:
    zip(34784209 bytes)Available download formats
    Dataset updated
    Apr 18, 2023
    Authors
    Quantigo AI Inc
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    The "People - Segmentation" dataset is a high-quality polygon annotation dataset containing 1000 publicly available images of people in various settings and environments. The dataset comprises a total of 1035 labels across one class, capturing people in different poses, expressions, and backgrounds. It is released under the CC BY-SA 4.0 license, providing researchers, data scientists, and enthusiasts with the ability to gain valuable insights into human activities and enabling object-level understanding. This makes it an indispensable tool for a range of applications, including but not limited to object detection, facial recognition, and human-computer interaction systems. With annotations, researchers can analyze and gain insights into the development of accurate person detection algorithms.

    Dataset Name - People - Segmentation Data Asset Type - Image Data Asset Volume - 1000 images Data Asset Content - People in various settings and environments Data Asset Source - Publicly available on the web Annotation Type - Polygon Annotation Format - COCO Platform Used - Supervisely

    This dataset is created by Quantigo AI, as a part of our commitment towards advancing the fields of AI and machine learning. If you have any queries about our datasets, please contact us at datasets@quantigo.ai.

    Visit our website at https://quantigo.ai/ to learn more about our services and commitment to advancing the fields of AI and machine learning.

  16. 142-Birds-Species-Object-Detection-V1

    • kaggle.com
    zip
    Updated Oct 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sai Sanjay Kottakota (2024). 142-Birds-Species-Object-Detection-V1 [Dataset]. https://www.kaggle.com/datasets/saisanjaykottakota/142-birds-species-object-detection-v1
    Explore at:
    zip(1081589024 bytes)Available download formats
    Dataset updated
    Oct 17, 2024
    Authors
    Sai Sanjay Kottakota
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Data Annotation for Computer Vision using Web Scraping and CVAT

    Introduction

    This project demonstrates the process of creating a labeled dataset for computer vision tasks using web scraping and the CVAT annotation tool. Web scraping was employed to gather images from the web, and CVAT was utilized to annotate these images with bounding boxes around objects of interest. This dataset can then be used to train object detection models.

    Dataset Creation

    1. Web Scraping: Images of 142 bird species were collected using web scraping techniques. Libraries such as requests and Beautiful Soup were likely used for this task.
    2. CVAT Annotation: The collected images were uploaded to CVAT, where bounding boxes were manually drawn around each bird instance in the images. This created a labeled dataset ready for training computer vision models.

    Usage

    This dataset can be used to train object detection models for bird species identification. It can also be used to evaluate the performance of existing object detection models on a specific dataset.

    Code

    The code used for this project is available in the attached notebook. It demonstrates how to perform the following tasks:

    • Download the dataset.
    • Install necessary libraries.
    • Upload the dataset to Kaggle.
    • Create a dataset in Kaggle and upload the data.

    Conclusion

    This project provides a comprehensive guide to data annotation for computer vision tasks. By combining web scraping and CVAT, we were able to create a high-quality labeled dataset for training object detection models. Sources github.com/cvat-ai/cvat opencv.org/blog/data-annotation/

    Sample manifest.jsonl metadata

    {"version":"1.1"}
    {"type":"images"}
    {"name":"Spot-billed_Pelican_-_Pelecanus_philippensis_-_Media_Search_-_Macaulay_Library_and_eBirdMacaulay_Library_logoMacaulay_Library_lo/10001","extension":".jpg","width":480,"height":360,"meta":{"related_images":[]}}
    {"name":"Spot-billed_Pelican_-_Pelecanus_philippensis_-_Media_Search_-_Macaulay_Library_and_eBirdMacaulay_Library_logoMacaulay_Library_lo/10002","extension":".jpg","width":480,"height":320,"meta":{"related_images":[]}}
    
  17. Z

    Curlie Enhanced with LLM Annotations: Two Datasets for Advancing...

    • data-staging.niaid.nih.gov
    Updated Dec 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nutter, Peter; Senghaas, Mika; Cizinsky, Ludek (2023). Curlie Enhanced with LLM Annotations: Two Datasets for Advancing Homepage2Vec's Multilingual Website Classification [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_10413067
    Explore at:
    Dataset updated
    Dec 21, 2023
    Dataset provided by
    École Polytechnique Fédérale de Lausanne
    Czech Technical University in Prague
    Authors
    Nutter, Peter; Senghaas, Mika; Cizinsky, Ludek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Advancing Homepage2Vec with LLM-Generated Datasets for Multilingual Website Classification

    This dataset contains two subsets of labeled website data, specifically created to enhance the performance of Homepage2Vec, a multi-label model for website classification. The datasets were generated using Large Language Models (LLMs) to provide more accurate and diverse topic annotations for websites, addressing a limitation of existing Homepage2Vec training data.

    Key Features:

    LLM-generated annotations: Both datasets feature website topic labels generated using LLMs, a novel approach to creating high-quality training data for website classification models.

    Improved multi-label classification: Fine-tuning Homepage2Vec with these datasets has been shown to improve its macro F1 score from 38% to 43% evaluated on a human-labeled dataset, demonstrating their effectiveness in capturing a broader range of website topics.

    Multilingual applicability: The datasets facilitate classification of websites in multiple languages, reflecting the inherent multilingual nature of Homepage2Vec.

    Dataset Composition:

    curlie-gpt3.5-10k: 10,000 websites labeled using GPT-3.5, context 2 and 1-shot

    curlie-gpt4-10k: 10,000 websites labeled using GPT-4, context 2 and zero-shot

    Intended Use:

    Fine-tuning and advancing Homepage2Vec or similar website classification models

    Research on LLM-generated datasets for text classification tasks

    Exploration of multilingual website classification

    Additional Information:

    Project and report repository: https://github.com/CS-433/ml-project-2-mlp

    Acknowledgments:

    This dataset was created as part of a project at EPFL's Data Science Lab (DLab) in collaboration with Prof. Robert West and Tiziano Piccardi.

  18. A

    AI Training Data Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Apr 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). AI Training Data Report [Dataset]. https://www.datainsightsmarket.com/reports/ai-training-data-1501657
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Apr 26, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The booming AI training data market is projected for explosive growth, reaching significant value by 2033. Learn about key market drivers, trends, restraints, and leading companies shaping this rapidly expanding sector. Explore regional breakdowns and application segments in this comprehensive market analysis.

  19. Audio Tags for TAU Urban Scenes

    • kaggle.com
    zip
    Updated Feb 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Audio Tags for TAU Urban Scenes [Dataset]. https://www.kaggle.com/datasets/thedevastator/audio-tags-for-tau-urban-scenes
    Explore at:
    zip(73595 bytes)Available download formats
    Dataset updated
    Feb 12, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Audio Tags for TAU Urban Scenes

    Web-Based Annotations for Airport, Public Square, and Park Scenes

    By [source]

    About this dataset

    This Multi-Annotator Tagged Soundscapes (MATS) dataset provides thoughtful audio tags describing a unique collection of airport, public square, and park scenes from TAU Urban Acoustic Scenes 2019. Annotations are provided in both raw and processed formats, with 133 annotators providing their opinions on each audio file. From providing an understanding of current sound levels to assisting in the development of noise-reduction algorithms, this dataset has something for everyone who wants to explore soundscapes from around the world. So whether you're looking for insight into urban sounds or are just interested in what you'll hear when visiting different locations around the world, this is your perfect resource!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides metadata for the TAU Urban Scenes 2019 development dataset. It can be used to explore and analyze soundscapes from urban environments, in order to better understand the acoustic environment of an urban setting.

    To use this dataset, start by exploring the audio tags associated with each audio file. This will give you an overview of the type of sounds present in each scene. Then, use the provided annotations files (MATS_labels_mace100_competence06 and MATS_labels_majority_vote) to study how annotations differ between annotators and how different methods handle multi-annotator data. You can also take a closer look at individual audio files by downloading them directly from zenodo or using audio players such as Audacity or SoX to open them up.

    You can then use this information to develop analyses that deep dive into various aspects of soundscapes in cities such as sound sources, noise levels, and temporal trends across different sites within these cities. This dataset provides a platform for researchers who wish to identify features that distinguish one scene from another or identify changes between time periods for specific locations!

    Research Ideas

    • Using a majority vote to determine the 'consensus' tags of an audio file, or to measure the agreement between multiple annotators on specific labels.
    • Training a machine learning model on the MACE100 processed annotations and using it to accurately detect audio tags for new sets of audio files.
    • Combining different annotation methods (MACE100/Competence06) for more robust analysis and comparison of results from multiple annotators

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: MATS_labels_mace100_competence06.csv | Column name | Description | |:--------------|:--------------------------------------------------------| | filename | The name of the audio file. (String) | | tags | The audio tags associated with the audio file. (String) |

    File: MATS_labels_majority_vote.csv | Column name | Description | |:--------------|:--------------------------------------------------------| | filename | The name of the audio file. (String) | | tags | The audio tags associated with the audio file. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .

  20. Z

    SemTab 2024: Semantic Web Challenge on Tabular Data to Knowledge Graph...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hassanzadeh, Oktie; Efthymiou, Vasilis; Chen, Jiaoyan (2024). SemTab 2024: Semantic Web Challenge on Tabular Data to Knowledge Graph Matching Data Sets - WikidataTables2024R1 and WikidataTables2024R2 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3518530
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    University of Oxford
    IBM Research
    Authors
    Hassanzadeh, Oktie; Efthymiou, Vasilis; Chen, Jiaoyan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data Sets from the ISWC 2024 Semantic Web Challenge on Tabular Data to Knowledge Graph Matching, Round 1, Wikidata Tables. Links to other datasets can be found on the challenge website: https://sem-tab-challenge.github.io/2024/ as well as the proceedings of the challenge published on CEUR.

    For details about the challenge, see: http://www.cs.ox.ac.uk/isg/challenges/sem-tab/

    For 2024 edition, see: https://sem-tab-challenge.github.io/2024/

    Note on License: This data includes data from the following sources. Refer to each source for license details:- Wikidata https://www.wikidata.org/

    THIS DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Data Insights Market (2025). Automated Data Annotation Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/automated-data-annotation-tools-1947663

Automated Data Annotation Tools Report

Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Apr 24, 2025
Dataset authored and provided by
Data Insights Market
License

https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description

Discover the booming Automated Data Annotation Tools market! This comprehensive analysis reveals key trends, drivers, restraints, and forecasts for 2025-2033, covering major regions & applications. Learn about leading companies and unlock opportunities in this rapidly evolving AI landscape.

Search
Clear search
Close search
Google apps
Main menu