19 datasets found
  1. d

    AI Training Data | Annotated Checkout Flows for Retail, Restaurant, and...

    • datarade.ai
    Updated Dec 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MealMe (2024). AI Training Data | Annotated Checkout Flows for Retail, Restaurant, and Marketplace Websites [Dataset]. https://datarade.ai/data-products/ai-training-data-annotated-checkout-flows-for-retail-resta-mealme
    Explore at:
    Dataset updated
    Dec 18, 2024
    Dataset authored and provided by
    MealMe
    Area covered
    United States of America
    Description

    AI Training Data | Annotated Checkout Flows for Retail, Restaurant, and Marketplace Websites Overview

    Unlock the next generation of agentic commerce and automated shopping experiences with this comprehensive dataset of meticulously annotated checkout flows, sourced directly from leading retail, restaurant, and marketplace websites. Designed for developers, researchers, and AI labs building large language models (LLMs) and agentic systems capable of online purchasing, this dataset captures the real-world complexity of digital transactions—from cart initiation to final payment.

    Key Features

    Breadth of Coverage: Over 10,000 unique checkout journeys across hundreds of top e-commerce, food delivery, and service platforms, including but not limited to Walmart, Target, Kroger, Whole Foods, Uber Eats, Instacart, Shopify-powered sites, and more.

    Actionable Annotation: Every flow is broken down into granular, step-by-step actions, complete with timestamped events, UI context, form field details, validation logic, and response feedback. Each step includes:

    Page state (URL, DOM snapshot, and metadata)

    User actions (clicks, taps, text input, dropdown selection, checkbox/radio interactions)

    System responses (AJAX calls, error/success messages, cart/price updates)

    Authentication and account linking steps where applicable

    Payment entry (card, wallet, alternative methods)

    Order review and confirmation

    Multi-Vertical, Real-World Data: Flows sourced from a wide variety of verticals and real consumer environments, not just demo stores or test accounts. Includes complex cases such as multi-item carts, promo codes, loyalty integration, and split payments.

    Structured for Machine Learning: Delivered in standard formats (JSONL, CSV, or your preferred schema), with every event mapped to action types, page features, and expected outcomes. Optional HAR files and raw network request logs provide an extra layer of technical fidelity for action modeling and RLHF pipelines.

    Rich Context for LLMs and Agents: Every annotation includes both human-readable and model-consumable descriptions:

    “What the user did” (natural language)

    “What the system did in response”

    “What a successful action should look like”

    Error/edge case coverage (invalid forms, OOS, address/payment errors)

    Privacy-Safe & Compliant: All flows are depersonalized and scrubbed of PII. Sensitive fields (like credit card numbers, user addresses, and login credentials) are replaced with realistic but synthetic data, ensuring compliance with privacy regulations.

    Each flow tracks the user journey from cart to payment to confirmation, including:

    Adding/removing items

    Applying coupons or promo codes

    Selecting shipping/delivery options

    Account creation, login, or guest checkout

    Inputting payment details (card, wallet, Buy Now Pay Later)

    Handling validation errors or OOS scenarios

    Order review and final placement

    Confirmation page capture (including order summary details)

    Why This Dataset?

    Building LLMs, agentic shopping bots, or e-commerce automation tools demands more than just page screenshots or API logs. You need deeply contextualized, action-oriented data that reflects how real users interact with the complex, ever-changing UIs of digital commerce. Our dataset uniquely captures:

    The full intent-action-outcome loop

    Dynamic UI changes, modals, validation, and error handling

    Nuances of cart modification, bundle pricing, delivery constraints, and multi-vendor checkouts

    Mobile vs. desktop variations

    Diverse merchant tech stacks (custom, Shopify, Magento, BigCommerce, native apps, etc.)

    Use Cases

    LLM Fine-Tuning: Teach models to reason through step-by-step transaction flows, infer next-best-actions, and generate robust, context-sensitive prompts for real-world ordering.

    Agentic Shopping Bots: Train agents to navigate web/mobile checkouts autonomously, handle edge cases, and complete real purchases on behalf of users.

    Action Model & RLHF Training: Provide reinforcement learning pipelines with ground truth “what happens if I do X?” data across hundreds of real merchants.

    UI/UX Research & Synthetic User Studies: Identify friction points, bottlenecks, and drop-offs in modern checkout design by replaying flows and testing interventions.

    Automated QA & Regression Testing: Use realistic flows as test cases for new features or third-party integrations.

    What’s Included

    10,000+ annotated checkout flows (retail, restaurant, marketplace)

    Step-by-step event logs with metadata, DOM, and network context

    Natural language explanations for each step and transition

    All flows are depersonalized and privacy-compliant

    Example scripts for ingesting, parsing, and analyzing the dataset

    Flexible licensing for research or commercial use

    Sample Categories Covered

    Grocery delivery (Instacart, Walmart, Kroger, Target, etc.)

    Restaurant takeout/delivery (Ub...

  2. APIGen-MT-5k

    • huggingface.co
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salesforce (2025). APIGen-MT-5k [Dataset]. https://huggingface.co/datasets/Salesforce/APIGen-MT-5k
    Explore at:
    Dataset updated
    May 16, 2025
    Dataset provided by
    Salesforce Inchttp://salesforce.com/
    Authors
    Salesforce
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Summary

    APIGen-MT is an automated agentic data generation pipeline designed to synthesize verifiable, high-quality, realistic datasets for agentic applications This dataset was released as part of APIGen-MT: Agentic PIpeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay Code: https://github.com/apigen-mt/apigen-mt.github.io The repo contains 5000 multi-turn trajectories collected by APIGen-MT This dataset is a subset of the data used to train the xLAM-2 model… See the full description on the dataset page: https://huggingface.co/datasets/Salesforce/APIGen-MT-5k.

  3. h

    uk_retail_store_synthetic_dataset

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syncora.ai - Agentic Synthetic Data Platform, uk_retail_store_synthetic_dataset [Dataset]. https://huggingface.co/datasets/syncora/uk_retail_store_synthetic_dataset
    Explore at:
    Authors
    Syncora.ai - Agentic Synthetic Data Platform
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    United Kingdom
    Description

    Synthetic Data Generation Demo — UK Retail Dataset

    Welcome to this synthetic data generation demo repository by Syncora.ai. This project showcases how to generate synthetic data using real-world tabular structures, demonstrated on a UK retail dataset with columns such as:

    Country
    CustomerID
    UnitPrice
    InvoiceDate
    Quantity
    StockCode

    This dataset is designed for dataset for LLM training and AI development, enabling developers to work with privacy-safe, high-quality… See the full description on the dataset page: https://huggingface.co/datasets/syncora/uk_retail_store_synthetic_dataset.

  4. Example of Labeling.

    • plos.figshare.com
    xls
    Updated Aug 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Libo Yang; Yuan Li; Junhua Tan; Libo Mao (2025). Example of Labeling. [Dataset]. http://doi.org/10.1371/journal.pone.0330258.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 26, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Libo Yang; Yuan Li; Junhua Tan; Libo Mao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Traditional knowledge graphs of water conservancy project risks have supported risk decision-making. However, they are constrained by limited data modalities and low accuracy in information extraction. A multimodal water conservancy project risk knowledge graph is proposed in this study, along with a synergistic strategy involving multimodal large language models Risk decision-making generation is facilitated through a multi-agent agentic retrieval-augmented generation framework. To enhance visual recognition, a DenseNet-based image classification model is improved by incorporating single-head self-attention and coordinate attention mechanisms. For textual data, risk entities such as locations, components, and events are extracted using a BERT-BiLSTM-CRF architecture. These extracted entities serve as the foundation for constructing the multimodal knowledge graph. To support generation, a multi-agent agentic retrieval-augmented generation mechanism is introduced. This mechanism enhances the reliability and interpretability of risk decision-making outputs. In experiments, the enhanced DenseNet model outperforms the original baseline in both precision and recall for image recognition tasks. In risk decision-making tasks, the proposed approach—combining a multimodal knowledge graph with a multi-agent agentic retrieval-augmented generation method—achieves strong performance on BERTScore and ROUGE-L metrics. This work presents a novel perspective for leveraging multimodal knowledge graphs in water conservancy project risk management.

  5. Results of module ablation on the validation set.

    • figshare.com
    xls
    Updated Aug 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Libo Yang; Yuan Li; Junhua Tan; Libo Mao (2025). Results of module ablation on the validation set. [Dataset]. http://doi.org/10.1371/journal.pone.0330258.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 26, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Libo Yang; Yuan Li; Junhua Tan; Libo Mao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Traditional knowledge graphs of water conservancy project risks have supported risk decision-making. However, they are constrained by limited data modalities and low accuracy in information extraction. A multimodal water conservancy project risk knowledge graph is proposed in this study, along with a synergistic strategy involving multimodal large language models Risk decision-making generation is facilitated through a multi-agent agentic retrieval-augmented generation framework. To enhance visual recognition, a DenseNet-based image classification model is improved by incorporating single-head self-attention and coordinate attention mechanisms. For textual data, risk entities such as locations, components, and events are extracted using a BERT-BiLSTM-CRF architecture. These extracted entities serve as the foundation for constructing the multimodal knowledge graph. To support generation, a multi-agent agentic retrieval-augmented generation mechanism is introduced. This mechanism enhances the reliability and interpretability of risk decision-making outputs. In experiments, the enhanced DenseNet model outperforms the original baseline in both precision and recall for image recognition tasks. In risk decision-making tasks, the proposed approach—combining a multimodal knowledge graph with a multi-agent agentic retrieval-augmented generation method—achieves strong performance on BERTScore and ROUGE-L metrics. This work presents a novel perspective for leveraging multimodal knowledge graphs in water conservancy project risk management.

  6. i

    Middle East & Africa Generative AI in Testing Market Size, Share, Analysis...

    • intelevoresearch.com
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://www.intelevoresearch.com/ (2025). Middle East & Africa Generative AI in Testing Market Size, Share, Analysis Report Component (Software, Services), Deployment (Cloud, On-premises, Hybrid), Application (Automated Test Case Generation, Intelligent Test Data Creation, AI-Powered Test Maintenance, Predictive Quality Analytics, Technology, NL-to-Test, Agentic Orchestration, Vision & Model-Based UI Understanding, Retrieval-Augmented Testing, Test-Data Generators), Organization Size (Large Enterprises, SMEs), End Use (IT & Telecom, BFSI, Healthcare & Life Sciences, Retail & eCommerce, Manufacturing & Industrial, Public Sector & Education) Region and Key Players - Industry Segment Overview, Market Dynamics, Competitive Strategies, Trends and Forecast 2025-2034 [Dataset]. https://www.intelevoresearch.com/reports/middle-east-africa-generative-ai-in-testing-market
    Explore at:
    Dataset updated
    Aug 20, 2025
    Dataset provided by
    https://www.intelevoresearch.com/
    License

    https://www.intelevoresearch.com/privacy-policyhttps://www.intelevoresearch.com/privacy-policy

    Area covered
    Africa, Middle East
    Description

    Middle East & Africa Generative AI in Testing market is set to grow from USD 221.08M in 2024 to USD 884.75M by 2034, at a CAGR of 15.35%. Explore trends, drivers, growth.

  7. i

    Europe Generative AI in Testing Market Size, Share, Analysis Report...

    • intelevoresearch.com
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://www.intelevoresearch.com/ (2025). Europe Generative AI in Testing Market Size, Share, Analysis Report Component (Software, Services), Deployment (Cloud, On-premises, Hybrid), Application (Automated Test Case Generation, Intelligent Test Data Creation, AI-Powered Test Maintenance, Predictive Quality Analytics, Technology, NL-to-Test, Agentic Orchestration, Vision & Model-Based UI Understanding, Retrieval-Augmented Testing, Test-Data Generators), Organization Size (Large Enterprises, SMEs), End Use (IT & Telecom, BFSI, Healthcare & Life Sciences, Retail & eCommerce, Manufacturing & Industrial, Public Sector & Education) Region and Key Players - Industry Segment Overview, Market Dynamics, Competitive Strategies, Trends and Forecast 2025-2034 [Dataset]. https://www.intelevoresearch.com/reports/europe-generative-ai-in-testing-market
    Explore at:
    Dataset updated
    Aug 20, 2025
    Dataset provided by
    https://www.intelevoresearch.com/
    License

    https://www.intelevoresearch.com/privacy-policyhttps://www.intelevoresearch.com/privacy-policy

    Area covered
    Europe
    Description

    Europe Generative AI in Testing market is set to rise from USD 0.21B in 2024 to USD 3.75B by 2034, growing at a CAGR of 34.21%. Explore drivers, trends and opportunities.

  8. h

    customer_support_conversations_dataset

    • huggingface.co
    Updated Oct 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syncora.ai - Agentic Synthetic Data Platform (2025). customer_support_conversations_dataset [Dataset]. https://huggingface.co/datasets/syncora/customer_support_conversations_dataset
    Explore at:
    Dataset updated
    Oct 10, 2025
    Authors
    Syncora.ai - Agentic Synthetic Data Platform
    Description

    💬 Customer Support Conversation Dataset — Powered by Syncora.ai

    A free synthetic dataset for chatbot training, LLM fine-tuning, and synthetic data generation research.Created using Syncora.ai’s privacy-safe synthetic data engine, this dataset is ideal for developing, testing, and benchmarking AI customer support systems. It serves as a dataset for chatbot training and a dataset for LLM training, offering rich, structured conversation data for real-world simulation.

      🌟… See the full description on the dataset page: https://huggingface.co/datasets/syncora/customer_support_conversations_dataset.
    
  9. h

    fitness-tracker-dataset

    • huggingface.co
    Updated Oct 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syncora.ai - Agentic Synthetic Data Platform (2025). fitness-tracker-dataset [Dataset]. https://huggingface.co/datasets/syncora/fitness-tracker-dataset
    Explore at:
    Dataset updated
    Oct 5, 2025
    Authors
    Syncora.ai - Agentic Synthetic Data Platform
    Description

    🏃 Synthetic Wearable & Activity Dataset — Powered by Syncora.ai

    Free dataset for health analytics, activity recognition, synthetic data generation, and dataset for LLM training.

      🌟 About This Dataset
    

    This dataset contains synthetic wearable fitness records, modeled on signals from devices such as the Apple Watch. All entries are fully synthetic, generated with Syncora.ai’s synthetic data engine, ensuring privacy-safe and bias-aware data.
    The dataset provides rich… See the full description on the dataset page: https://huggingface.co/datasets/syncora/fitness-tracker-dataset.

  10. i

    Global Generative AI in Testing Market Size, Share, Analysis Report...

    • intelevoresearch.com
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://www.intelevoresearch.com/ (2025). Global Generative AI in Testing Market Size, Share, Analysis Report Component (Software, Services), Deployment (Cloud, On-premises, Hybrid), Application (Automated Test Case Generation, Intelligent Test Data Creation, AI-Powered Test Maintenance, Predictive Quality Analytics, Technology, NL-to-Test, Agentic Orchestration, Vision & Model-Based UI Understanding, Retrieval-Augmented Testing, Test-Data Generators), Organization Size (Large Enterprises, SMEs), End Use (IT & Telecom, BFSI, Healthcare & Life Sciences, Retail & eCommerce, Manufacturing & Industrial, Public Sector & Education) Region and Key Players - Industry Segment Overview, Market Dynamics, Competitive Strategies, Trends and Forecast 2025-2034 [Dataset]. https://www.intelevoresearch.com/reports/generative-ai-in-testing-market
    Explore at:
    Dataset updated
    Aug 20, 2025
    Dataset provided by
    https://www.intelevoresearch.com/
    License

    https://www.intelevoresearch.com/privacy-policyhttps://www.intelevoresearch.com/privacy-policy

    Description

    Global Generative AI in Testing market is set to grow from USD 0.71B in 2024 to USD 14.15B by 2034,at a CAGR of 34.2% (2025–2034). Explore trends, opportunities and drivers.

  11. h

    mental_health_survey_dataset

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syncora.ai - Agentic Synthetic Data Platform, mental_health_survey_dataset [Dataset]. https://huggingface.co/datasets/syncora/mental_health_survey_dataset
    Explore at:
    Authors
    Syncora.ai - Agentic Synthetic Data Platform
    Description

    🧠 Mental Health Posting Dataset — Synthetic Dataset for LLM & Chatbot Training

    Free dataset for mental health research, LLM training, and chatbot development, generated using synthetic data generation techniques to ensure privacy and high fidelity.

      🌟 About This Dataset
    

    This dataset contains synthetic mental health survey responses across multiple demographics and occupations. It includes participant-reported stress levels, coping mechanisms, mood swings, and social… See the full description on the dataset page: https://huggingface.co/datasets/syncora/mental_health_survey_dataset.

  12. i

    North America Generative AI in Testing Market Size, Share, Analysis Report...

    • intelevoresearch.com
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://www.intelevoresearch.com/ (2025). North America Generative AI in Testing Market Size, Share, Analysis Report Component (Software, Services), Deployment (Cloud, On-premises, Hybrid), Application (Automated Test Case Generation, Intelligent Test Data Creation, AI-Powered Test Maintenance, Predictive Quality Analytics, Technology, NL-to-Test, Agentic Orchestration, Vision & Model-Based UI Understanding, Retrieval-Augmented Testing, Test-Data Generators), Organization Size (Large Enterprises, SMEs), End Use (IT & Telecom, BFSI, Healthcare & Life Sciences, Retail & eCommerce, Manufacturing & Industrial, Public Sector & Education) Region and Key Players - Industry Segment Overview, Market Dynamics, Competitive Strategies, Trends and Forecast 2025-2034 [Dataset]. https://www.intelevoresearch.com/reports/north-america-generative-ai-in-testing-market
    Explore at:
    Dataset updated
    Aug 20, 2025
    Dataset provided by
    https://www.intelevoresearch.com/
    License

    https://www.intelevoresearch.com/privacy-policyhttps://www.intelevoresearch.com/privacy-policy

    Description

    North America Generative AI in Testing market is set to grow from USD 0.31B in 2024 to USD 5.8B by 2034, at a CAGR of 33.91%. Explore trends, drivers, and opportunities.

  13. AI Procurement Intelligence Market Analysis, Size, and Forecast 2025-2029 :...

    • technavio.com
    pdf
    Updated Oct 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). AI Procurement Intelligence Market Analysis, Size, and Forecast 2025-2029 : North America (US, Canada, and Mexico), Europe (Germany, UK, France, The Netherlands, Italy, and Spain), APAC (China, Japan, India, Australia, South Korea, and Indonesia), South America (Brazil, Argentina, and Colombia), Middle East and Africa (UAE, South Africa, and Turkey), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/ai-procurement-intelligence-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Oct 9, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2025 - 2029
    Area covered
    United States, Canada
    Description

    Snapshot img { margin: 10px !important; } AI Procurement Intelligence Market Size 2025-2029

    The ai procurement intelligence market size is forecast to increase by USD 14.5 billion, at a CAGR of 42.9% between 2024 and 2029.

    Enterprises are increasingly adopting AI procurement intelligence to enhance operational efficiency and achieve significant cost savings in response to persistent economic pressures. This drive for strategic cost management is met by the proliferation of generative AI and hyper-automation, which are being integrated into advanced procurement software. These technologies are enabling a shift toward predictive sourcing functions, allowing teams to forecast market conditions and automate complex decision-making processes. By leveraging natural language prompts and cognitive capabilities, these tools make sophisticated data analysis more accessible, empowering procurement professionals to focus on higher-value activities like negotiation and strategic supplier relationship management. The focus is on creating autonomous and strategic sourcing capabilities through industrial ai software.However, realizing the full potential of these advanced systems is often constrained by foundational issues related to data integrity and accessibility. Many organizations grapple with a fragmented data landscape, where procurement information is trapped in disparate silos with inconsistent taxonomies, making the creation of a unified data view a significant hurdle. Without meticulous data cleansing and normalization, the insights generated by AI algorithms can be skewed or misleading, which erodes user trust and undermines the business case for the technology. This highlights the importance of robust AI governance tools to manage data quality, security, and integration effectively within the framework of agentic AI for data engineering.

    What will be the Size of the AI Procurement Intelligence Market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2019 - 2023 and forecasts 2025-2029 - in the full report.
    Request Free SampleThe market is defined by a strategic shift toward proactive risk mitigation and enhanced supply chain resilience. Organizations are leveraging predictive analytics and real-time monitoring to anticipate disruptions from geopolitical or climate-related events. This move from a reactive to a proactive stance is enabled by AI-powered platforms that provide deep visibility into multi-tier supplier networks. The integration of predictive ai in supply chain systems is becoming standard practice for ensuring business continuity and managing complex global trade dynamics. This focus on foresight and preparedness underscores a fundamental change in procurement strategy.Operational efficiency is being transformed through procurement workflow automation and the adoption of hyper-automation. These technologies are streamlining routine tasks like invoice processing and purchase order generation, freeing up procurement professionals for more strategic activities. The use of generative AI is also changing user interaction via natural language prompts, making complex data analysis more accessible. This focus on intelligent automation and ai in project management helps organizations reduce sourcing cycle times and improve overall productivity.Supplier relationship management is evolving with the use of sophisticated AI tools for performance evaluation and strategic decision-making. AI-powered platforms assist in supplier discovery and vetting, ensuring that new partners meet rigorous standards for quality and compliance. These systems analyze supplier performance metrics to inform consolidation strategies and negotiation tactics. The ongoing development of ai for sales, from a procurement perspective, allows for more dynamic and data-driven interactions, fostering a collaborative and resilient supplier ecosystem.

    How is this AI Procurement Intelligence Industry segmented?

    The ai procurement intelligence industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in "USD million" for the period 2025-2029, as well as historical data from 2019 - 2023 for the following segments. ComponentSoftwareServicesDeploymentCloud-basedOn-premisesEnd-userLarge enterprisesSMEsGovernment and public sectorGeographyNorth AmericaUSCanadaMexicoEuropeGermanyUKFranceThe NetherlandsItalySpainAPACChinaJapanIndiaAustraliaSouth KoreaIndonesiaSouth AmericaBrazilArgentinaColombiaMiddle East and AfricaUAESouth AfricaTurkeyRest of World (ROW)

    By Component Insights

    The software segment is estimated to witness significant growth during the forecast period.The software segment forms the core of the market, comprising digital platforms and applications that enable data-driven procurement. These solutions, predominantly delivered via a Software-as-a-Service model, provide func

  14. h

    developer-productivity-simulated-behavioral-data

    • huggingface.co
    Updated Aug 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syncora.ai - Agentic Synthetic Data Platform (2025). developer-productivity-simulated-behavioral-data [Dataset]. https://huggingface.co/datasets/syncora/developer-productivity-simulated-behavioral-data
    Explore at:
    Dataset updated
    Aug 25, 2025
    Authors
    Syncora.ai - Agentic Synthetic Data Platform
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Synthetic AI Developer Productivity Dataset — Behavioral + Cognitive Simulation

    A synthetic data generation resource for modeling behavioral and cognitive dynamics in developers.

      📘 About This Dataset
    

    This dataset simulates productivity data from AI-assisted software developers. It blends behavioral signals, physiological inputs, and productivity metrics to explore the nuanced relationships between deep work, distractions, caffeine, AI usage, and cognitive strain.… See the full description on the dataset page: https://huggingface.co/datasets/syncora/developer-productivity-simulated-behavioral-data.

  15. Risk Information Query and Decision Generation Workflow.

    • plos.figshare.com
    xls
    Updated Aug 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Libo Yang; Yuan Li; Junhua Tan; Libo Mao (2025). Risk Information Query and Decision Generation Workflow. [Dataset]. http://doi.org/10.1371/journal.pone.0330258.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 26, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Libo Yang; Yuan Li; Junhua Tan; Libo Mao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Risk Information Query and Decision Generation Workflow.

  16. h

    DataScience-Instruct-500K

    • huggingface.co
    Updated Oct 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RUC-DataLab (2025). DataScience-Instruct-500K [Dataset]. https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K
    Explore at:
    Dataset updated
    Oct 21, 2025
    Dataset authored and provided by
    RUC-DataLab
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

    Authors: Shaolei Zhang, Ju Fan*, Meihao Fan, Guoliang Li, Xiaoyong Du

    DeepAnalyze is the first agentic LLM for autonomous data science. It can autonomously complete a wide range of data-centric tasks without human intervention, supporting: 🛠 Entire data science pipeline: Automatically perform any data science tasks such as data preparation, analysis, modeling, visualization, and report generation. 🔍… See the full description on the dataset page: https://huggingface.co/datasets/RUC-DataLab/DataScience-Instruct-500K.

  17. f

    Data Sheet 1_On the potential of agentic workflows for animal training plan...

    • frontiersin.figshare.com
    pdf
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jörg Schultz (2025). Data Sheet 1_On the potential of agentic workflows for animal training plan generation.pdf [Dataset]. http://doi.org/10.3389/fvets.2025.1563233.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 20, 2025
    Dataset provided by
    Frontiers
    Authors
    Jörg Schultz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Effective animal training depends on well-structured training plans that ensure consistent progress and measurable outcomes. However, the creation of such plans is often time-intensive, repetitive, and detracts from hands-on training. Recent advancements in generative AI powered by large language models (LLMs) provide potential solutions but frequently fail to produce actionable, individualized plans tailored to specific contexts. This limitation is particularly significant given the diverse tasks performed by dogs–ranging from working roles in military and police operations to competitive sports–and the varying training philosophies among practitioners. To address these challenges, a modular agentic workflow framework is proposed, leveraging LLMs while mitigating their shortcomings. By decomposing the training plan generation process into specialized building blocks–autonomous agents that handle subtasks such as structuring progressions, ensuring welfare compliance, and adhering to team-specific standard operating procedures (SOPs)—this approach facilitates the creation of specific, actionable plans. The modular design further allows workflows to be tailored to the unique requirements of individual tasks and philosophies. As a proof of concept, a complete training plan generation workflow is presented, integrating these agents into a cohesive system. This framework prioritizes flexibility and adaptability, empowering trainers to create customized solutions while leveraging generative AI's capabilities. In summary, agentic workflows bridge the gap between cutting-edge technology and the practical, diverse needs of the animal training community. As such, they could form a crucial foundation for advancing computer-assisted animal training methodologies.

  18. h

    WorFBench_train

    • huggingface.co
    Updated Jul 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ZJUNLP (2025). WorFBench_train [Dataset]. https://huggingface.co/datasets/zjunlp/WorFBench_train
    Explore at:
    Dataset updated
    Jul 21, 2025
    Dataset authored and provided by
    ZJUNLP
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This repository contains the data presented in Benchmarking Agentic Workflow Generation. Code: https://github.com/zjunlp/WorfBench

  19. h

    graph-data-quantum-rl

    • huggingface.co
    Updated Oct 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cong Yu (2025). graph-data-quantum-rl [Dataset]. https://huggingface.co/datasets/Benyucong/graph-data-quantum-rl
    Explore at:
    Dataset updated
    Oct 5, 2025
    Authors
    Cong Yu
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Citation

    If you use this dataset, please cite: @misc{yu2025quasarquantumassemblycode, title={QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL}, author={Cong Yu and Valter Uotila and Shilong Deng and Qingyuan Wu and Tuo Shi and Songlin Jiang and Lei You and Bo Zhao}, year={2025}, eprint={2510.00967}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2510.00967}, }

  20. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
MealMe (2024). AI Training Data | Annotated Checkout Flows for Retail, Restaurant, and Marketplace Websites [Dataset]. https://datarade.ai/data-products/ai-training-data-annotated-checkout-flows-for-retail-resta-mealme

AI Training Data | Annotated Checkout Flows for Retail, Restaurant, and Marketplace Websites

Explore at:
Dataset updated
Dec 18, 2024
Dataset authored and provided by
MealMe
Area covered
United States of America
Description

AI Training Data | Annotated Checkout Flows for Retail, Restaurant, and Marketplace Websites Overview

Unlock the next generation of agentic commerce and automated shopping experiences with this comprehensive dataset of meticulously annotated checkout flows, sourced directly from leading retail, restaurant, and marketplace websites. Designed for developers, researchers, and AI labs building large language models (LLMs) and agentic systems capable of online purchasing, this dataset captures the real-world complexity of digital transactions—from cart initiation to final payment.

Key Features

Breadth of Coverage: Over 10,000 unique checkout journeys across hundreds of top e-commerce, food delivery, and service platforms, including but not limited to Walmart, Target, Kroger, Whole Foods, Uber Eats, Instacart, Shopify-powered sites, and more.

Actionable Annotation: Every flow is broken down into granular, step-by-step actions, complete with timestamped events, UI context, form field details, validation logic, and response feedback. Each step includes:

Page state (URL, DOM snapshot, and metadata)

User actions (clicks, taps, text input, dropdown selection, checkbox/radio interactions)

System responses (AJAX calls, error/success messages, cart/price updates)

Authentication and account linking steps where applicable

Payment entry (card, wallet, alternative methods)

Order review and confirmation

Multi-Vertical, Real-World Data: Flows sourced from a wide variety of verticals and real consumer environments, not just demo stores or test accounts. Includes complex cases such as multi-item carts, promo codes, loyalty integration, and split payments.

Structured for Machine Learning: Delivered in standard formats (JSONL, CSV, or your preferred schema), with every event mapped to action types, page features, and expected outcomes. Optional HAR files and raw network request logs provide an extra layer of technical fidelity for action modeling and RLHF pipelines.

Rich Context for LLMs and Agents: Every annotation includes both human-readable and model-consumable descriptions:

“What the user did” (natural language)

“What the system did in response”

“What a successful action should look like”

Error/edge case coverage (invalid forms, OOS, address/payment errors)

Privacy-Safe & Compliant: All flows are depersonalized and scrubbed of PII. Sensitive fields (like credit card numbers, user addresses, and login credentials) are replaced with realistic but synthetic data, ensuring compliance with privacy regulations.

Each flow tracks the user journey from cart to payment to confirmation, including:

Adding/removing items

Applying coupons or promo codes

Selecting shipping/delivery options

Account creation, login, or guest checkout

Inputting payment details (card, wallet, Buy Now Pay Later)

Handling validation errors or OOS scenarios

Order review and final placement

Confirmation page capture (including order summary details)

Why This Dataset?

Building LLMs, agentic shopping bots, or e-commerce automation tools demands more than just page screenshots or API logs. You need deeply contextualized, action-oriented data that reflects how real users interact with the complex, ever-changing UIs of digital commerce. Our dataset uniquely captures:

The full intent-action-outcome loop

Dynamic UI changes, modals, validation, and error handling

Nuances of cart modification, bundle pricing, delivery constraints, and multi-vendor checkouts

Mobile vs. desktop variations

Diverse merchant tech stacks (custom, Shopify, Magento, BigCommerce, native apps, etc.)

Use Cases

LLM Fine-Tuning: Teach models to reason through step-by-step transaction flows, infer next-best-actions, and generate robust, context-sensitive prompts for real-world ordering.

Agentic Shopping Bots: Train agents to navigate web/mobile checkouts autonomously, handle edge cases, and complete real purchases on behalf of users.

Action Model & RLHF Training: Provide reinforcement learning pipelines with ground truth “what happens if I do X?” data across hundreds of real merchants.

UI/UX Research & Synthetic User Studies: Identify friction points, bottlenecks, and drop-offs in modern checkout design by replaying flows and testing interventions.

Automated QA & Regression Testing: Use realistic flows as test cases for new features or third-party integrations.

What’s Included

10,000+ annotated checkout flows (retail, restaurant, marketplace)

Step-by-step event logs with metadata, DOM, and network context

Natural language explanations for each step and transition

All flows are depersonalized and privacy-compliant

Example scripts for ingesting, parsing, and analyzing the dataset

Flexible licensing for research or commercial use

Sample Categories Covered

Grocery delivery (Instacart, Walmart, Kroger, Target, etc.)

Restaurant takeout/delivery (Ub...

Search
Clear search
Close search
Google apps
Main menu