100+ datasets found

f
Video-to-Model Data Set
figshare.com
commons.datacite.org
xml
Updated Mar 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz (2020). Video-to-Model Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.12026850.v1
Explore at:
xmlAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12026850.v1
Dataset updated
Mar 24, 2020
Dataset provided by
figshare
Authors
Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data set belongs to the paper "Video-to-Model: Unsupervised Trace Extraction from Videos for Process Discovery and Conformance Checking in Manual Assembly", submitted on March 24, 2020, to the 18th International Conference on Business Process Management (BPM).Abstract: Manual activities are often hidden deep down in discrete manufacturing processes. For the elicitation and optimization of process behavior, complete information about the execution of Manual activities are required. Thus, an approach is presented on how execution level information can be extracted from videos in manual assembly. The goal is the generation of a log that can be used in state-of-the-art process mining tools. The test bed for the system was lightweight and scalable consisting of an assembly workstation equipped with a single RGB camera recording only the hand movements of the worker from top. A neural network based real-time object classifier was trained to detect the worker’s hands. The hand detector delivers the input for an algorithm, which generates trajectories reflecting the movement paths of the hands. Those trajectories are automatically assigned to work steps using the position of material boxes on the assembly shelf as reference points and hierarchical clustering of similar behaviors with dynamic time warping. The system has been evaluated in a task-based study with ten participants in a laboratory, but under realistic conditions. The generated logs have been loaded into the process mining toolkit ProM to discover the underlying process model and to detect deviations from both, instructions and ground truth, using conformance checking. The results show that process mining delivers insights about the assembly process and the system’s precision.The data set contains the generated and the annotated logs based on the video material gathered during the user study. In addition, the petri nets from the process discovery and conformance checking conducted with ProM (http://www.promtools.org) and the reference nets modeled with Yasper (http://www.yasper.org/) are provided.
B
Brain-like Computer Report
datainsightsmarket.com
doc, pdf, ppt
Updated May 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Brain-like Computer Report [Dataset]. https://www.datainsightsmarket.com/reports/brain-like-computer-1642905
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
May 2, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The brain-like computer market is experiencing significant growth, driven by advancements in neuromorphic computing and the increasing demand for high-performance, energy-efficient computing solutions in diverse sectors. The market, estimated at $2 billion in 2025, is projected to exhibit a robust Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching an estimated market value of $10 billion by 2033. Key application areas include data mining and research, with the segment utilizing neurons exceeding 100 million units expected to dominate due to their enhanced processing capabilities. Leading players like Intel, IBM, and prominent universities are heavily investing in R&D, fueling innovation and market expansion. The North American region currently holds a significant market share, attributed to substantial technological advancements and early adoption of brain-like computing technologies, followed by Europe and Asia Pacific. However, high development costs and the complexity involved in designing and implementing these systems present significant challenges to widespread adoption. Future growth is likely to be fueled by continuous technological advancements, decreasing production costs, and the growing demand for artificial intelligence and machine learning applications requiring superior processing speed and energy efficiency. The market segmentation reveals a strong correlation between the number of neurons and application type. Larger-scale neuronal systems (above 100 million units) are predominantly employed in demanding applications like data mining, where the capability to process vast datasets is crucial. Conversely, smaller-scale systems might be suitable for specific research tasks with less complex computational requirements. Regional disparities are expected to persist, with North America maintaining its leadership position while Asia Pacific demonstrates significant growth potential due to increasing investments in research and development and the expanding adoption of AI/ML technologies across various sectors. Continued advancements in materials science, allowing for greater miniaturization and energy efficiency, will further drive market growth in the coming years.
Coal Miners Detection
kaggle.com
Updated Sep 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2023). Coal Miners Detection [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/miners-detection/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 18, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Miners Object Detection dataset

The dataset consists of of photos captured within various mines, focusing on miners engaged in their work. Each photo is annotated with bounding box detection of the miners, an attribute highlights whether each miner is sitting or standing in the photo.

The dataset's diverse applications such as computer vision, safety assessment and others make it a valuable resource for researchers, employers, and policymakers in the mining industry.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fdb3f193275f5206914a19b127e20138e%2FFrame%2013.png?generation=1695040375509674&alt=media" alt="">

Get the Dataset

This is just an example of the data

Leave a request on https://trainingdata.pro/datasets to discuss your requirements, learn about the price and buy the dataset

Dataset structure

images - contains of original images of miners

boxes - includes bounding box labeling for the original images

annotations.xml - contains coordinates of the bounding boxes and labels, created for the original photo

Data Format

Each image from images folder is accompanied by an XML-annotation in the annotations.xml file indicating the coordinates of the bounding boxes for miners detection. For each point, the x and y coordinates are provided. The position of the miner is also provided by the attribute is_sitting (true, false).

Example of XML file structure

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Febb59bc7d91a28f4e10c3f3da4ce4488%2Fcarbon%20(1).png?generation=1695040600108833&alt=media" alt="">

Miners detection might be made in accordance with your requirements.

TrainingData provides high-quality data annotation tailored to your needs

keywords: coal mines, underground, safety monitoring system, safety dataset, manufacturing dataset, industrial safety database, health and safety dataset, quality control dataset, quality assurance dataset, annotations dataset, computer vision dataset, image dataset, object detection, human images, classification
A pre-trained sound event detection neural network
search.datacite.org
figshare.com
Updated Jul 26, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ian McLoughlin (2017). A pre-trained sound event detection neural network [Dataset]. http://doi.org/10.6084/m9.figshare.5245789
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.5245789
Dataset updated
Jul 26, 2017
Dataset provided by
Figsharehttp://figshare.com/
DataCitehttps://www.datacite.org/
Authors
Ian McLoughlin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a network trained earlier on clean data. It does not give very good results but is enough to show the system working. It can achieve above 90% on clean sounds and probably about 80% accuracy in 0dB SNR.
The network was saved manually (using the MATLAB 'save' command) after running the training code, and before running the testing code.
u
Human-Computer Interaction Logs
indigo.uic.edu
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julian Theis; Houshang Darabi (2023). Human-Computer Interaction Logs [Dataset]. http://doi.org/10.25417/uic.11923386.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25417/uic.11923386.v1
Dataset updated
May 30, 2023
Dataset provided by
University of Illinois Chicago
Authors
Julian Theis; Houshang Darabi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset comprises ten human-computer interaction logs of real participants who solved a given task in a Windows environment. The participants were allowed to use the standard notepad, calculator, and file explorer. All recordings are anonymized and do not contain any private information.Simple:Each of the five log files in the folder simple contains Human-Computer Interaction recordings of a participant solving a simple task. Participants were provided 30 raw text files where each one contained data about the revenue and expenses of a single product for a given time period. In total 15 summaries were asked to be created by summarizing the data of two files and calculating the combined revenue, expenses, and profit. Complex:Each of the five log files in the folder complex contains Human-Computer Interaction recordings of a participant solving a more advanced task. In particular, participants were given a folder of text documents and were asked to create summary documents that contain the total revenue and expenses of the quarter, profit, and, where applicable, profit improvement compared to the previous quarter and the same quarter of the previous year. Each quarter’s data comprised multiple text files.The logging application that has been used is the one described inJulian Theis and Houshang Darabi. 2019. Behavioral Petri Net Mining and Automated Analysis for Human-Computer Interaction Recommendations in Multi-Application Environments. Proc. ACM Hum.-Comput. Interact. 3, EICS, Article 13 (June 2019), 16 pages. DOI: https://doi.org/10.1145/3331155Please refer to Table 1 and Table 2 of this publication regarding the structure of the log files. The first column corresponds to the timestamp in milliseconds, the second column represents the event key, and the third column contains additional event-specific information.
m
Data Buffalo Toraja
data.mendeley.com
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdul Rachman Manga (2025). Data Buffalo Toraja [Dataset]. http://doi.org/10.17632/kbft73pdkw.2
Explore at:
Unique identifier
https://doi.org/10.17632/kbft73pdkw.2
Dataset updated
May 16, 2025
Authors
Abdul Rachman Manga
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
This data was taken directly in the Toraja area using a digital camera, a minimum shooting distance of 3 m in video form, the results of the shooting are divided into frames
u
The TAWOS dataset
rdr.ucl.ac.uk
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vali Tawosi; Afnan Abdulaziz A Alsubaihin; Rebecca Moussa; Federica Sarro (2023). The TAWOS dataset [Dataset]. http://doi.org/10.5522/04/19085834.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.5522/04/19085834.v1
Dataset updated
May 31, 2023
Dataset provided by
University College London
Authors
Vali Tawosi; Afnan Abdulaziz A Alsubaihin; Rebecca Moussa; Federica Sarro
License
https://www.apache.org/licenses/LICENSE-2.0.htmlhttps://www.apache.org/licenses/LICENSE-2.0.html
Description
TAWOS (Peacock in Farsi and Arabic) is a dataset of agile open-source software project issues mined from Jira repositories including many descriptive features (raw and derived). The dataset aims to be all-inclusive, making it well-suited to several research avenues, and cross-analyses therein. This dataset is described and presented in the paper "A Versatile Dataset of Agile Open Source Software Projects" authored by Vali Tawosi, Afnan Al-Subaihin, Rebecca Moussa and Federica Sarro. The paper is accepted at the 2022 Mining Software Repositories (MSR) conference. Citation information will be available soon. For further information please refer to "https://github.com/SOLAR-group/TAWOS".
Global ML-ready dataset for mining areas in satellite images
zenodo.org
zip
Updated Nov 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simon Jasansky; Simon Jasansky; Victor Maus; Mirela Popa; Anna Wilbik; Anna Wilbik; Victor Maus; Mirela Popa (2024). Global ML-ready dataset for mining areas in satellite images [Dataset]. http://doi.org/10.5281/zenodo.14195737
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14195737
Dataset updated
Nov 21, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Simon Jasansky; Simon Jasansky; Victor Maus; Mirela Popa; Anna Wilbik; Anna Wilbik; Victor Maus; Mirela Popa
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset is a global resource for machine learning applications in mining area detection and semantic segmentation on satellite imagery. It contains Sentinel-2 satellite images and corresponding mining area masks + bounding boxes for 1,210 sites worldwide. Ground-truth masks are derived from Maus et al. (2022) and Tang et al. (2023), and validated through manual verification to ensure accurate alignment with Sentinel-2 imagery from specific timestamps.

The dataset includes three mask variants:

Masks exclusively from Maus et al. (n=1,090)

Masks exclusively from Tang et al. (n=817)

A preferred mask selected from either Maus or Tang based on alignment quality determined during manual review (n=1,210).

Each tile corresponds to a 2048x2048 pixel Sentinel-2 image, with metadata on mine type (surface, placer, underground, brine & evaporation) and scale (artisanal, industrial). For convenience, the preferred mask dataset is already split into training (75%), validation (15%), and test (10%) sets.

Furthermore, dataset quality was validated by re-validating test set tiles manually and correcting any mismatches between mining polygons and visually observed true mining area in the images, resulting in the following estimated quality metrics:

Combined Maus Tang
Accuracy 99.78 99.74 99.83
Precision 99.22 99.20 99.24
Recall 95.71 96.34 95.10

Note that the dataset does not contain the Sentinel-2 images themselves but contains a reference to specific Sentinel-2 images. Thus, for any ML applications, the images must be persisted first. For example, Sentinel-2 imagery is available from Microsoft's Planetary Computer and filterable via STAC API: https://planetarycomputer.microsoft.com/dataset/sentinel-2-l2a. Additionally, the temporal specificity of the data allows integration with other imagery sources from the indicated timestamp, such as Landsat or other high-resolution imagery.

Source code used to generate this dataset and to use it for ML model training is available at https://github.com/SimonJasansky/mine-segmentation. It includes useful Python scripts, e.g. to download Sentinel-2 images via STAC API, or to divide tile images (2048x2048px) into smaller chips (e.g. 512x512px).

A database schema, a schematic depiction of the dataset generation process, and a map of the global distribution of tiles are provided in the accompanying images.
f
Datasets for the paper: SDHRHA
figshare.com
txt
Updated Jul 8, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alejandro Sanchez Guinea (2018). Datasets for the paper: SDHRHA [Dataset]. http://doi.org/10.6084/m9.figshare.5572558.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5572558.v1
Dataset updated
Jul 8, 2018
Dataset provided by
figshare
Authors
Alejandro Sanchez Guinea
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data of laboratory and in-the-wild studies for pattern mining approaches to discover periodic-frequent routines of people at home.
d
Replication Data for: Age and Gender Identification in Unbalanced Social...
search.dataone.org
Updated Nov 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gomez, Juan Carlos (2023). Replication Data for: Age and Gender Identification in Unbalanced Social Media (CONIELECOMP 2019) [Dataset]. http://doi.org/10.7910/DVN/9B2JZD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/9B2JZD
Dataset updated
Nov 22, 2023
Dataset provided by
Harvard Dataverse
Authors
Gomez, Juan Carlos
Description
This dataset was used to conduct the experiments in the paper "Age and Gender Identification in Unbalanced Social Media", presented at the 29th International Conference on Electronics, Communications and Computers (CONIELECOMP 2019).
Market Basket Analysis
kaggle.com
Updated Dec 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 9, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aslan Ahmedov
Description
Market Basket Analysis

Market basket analysis with Apriori algorithm

The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

Introduction

Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

An Example of Association Rules

Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Strategy

Data Import

Data Understanding and Exploration

Transformation of the data – so that is ready to be consumed by the association rules algorithm

Running association rules

Exploring the rules generated

Filtering the generated rules

Visualization of Rule

Dataset Description

File name: Assignment-1_Data

List name: retaildata

File format: . xlsx

Number of Row: 522065

Number of Attributes: 7

BillNo: 6-digit number assigned to each transaction. Nominal.

Itemname: Product name. Nominal.

Quantity: The quantities of each product per transaction. Numeric.

Date: The day and time when each transaction was generated. Numeric.

Price: Product price. Numeric.

CustomerID: 5-digit number assigned to each customer. Nominal.

Country: Name of the country where each customer resides. Nominal.

https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

Libraries in R

First, we need to load required libraries. Shortly I describe all libraries.

arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).

arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.

tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.

readxl - Read Excel Files in R.

plyr - Tools for Splitting, Applying and Combining Data.

ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

knitr - Dynamic Report generation in R.

magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.

dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

Data Pre-processing

Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

After we will clear our data frame, will remove missing values.

https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
r
International Journal of Engineering and Advanced Technology FAQ -...
researchhelpdesk.org
Updated May 28, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Help Desk (2022). International Journal of Engineering and Advanced Technology FAQ - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/faq/552/international-journal-of-engineering-and-advanced-technology
Explore at:
Dataset updated
May 28, 2022
Dataset authored and provided by
Research Help Desk
Description
International Journal of Engineering and Advanced Technology FAQ - ResearchHelpDesk - International Journal of Engineering and Advanced Technology (IJEAT) is having Online-ISSN 2249-8958, bi-monthly international journal, being published in the months of February, April, June, August, October, and December by Blue Eyes Intelligence Engineering & Sciences Publication (BEIESP) Bhopal (M.P.), India since the year 2011. It is academic, online, open access, double-blind, peer-reviewed international journal. It aims to publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. All submitted papers will be reviewed by the board of committee of IJEAT. Aim of IJEAT Journal disseminate original, scientific, theoretical or applied research in the field of Engineering and allied fields. dispense a platform for publishing results and research with a strong empirical component. aqueduct the significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. seek original and unpublished research papers based on theoretical or experimental works for the publication globally. publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. impart a platform for publishing results and research with a strong empirical component. create a bridge for a significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. solicit original and unpublished research papers, based on theoretical or experimental works. Scope of IJEAT International Journal of Engineering and Advanced Technology (IJEAT) covers all topics of all engineering branches. Some of them are Computer Science & Engineering, Information Technology, Electronics & Communication, Electrical and Electronics, Electronics and Telecommunication, Civil Engineering, Mechanical Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. The main topic includes but not limited to: 1. Smart Computing and Information Processing Signal and Speech Processing Image Processing and Pattern Recognition WSN Artificial Intelligence and machine learning Data mining and warehousing Data Analytics Deep learning Bioinformatics High Performance computing Advanced Computer networking Cloud Computing IoT Parallel Computing on GPU Human Computer Interactions 2. Recent Trends in Microelectronics and VLSI Design Process & Device Technologies Low-power design Nanometer-scale integrated circuits Application specific ICs (ASICs) FPGAs Nanotechnology Nano electronics and Quantum Computing 3. Challenges of Industry and their Solutions, Communications Advanced Manufacturing Technologies Artificial Intelligence Autonomous Robots Augmented Reality Big Data Analytics and Business Intelligence Cyber Physical Systems (CPS) Digital Clone or Simulation Industrial Internet of Things (IIoT) Manufacturing IOT Plant Cyber security Smart Solutions – Wearable Sensors and Smart Glasses System Integration Small Batch Manufacturing Visual Analytics Virtual Reality 3D Printing 4. Internet of Things (IoT) Internet of Things (IoT) & IoE & Edge Computing Distributed Mobile Applications Utilizing IoT Security, Privacy and Trust in IoT & IoE Standards for IoT Applications Ubiquitous Computing Block Chain-enabled IoT Device and Data Security and Privacy Application of WSN in IoT Cloud Resources Utilization in IoT Wireless Access Technologies for IoT Mobile Applications and Services for IoT Machine/ Deep Learning with IoT & IoE Smart Sensors and Internet of Things for Smart City Logic, Functional programming and Microcontrollers for IoT Sensor Networks, Actuators for Internet of Things Data Visualization using IoT IoT Application and Communication Protocol Big Data Analytics for Social Networking using IoT IoT Applications for Smart Cities Emulation and Simulation Methodologies for IoT IoT Applied for Digital Contents 5. Microwaves and Photonics Microwave filter Micro Strip antenna Microwave Link design Microwave oscillator Frequency selective surface Microwave Antenna Microwave Photonics Radio over fiber Optical communication Optical oscillator Optical Link design Optical phase lock loop Optical devices 6. Computation Intelligence and Analytics Soft Computing Advance Ubiquitous Computing Parallel Computing Distributed Computing Machine Learning Information Retrieval Expert Systems Data Mining Text Mining Data Warehousing Predictive Analysis Data Management Big Data Analytics Big Data Security 7. Energy Harvesting and Wireless Power Transmission Energy harvesting and transfer for wireless sensor networks Economics of energy harvesting communications Waveform optimization for wireless power transfer RF Energy Harvesting Wireless Power Transmission Microstrip Antenna design and application Wearable Textile Antenna Luminescence Rectenna 8. Advance Concept of Networking and Database Computer Network Mobile Adhoc Network Image Security Application Artificial Intelligence and machine learning in the Field of Network and Database Data Analytic High performance computing Pattern Recognition 9. Machine Learning (ML) and Knowledge Mining (KM) Regression and prediction Problem solving and planning Clustering Classification Neural information processing Vision and speech perception Heterogeneous and streaming data Natural language processing Probabilistic Models and Methods Reasoning and inference Marketing and social sciences Data mining Knowledge Discovery Web mining Information retrieval Design and diagnosis Game playing Streaming data Music Modelling and Analysis Robotics and control Multi-agent systems Bioinformatics Social sciences Industrial, financial and scientific applications of all kind 10. Advanced Computer networking Computational Intelligence Data Management, Exploration, and Mining Robotics Artificial Intelligence and Machine Learning Computer Architecture and VLSI Computer Graphics, Simulation, and Modelling Digital System and Logic Design Natural Language Processing and Machine Translation Parallel and Distributed Algorithms Pattern Recognition and Analysis Systems and Software Engineering Nature Inspired Computing Signal and Image Processing Reconfigurable Computing Cloud, Cluster, Grid and P2P Computing Biomedical Computing Advanced Bioinformatics Green Computing Mobile Computing Nano Ubiquitous Computing Context Awareness and Personalization, Autonomic and Trusted Computing Cryptography and Applied Mathematics Security, Trust and Privacy Digital Rights Management Networked-Driven Multicourse Chips Internet Computing Agricultural Informatics and Communication Community Information Systems Computational Economics, Digital Photogrammetric Remote Sensing, GIS and GPS Disaster Management e-governance, e-Commerce, e-business, e-Learning Forest Genomics and Informatics Healthcare Informatics Information Ecology and Knowledge Management Irrigation Informatics Neuro-Informatics Open Source: Challenges and opportunities Web-Based Learning: Innovation and Challenges Soft computing Signal and Speech Processing Natural Language Processing 11. Communications Microstrip Antenna Microwave Radar and Satellite Smart Antenna MIMO Antenna Wireless Communication RFID Network and Applications 5G Communication 6G Communication 12. Algorithms and Complexity Sequential, Parallel And Distributed Algorithms And Data Structures Approximation And Randomized Algorithms Graph Algorithms And Graph Drawing On-Line And Streaming Algorithms Analysis Of Algorithms And Computational Complexity Algorithm Engineering Web Algorithms Exact And Parameterized Computation Algorithmic Game Theory Computational Biology Foundations Of Communication Networks Computational Geometry Discrete Optimization 13. Software Engineering and Knowledge Engineering Software Engineering Methodologies Agent-based software engineering Artificial intelligence approaches to software engineering Component-based software engineering Embedded and ubiquitous software engineering Aspect-based software engineering Empirical software engineering Search-Based Software engineering Automated software design and synthesis Computer-supported cooperative work Automated software specification Reverse engineering Software Engineering Techniques and Production Perspectives Requirements engineering Software analysis, design and modelling Software maintenance and evolution Software engineering tools and environments Software engineering decision support Software design patterns Software product lines Process and workflow management Reflection and metadata approaches Program understanding and system maintenance Software domain modelling and analysis Software economics Multimedia and hypermedia software engineering Software engineering case study and experience reports Enterprise software, middleware, and tools Artificial intelligent methods, models, techniques Artificial life and societies Swarm intelligence Smart Spaces Autonomic computing and agent-based systems Autonomic computing Adaptive Systems Agent architectures, ontologies, languages and protocols Multi-agent systems Agent-based learning and knowledge discovery Interface agents Agent-based auctions and marketplaces Secure mobile and multi-agent systems Mobile agents SOA and Service-Oriented Systems Service-centric software engineering Service oriented requirements engineering Service oriented architectures Middleware for service based systems Service discovery and composition Service level agreements (drafting,
m
Bias-Free Dataset of Food Delivery App Reviews with Data Poisoning Attacks
data.mendeley.com
Updated Apr 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hyunggu Jung (2024). Bias-Free Dataset of Food Delivery App Reviews with Data Poisoning Attacks [Dataset]. http://doi.org/10.17632/rnyrpzyw3h.2
Explore at:
Unique identifier
https://doi.org/10.17632/rnyrpzyw3h.2
Dataset updated
Apr 2, 2024
Authors
Hyunggu Jung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset consists of reviews collected from restaurants on a Korean delivery app platform running a review event. A total of 128,668 reviews were collected from 136 restaurants by crawling reviews using the Selenium library in Python. The dataset named as Korean Reviews.csv provides review data not translated to English, and the dataset named as English Reviews.csv provides review data translated to English. The 136 chosen restaurants run review events which demand customers to write reviews with 5 stars and photos. So the annotation of data was done by considering 1) whether the review gives five-star ratings, and 2) whether the review contains photo(s).
o
NLP Expert QA Dataset
opendatabay.com
.undefined
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). NLP Expert QA Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/c030902d-7b02-48a2-b32f-8f7140dd1de7
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 7, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Data Science and Analytics
Description
This dataset, QASPER: NLP Questions and Evidence, is an exceptional collection of over 5,000 questions and answers focused on Natural Language Processing (NLP) papers. It has been crowdsourced from experienced NLP practitioners, with each question meticulously crafted based solely on the titles and abstracts of the respective papers. The answers provided are expertly enriched with evidence taken directly from the full text of each paper. QASPER features structured fields including 'qas' for questions and answers, 'evidence' for supporting information, paper titles, abstracts, figures and tables, and full text. This makes it a valuable resource for researchers aiming to understand how practitioners interpret NLP topics and to validate solutions for problems found in existing literature. The dataset contains 5,049 questions spanning 1,585 distinct papers.

Columns

title: The title of the paper. (String)

abstract: A summary of the paper. (String)

full_text: The full text of the paper. (String)

qas: Questions and answers about the paper. (Object)

figures_and_tables: Figures and tables from the paper. (Object)

id: Unique identifier for the paper.

Distribution

The QASPER dataset comprises 5,049 questions across 1,585 papers. It is distributed across five files in .csv format, with one additional .json file for figures and tables. These include two test datasets (test.csv and validation.csv), two train datasets (train-v2-0_lessons_only_.csv and trainv2-0_unsplit.csv), and a figures dataset (figures_and_tables_.json). Each CSV file contains distinct datasets with columns dedicated to titles, abstracts, full texts, and Q&A fields, along with evidence for each paper mentioned in the respective rows.

Usage

This dataset is ideal for various applications, including: * Developing AI models to automatically generate questions and answers from paper titles and abstracts. * Enhancing machine learning algorithms by combining answers with evidence to discover relationships between papers. * Creating online forums for NLP practitioners, using dataset questions to spark discussion within the community. * Conducting basic descriptive statistics or advanced predictive analytics, such as logistic regression or naive Bayes models. * Summarising basic crosstabs between any two variables, like titles and abstracts. * Correlating title lengths with the number of words in their corresponding abstracts to identify patterns. * Utilising text mining technologies like topic modelling, machine learning techniques, or automated processes to summarise underlying patterns. * Filtering terms relevant to specific research hypotheses and processing them via web crawlers, search engines, or document similarity algorithms.

Coverage

The dataset has a GLOBAL region scope. It focuses on papers within the field of Natural Language Processing. The questions and answers are crowdsourced from experienced NLP practitioners. The dataset was listed on 22/06/2025.

License

CC0

Who Can Use It

This dataset is highly suitable for: * Researchers seeking insights into how NLP practitioners interpret complex topics. * Those requiring effective validation for developing clear-cut solutions to problems encountered in existing NLP literature. * NLP practitioners looking for a resource to stimulate discussions within their community. * Data scientists and analysts interested in exploring NLP datasets through descriptive statistics or advanced predictive analytics. * Developers and researchers working with text mining, machine learning techniques, or automated text processing.

Dataset Name Suggestions

NLP Expert QA Dataset

QASPER: NLP Paper Questions and Evidence

Academic NLP Q&A Corpus

Natural Language Processing Research Questions

Attributes

Original Data Source: QASPER: NLP Questions and Evidence
s
Online Feature Selection and Its Applications
researchdata.smu.edu.sg
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HOI Steven; Jialei WANG; Peilin ZHAO; Rong JIN (2023). Online Feature Selection and Its Applications [Dataset]. http://doi.org/10.25440/smu.12062733.v1
Explore at:
Unique identifier
https://doi.org/10.25440/smu.12062733.v1
Dataset updated
May 31, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
HOI Steven; Jialei WANG; Peilin ZHAO; Rong JIN
License
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Description
Feature selection is an important technique for data mining before a machine learning algorithm is applied. Despite its importance, most studies of feature selection are restricted to batch learning. Unlike traditional batch learning methods, online learning represents a promising family of efficient and scalable machine learning algorithms for large-scale applications. Most existing studies of online learning require accessing all the attributes/features of training instances. Such a classical setting is not always appropriate for real-world applications when data instances are of high dimensionality or it is expensive to acquire the full set of attributes/features. To address this limitation, we investigate the problem of Online Feature Selection (OFS) in which an online learner is only allowed to maintain a classifier involved only a small and fixed number of features. The key challenge of Online Feature Selection is how to make accurate prediction using a small and fixed number of active features. This is in contrast to the classical setup of online learning where all the features can be used for prediction. We attempt to tackle this challenge by studying sparsity regularization and truncation techniques. Specifically, this article addresses two different tasks of online feature selection: (1) learning with full input where an learner is allowed to access all the features to decide the subset of active features, and (2) learning with partial input where only a limited number of features is allowed to be accessed for each instance by the learner. We present novel algorithms to solve each of the two problems and give their performance analysis. We evaluate the performance of the proposed algorithms for online feature selection on several public datasets, and demonstrate their applications to real-world problems including image classification in computer vision and microarray gene expression analysis in bioinformatics. The encouraging results of our experiments validate the efficacy and efficiency of the proposed techniques.Related Publication: Hoi, S. C., Wang, J., Zhao, P., & Jin, R. (2012). Online feature selection for mining big data. In Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (pp. 93-100). ACM. http://dx.doi.org/10.1145/2351316.2351329 Full text available in InK: http://ink.library.smu.edu.sg/sis_research/2402/ Wang, J., Zhao, P., Hoi, S. C., & Jin, R. (2014). Online feature selection and its applications. IEEE Transactions on Knowledge and Data Engineering, 26(3), 698-710. http://dx.doi.org/10.1109/TKDE.2013.32 Full text available in InK: http://ink.library.smu.edu.sg/sis_research/2277/
f
The Maven Dependency Dataset
figshare.com
data.4tu.nl
txt
Updated Jul 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steven Raemaekers; A. (Arie) van Deursen; Joost Visser (2020). The Maven Dependency Dataset [Dataset]. http://doi.org/10.4121/uuid:68a0e837-4fda-407a-949e-a159546e67b6
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:68a0e837-4fda-407a-949e-a159546e67b6
Dataset updated
Jul 23, 2020
Dataset provided by
4TU.ResearchData
Authors
Steven Raemaekers; A. (Arie) van Deursen; Joost Visser
License
https://doi.org/10.4121/resource:terms_of_usehttps://doi.org/10.4121/resource:terms_of_use
Description
The Maven Dependency Dataset contains the data as described in the paper "Mining Metrics, Changes and Dependencies from the Maven Dependency Dataset". NOTE: See the README.TXT file for more information on the data in this dataset. The dataset consists of multiple parts: A snapshot of the Maven repository dated July 30, 2011 (maven.tar.gz), a MySQL database (complete.tar.gz) containing information on individual methods, classes and packages of different library versions, a Berkeley DB database (berkeley.tar.gz) containing metrics on all methods, classes and packages in the repository, a Neo4j graph database (graphdb.tar.gz) containing a call graph of the entire repository, scripts and analysis files (scriptsAndData.tar.gz), Source code and a binary package of the analysis software (fullmaven.jar and fullmaven-sources.jar), and text dumps of data in these databases (graphdump.tar.gz, processed.tar.gz, calls.tar.gz and units.tar.gz).
m
Cars Substory Interaction Data
figshare.manchester.ac.uk
bin
Updated Jan 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan Carlton (2022). Cars Substory Interaction Data [Dataset]. http://doi.org/10.48420/18702719.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.48420/18702719.v1
Dataset updated
Jan 19, 2022
Dataset provided by
University of Manchester
Authors
Jonathan Carlton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The interaction events captured in self-driving cars sub-story of the BBC Click 1000th episode.
o
AI Question Answering Data
opendatabay.com
.undefined
Updated Jul 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). AI Question Answering Data [Dataset]. https://www.opendatabay.com/data/ai-ml/d3c37fed-f830-444b-a988-c893d3396fd7
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 5, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Data Science and Analytics
Description
This dataset provides essential information for entries related to question answering tasks using AI models. It is designed to offer valuable insights for researchers and practitioners, enabling them to effectively train and rigorously evaluate their machine learning models. The dataset serves as a valuable resource for building and assessing question-answering systems. It is available free of charge.

Columns

instruction: Contains the specific instructions given to a model to generate a response.

responses: Includes the responses generated by the model based on the given instructions.

next_response: Provides the subsequent response from the model, following a previous response, which facilitates a conversational interaction.

answer: Lists the correct answer for each question presented in the instruction, acting as a reference for assessing the model's accuracy.

is_human_response: A boolean column that indicates whether a particular response was created by a human or by a machine learning model, helping to differentiate between the two. Out of nearly 19,300 entries, 254 are human-generated responses, while 18,974 were generated by models.

Distribution

The data files are typically in CSV format, with a dedicated train.csv file for training data and a test.csv file for testing purposes. The training file contains a large number of examples. Specific dates are not included within this dataset description, focusing solely on providing accurate and informative details about its content and purpose. Specific numbers for rows or records are not detailed in the available information.

Usage

This dataset is ideal for a variety of applications and use cases: * Training and Testing: Utilise train.csv to train question-answering models or algorithms, and test.csv to evaluate their performance on unseen questions. * Machine Learning Model Creation: Develop machine learning models specifically for question-answering by leveraging the instructional components, including instructions, responses, next responses, and human-generated answers, along with their is_human_response labels. * Model Performance Evaluation: Assess model performance by comparing predicted responses with actual human-generated answers from the test.csv file. * Data Augmentation: Expand existing data by paraphrasing instructions or generating alternative responses within similar contexts. * Conversational Agents: Build conversational agents or chatbots by utilising the instruction-response pairs for training. * Language Understanding: Train models to understand language and generate responses based on instructions and previous responses. * Educational Materials: Develop interactive quizzes or study guides, with models providing instant feedback to students. * Information Retrieval Systems: Create systems that help users find specific answers from large datasets. * Customer Support: Train customer support chatbots to provide quick and accurate responses to inquiries. * Language Generation Research: Develop novel algorithms for generating coherent responses in question-answering scenarios. * Automatic Summarisation Systems: Train systems to generate concise summaries by understanding main content through question answering. * Dialogue Systems Evaluation: Use the instruction-response pairs as a benchmark for evaluating dialogue system performance. * NLP Algorithm Benchmarking: Establish baselines against which other NLP tools and methods can be measured.

Coverage

The dataset's geographic scope is global. There is no specific time range or demographic scope noted within the available details, as specific dates are not included.

License

CC0

Who Can Use It

This dataset is highly suitable for: * Researchers and Practitioners: To gain insights into question answering tasks using AI models. * Developers: To train models, create chatbots, and build conversational agents. * Students: For developing educational materials and enhancing their learning experience through interactive tools. * Individuals and teams working on Natural Language Processing (NLP) projects. * Those creating information retrieval systems or customer support solutions. * Experts in natural language generation (NLG) and automatic summarisation systems. * Anyone involved in the evaluation of dialogue systems and machine learning model training.

Dataset Name Suggestions

AI Question Answering Data

Conversational AI Training Data

NLP Question-Answering Dataset

Model Evaluation QA Data

Dialogue Response Dataset

Attributes

Original Data Source: Question-Answering Training and Testing Data
r
International Journal of Engineering and Advanced Technology Publication fee...
researchhelpdesk.org
Updated Jun 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Help Desk (2022). International Journal of Engineering and Advanced Technology Publication fee - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/publication-fee/552/international-journal-of-engineering-and-advanced-technology
Explore at:
Dataset updated
Jun 25, 2022
Dataset authored and provided by
Research Help Desk
Description
International Journal of Engineering and Advanced Technology Publication fee - ResearchHelpDesk - International Journal of Engineering and Advanced Technology (IJEAT) is having Online-ISSN 2249-8958, bi-monthly international journal, being published in the months of February, April, June, August, October, and December by Blue Eyes Intelligence Engineering & Sciences Publication (BEIESP) Bhopal (M.P.), India since the year 2011. It is academic, online, open access, double-blind, peer-reviewed international journal. It aims to publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. All submitted papers will be reviewed by the board of committee of IJEAT. Aim of IJEAT Journal disseminate original, scientific, theoretical or applied research in the field of Engineering and allied fields. dispense a platform for publishing results and research with a strong empirical component. aqueduct the significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. seek original and unpublished research papers based on theoretical or experimental works for the publication globally. publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. impart a platform for publishing results and research with a strong empirical component. create a bridge for a significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. solicit original and unpublished research papers, based on theoretical or experimental works. Scope of IJEAT International Journal of Engineering and Advanced Technology (IJEAT) covers all topics of all engineering branches. Some of them are Computer Science & Engineering, Information Technology, Electronics & Communication, Electrical and Electronics, Electronics and Telecommunication, Civil Engineering, Mechanical Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. The main topic includes but not limited to: 1. Smart Computing and Information Processing Signal and Speech Processing Image Processing and Pattern Recognition WSN Artificial Intelligence and machine learning Data mining and warehousing Data Analytics Deep learning Bioinformatics High Performance computing Advanced Computer networking Cloud Computing IoT Parallel Computing on GPU Human Computer Interactions 2. Recent Trends in Microelectronics and VLSI Design Process & Device Technologies Low-power design Nanometer-scale integrated circuits Application specific ICs (ASICs) FPGAs Nanotechnology Nano electronics and Quantum Computing 3. Challenges of Industry and their Solutions, Communications Advanced Manufacturing Technologies Artificial Intelligence Autonomous Robots Augmented Reality Big Data Analytics and Business Intelligence Cyber Physical Systems (CPS) Digital Clone or Simulation Industrial Internet of Things (IIoT) Manufacturing IOT Plant Cyber security Smart Solutions – Wearable Sensors and Smart Glasses System Integration Small Batch Manufacturing Visual Analytics Virtual Reality 3D Printing 4. Internet of Things (IoT) Internet of Things (IoT) & IoE & Edge Computing Distributed Mobile Applications Utilizing IoT Security, Privacy and Trust in IoT & IoE Standards for IoT Applications Ubiquitous Computing Block Chain-enabled IoT Device and Data Security and Privacy Application of WSN in IoT Cloud Resources Utilization in IoT Wireless Access Technologies for IoT Mobile Applications and Services for IoT Machine/ Deep Learning with IoT & IoE Smart Sensors and Internet of Things for Smart City Logic, Functional programming and Microcontrollers for IoT Sensor Networks, Actuators for Internet of Things Data Visualization using IoT IoT Application and Communication Protocol Big Data Analytics for Social Networking using IoT IoT Applications for Smart Cities Emulation and Simulation Methodologies for IoT IoT Applied for Digital Contents 5. Microwaves and Photonics Microwave filter Micro Strip antenna Microwave Link design Microwave oscillator Frequency selective surface Microwave Antenna Microwave Photonics Radio over fiber Optical communication Optical oscillator Optical Link design Optical phase lock loop Optical devices 6. Computation Intelligence and Analytics Soft Computing Advance Ubiquitous Computing Parallel Computing Distributed Computing Machine Learning Information Retrieval Expert Systems Data Mining Text Mining Data Warehousing Predictive Analysis Data Management Big Data Analytics Big Data Security 7. Energy Harvesting and Wireless Power Transmission Energy harvesting and transfer for wireless sensor networks Economics of energy harvesting communications Waveform optimization for wireless power transfer RF Energy Harvesting Wireless Power Transmission Microstrip Antenna design and application Wearable Textile Antenna Luminescence Rectenna 8. Advance Concept of Networking and Database Computer Network Mobile Adhoc Network Image Security Application Artificial Intelligence and machine learning in the Field of Network and Database Data Analytic High performance computing Pattern Recognition 9. Machine Learning (ML) and Knowledge Mining (KM) Regression and prediction Problem solving and planning Clustering Classification Neural information processing Vision and speech perception Heterogeneous and streaming data Natural language processing Probabilistic Models and Methods Reasoning and inference Marketing and social sciences Data mining Knowledge Discovery Web mining Information retrieval Design and diagnosis Game playing Streaming data Music Modelling and Analysis Robotics and control Multi-agent systems Bioinformatics Social sciences Industrial, financial and scientific applications of all kind 10. Advanced Computer networking Computational Intelligence Data Management, Exploration, and Mining Robotics Artificial Intelligence and Machine Learning Computer Architecture and VLSI Computer Graphics, Simulation, and Modelling Digital System and Logic Design Natural Language Processing and Machine Translation Parallel and Distributed Algorithms Pattern Recognition and Analysis Systems and Software Engineering Nature Inspired Computing Signal and Image Processing Reconfigurable Computing Cloud, Cluster, Grid and P2P Computing Biomedical Computing Advanced Bioinformatics Green Computing Mobile Computing Nano Ubiquitous Computing Context Awareness and Personalization, Autonomic and Trusted Computing Cryptography and Applied Mathematics Security, Trust and Privacy Digital Rights Management Networked-Driven Multicourse Chips Internet Computing Agricultural Informatics and Communication Community Information Systems Computational Economics, Digital Photogrammetric Remote Sensing, GIS and GPS Disaster Management e-governance, e-Commerce, e-business, e-Learning Forest Genomics and Informatics Healthcare Informatics Information Ecology and Knowledge Management Irrigation Informatics Neuro-Informatics Open Source: Challenges and opportunities Web-Based Learning: Innovation and Challenges Soft computing Signal and Speech Processing Natural Language Processing 11. Communications Microstrip Antenna Microwave Radar and Satellite Smart Antenna MIMO Antenna Wireless Communication RFID Network and Applications 5G Communication 6G Communication 12. Algorithms and Complexity Sequential, Parallel And Distributed Algorithms And Data Structures Approximation And Randomized Algorithms Graph Algorithms And Graph Drawing On-Line And Streaming Algorithms Analysis Of Algorithms And Computational Complexity Algorithm Engineering Web Algorithms Exact And Parameterized Computation Algorithmic Game Theory Computational Biology Foundations Of Communication Networks Computational Geometry Discrete Optimization 13. Software Engineering and Knowledge Engineering Software Engineering Methodologies Agent-based software engineering Artificial intelligence approaches to software engineering Component-based software engineering Embedded and ubiquitous software engineering Aspect-based software engineering Empirical software engineering Search-Based Software engineering Automated software design and synthesis Computer-supported cooperative work Automated software specification Reverse engineering Software Engineering Techniques and Production Perspectives Requirements engineering Software analysis, design and modelling Software maintenance and evolution Software engineering tools and environments Software engineering decision support Software design patterns Software product lines Process and workflow management Reflection and metadata approaches Program understanding and system maintenance Software domain modelling and analysis Software economics Multimedia and hypermedia software engineering Software engineering case study and experience reports Enterprise software, middleware, and tools Artificial intelligent methods, models, techniques Artificial life and societies Swarm intelligence Smart Spaces Autonomic computing and agent-based systems Autonomic computing Adaptive Systems Agent architectures, ontologies, languages and protocols Multi-agent systems Agent-based learning and knowledge discovery Interface agents Agent-based auctions and marketplaces Secure mobile and multi-agent systems Mobile agents SOA and Service-Oriented Systems Service-centric software engineering Service oriented requirements engineering Service oriented architectures Middleware for service based systems Service discovery and composition Service level
Electronic Invoicing Event Logs
search.datacite.org
figshare.com
Updated Jun 27, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Almir Djedović (2018). Electronic Invoicing Event Logs [Dataset]. http://doi.org/10.4121/uuid:5a9039b8-794a-4ccd-a5ef-4671f0a258a4
Explore at:
Unique identifier
https://doi.org/10.4121/uuid:5a9039b8-794a-4ccd-a5ef-4671f0a258a4
Dataset updated
Jun 27, 2018
Dataset provided by
DataCitehttps://www.datacite.org/
4TU.Centre for Research Data
Authors
Almir Djedović
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This set of data contains information about the process execution of electronic invoicing. The process of electronic invoicing contains the following activities: invoice scanning, approve invoice, liquidation and so on. The data set contains information about the event name, event type, time of the event's execution and the participant whose execution the event is related to. The data is formatted in the MXML format in order to be used for the process mining analysis using tools such as ProM and so on.

Facebook

Twitter

Click to copy link

Link copied

Cite

Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz (2020). Video-to-Model Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.12026850.v1

Video-to-Model Data Set

Explore at:

223 scholarly articles cite this dataset (View in Google Scholar)

xmlAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.12026850.v1

Dataset updated

Mar 24, 2020

Dataset provided by

figshare

Authors

Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This data set belongs to the paper "Video-to-Model: Unsupervised Trace Extraction from Videos for Process Discovery and Conformance Checking in Manual Assembly", submitted on March 24, 2020, to the 18th International Conference on Business Process Management (BPM).Abstract: Manual activities are often hidden deep down in discrete manufacturing processes. For the elicitation and optimization of process behavior, complete information about the execution of Manual activities are required. Thus, an approach is presented on how execution level information can be extracted from videos in manual assembly. The goal is the generation of a log that can be used in state-of-the-art process mining tools. The test bed for the system was lightweight and scalable consisting of an assembly workstation equipped with a single RGB camera recording only the hand movements of the worker from top. A neural network based real-time object classifier was trained to detect the worker’s hands. The hand detector delivers the input for an algorithm, which generates trajectories reflecting the movement paths of the hands. Those trajectories are automatically assigned to work steps using the position of material boxes on the assembly shelf as reference points and hierarchical clustering of similar behaviors with dynamic time warping. The system has been evaluated in a task-based study with ten participants in a laboratory, but under realistic conditions. The generated logs have been loaded into the process mining toolkit ProM to discover the underlying process model and to detect deviations from both, instructions and ground truth, using conformance checking. The results show that process mining delivers insights about the assembly process and the system’s precision.The data set contains the generated and the annotated logs based on the video material gathered during the user study. In addition, the petri nets from the process discovery and conformance checking conducted with ProM (http://www.promtools.org) and the reference nets modeled with Yasper (http://www.yasper.org/) are provided.

Clear search

Close search

Google apps

Main menu

	Combined	Maus	Tang
Accuracy	99.78	99.74	99.83
Precision	99.22	99.20	99.24
Recall	95.71	96.34	95.10

Video-to-Model Data Set

Brain-like Computer Report

Coal Miners Detection

Miners Object Detection dataset

Get the Dataset

This is just an example of the data

Dataset structure

Data Format

Example of XML file structure

Miners detection might be made in accordance with your requirements.

TrainingData provides high-quality data annotation tailored to your needs

A pre-trained sound event detection neural network

Human-Computer Interaction Logs

Data Buffalo Toraja

The TAWOS dataset

Global ML-ready dataset for mining areas in satellite images

Datasets for the paper: SDHRHA

Replication Data for: Age and Gender Identification in Unbalanced Social...

Market Basket Analysis

Market Basket Analysis

Introduction

An Example of Association Rules

Strategy

Dataset Description

Libraries in R

Data Pre-processing

International Journal of Engineering and Advanced Technology FAQ -...

Bias-Free Dataset of Food Delivery App Reviews with Data Poisoning Attacks

NLP Expert QA Dataset

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Online Feature Selection and Its Applications

The Maven Dependency Dataset

Cars Substory Interaction Data

AI Question Answering Data

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

International Journal of Engineering and Advanced Technology Publication fee...

Electronic Invoicing Event Logs

Video-to-Model Data Set