66 datasets found

Data from: An open dataset for intelligent recognition and classification of...
springernature.figshare.com
bin
Updated Jul 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xuhui Zhang; Wenjuan Yang; Bing Ma; Yanqun Wang; Yujia Wu; Jianxin Yan; Yongwei Liu; Chao Zhang; Jicheng Wan; Yue Wang; Mengyao Huang; Yuyang Li; Dian Zhao (2024). An open dataset for intelligent recognition and classification of abnormal condition in longwall mining [Dataset]. http://doi.org/10.6084/m9.figshare.22654945.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22654945.v1
Dataset updated
Jul 8, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Xuhui Zhang; Wenjuan Yang; Bing Ma; Yanqun Wang; Yujia Wu; Jianxin Yan; Yongwei Liu; Chao Zhang; Jicheng Wan; Yue Wang; Mengyao Huang; Yuyang Li; Dian Zhao
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This work developed image dataset of underground longwall mining face (DsLMF+), which consists of 138004 images with annotation 6 categories of mine personnel, hydraulic support guard plate, large coal, towline, miners’ behaviour and mine safety helmet. All the labels of dataset are publicly available in YOLO format and COCO format.The dataset aims to support further research and advancement of the intelligent identification and classification of abnormal conditions for underground mining.
R
Data Mining Dataset
universe.roboflow.com
zip
Updated Aug 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ilham project (2023). Data Mining Dataset [Dataset]. https://universe.roboflow.com/ilham-project/data-mining-n52lu/model/1
Explore at:
zipAvailable download formats
Dataset updated
Aug 4, 2023
Dataset authored and provided by
ilham project
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
Uangrupiah Bounding Boxes
Description
Data Mining

## Overview Data Mining is a dataset for object detection tasks - it contains Uangrupiah annotations for 692 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
R
Data Mining Test Dataset
universe.roboflow.com
zip
Updated Oct 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ons (2025). Data Mining Test Dataset [Dataset]. https://universe.roboflow.com/ons-eykpy/data-mining-test-fjlw4/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Oct 20, 2025
Dataset authored and provided by
ons
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Cars Damage Cars Bounding Boxes
Description
Data Mining Test

## Overview Data Mining Test is a dataset for object detection tasks - it contains Cars Damage Cars annotations for 382 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
E
Dataset for training classifiers of comparative sentences
live.european-language-grid.eu
csv
Updated Apr 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Dataset for training classifiers of comparative sentences [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7607
Explore at:
csvAvailable download formats
Dataset updated
Apr 19, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As there was no large publicly available cross-domain dataset for comparative argument mining, we create one composed of sentences, potentially annotated with BETTER / WORSE markers (the first object is better / worse than the second object) or NONE (the sentence does not contain a comparison of the target objects). The BETTER sentences stand for a pro-argument in favor of the first compared object and WORSE-sentences represent a con-argument and favor the second object. We aim for minimizing dataset domain-specific biases in order to capture the nature of comparison and not the nature of the particular domains, thus decided to control the specificity of domains by the selection of comparison targets. We hypothesized and could confirm in preliminary experiments that comparison targets usually have a common hypernym (i.e., are instances of the same class), which we utilized for selection of the compared objects pairs. The most specific domain we choose, is computer science with comparison targets like programming languages, database products and technology standards such as Bluetooth or Ethernet. Many computer science concepts can be compared objectively (e.g., on transmission speed or suitability for certain applications). The objects for this domain were manually extracted from List of-articles at Wikipedia. In the annotation process, annotators were asked to only label sentences from this domain if they had some basic knowledge in computer science. The second, broader domain is brands. It contains objects of different types (e.g., cars, electronics, and food). As brands are present in everyday life, anyone should be able to label the majority of sentences containing well-known brands such as Coca-Cola or Mercedes. Again, targets for this domain were manually extracted from `List of''-articles at Wikipedia.The third domain is not restricted to any topic: random. For each of 24~randomly selected seed words 10 similar words were collected based on the distributional similarity API of JoBimText (http://www.jobimtext.org). Seed words created using randomlists.com: book, car, carpenter, cellphone, Christmas, coffee, cork, Florida, hamster, hiking, Hoover, Metallica, NBC, Netflix, ninja, pencil, salad, soccer, Starbucks, sword, Tolkien, wine, wood, XBox, Yale.Especially for brands and computer science, the resulting object lists were large (4493 in brands and 1339 in computer science). In a manual inspection, low-frequency and ambiguous objects were removed from all object lists (e.g., RAID (a hardware concept) and Unity (a game engine) are also regularly used nouns). The remaining objects were combined to pairs. For each object type (seed Wikipedia list page or the seed word), all possible combinations were created. These pairs were then used to find sentences containing both objects. The aforementioned approaches to selecting compared objects pairs tend minimize inclusion of the domain specific data, but do not solve the problem fully though. We keep open a question of extending dataset with diverse object pairs including abstract concepts for future work. As for the sentence mining, we used the publicly available index of dependency-parsed sentences from the Common Crawl corpus containing over 14 billion English sentences filtered for duplicates. This index was queried for sentences containing both objects of each pair. For 90% of the pairs, we also added comparative cue words (better, easier, faster, nicer, wiser, cooler, decent, safer, superior, solid, terrific, worse, harder, slower, poorly, uglier, poorer, lousy, nastier, inferior, mediocre) to the query in order to bias the selection towards comparisons but at the same time admit comparisons that do not contain any of the anticipated cues. This was necessary as a random sampling would have resulted in only a very tiny fraction of comparisons. Note that even sentences containing a cue word do not necessarily express a comparison between the desired targets (dog vs. cat: He's the best pet that you can get, better than a dog or cat.). It is thus especially crucial to enable a classifier to learn not to rely on the existence of clue words only (very likely in a random sample of sentences with very few comparisons). For our corpus, we keep pairs with at least 100 retrieved sentences.From all sentences of those pairs, 2500 for each category were randomly sampled as candidates for a crowdsourced annotation that we conducted on figure-eight.com in several small batches. Each sentence was annotated by at least five trusted workers. We ranked annotations by confidence, which is the figure-eight internal measure of combining annotator trust and voting, and discarded annotations with a confidence below 50%. Of all annotated items, 71% received unanimous votes and for over 85% at least 4 out of 5 workers agreed -- rendering the collection procedure aimed at ease of annotation successful.The final dataset contains 7199 sentences with 271 distinct object pairs. The majority of sentences (over 72%) are non-comparative despite biasing the selection with cue words; in 70% of the comparative sentences, the favored target is named first.You can browse though the data here: https://docs.google.com/spreadsheets/d/1U8i6EU9GUKmHdPnfwXEuBxi0h3aiRCLPRC-3c9ROiOE/edit?usp=sharing Full description of the dataset is available in the workshop paper at ACL 2019 conference. Please cite this paper if you use the data: Franzek, Mirco, Alexander Panchenko, and Chris Biemann. ""Categorization of Comparative Sentences for Argument Mining."" arXiv preprint arXiv:1809.06152 (2018).@inproceedings{franzek2018categorization, title={Categorization of Comparative Sentences for Argument Mining}, author={Panchenko, Alexander and Bondarenko, and Franzek, Mirco and Hagen, Matthias and Biemann, Chris}, booktitle={Proceedings of the 6th Workshop on Argument Mining at ACL'2019}, year={2019}, address={Florence, Italy}}
Z
ELKI Multi-View Clustering Data Sets Based on the Amsterdam Library of...
data.niaid.nih.gov
elki-project.github.io
+1more
Updated May 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schubert, Erich; Zimek, Arthur (2024). ELKI Multi-View Clustering Data Sets Based on the Amsterdam Library of Object Images (ALOI) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6355683
Explore at:
Dataset updated
May 2, 2024
Dataset provided by
Ludwig-Maximilians-Universität München
Authors
Schubert, Erich; Zimek, Arthur
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These data sets were originally created for the following publications:

M. E. Houle, H.-P. Kriegel, P. Kröger, E. Schubert, A. Zimek Can Shared-Neighbor Distances Defeat the Curse of Dimensionality? In Proceedings of the 22nd International Conference on Scientific and Statistical Database Management (SSDBM), Heidelberg, Germany, 2010.

H.-P. Kriegel, E. Schubert, A. Zimek Evaluation of Multiple Clustering Solutions In 2nd MultiClust Workshop: Discovering, Summarizing and Using Multiple Clusterings Held in Conjunction with ECML PKDD 2011, Athens, Greece, 2011.

The outlier data set versions were introduced in:

E. Schubert, R. Wojdanowski, A. Zimek, H.-P. Kriegel On Evaluation of Outlier Rankings and Outlier Scores In Proceedings of the 12th SIAM International Conference on Data Mining (SDM), Anaheim, CA, 2012.

They are derived from the original image data available at https://aloi.science.uva.nl/

The image acquisition process is documented in the original ALOI work: J. M. Geusebroek, G. J. Burghouts, and A. W. M. Smeulders, The Amsterdam library of object images, Int. J. Comput. Vision, 61(1), 103-112, January, 2005

Additional information is available at: https://elki-project.github.io/datasets/multi_view

The following views are currently available:

Feature type Description Files Object number Sparse 1000 dimensional vectors that give the true object assignment objs.arff.gz RGB color histograms Standard RGB color histograms (uniform binning) aloi-8d.csv.gz aloi-27d.csv.gz aloi-64d.csv.gz aloi-125d.csv.gz aloi-216d.csv.gz aloi-343d.csv.gz aloi-512d.csv.gz aloi-729d.csv.gz aloi-1000d.csv.gz HSV color histograms Standard HSV/HSB color histograms in various binnings aloi-hsb-2x2x2.csv.gz aloi-hsb-3x3x3.csv.gz aloi-hsb-4x4x4.csv.gz aloi-hsb-5x5x5.csv.gz aloi-hsb-6x6x6.csv.gz aloi-hsb-7x7x7.csv.gz aloi-hsb-7x2x2.csv.gz aloi-hsb-7x3x3.csv.gz aloi-hsb-14x3x3.csv.gz aloi-hsb-8x4x4.csv.gz aloi-hsb-9x5x5.csv.gz aloi-hsb-13x4x4.csv.gz aloi-hsb-14x5x5.csv.gz aloi-hsb-10x6x6.csv.gz aloi-hsb-14x6x6.csv.gz Color similiarity Average similarity to 77 reference colors (not histograms) 18 colors x 2 sat x 2 bri + 5 grey values (incl. white, black) aloi-colorsim77.arff.gz (feature subsets are meaningful here, as these features are computed independently of each other) Haralick features First 13 Haralick features (radius 1 pixel) aloi-haralick-1.csv.gz Front to back Vectors representing front face vs. back faces of individual objects front.arff.gz Basic light Vectors indicating basic light situations light.arff.gz Manual annotations Manually annotated object groups of semantically related objects such as cups manual1.arff.gz

Outlier Detection Versions

Additionally, we generated a number of subsets for outlier detection:

Feature type Description Files RGB Histograms Downsampled to 100000 objects (553 outliers) aloi-27d-100000-max10-tot553.csv.gz aloi-64d-100000-max10-tot553.csv.gz Downsampled to 75000 objects (717 outliers) aloi-27d-75000-max4-tot717.csv.gz aloi-64d-75000-max4-tot717.csv.gz Downsampled to 50000 objects (1508 outliers) aloi-27d-50000-max5-tot1508.csv.gz aloi-64d-50000-max5-tot1508.csv.gz
Procure-To-Payment (P2P) Object-centric Event Log in OCEL 2.0 Standard
zenodo.org
data.niaid.nih.gov
+1more
bin, json, xml
Updated Oct 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gyunam Park; Gyunam Park; genannt Unterberg Leah Tacke; genannt Unterberg Leah Tacke (2023). Procure-To-Payment (P2P) Object-centric Event Log in OCEL 2.0 Standard [Dataset]. http://doi.org/10.5281/zenodo.8412920
Explore at:
json, xml, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8412920
Dataset updated
Oct 7, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gyunam Park; Gyunam Park; genannt Unterberg Leah Tacke; genannt Unterberg Leah Tacke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Short Description

This process describes the Procure-To-Pay (P2P) procedure within an organization, starting from the initiation of a purchase requirement up to the execution of payment. This simulation extensively uses genuine SAP transactions and object types to offer a realistic representation of the P2P process.

Overview

Within our simulated organization:

Procurement Initiatives: The procurement journey begins when a department or individual recognizes a need and creates a Purchase Requisition using transaction ME51N.

Approval Process: Before the purchase can proceed, the requisition must be approved. This is carried out using transaction ME54N. Given the nature of our simulation, there may be instances where the approval process takes an unusually long time, exemplifying the Lengthy Approval Process behavior.

Vendor Interactions:

Upon approval, a Request for Quotation is sent out to potential vendors using transaction ME41.

Vendors then submit their quotations, which are maintained in the system using transaction ME47.

Purchase Order Creation: Once a vendor's quotation is selected, a Purchase Order is created using transaction ME21N. The purchase order is then subjected to an internal approval process (ME29N). Occasionally, maverick buying—where purchases are made without proper authorization—can be observed.

Goods & Invoice Management:

When the goods are received, a Goods Receipt is recorded using transaction MIGO.

Invoices from vendors are then received and recorded. A three-way match, which checks the purchase order, goods receipt, and invoice for discrepancies, is performed using transaction MRBR.

Payment: Once everything is verified, payments are executed using transaction F110. However, there may be instances of Duplicate Payments in our simulation, where the system mistakenly pays the same invoice more than once.

Special Behaviors:

Maverick Buying: Unauthorized purchases, bypassing the standard procedure.

Duplicate Payments: An error leading to the same invoice being paid multiple times.

Lengthy Approval Process: Delays in approving purchase requisitions or purchase orders, which might lead to operational inefficiencies.

General Properties

An overview of log properties is given below.

Property Value
Event Types 10
Object Types 7
Events 14671
Objects 9543

Authors

Gyunam Park and Leah Tacke genannt Unterberg

Contributing

To contribute, drop us an email! We are happy to receive your feedback.
Atlanta, Georgia - Aerial imagery object identification dataset for building...
figshare.com
tiff
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kyle Bradbury; Benjamin Brigman; Leslie Collins; Timothy Johnson; Sebastian Lin; Richard Newell; Sophia Park; Sunith Suresh; Hoel Wiesner; Yue Xi (2023). Atlanta, Georgia - Aerial imagery object identification dataset for building and road detection, and building height estimation [Dataset]. http://doi.org/10.6084/m9.figshare.3504308.v1
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3504308.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Kyle Bradbury; Benjamin Brigman; Leslie Collins; Timothy Johnson; Sebastian Lin; Richard Newell; Sophia Park; Sunith Suresh; Hoel Wiesner; Yue Xi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Georgia, Atlanta
Description
This dataset is part of the larger data collection, “Aerial imagery object identification dataset for building and road detection, and building height estimation”, linked to in the references below and can be accessed here: https://dx.doi.org/10.6084/m9.figshare.c.3290519. For a full description of the data, please see the metadata: https://dx.doi.org/10.6084/m9.figshare.3504413.

Imagery data from the United States Geological Survey (USGS); building and road shapefiles are from OpenStreetMaps (OSM) (these OSM data are made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/); and the Lidar data are from U.S. National Oceanic and Atmospheric Administration (NOAA), the Texas Natural Resources Information System (TNRIS).
Z
Repartition of part of visdrone2019 dataset
data.niaid.nih.gov
Updated Nov 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gang Liu (2022). Repartition of part of visdrone2019 dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7355397
Explore at:
Dataset updated
Nov 24, 2022
Authors
Gang Liu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The VisDrone2019 dataset is collected by the AISKYEYE team at the Lab of Machine Learning and Data Mining, Tianjin University, China. The dataset contains a large number of objects in urban and rural road scenes (10 categories such as pedestrians, vehicles, bicycles, etc.), covering a wide variety of scenes and containing a large number of small objects.A link to the original data set: https://github.com/VisDrone/VisDrone-Dataset We selected the training set of the object detection part as our data set, and randomly divided it into new training set, verification set and test set in a ratio close to 7:2:1. Available at https://github.com/VisDrone/VisDrone-Dataset
Market Basket Analysis
kaggle.com
zip
Updated Dec 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
Explore at:
zip(23875170 bytes)Available download formats
Dataset updated
Dec 9, 2021
Authors
Aslan Ahmedov
Description
Market Basket Analysis

Market basket analysis with Apriori algorithm

The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

Introduction

Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

An Example of Association Rules

Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Strategy

Data Import

Data Understanding and Exploration

Transformation of the data – so that is ready to be consumed by the association rules algorithm

Running association rules

Exploring the rules generated

Filtering the generated rules

Visualization of Rule

Dataset Description

File name: Assignment-1_Data

List name: retaildata

File format: . xlsx

Number of Row: 522065

Number of Attributes: 7

BillNo: 6-digit number assigned to each transaction. Nominal.

Itemname: Product name. Nominal.

Quantity: The quantities of each product per transaction. Numeric.

Date: The day and time when each transaction was generated. Numeric.

Price: Product price. Numeric.

CustomerID: 5-digit number assigned to each customer. Nominal.

Country: Name of the country where each customer resides. Nominal.

https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

Libraries in R

First, we need to load required libraries. Shortly I describe all libraries.

arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).

arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.

tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.

readxl - Read Excel Files in R.

plyr - Tools for Splitting, Applying and Combining Data.

ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

knitr - Dynamic Report generation in R.

magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.

dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

Data Pre-processing

Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

After we will clear our data frame, will remove missing values.

https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
c
Comprehensive Ethereum Execution Data for Object-Centric Process Mining of...
cryptodata.center
Updated Dec 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Comprehensive Ethereum Execution Data for Object-Centric Process Mining of DApps - Dataset - CryptoData Hub [Dataset]. https://cryptodata.center/dataset/comprehensive-ethereum-execution-data-for-object-centric-process-mining-of-dapps
Explore at:
Dataset updated
Dec 4, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset pertains to the collection and analysis of blockchain execution data, particularly from Ethereum-based Decentralized Applications (DApps). This data includes transactions, transaction receipts, and detailed transaction traces, documenting the execution steps performed by the Ethereum Virtual Machine (EVM). Such traces are essential for understanding the interaction between smart contracts and accounts, including Contract Accounts (CAs) and Externally Owned Accounts (EOAs). A blockchain is an append-only ledger that chronologically records data in blocks. Each block contains transactions that signify state transitions, and transaction receipts that provide a hashed result of these transitions to ensure uniform results across different executions. The dataset includes a classification of Ethereum accounts, detailing the functions and interactions between EOAs and CAs, where CAs deploy and execute smart contract code. The dataset captures the granular operational data of blockchain transactions, such as function calls, contract creations, and log entries generated by smart contracts. These details are crucial for creating object-centric event logs, aiding in process mining and analysis to bridge the gap between theoretical process models and actual execution. Contract creations and function calls are fundamental components of the dataset. The former documents the deployment of smart contracts, including the mechanics of contract updates and additions through various design patterns. Function calls between accounts are also extensively logged, providing insights into the flow of Ethereum's native token, Ether, and other transactional data within the blockchain. Delegated calls and log entries represent more specialized interactions within Ethereum, where delegated calls allow contracts to use code from other contracts to manipulate their own state, supporting upgradeable contract designs. Log entries, specified within smart contract code, facilitate the communication of contract execution details to external systems. To handle the diverse and dynamic nature of blockchain data, the dataset employs the Object-Centric Event Log (OCEL) format. This format accommodates multiple object types in a single log, addressing issues such as event divergence and convergence, typical of traditional single-case logs. The latest version, OCEL 2.0, supports documenting dynamic object roles and relationships, improving the fidelity of logs in capturing blockchain operations. In summary, the dataset is structured to support a comprehensive analysis of blockchain behaviors, particularly focusing on Ethereum DApps. It is tailored to assist researchers and practitioners in understanding and analyzing the decentralized execution of smart contracts and the associated data flows within the blockchain environment.
VisDrone Dataset for Drone-Based Computer Vision
kaggle.com
zip
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evil Spirit05 (2024). VisDrone Dataset for Drone-Based Computer Vision [Dataset]. https://www.kaggle.com/datasets/evilspirit05/visdrone/data
Explore at:
zip(1990878150 bytes)Available download formats
Dataset updated
Sep 24, 2024
Authors
Evil Spirit05
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The VisDrone Dataset is a comprehensive benchmark developed by the AISKYEYE team at the Lab of Machine Learning and Data Mining, Tianjin University, China. Designed for various computer vision tasks associated with drone-based image and video analysis, the dataset serves as an essential resource for researchers and practitioners in the field.

Key Features

Extensive Collection: The dataset comprises 288 video clips containing 261,908 frames and 10,209 static images, all captured using different drone-mounted cameras. This extensive collection showcases a wide range of environments, objects, and scenarios.

Diverse Environments: VisDrone encompasses images and videos from 14 cities across China, covering both urban and rural settings. This diversity enhances the dataset's applicability to various real-world applications.

Varied Object Categories: The dataset features a rich array of object categories, including pedestrians, vehicles, bicycles, and tricycles. This variety allows for robust training and evaluation of models across multiple object detection tasks.

High-Quality Annotations: With over 2.6 million manually annotated bounding boxes, the VisDrone Dataset provides detailed ground truth data for object detection, tracking, and crowd counting tasks. Annotations also include attributes such as scene visibility, object class, and occlusion, enabling researchers to develop more effective models.

Dataset Structure

The VisDrone dataset is organized into five main subsets, each targeting a specific task:

Task 1: Object Detection in Images

Task 2: Object Detection in Videos

Task 3: Single-Object Tracking

Task 4: Multi-Object Tracking

Task 5: Crowd Counting This structured approach facilitates focused training and evaluation for distinct computer vision challenges.

Applications

The VisDrone Dataset is widely used for training and evaluating deep learning models in various drone-based computer vision tasks, including:

Object Detection: Identifying and localizing multiple object classes in images and videos.

Object Tracking: Following individual objects across frames in video sequences, enabling applications in surveillance and traffic monitoring.

Crowd Counting: Estimating the number of individuals in crowded scenes, which is valuable for urban planning and safety assessments.

Conclusion

The VisDrone Dataset stands out as a significant contribution to the field of drone-based computer vision. Its diverse sensor data, extensive annotations, and various task-focused subsets make it a valuable resource for advancing research and development in drone applications. Whether for academic research or practical implementations, the VisDrone Dataset is instrumental in fostering innovation in the rapidly evolving domain of drone technology.
m
Asbest veins in the open pit conditions
data.mendeley.com
Updated Dec 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mikhail Ronkin (2022). Asbest veins in the open pit conditions [Dataset]. http://doi.org/10.17632/y2jfk63tpd.1
Explore at:
Unique identifier
https://doi.org/10.17632/y2jfk63tpd.1
Dataset updated
Dec 12, 2022
Authors
Mikhail Ronkin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Database includes 1660 images of asbestos rock-chunks with asbestos veins taken in the different weather and day time conditions. All Data taken in the Bazhenovskoye field, Russia. All data are labeled for instance segmentation (as well as object detection and semantic segmentation) problems and have labeling in the COCO format. The archive contains both: all data in the images folder and annotation in the annotations folder. The labeling was performed manually in the CVAT software. The image size is 2592 × 2048.
Make Data Count Dataset - MinerU Extraction
kaggle.com
zip
Updated Aug 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Omid Erfanmanesh (2025). Make Data Count Dataset - MinerU Extraction [Dataset]. https://www.kaggle.com/datasets/omiderfanmanesh/make-data-count-dataset-mineru-extraction
Explore at:
zip(4272989320 bytes)Available download formats
Dataset updated
Aug 26, 2025
Authors
Omid Erfanmanesh
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Description

This dataset contains PDF-to-text conversions of scientific research articles, prepared for the task of data citation mining. The goal is to identify references to research datasets within full-text scientific papers and classify them as Primary (data generated in the study) or Secondary (data reused from external sources).

The PDF articles were processed using MinerU, which converts scientific PDFs into structured machine-readable formats (JSON, Markdown, images). This ensures participants can access both the raw text and layout information needed for fine-grained information extraction.

Files and Structure

Each paper directory contains the following files:

*_origin.pdf The original PDF file of the scientific article.

*_content_list.json Structured extraction of the PDF content, where each object represents a text or figure element with metadata. Example entry:

{ "type": "text", "text": "10.1002/2017JC013030", "text_level": 1, "page_idx": 0 }

full.md The complete article content in Markdown format (linearized for easier reading).

images/ Folder containing figures and extracted images from the article.

layout.json Page layout metadata, including positions of text blocks and images.

Data Mining Task

The aim is to detect dataset references in the article text and classify them:

DOIs (Digital Object Identifiers): https://doi.org/[prefix]/[suffix] Example: https://doi.org/10.5061/dryad.r6nq870

Accession IDs: Used by data repositories. Format varies by repository. Examples:

GSE12345 (NCBI GEO)

PDB 1Y2T (Protein Data Bank)

E-MEXP-568 (ArrayExpress)

Each dataset mention must be labeled as:

Primary: Data generated by the paper (new experiments, field observations, sequencing runs, etc.).

Secondary: Data reused from external repositories or prior studies.

Training and Test Splits

train/ → Articles with gold-standard labels (train_labels.csv).

test/ → Articles without labels, used for evaluation.

train_labels.csv → Ground truth with:

article_id: Research paper DOI.

dataset_id: Extracted dataset identifier.

type: Citation type (Primary / Secondary).

sample_submission.csv → Example submission format.

Example

Paper: https://doi.org/10.1098/rspb.2016.1151 Data: https://doi.org/10.5061/dryad.6m3n9 In-text span:

"The data we used in this publication can be accessed from Dryad at doi:10.5061/dryad.6m3n9." Citation type: Primary

This dataset enables participants to develop and test NLP systems for:

Information extraction (locating dataset mentions).

Identifier normalization (mapping mentions to persistent IDs).

Citation classification (distinguishing Primary vs Secondary data usage).
Zenodo Open Metadata snapshot - Training dataset for records and communities...
zenodo.org
data.niaid.nih.gov
application/gzip, bin
Updated Dec 15, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo team; Zenodo team (2022). Zenodo Open Metadata snapshot - Training dataset for records and communities classifier building [Dataset]. http://doi.org/10.5281/zenodo.7438358
Explore at:
bin, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7438358
Dataset updated
Dec 15, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Zenodo team; Zenodo team
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains Zenodo's published open access records and communities metadata, including entries marked by the Zenodo staff as spam and deleted.

The datasets are gzipped compressed JSON-lines files, where each line is a JSON object representation of a Zenodo record or community.

Records dataset

Filename: zenodo_open_metadata_{ date of export }.jsonl.gz

Each object contains the terms: part_of, thesis, description, doi, meeting, imprint, references, recid, alternate_identifiers, resource_type, journal, related_identifiers, title, subjects, notes, creators, communities, access_right, keywords, contributors, publication_date

which correspond to the fields with the same name available in Zenodo's record JSON Schema at https://zenodo.org/schemas/records/record-v1.0.0.json.

In addition, some terms have been altered:

The term files contains a list of dictionaries containing filetype, size, and filename only.

The term license contains a short Zenodo ID of the license (e.g. "cc-by").

Communities dataset

Filename: zenodo_community_metadata_{ date of export }.jsonl.gz

Each object contains the terms: id, title, description, curation_policy, page

which correspond to the fields with the same name available in Zenodo's community creation form.

Notes for all datasets

For each object the term spam contains a boolean value, determining whether a given record/community was marked as spam content by Zenodo staff.

Some values for the top-level terms, which were missing in the metadata may contain a null value.

A smaller uncompressed random sample of 200 JSON lines is also included for each dataset to test and get familiar with the format without having to download the entire dataset.
Coal Miners Detection
kaggle.com
zip
Updated Sep 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unique Data (2023). Coal Miners Detection [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/miners-detection
Explore at:
zip(5795006 bytes)Available download formats
Dataset updated
Sep 18, 2023
Authors
Unique Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Miners Object Detection dataset

The dataset consists of of photos captured within various mines, focusing on miners engaged in their work. Each photo is annotated with bounding box detection of the miners, an attribute highlights whether each miner is sitting or standing in the photo.

💴 For Commercial Usage: To discuss your requirements, learn about the price and buy the dataset, leave a request on our website to buy the dataset

The dataset's diverse applications such as computer vision, safety assessment and others make it a valuable resource for researchers, employers, and policymakers in the mining industry.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fdb3f193275f5206914a19b127e20138e%2FFrame%2013.png?generation=1695040375509674&alt=media" alt="">

Get the Dataset

This is just an example of the data

Leave a request on https://trainingdata.pro/datasets to discuss your requirements, learn about the price and buy the dataset

Dataset structure

images - contains of original images of miners

boxes - includes bounding box labeling for the original images

annotations.xml - contains coordinates of the bounding boxes and labels, created for the original photo

Data Format

Each image from images folder is accompanied by an XML-annotation in the annotations.xml file indicating the coordinates of the bounding boxes for miners detection. For each point, the x and y coordinates are provided. The position of the miner is also provided by the attribute is_sitting (true, false).

Example of XML file structure

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Febb59bc7d91a28f4e10c3f3da4ce4488%2Fcarbon%20(1).png?generation=1695040600108833&alt=media" alt="">

Miners detection might be made in accordance with your requirements.

🧩 This is just an example of the data. Leave a request here to learn more

🚀 You can learn more about our high-quality unique datasets here

keywords: coal mines, underground, safety monitoring system, safety dataset, manufacturing dataset, industrial safety database, health and safety dataset, quality control dataset, quality assurance dataset, annotations dataset, computer vision dataset, image dataset, object detection, human images, classification
US Deep Learning Market Analysis, Size, and Forecast 2025-2029
technavio.com
pdf
Updated Jul 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2025). US Deep Learning Market Analysis, Size, and Forecast 2025-2029 [Dataset]. https://www.technavio.com/report/us-deep-learning-market-industry-analysis
Explore at:
pdfAvailable download formats
Dataset updated
Jul 8, 2025
Dataset provided by
TechNavio
Authors
Technavio
License
https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Time period covered
2025 - 2029
Description
Snapshot img

US Deep Learning Market Size 2025-2029

The deep learning market size in US is forecast to increase by USD 5.02 billion at a CAGR of 30.1% between 2024 and 2029.

The deep learning market is experiencing robust growth, driven by the increasing adoption of artificial intelligence (AI) in various industries for advanced solutioning. This trend is fueled by the availability of vast amounts of data, which is a key requirement for deep learning algorithms to function effectively. Industry-specific solutions are gaining traction, as businesses seek to leverage deep learning for specific use cases such as image and speech recognition, fraud detection, and predictive maintenance. Alongside, intuitive data visualization tools are simplifying complex neural network outputs, helping stakeholders understand and validate insights. However, challenges remain, including the need for powerful computing resources, data privacy concerns, and the high cost of implementing and maintaining deep learning systems. Despite these hurdles, the market's potential for innovation and disruption is immense, making it an exciting space for businesses to explore further. Semi-supervised learning, data labeling, and data cleaning facilitate efficient training of deep learning models. Cloud analytics is another significant trend, as companies seek to leverage cloud computing for cost savings and scalability.

What will be the Size of the market During the Forecast Period?

Request Free Sample

Deep learning, a subset of machine learning, continues to shape industries by enabling advanced applications such as image and speech recognition, text generation, and pattern recognition. Reinforcement learning, a type of deep learning, gains traction, with deep reinforcement learning leading the charge. Anomaly detection, a crucial application of unsupervised learning, safeguards systems against security vulnerabilities. Ethical implications and fairness considerations are increasingly important in deep learning, with emphasis on explainable AI and model interpretability. Graph neural networks and attention mechanisms enhance data preprocessing for sequential data modeling and object detection. Time series forecasting and dataset creation further expand deep learning's reach, while privacy preservation and bias mitigation ensure responsible use.

In summary, deep learning's market dynamics reflect a constant pursuit of innovation, efficiency, and ethical considerations. The Deep Learning Market in the US is flourishing as organizations embrace intelligent systems powered by supervised learning and emerging self-supervised learning techniques. These methods refine predictive capabilities and reduce reliance on labeled data, boosting scalability. BFSI firms utilize AI image recognition for various applications, including personalizing customer communication, maintaining a competitive edge, and automating repetitive tasks to boost productivity. Sophisticated feature extraction algorithms now enable models to isolate patterns with high precision, particularly in applications such as image classification for healthcare, security, and retail.

How is this market segmented and which is the largest segment?

The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

Application Image recognition Voice recognition Video surveillance and diagnostics Data mining Type Software Services Hardware End-user Security Automotive Healthcare Retail and commerce Others Geography North America US

By Application Insights

The Image recognition segment is estimated to witness significant growth during the forecast period. In the realm of artificial intelligence (AI) and machine learning, image recognition, a subset of computer vision, is gaining significant traction. This technology utilizes neural networks, deep learning models, and various machine learning algorithms to decipher visual data from images and videos. Image recognition is instrumental in numerous applications, including visual search, product recommendations, and inventory management. Consumers can take photographs of products to discover similar items, enhancing the online shopping experience. In the automotive sector, image recognition is indispensable for advanced driver assistance systems (ADAS) and autonomous vehicles, enabling the identification of pedestrians, other vehicles, road signs, and lane markings.

Furthermore, image recognition plays a pivotal role in augmented reality (AR) and virtual reality (VR) applications, where it tracks physical objects and overlays digital content onto real-world scenarios. The model training process involves the backpropagation algorithm, which calculates the loss fu
B
BI Analysis Software Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jul 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). BI Analysis Software Report [Dataset]. https://www.datainsightsmarket.com/reports/bi-analysis-software-1963150
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Jul 15, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Business Intelligence (BI) analysis software market is booming, driven by big data and cloud computing. Discover key trends, growth projections (2025-2033), leading companies (Microsoft, Tableau, SAP, etc.), and regional market shares in our comprehensive analysis. Learn how BI is transforming decision-making across industries.
z
Simulated Object-Centric Event Logs (OCEL 2.0) for Order-to-Cash,...
zenodo.org
xml
Updated Oct 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alessandro Berti; Alessandro Berti (2024). Simulated Object-Centric Event Logs (OCEL 2.0) for Order-to-Cash, Procure-to-Pay, Hiring, and Hospital Patient Lifecycle Processes [Dataset]. http://doi.org/10.5281/zenodo.13879980
Explore at:
xmlAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13879980
Dataset updated
Oct 2, 2024
Dataset provided by
Zenodo
Authors
Alessandro Berti; Alessandro Berti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains simulated object-centric event logs for four distinct business processes: Order-to-Cash (O2C), Procure-to-Pay (P2P), Hiring, and Hospital Patient Lifecycle. Each process is designed to reflect realistic workflows, encompassing multiple object types and capturing key activities, decision points, and process dynamics. The dataset is aimed at providing a rich source of data for process mining, analysis, and modeling activities.

1. Order-to-Cash (O2C):
The O2C process simulates an end-to-end business flow starting from customer order placement to payment receipt. It includes diverse activities such as order approval, fulfillment, invoice generation, and payment processing, involving object types like Customers, Orders, Products, and Invoices. The dataset captures variability through random decisions, synchronization between departments, and workarounds in credit checks and inventory adjustments. Attributes such as customer tiers, order values, and shipment statuses add further depth, allowing for detailed analysis of this complex process.

2. Procure-to-Pay (P2P):
The P2P process simulates the procurement lifecycle, from requisition creation to payment of suppliers. Key activities include purchase order creation, three-way matching, goods receipt, and payment processing. The event log records object types such as Purchase Requisitions, Purchase Orders, Suppliers, and Invoices. Variability is introduced through approval decisions, batching, and potential mismatches in the matching process. The dataset represents the inherent complexities of real-world procurement operations, including batching and synchronization issues between different process stages.

3. Hiring Process:
The hiring process log tracks the recruitment lifecycle, from job requisition creation to onboarding. It includes object types like Candidates, Job Requisitions, Recruiters, and Interviewers. The process covers activities such as resume screening, interviews, assessments, and offer management. Variability in the hiring process is introduced through random delays, candidate decisions, and background check durations. Batching occurs in stages like resume screening and onboarding, while synchronization challenges arise during interview scheduling.

4. Hospital Patient Lifecycle:
This log represents the lifecycle of patients within a hospital, capturing interactions with multiple resources such as physicians, beds, and medical equipment. The process begins with pre-admission activities, followed by diagnosis, treatment, and discharge. The dataset includes object types like Patients, Physicians, and Medical Equipment, with attributes related to patient demographics and event severity. The process reflects the dynamic nature of hospital operations, including synchronization of resources and the occurrence of workarounds in case of delays or resource unavailability.

Each process simulation captures high variability, synchronization issues, and batching, making this dataset suitable for analyzing real-world operational challenges. The logs provide a comprehensive view of complex workflows, supporting advanced analysis, including object-centric process mining.

This description will provide the necessary details about the dataset, highlighting its structure, purpose, and potential uses for researchers and process analysts.

Object-centric event logs conceived and simulated by the o1-preview-2024-09-12 LRM, using the https://github.com/fit-alessandro-berti/llm-ocel-simulator project.
n
PLOS ONE publication and citation data
data-staging.niaid.nih.gov
data.niaid.nih.gov
+2more
zip
Updated May 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Petersen (2023). PLOS ONE publication and citation data [Dataset]. http://doi.org/10.6071/M39W8V
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6071/M39W8V
Dataset updated
May 15, 2023
Dataset provided by
University of California, Merced
Authors
Alexander Petersen
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Merged PLOS ONE and Web of Science data compiled in .dta files produced by STATA13. Included is a Do-file for reproducing the regression model estimates reported in the pre-print (Tables I and II) and published version (Table 1). Each observation (.dta line) corresponds to a given PLOS ONE article, with various article-level and editor-level characteristics used as explanatory and control variables. This summary provides a brief description of each variable and its source.

If you use this data, please cite: A. M. Petersen. Megajournal mismanagement: Manuscript decision bias and anomalous editor activity at PLOS ONE. Journal of Informetrics 13, 100974 (2019). DOI: 10.1016/j.joi.2019.100974

Methods We gathered the citation information for all PLOS ONE articles, indexed by A, from the Web of Science (WOS) Core Collection. From this data we obtained a master list of the unique digital object identifier, DOIA and the number of citations, cA, at the time of the data download (census) date

(a) For the pre-print this corresponds to December 3, 2016;

(b) and for the final published article this corresponds to February 25, 2019.

We then used each DOIA to access the corresponding online XML version of each article at PLOS ONE by visiting the unique web address “http://journals.plos.org/plosone/article?id=” + “DOIA”. After parsing the full-text XML (primarily the author byline data and reference list), we merged the PLOS ONE publication information and WOS citation data by matching on DOIA.

allofplos: PLOS has since made all full-text XML data freely available: https://www.plos.org/text-and-data-mining ; this option was not available at the moment of our data collection.
Dataset B
figshare.com
xlsx
Updated Jul 18, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suhailan Safei (2017). Dataset B [Dataset]. http://doi.org/10.6084/m9.figshare.5216377.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5216377.v2
Dataset updated
Jul 18, 2017
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Suhailan Safei
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
UEFA Championship ranking-based clustering output

Property	Value
Event Types	10
Object Types	7
Events	14671
Objects	9543

Facebook

Twitter

Click to copy link

Link copied

Cite

Xuhui Zhang; Wenjuan Yang; Bing Ma; Yanqun Wang; Yujia Wu; Jianxin Yan; Yongwei Liu; Chao Zhang; Jicheng Wan; Yue Wang; Mengyao Huang; Yuyang Li; Dian Zhao (2024). An open dataset for intelligent recognition and classification of abnormal condition in longwall mining [Dataset]. http://doi.org/10.6084/m9.figshare.22654945.v1

Data from: An open dataset for intelligent recognition and classification of abnormal condition in longwall mining

Explore at:

binAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.22654945.v1

Dataset updated

Jul 8, 2024

Dataset provided by

Figsharehttp://figshare.com/

Authors

Xuhui Zhang; Wenjuan Yang; Bing Ma; Yanqun Wang; Yujia Wu; Jianxin Yan; Yongwei Liu; Chao Zhang; Jicheng Wan; Yue Wang; Mengyao Huang; Yuyang Li; Dian Zhao

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

This work developed image dataset of underground longwall mining face (DsLMF+), which consists of 138004 images with annotation 6 categories of mine personnel, hydraulic support guard plate, large coal, towline, miners’ behaviour and mine safety helmet. All the labels of dataset are publicly available in YOLO format and COCO format.The dataset aims to support further research and advancement of the intelligent identification and classification of abnormal conditions for underground mining.

Clear search

Close search

Google apps

Main menu

Data from: An open dataset for intelligent recognition and classification of...

Data Mining Dataset

Data Mining

Data Mining Test Dataset

Data Mining Test

Dataset for training classifiers of comparative sentences

ELKI Multi-View Clustering Data Sets Based on the Amsterdam Library of...

Procure-To-Payment (P2P) Object-centric Event Log in OCEL 2.0 Standard

Atlanta, Georgia - Aerial imagery object identification dataset for building...

Repartition of part of visdrone2019 dataset

Market Basket Analysis

Market Basket Analysis

Introduction

An Example of Association Rules

Strategy

Dataset Description

Libraries in R

Data Pre-processing

Comprehensive Ethereum Execution Data for Object-Centric Process Mining of...

VisDrone Dataset for Drone-Based Computer Vision

Key Features

Dataset Structure

The VisDrone dataset is organized into five main subsets, each targeting a specific task:

Applications

Conclusion

Asbest veins in the open pit conditions

Make Data Count Dataset - MinerU Extraction

Dataset Description

Files and Structure

Data Mining Task

Training and Test Splits

Example

Zenodo Open Metadata snapshot - Training dataset for records and communities...

Coal Miners Detection

Miners Object Detection dataset

💴 For Commercial Usage: To discuss your requirements, learn about the price and buy the dataset, leave a request on our website to buy the dataset

Get the Dataset

This is just an example of the data

Dataset structure

Data Format

Example of XML file structure

Miners detection might be made in accordance with your requirements.

🧩 This is just an example of the data. Leave a request here to learn more

US Deep Learning Market Analysis, Size, and Forecast 2025-2029

Snapshot img

BI Analysis Software Report

Simulated Object-Centric Event Logs (OCEL 2.0) for Order-to-Cash,...

PLOS ONE publication and citation data

allofplos: PLOS has since made all full-text XML data freely available: https://www.plos.org/text-and-data-mining ; this option was not available at the moment of our data collection.

Dataset B

Data from: An open dataset for intelligent recognition and classification of abnormal condition in longwall mining