96 datasets found

m
Lisbon, Portugal, hotel’s customer dataset with three years of personal,...
data.mendeley.com
Updated Nov 18, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nuno Antonio (2020). Lisbon, Portugal, hotel’s customer dataset with three years of personal, behavioral, demographic, and geographic information [Dataset]. http://doi.org/10.17632/j83f5fsh6c.1
Explore at:
Unique identifier
https://doi.org/10.17632/j83f5fsh6c.1
Dataset updated
Nov 18, 2020
Authors
Nuno Antonio
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Portugal, Lisbon
Description
Hotel customer dataset with 31 variables describing a total of 83,590 instances (customers). It comprehends three full years of customer behavioral data. In addition to personal and behavioral information, the dataset also contains demographic and geographical information. This dataset contributes to reducing the lack of real-world business data that can be used for educational and research purposes. The dataset can be used in data mining, machine learning, and other analytical field problems in the scope of data science. Due to its unit of analysis, it is a dataset especially suitable for building customer segmentation models, including clustering and RFM (Recency, Frequency, and Monetary value) models, but also be used in classification and regression problems.
s
Data from: Joint Behavior-Topic Model for Microblogs
researchdata.smu.edu.sg
bin
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
QIU Minghui; Feida ZHU; Jing JIANG (2023). Joint Behavior-Topic Model for Microblogs [Dataset]. http://doi.org/10.25440/smu.12062724.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.12062724.v1
Dataset updated
May 31, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
QIU Minghui; Feida ZHU; Jing JIANG
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
We propose an LDA-based behavior-topic model (B-LDA) which jointly models user topic interests and behavioral patterns. We focus the study of the model on on-line social network settings such as microblogs like Twitter where the textual content is relatively short but user interactions on them are rich.Related Publication: Qiu, M., Zhu, F., & Jiang, J. (2013). It is not just what we say, but how we say them: LDA-based behavior-topic model. In 2013 SIAM International Conference on Data Mining (SDM’13): 2-4 May, Austin, Texas (pp. 794-802). Philadelphia: SIAM. http://doi.org/10.1137/1.9781611972832.88
d
Data from: A method for detecting characteristic patterns in social...
datadryad.org
dataone.org
+2more
zip
Updated Dec 13, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikolai W. F. Bode; Andrew Sutton; Lindsey Lacey; John G. Fennell; Ute Leonards (2016). A method for detecting characteristic patterns in social interactions with an application to handover interactions [Dataset]. http://doi.org/10.5061/dryad.8j27n
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.8j27n
Dataset updated
Dec 13, 2016
Dataset provided by
Dryad
Authors
Nikolai W. F. Bode; Andrew Sutton; Lindsey Lacey; John G. Fennell; Ute Leonards
Time period covered
Dec 6, 2016
Description
Data and algorithmsData and algorithms for analysis associated with manuscript. See 'readme.txt' for further detail.alldata.zip
d
Appendix - Mining User Behaviour from Smartphone data: a literature review
data.dtu.dk
xlsx
Updated Jul 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Valentino Servizi; Francisco Camara Pereira; Marie Karen Anderson; Otto Anker Nielsen (2023). Appendix - Mining User Behaviour from Smartphone data: a literature review [Dataset]. http://doi.org/10.11583/DTU.11989455
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.11583/DTU.11989455
Dataset updated
Jul 12, 2023
Dataset provided by
Technical University of Denmark
Authors
Valentino Servizi; Francisco Camara Pereira; Marie Karen Anderson; Otto Anker Nielsen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Each study reviewed is here catalogued as follows.· Level of difficulty: Classification Task, Number and List of Classes.· Approach: Method and Main Features.· Performance: Score, Metric, Validation Method.· Realism of dataset: Ground Truth, Person-day, Respondents, Observations, Collection Time, Area, Smartphone App.· Sensors involved: AGPS, Inertial Navigation Systems (INS), Geographic Information Systems (GIS), Data Fusion.
m
Replication Data for: Do expectations towards Thai hospitality differ? The...
data.mendeley.com
Updated Feb 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RAKSMEY SANN (2023). Replication Data for: Do expectations towards Thai hospitality differ? The views of English vs Chinese speaking travelers [Dataset]. http://doi.org/10.17632/v75j8yhpgy.1
Explore at:
Unique identifier
https://doi.org/10.17632/v75j8yhpgy.1
Dataset updated
Feb 21, 2023
Authors
RAKSMEY SANN
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes replication data for the paper: " Sann, R. and Lai, P.-C. (2021), "Do expectations towards Thai hospitality differ? The views of English vs Chinese speaking travelers", International Journal of Culture, Tourism and Hospitality Research, Vol. 15 No. 1, pp. 43-58. https://doi.org/10.1108/IJCTHR-01-2020-0010".
f
A New Data-Mining Method to Search for Behavioral Properties That Induce...
plos.figshare.com
tiff
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Takashi Ochiai; Yuji Suehiro; Katsuhiro Nishinari; Takeo Kubo; Hideaki Takeuchi (2023). A New Data-Mining Method to Search for Behavioral Properties That Induce Alignment and Their Involvement in Social Learning in Medaka Fish (Oryzias Latipes) [Dataset]. http://doi.org/10.1371/journal.pone.0071685
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0071685
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Takashi Ochiai; Yuji Suehiro; Katsuhiro Nishinari; Takeo Kubo; Hideaki Takeuchi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundCoordinated movement in social animal groups via social learning facilitates foraging activity. Few studies have examined the behavioral cause-and-effect between group members that mediates this social learning.Methodology/Principal FindingsWe first established a behavioral paradigm for visual food learning using medaka fish and demonstrated that a single fish can learn to associate a visual cue with a food reward. Grouped medaka fish (6 fish) learn to respond to the visual cue more rapidly than a single fish, indicating that medaka fish undergo social learning. We then established a data-mining method based on Kullback-Leibler divergence (KLD) to search for candidate behaviors that induce alignment and found that high-speed movement of a focal fish tended to induce alignment of the other members locally and transiently under free-swimming conditions without presentation of a visual cue. The high-speed movement of the informed and trained fish during visual cue presentation appeared to facilitate the alignment of naïve fish in response to some visual cues, thereby mediating social learning. Compared with naïve fish, the informed fish had a higher tendency to induce alignment of other naïve fish under free-swimming conditions without visual cue presentation, suggesting the involvement of individual recognition in social learning.Conclusions/SignificanceBehavioral cause-and-effect studies of the high-speed movement between fish group members will contribute to our understanding of the dynamics of social behaviors. The data-mining method used in the present study is a powerful method to search for candidates factors associated with inter-individual interactions using a dataset for time-series coordinate data of individuals.
D
Data and code for: Physiological and behavioural resistance of malaria...
dataverse.ird.fr
pdf, text/x-r-source +2
Updated Dec 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paul Taconet; Paul Taconet; Diloma Dieudonné Soma; Diloma Dieudonné Soma; Barnabas Zogo; Karine Mouline; Karine Mouline; Frédéric Simard; Frédéric Simard; Alphonsine Koffi Amanan; Roch Kounbobr Dabiré; Roch Kounbobr Dabiré; Cédric Pennetier; Cédric Pennetier; Nicolas Moiroux; Nicolas Moiroux; Barnabas Zogo; Alphonsine Koffi Amanan (2023). Data and code for: Physiological and behavioural resistance of malaria vectors in rural West-Africa: a data mining study to adress their fine-scale spatiotemporal heterogeneity, drivers, and predictability [Dataset]. http://doi.org/10.23708/LV8GEW
Explore at:
zip(508869), text/x-r-source(3782), zip(156257727), pdf(33371), text/x-r-source(20243), txt(2304)Available download formats
Unique identifier
https://doi.org/10.23708/LV8GEW
Dataset updated
Dec 20, 2023
Dataset provided by
DataSuds
Authors
Paul Taconet; Paul Taconet; Diloma Dieudonné Soma; Diloma Dieudonné Soma; Barnabas Zogo; Karine Mouline; Karine Mouline; Frédéric Simard; Frédéric Simard; Alphonsine Koffi Amanan; Roch Kounbobr Dabiré; Roch Kounbobr Dabiré; Cédric Pennetier; Cédric Pennetier; Nicolas Moiroux; Nicolas Moiroux; Barnabas Zogo; Alphonsine Koffi Amanan
License
https://dataverse.ird.fr/api/datasets/:persistentId/versions/5.0/customlicense?persistentId=doi:10.23708/LV8GEWhttps://dataverse.ird.fr/api/datasets/:persistentId/versions/5.0/customlicense?persistentId=doi:10.23708/LV8GEW
Time period covered
Oct 1, 2016 - Jun 1, 2018
Area covered
West Africa, Africa, Côte d'Ivoire, Savanes, Sud-Ouest, Burkina Faso
Dataset funded by
IRD
French Ministry for Europe and Foreign Affairs
Agence Nationale de la Recherche
Initiative 5% - Expertise France
Description
These data and scripts are accompanying the manuscript "Physiological and behavioural resistance of malaria vectors in rural West-Africa: a data mining study to adress their fine-scale spatiotemporal heterogeneity, drivers, and predictability" by Paul Taconet, Dieudonne Diloma Soma, Barnabas Zogo, Karine Mouline, Frederic Simard, Alphonsine Amanan Koffi, Roch Kounbobr Dabiré, Cedric Pennetier, and Nicolas Moiroux. The manuscript has been posted as a preprint on biorXiv (https://doi.org/10.1101/2022.08.20.504631). In this data-mining work, we modeled a set of indicators of physiological resistances to insecticide (prevalence of three target-site mutations) and biting behaviours (early- and late-biting, exophagy) of anopheles mosquitoes in two rural areas of West-Africa, located in Burkina Faso and Cote d'Ivoire. To this aim, we used mosquito field collections along with heterogeneous, multisource and multi-scale environmental data. The objectives were i) to assess the small-scale spatial and temporal heterogeneity of the indicators, ii) to better understand their drivers, and iii) to assess their spatio-temporal predictability, at scales that are consistent with operational action. The explanatory variables covered a wide range of potential environmental determinants of vector resistance to insecticide or feeding behaviour: vector control, human availability and nocturnal behaviour, macro and micro-climatic conditions, landscape, etc. ContentsInput datasets and the R script used for the data analyses are provided. Because the models may take very long to fit (due to the size of the raw data), they were pre-fit, saved as .rds files ('R Data Serialization' format), and made available in the "models" folder. The R script used to answer to one of the reviewer's question (reviewer n°1, question n°1) is also included.
Webis Simulation Data Mining Bridge Models Corpus 2012 (Webis-SDMbridge-12)
zenodo.org
live.european-language-grid.eu
zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steven Burrows; Benno Stein; Benno Stein; Jörg Frochte; Tim Gollub; Tim Gollub; Peter Hirsch; Jens Opolka; Tom Paschke; Michael Völske; Michael Völske; Steven Burrows; Jörg Frochte; Peter Hirsch; Jens Opolka; Tom Paschke (2020). Webis Simulation Data Mining Bridge Models Corpus 2012 (Webis-SDMbridge-12) [Dataset]. http://doi.org/10.5281/zenodo.3259676
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3259676
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Steven Burrows; Benno Stein; Benno Stein; Jörg Frochte; Tim Gollub; Tim Gollub; Peter Hirsch; Jens Opolka; Tom Paschke; Michael Völske; Michael Völske; Steven Burrows; Jörg Frochte; Peter Hirsch; Jens Opolka; Tom Paschke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This corpus provides the simulation data mining community with a collection of 14641 bridge models and simulated behavior.

1. Folder "1-designs"

The text files in this directory should contain all information for the
independent variables any machine learning experiment. For reference, all 14641 IFC models are supplied in subfolders 001 to 147.

2. Folder "2-simulation"

This folder contains samples of the simulation output that may be viewed in Paraview (http://www.paraview.org). The original model contains the "Org" filename fragment, and the maximum and minimum behaviors are indicated with "Max" and "Min" filename fragments. Displacement, strain, and stress behaviors are all given. Only three of the 14641 models are given as the file sizes are
around 1.4 to 2.2 megabytes each. The complete data (approximately 81 gigabytes) can be regenerated and provided if necessary on request (email webis@medien.uni-weimar.de).

3. Folder "3-aggregation"

Maximum displacement, strain, and stress measurements are given in the text files individually, and together in the files with the "vtk" filename fragment. This data should be sufficient for the dependent variables of any machine learning experiment.
Market Basket Analysis
kaggle.com
zip
Updated Dec 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
Explore at:
zip(23875170 bytes)Available download formats
Dataset updated
Dec 9, 2021
Authors
Aslan Ahmedov
Description
Market Basket Analysis

Market basket analysis with Apriori algorithm

The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

Introduction

Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

An Example of Association Rules

Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Strategy

Data Import

Data Understanding and Exploration

Transformation of the data – so that is ready to be consumed by the association rules algorithm

Running association rules

Exploring the rules generated

Filtering the generated rules

Visualization of Rule

Dataset Description

File name: Assignment-1_Data

List name: retaildata

File format: . xlsx

Number of Row: 522065

Number of Attributes: 7

BillNo: 6-digit number assigned to each transaction. Nominal.

Itemname: Product name. Nominal.

Quantity: The quantities of each product per transaction. Numeric.

Date: The day and time when each transaction was generated. Numeric.

Price: Product price. Numeric.

CustomerID: 5-digit number assigned to each customer. Nominal.

Country: Name of the country where each customer resides. Nominal.

https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

Libraries in R

First, we need to load required libraries. Shortly I describe all libraries.

arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).

arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.

tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.

readxl - Read Excel Files in R.

plyr - Tools for Splitting, Applying and Combining Data.

ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

knitr - Dynamic Report generation in R.

magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.

dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

Data Pre-processing

Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

After we will clear our data frame, will remove missing values.

https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
Video-to-Model Data Set
figshare.com
commons.datacite.org
xml
Updated Mar 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz (2020). Video-to-Model Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.12026850.v1
Explore at:
xmlAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12026850.v1
Dataset updated
Mar 24, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data set belongs to the paper "Video-to-Model: Unsupervised Trace Extraction from Videos for Process Discovery and Conformance Checking in Manual Assembly", submitted on March 24, 2020, to the 18th International Conference on Business Process Management (BPM).Abstract: Manual activities are often hidden deep down in discrete manufacturing processes. For the elicitation and optimization of process behavior, complete information about the execution of Manual activities are required. Thus, an approach is presented on how execution level information can be extracted from videos in manual assembly. The goal is the generation of a log that can be used in state-of-the-art process mining tools. The test bed for the system was lightweight and scalable consisting of an assembly workstation equipped with a single RGB camera recording only the hand movements of the worker from top. A neural network based real-time object classifier was trained to detect the worker’s hands. The hand detector delivers the input for an algorithm, which generates trajectories reflecting the movement paths of the hands. Those trajectories are automatically assigned to work steps using the position of material boxes on the assembly shelf as reference points and hierarchical clustering of similar behaviors with dynamic time warping. The system has been evaluated in a task-based study with ten participants in a laboratory, but under realistic conditions. The generated logs have been loaded into the process mining toolkit ProM to discover the underlying process model and to detect deviations from both, instructions and ground truth, using conformance checking. The results show that process mining delivers insights about the assembly process and the system’s precision.The data set contains the generated and the annotated logs based on the video material gathered during the user study. In addition, the petri nets from the process discovery and conformance checking conducted with ProM (http://www.promtools.org) and the reference nets modeled with Yasper (http://www.yasper.org/) are provided.
Data: Understanding the Behavior of Process Mining Analysts: A Catalogue of...
zenodo.org
Updated Jul 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jessica Van Suetendael; Jessica Van Suetendael; Benoît Depaire; Benoît Depaire; Mieke Jans; Mieke Jans; Niels Martin; Niels Martin (2025). Data: Understanding the Behavior of Process Mining Analysts: A Catalogue of Exploratory Process Mining Behaviors [Dataset]. http://doi.org/10.5281/zenodo.15845469
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15845469
Dataset updated
Jul 9, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jessica Van Suetendael; Jessica Van Suetendael; Benoît Depaire; Benoît Depaire; Mieke Jans; Mieke Jans; Niels Martin; Niels Martin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The codings for the literature-based ethogram and the interview protocol of the paper titled: "Understanding the Behavior of Process Mining Analysts: A Catalogue of Exploratory Process Mining Behaviors" can be found in this depository.
Financial-Behavior
kaggle.com
zip
Updated Nov 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ziya (2024). Financial-Behavior [Dataset]. https://www.kaggle.com/datasets/ziya07/financial-behavior
Explore at:
zip(30268 bytes)Available download formats
Dataset updated
Nov 20, 2024
Authors
Ziya
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset contains 500 tweets related to financial literacy and consumer behavior, designed for tasks such as sentiment analysis, emotion classification, and behavior prediction. The dataset was generated to support research in financial literacy education and consumer behavior modeling, incorporating realistic tweet structures and metadata.

Dataset Features tweet_content (string): The text of the tweets, reflecting various financial literacy topics and emotions.

emotion (categorical): The emotion expressed in the tweet, selected from:

Positive Fear Anticipation Disgust Surprise sentiment_score (float): A numerical score representing the sentiment of the tweet, ranging from -1 (negative sentiment) to 1 (positive sentiment).

likes (integer): Number of likes the tweet received (simulated).

retweets (integer): Number of retweets the tweet received (simulated).

replies (integer): Number of replies the tweet received (simulated).

topic_tags (categorical): The main financial topic discussed in the tweet, selected from:

Savings Investment Budgeting Debt Management Financial Planning Credit Scores Spending Habits financial_behavior (categorical): A classification of the financial behavior implied by the tweet, categorized as:

Good behavior Moderate behavior Risky behavior Potential Use Cases Sentiment analysis and emotion classification. Behavioral modeling for financial decision-making. Testing machine learning algorithms for financial literacy. Educational applications for personalized financial learning platforms. Simulating tweet analysis in social media mining studies.
4
Data underlying the paper: An agent-based process mining architecture for...
data.4tu.nl
figshare.com
zip
Updated Aug 27, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rob Bemthuis; M. (Martijn) Koot; M.R.K. (Martijn) Mes; F.A. (Faiza) Bukhsh; M.E. (Maria-Eugenia) Iacob; N. (Nirvana) Meratnia (2019). Data underlying the paper: An agent-based process mining architecture for emergent behavior analysis [Dataset]. http://doi.org/10.4121/uuid:9e430177-1dd0-40e9-b48a-8eb39124ef4c
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:9e430177-1dd0-40e9-b48a-8eb39124ef4c
Dataset updated
Aug 27, 2019
Dataset provided by
4TU.Centre for Research Data
Authors
Rob Bemthuis; M. (Martijn) Koot; M.R.K. (Martijn) Mes; F.A. (Faiza) Bukhsh; M.E. (Maria-Eugenia) Iacob; N. (Nirvana) Meratnia
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The dataset contains a collection of experiment results and event logs generated. The experiment comprises a job-shop scheduling problem, implemented in a discrete-event simulation model. The raw experiment results are given from which event log files can be generated by following the steps as described in this data paper or the referred academic paper. A collection of event log files is given, as well as the raw files. The logs include the filtered part of the case study as presented in the paper "An agent-based process mining architecture for emergent behavior analysis" by Rob Bemthuis, Martijn Koot, Martijn Mes, Faiza Bukhsh, Maria-Eugenia Iacob, and Nirvana Meratnia.
Global Wildfire Database for GWIS (2021)
doi.pangaea.de
html, tsv
Updated May 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomàs Artés Vivancos (2022). Global Wildfire Database for GWIS (2021) [Dataset]. http://doi.org/10.1594/PANGAEA.943975
Explore at:
html, tsvAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.943975
Dataset updated
May 11, 2022
Dataset provided by
PANGAEA
Authors
Tomàs Artés Vivancos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2000 - Jan 1, 2021
Variables measured
DATE/TIME, File content, Binary Object, Binary Object (File Size)
Description
Global Wildfire Database for GWIS (2021) is an individual fire event focused database. Post processing of MCD64A1 providing geometries of final fire perimeters including initial and final date and the corresponding daily active areas for each fire. This dataset is an update of the data related with GlobFire (https://doi.org/10.6084/m9.figshare.10284101). […]
c
Global Data Mining Software Market Report 2025 Edition, Market Size, Share,...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2025). Global Data Mining Software Market Report 2025 Edition, Market Size, Share, CAGR, Forecast, Revenue [Dataset]. https://www.cognitivemarketresearch.com/data-mining-software-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Jun 2, 2025
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, the global Data Mining Software market size will be USD XX million in 2025. It will expand at a compound annual growth rate (CAGR) of XX% from 2025 to 2031.

North America held the major market share for more than XX% of the global revenue with a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. Europe accounted for a market share of over XX% of the global revenue with a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. Asia Pacific held a market share of around XX% of the global revenue with a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. Latin America had a market share of more than XX% of the global revenue with a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. Middle East and Africa had a market share of around XX% of the global revenue and was estimated at a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. KEY DRIVERS

Increasing Focus on Customer Satisfaction to Drive Data Mining Software Market Growth

In today’s hyper-competitive and digitally connected marketplace, customer satisfaction has emerged as a critical factor for business sustainability and growth. The growing focus on enhancing customer satisfaction is proving to be a significant driver in the expansion of the data mining software market. Organizations are increasingly leveraging data mining tools to sift through vast volumes of customer data—ranging from transactional records and website activity to social media engagement and call center logs—to uncover insights that directly influence customer experience strategies. Data mining software empowers companies to analyze customer behavior patterns, identify dissatisfaction triggers, and predict future preferences. Through techniques such as classification, clustering, and association rule mining, businesses can break down large datasets to understand what customers want, what they are likely to purchase next, and how they feel about the brand. These insights not only help in refining customer service but also in shaping product development, pricing strategies, and promotional campaigns. For instance, Netflix uses data mining to recommend personalized content by analyzing a user's viewing history, ratings, and preferences. This has led to increased user engagement and retention, highlighting how a deep understanding of customer preferences—made possible through data mining—can translate into competitive advantage. Moreover, companies are increasingly using these tools to create highly targeted and customer-specific marketing campaigns. By mining data from e-commerce transactions, browsing behavior, and demographic profiles, brands can tailor their offerings and communications to suit individual customer segments. For Instance Amazon continuously mines customer purchasing and browsing data to deliver personalized product recommendations, tailored promotions, and timely follow-ups. This not only enhances customer satisfaction but also significantly boosts conversion rates and average order value. According to a report by McKinsey, personalization can deliver five to eight times the ROI on marketing spend and lift sales by 10% or more—a powerful incentive for companies to adopt data mining software as part of their customer experience toolkit. (Source: https://www.mckinsey.com/capabilities/growth-marketing-and-sales/our-insights/personalizing-at-scale#/) The utility of data mining tools extends beyond e-commerce and streaming platforms. In the banking and financial services industry, for example, institutions use data mining to analyze customer feedback, call center transcripts, and usage data to detect pain points and improve service delivery. Bank of America, for instance, utilizes data mining and predictive analytics to monitor customer interactions and provide proactive service suggestions or fraud alerts, significantly improving user satisfaction and trust. (Source: https://futuredigitalfinance.wbresearch.com/blog/bank-of-americas-erica-client-interactions-future-ai-in-banking) Similarly, telecom companies like Vodafone use data mining to understand customer churn behavior and implement retention strategies based on insights drawn from service usage patterns and complaint histories. In addition to p...
Student Performance and Learning Behavior Dataset
kaggle.com
zip
Updated Sep 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adil Shamim (2025). Student Performance and Learning Behavior Dataset [Dataset]. https://www.kaggle.com/datasets/adilshamim8/student-performance-and-learning-style
Explore at:
zip(78897 bytes)Available download formats
Dataset updated
Sep 4, 2025
Authors
Adil Shamim
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset provides a comprehensive view of student performance and learning behavior, integrating academic, demographic, behavioral, and psychological factors.

It was created by merging two publicly available Kaggle datasets, resulting in a unified dataset of 14,003 student records with 16 attributes. All entries are anonymized, with no personally identifiable information.

Key Features

Study behaviors & engagement → StudyHours, Attendance, Extracurricular, AssignmentCompletion, OnlineCourses, Discussions

Resources & environment → Resources, Internet, EduTech

Motivation & psychology → Motivation, StressLevel

Demographics → Gender, Age (18–30 years)

Learning preference → LearningStyle

Performance indicators → ExamScore, FinalGrade

Objectives & Use Cases

The dataset can be used for:

Predictive modeling → Regression/classification of student performance (ExamScore, FinalGrade)

Clustering analysis → Identifying learning behavior groups with K-Means or other unsupervised methods

Educational analytics → Exploring how study habits, stress, and motivation affect outcomes

Adaptive learning research → Linking behavioral patterns to personalized learning pathways

Analysis Pipeline (from original study)

The dataset was analyzed in Python using:

Preprocessing → Encoding, normalization (z-score, Min–Max), deduplication

Clustering → K-Means, Elbow Method, Silhouette Score, Davies–Bouldin Index

Dimensionality Reduction → PCA (2D/3D visualizations)

Statistical Analysis → ANOVA, regression for group differences

Interpretation → Mapping clusters to LearningStyle categories & extracting insights for adaptive learning

File

merged_dataset.csv → 14,003 rows × 16 columns Includes student demographics, behaviors, engagement, learning styles, and performance indicators.

Provenance

Source: Zenodo – Student Performance and Learning Behavior Dataset

Creator: Kamal Najem (2024)

License: CC BY 4.0 (per Zenodo terms)

This dataset is an excellent playground for educational data mining — from clustering and behavioral analytics to predictive modeling and personalized learning applications.
Defensive Assignment Identification
figshare.com
zip
Updated Aug 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fabiano Baldo; Yoran, Leichsenring (2020). Defensive Assignment Identification [Dataset]. http://doi.org/10.6084/m9.figshare.12678989.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12678989.v1
Dataset updated
Aug 18, 2020
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Fabiano Baldo; Yoran, Leichsenring
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This file contains the source code, the dataset and result files used to validate the work entitled "A method to identify defensive assignments in team-based invasion sports using spatiotemporal trajectories" that is under publication at the International Journal of Geographical Information Science. The complete reference of the published paper will be posted when available.
Sample data (five types of features of one participant)
figshare.com
txt
Updated Mar 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hua Liao (2022). Sample data (five types of features of one participant) [Dataset]. http://doi.org/10.6084/m9.figshare.19443503.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19443503.v1
Dataset updated
Mar 29, 2022
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Hua Liao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sample data (five types of features of one participant)
Fraud Detection Software Developers in the US - Market Research Report...
ibisworld.com
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IBISWorld (2025). Fraud Detection Software Developers in the US - Market Research Report (2015-2030) [Dataset]. https://www.ibisworld.com/united-states/market-research-reports/fraud-detection-software-developers-industry/
Explore at:
Dataset updated
May 15, 2025
Dataset authored and provided by
IBISWorld
License
https://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/
Time period covered
2015 - 2030
Area covered
United States
Description
In the rapidly evolving US fraud detection software industry, developers invest significant capital in staying ahead of increasingly sophisticated cyber threats and fraud tactics. Over the past five years, accelerated digitalization, a surge in real-time payments and the adoption of e-commerce have fueled demand for industry solutions. Emerging trends such as behavioral biometrics, deepfake detection and real-time anomaly scoring have become essential, and developers now deliver cloud-based platforms able to address emerging threats. As businesses, banks, healthcare providers and public sector organizations face rising regulatory scrutiny and compliance demands, industry revenue has grown at a CAGR of 9.1% to an estimated $26.3 billion, including anticipated growth of 5.4% in 2025 alone. The widespread adoption of contactless payment technologies, such as mobile wallets and tap-to-pay cards enabled by Near Field Communication (NFC), has introduced a fresh set of vulnerabilities. Cybercriminals are now leveraging advanced techniques to exploit weaknesses that legacy systems are not designed to detect. These threats have required fraud detection software developers to integrate novel security measures into their offerings. Meanwhile, the rapid growth of e-commerce has been a significant driver of demand for fraud detection software among retail and wholesale companies. As more consumers migrate to online shopping platforms, transaction volumes have soared, exposing retailers and wholesalers to heightened risks. This has provided industry developers with a high-growth market where they often benefit from increased pricing power, which supports profit growth. Moving forward, the industry is set for further transformation as regulatory mandates around AI-enabled fraud prevention, deepfake detection and real-time compliance reporting become widespread. Continuous M&A activity and increased demand from high-growth market segments will strengthen revenue streams. Despite ongoing competitive pressures and rapidly shifting threat landscapes, these factors are forecast to support a robust industry revenue CAGR of 5.2% through 2030, reaching an estimated $33.8 billion.
Tagged Flickr metadata
search.datacite.org
figshare.com
Updated Jan 26, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nataliya Tkachenko (2017). Tagged Flickr metadata [Dataset]. http://doi.org/10.6084/m9.figshare.4591009.v1
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.4591009.v1
Dataset updated
Jan 26, 2017
Dataset provided by
DataCitehttps://www.datacite.org/
Figsharehttp://figshare.com/
Authors
Nataliya Tkachenko
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Anonymised datafile, extracted from YFCC100M archive, contains tags and keywords corresponding to risk-signalling and neutral environmental semantics

Facebook

Twitter

Click to copy link

Link copied

Cite

Nuno Antonio (2020). Lisbon, Portugal, hotel’s customer dataset with three years of personal, behavioral, demographic, and geographic information [Dataset]. http://doi.org/10.17632/j83f5fsh6c.1

Lisbon, Portugal, hotel’s customer dataset with three years of personal, behavioral, demographic, and geographic information

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://doi.org/10.17632/j83f5fsh6c.1

Dataset updated

Nov 18, 2020

Authors

Nuno Antonio

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Portugal, Lisbon

Description

Hotel customer dataset with 31 variables describing a total of 83,590 instances (customers). It comprehends three full years of customer behavioral data. In addition to personal and behavioral information, the dataset also contains demographic and geographical information. This dataset contributes to reducing the lack of real-world business data that can be used for educational and research purposes. The dataset can be used in data mining, machine learning, and other analytical field problems in the scope of data science. Due to its unit of analysis, it is a dataset especially suitable for building customer segmentation models, including clustering and RFM (Recency, Frequency, and Monetary value) models, but also be used in classification and regression problems.

Clear search

Close search

Google apps

Main menu

Lisbon, Portugal, hotel’s customer dataset with three years of personal,...

Data from: Joint Behavior-Topic Model for Microblogs

Data from: A method for detecting characteristic patterns in social...

Appendix - Mining User Behaviour from Smartphone data: a literature review

Replication Data for: Do expectations towards Thai hospitality differ? The...

A New Data-Mining Method to Search for Behavioral Properties That Induce...

Data and code for: Physiological and behavioural resistance of malaria...

Webis Simulation Data Mining Bridge Models Corpus 2012 (Webis-SDMbridge-12)

Market Basket Analysis

Market Basket Analysis

Introduction

An Example of Association Rules

Strategy

Dataset Description

Libraries in R

Data Pre-processing

Video-to-Model Data Set

Data: Understanding the Behavior of Process Mining Analysts: A Catalogue of...

Financial-Behavior

Data underlying the paper: An agent-based process mining architecture for...

Global Wildfire Database for GWIS (2021)

Global Data Mining Software Market Report 2025 Edition, Market Size, Share,...

Student Performance and Learning Behavior Dataset

Key Features

Objectives & Use Cases

Analysis Pipeline (from original study)

File

Provenance

Defensive Assignment Identification

Sample data (five types of features of one participant)

Fraud Detection Software Developers in the US - Market Research Report...

Tagged Flickr metadata

Lisbon, Portugal, hotel’s customer dataset with three years of personal, behavioral, demographic, and geographic informationSee More Versions

Lisbon, Portugal, hotel’s customer dataset with three years of personal, behavioral, demographic, and geographic information