96 datasets found
  1. m

    Lisbon, Portugal, hotel’s customer dataset with three years of personal,...

    • data.mendeley.com
    Updated Nov 18, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nuno Antonio (2020). Lisbon, Portugal, hotel’s customer dataset with three years of personal, behavioral, demographic, and geographic information [Dataset]. http://doi.org/10.17632/j83f5fsh6c.1
    Explore at:
    Dataset updated
    Nov 18, 2020
    Authors
    Nuno Antonio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Portugal, Lisbon
    Description

    Hotel customer dataset with 31 variables describing a total of 83,590 instances (customers). It comprehends three full years of customer behavioral data. In addition to personal and behavioral information, the dataset also contains demographic and geographical information. This dataset contributes to reducing the lack of real-world business data that can be used for educational and research purposes. The dataset can be used in data mining, machine learning, and other analytical field problems in the scope of data science. Due to its unit of analysis, it is a dataset especially suitable for building customer segmentation models, including clustering and RFM (Recency, Frequency, and Monetary value) models, but also be used in classification and regression problems.

  2. s

    Data from: Joint Behavior-Topic Model for Microblogs

    • researchdata.smu.edu.sg
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QIU Minghui; Feida ZHU; Jing JIANG (2023). Joint Behavior-Topic Model for Microblogs [Dataset]. http://doi.org/10.25440/smu.12062724.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    SMU Research Data Repository (RDR)
    Authors
    QIU Minghui; Feida ZHU; Jing JIANG
    License

    http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/

    Description

    We propose an LDA-based behavior-topic model (B-LDA) which jointly models user topic interests and behavioral patterns. We focus the study of the model on on-line social network settings such as microblogs like Twitter where the textual content is relatively short but user interactions on them are rich.Related Publication: Qiu, M., Zhu, F., & Jiang, J. (2013). It is not just what we say, but how we say them: LDA-based behavior-topic model. In 2013 SIAM International Conference on Data Mining (SDM’13): 2-4 May, Austin, Texas (pp. 794-802). Philadelphia: SIAM. http://doi.org/10.1137/1.9781611972832.88

  3. d

    Data from: A method for detecting characteristic patterns in social...

    • datadryad.org
    • dataone.org
    • +2more
    zip
    Updated Dec 13, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikolai W. F. Bode; Andrew Sutton; Lindsey Lacey; John G. Fennell; Ute Leonards (2016). A method for detecting characteristic patterns in social interactions with an application to handover interactions [Dataset]. http://doi.org/10.5061/dryad.8j27n
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 13, 2016
    Dataset provided by
    Dryad
    Authors
    Nikolai W. F. Bode; Andrew Sutton; Lindsey Lacey; John G. Fennell; Ute Leonards
    Time period covered
    Dec 6, 2016
    Description

    Data and algorithmsData and algorithms for analysis associated with manuscript. See 'readme.txt' for further detail.alldata.zip

  4. d

    Appendix - Mining User Behaviour from Smartphone data: a literature review

    • data.dtu.dk
    xlsx
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valentino Servizi; Francisco Camara Pereira; Marie Karen Anderson; Otto Anker Nielsen (2023). Appendix - Mining User Behaviour from Smartphone data: a literature review [Dataset]. http://doi.org/10.11583/DTU.11989455
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 12, 2023
    Dataset provided by
    Technical University of Denmark
    Authors
    Valentino Servizi; Francisco Camara Pereira; Marie Karen Anderson; Otto Anker Nielsen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Each study reviewed is here catalogued as follows.· Level of difficulty: Classification Task, Number and List of Classes.· Approach: Method and Main Features.· Performance: Score, Metric, Validation Method.· Realism of dataset: Ground Truth, Person-day, Respondents, Observations, Collection Time, Area, Smartphone App.· Sensors involved: AGPS, Inertial Navigation Systems (INS), Geographic Information Systems (GIS), Data Fusion.

  5. m

    Replication Data for: Do expectations towards Thai hospitality differ? The...

    • data.mendeley.com
    Updated Feb 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RAKSMEY SANN (2023). Replication Data for: Do expectations towards Thai hospitality differ? The views of English vs Chinese speaking travelers [Dataset]. http://doi.org/10.17632/v75j8yhpgy.1
    Explore at:
    Dataset updated
    Feb 21, 2023
    Authors
    RAKSMEY SANN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset includes replication data for the paper: " Sann, R. and Lai, P.-C. (2021), "Do expectations towards Thai hospitality differ? The views of English vs Chinese speaking travelers", International Journal of Culture, Tourism and Hospitality Research, Vol. 15 No. 1, pp. 43-58. https://doi.org/10.1108/IJCTHR-01-2020-0010".

  6. f

    A New Data-Mining Method to Search for Behavioral Properties That Induce...

    • plos.figshare.com
    tiff
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Takashi Ochiai; Yuji Suehiro; Katsuhiro Nishinari; Takeo Kubo; Hideaki Takeuchi (2023). A New Data-Mining Method to Search for Behavioral Properties That Induce Alignment and Their Involvement in Social Learning in Medaka Fish (Oryzias Latipes) [Dataset]. http://doi.org/10.1371/journal.pone.0071685
    Explore at:
    tiffAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Takashi Ochiai; Yuji Suehiro; Katsuhiro Nishinari; Takeo Kubo; Hideaki Takeuchi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundCoordinated movement in social animal groups via social learning facilitates foraging activity. Few studies have examined the behavioral cause-and-effect between group members that mediates this social learning.Methodology/Principal FindingsWe first established a behavioral paradigm for visual food learning using medaka fish and demonstrated that a single fish can learn to associate a visual cue with a food reward. Grouped medaka fish (6 fish) learn to respond to the visual cue more rapidly than a single fish, indicating that medaka fish undergo social learning. We then established a data-mining method based on Kullback-Leibler divergence (KLD) to search for candidate behaviors that induce alignment and found that high-speed movement of a focal fish tended to induce alignment of the other members locally and transiently under free-swimming conditions without presentation of a visual cue. The high-speed movement of the informed and trained fish during visual cue presentation appeared to facilitate the alignment of naïve fish in response to some visual cues, thereby mediating social learning. Compared with naïve fish, the informed fish had a higher tendency to induce alignment of other naïve fish under free-swimming conditions without visual cue presentation, suggesting the involvement of individual recognition in social learning.Conclusions/SignificanceBehavioral cause-and-effect studies of the high-speed movement between fish group members will contribute to our understanding of the dynamics of social behaviors. The data-mining method used in the present study is a powerful method to search for candidates factors associated with inter-individual interactions using a dataset for time-series coordinate data of individuals.

  7. D

    Data and code for: Physiological and behavioural resistance of malaria...

    • dataverse.ird.fr
    pdf, text/x-r-source +2
    Updated Dec 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Taconet; Paul Taconet; Diloma Dieudonné Soma; Diloma Dieudonné Soma; Barnabas Zogo; Karine Mouline; Karine Mouline; Frédéric Simard; Frédéric Simard; Alphonsine Koffi Amanan; Roch Kounbobr Dabiré; Roch Kounbobr Dabiré; Cédric Pennetier; Cédric Pennetier; Nicolas Moiroux; Nicolas Moiroux; Barnabas Zogo; Alphonsine Koffi Amanan (2023). Data and code for: Physiological and behavioural resistance of malaria vectors in rural West-Africa: a data mining study to adress their fine-scale spatiotemporal heterogeneity, drivers, and predictability [Dataset]. http://doi.org/10.23708/LV8GEW
    Explore at:
    zip(508869), text/x-r-source(3782), zip(156257727), pdf(33371), text/x-r-source(20243), txt(2304)Available download formats
    Dataset updated
    Dec 20, 2023
    Dataset provided by
    DataSuds
    Authors
    Paul Taconet; Paul Taconet; Diloma Dieudonné Soma; Diloma Dieudonné Soma; Barnabas Zogo; Karine Mouline; Karine Mouline; Frédéric Simard; Frédéric Simard; Alphonsine Koffi Amanan; Roch Kounbobr Dabiré; Roch Kounbobr Dabiré; Cédric Pennetier; Cédric Pennetier; Nicolas Moiroux; Nicolas Moiroux; Barnabas Zogo; Alphonsine Koffi Amanan
    License

    https://dataverse.ird.fr/api/datasets/:persistentId/versions/5.0/customlicense?persistentId=doi:10.23708/LV8GEWhttps://dataverse.ird.fr/api/datasets/:persistentId/versions/5.0/customlicense?persistentId=doi:10.23708/LV8GEW

    Time period covered
    Oct 1, 2016 - Jun 1, 2018
    Area covered
    West Africa, Africa, Côte d'Ivoire, Savanes, Sud-Ouest, Burkina Faso
    Dataset funded by
    IRD
    French Ministry for Europe and Foreign Affairs
    Agence Nationale de la Recherche
    Initiative 5% - Expertise France
    Description

    These data and scripts are accompanying the manuscript "Physiological and behavioural resistance of malaria vectors in rural West-Africa: a data mining study to adress their fine-scale spatiotemporal heterogeneity, drivers, and predictability" by Paul Taconet, Dieudonne Diloma Soma, Barnabas Zogo, Karine Mouline, Frederic Simard, Alphonsine Amanan Koffi, Roch Kounbobr Dabiré, Cedric Pennetier, and Nicolas Moiroux. The manuscript has been posted as a preprint on biorXiv (https://doi.org/10.1101/2022.08.20.504631). In this data-mining work, we modeled a set of indicators of physiological resistances to insecticide (prevalence of three target-site mutations) and biting behaviours (early- and late-biting, exophagy) of anopheles mosquitoes in two rural areas of West-Africa, located in Burkina Faso and Cote d'Ivoire. To this aim, we used mosquito field collections along with heterogeneous, multisource and multi-scale environmental data. The objectives were i) to assess the small-scale spatial and temporal heterogeneity of the indicators, ii) to better understand their drivers, and iii) to assess their spatio-temporal predictability, at scales that are consistent with operational action. The explanatory variables covered a wide range of potential environmental determinants of vector resistance to insecticide or feeding behaviour: vector control, human availability and nocturnal behaviour, macro and micro-climatic conditions, landscape, etc. ContentsInput datasets and the R script used for the data analyses are provided. Because the models may take very long to fit (due to the size of the raw data), they were pre-fit, saved as .rds files ('R Data Serialization' format), and made available in the "models" folder. The R script used to answer to one of the reviewer's question (reviewer n°1, question n°1) is also included.

  8. Webis Simulation Data Mining Bridge Models Corpus 2012 (Webis-SDMbridge-12)

    • zenodo.org
    • live.european-language-grid.eu
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Steven Burrows; Benno Stein; Benno Stein; Jörg Frochte; Tim Gollub; Tim Gollub; Peter Hirsch; Jens Opolka; Tom Paschke; Michael Völske; Michael Völske; Steven Burrows; Jörg Frochte; Peter Hirsch; Jens Opolka; Tom Paschke (2020). Webis Simulation Data Mining Bridge Models Corpus 2012 (Webis-SDMbridge-12) [Dataset]. http://doi.org/10.5281/zenodo.3259676
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Steven Burrows; Benno Stein; Benno Stein; Jörg Frochte; Tim Gollub; Tim Gollub; Peter Hirsch; Jens Opolka; Tom Paschke; Michael Völske; Michael Völske; Steven Burrows; Jörg Frochte; Peter Hirsch; Jens Opolka; Tom Paschke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This corpus provides the simulation data mining community with a collection of 14641 bridge models and simulated behavior.

    1. Folder "1-designs"

    The text files in this directory should contain all information for the
    independent variables any machine learning experiment. For reference, all 14641 IFC models are supplied in subfolders 001 to 147.

    2. Folder "2-simulation"

    This folder contains samples of the simulation output that may be viewed in Paraview (http://www.paraview.org). The original model contains the "Org" filename fragment, and the maximum and minimum behaviors are indicated with "Max" and "Min" filename fragments. Displacement, strain, and stress behaviors are all given. Only three of the 14641 models are given as the file sizes are
    around 1.4 to 2.2 megabytes each. The complete data (approximately 81 gigabytes) can be regenerated and provided if necessary on request (email webis@medien.uni-weimar.de).

    3. Folder "3-aggregation"

    Maximum displacement, strain, and stress measurements are given in the text files individually, and together in the files with the "vtk" filename fragment. This data should be sufficient for the dependent variables of any machine learning experiment.

  9. Market Basket Analysis

    • kaggle.com
    zip
    Updated Dec 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
    Explore at:
    zip(23875170 bytes)Available download formats
    Dataset updated
    Dec 9, 2021
    Authors
    Aslan Ahmedov
    Description

    Market Basket Analysis

    Market basket analysis with Apriori algorithm

    The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

    Introduction

    Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

    An Example of Association Rules

    Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

    Strategy

    • Data Import
    • Data Understanding and Exploration
    • Transformation of the data – so that is ready to be consumed by the association rules algorithm
    • Running association rules
    • Exploring the rules generated
    • Filtering the generated rules
    • Visualization of Rule

    Dataset Description

    • File name: Assignment-1_Data
    • List name: retaildata
    • File format: . xlsx
    • Number of Row: 522065
    • Number of Attributes: 7

      • BillNo: 6-digit number assigned to each transaction. Nominal.
      • Itemname: Product name. Nominal.
      • Quantity: The quantities of each product per transaction. Numeric.
      • Date: The day and time when each transaction was generated. Numeric.
      • Price: Product price. Numeric.
      • CustomerID: 5-digit number assigned to each customer. Nominal.
      • Country: Name of the country where each customer resides. Nominal.

    imagehttps://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

    Libraries in R

    First, we need to load required libraries. Shortly I describe all libraries.

    • arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).
    • arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.
    • tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.
    • readxl - Read Excel Files in R.
    • plyr - Tools for Splitting, Applying and Combining Data.
    • ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
    • knitr - Dynamic Report generation in R.
    • magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.
    • dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
    • tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

    imagehttps://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

    Data Pre-processing

    Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

    imagehttps://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> imagehttps://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

    After we will clear our data frame, will remove missing values.

    imagehttps://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

    To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...

  10. Video-to-Model Data Set

    • figshare.com
    • commons.datacite.org
    xml
    Updated Mar 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz (2020). Video-to-Model Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.12026850.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Mar 24, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Sönke Knoch; Shreeraman Ponpathirkoottam; Tim Schwartz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set belongs to the paper "Video-to-Model: Unsupervised Trace Extraction from Videos for Process Discovery and Conformance Checking in Manual Assembly", submitted on March 24, 2020, to the 18th International Conference on Business Process Management (BPM).Abstract: Manual activities are often hidden deep down in discrete manufacturing processes. For the elicitation and optimization of process behavior, complete information about the execution of Manual activities are required. Thus, an approach is presented on how execution level information can be extracted from videos in manual assembly. The goal is the generation of a log that can be used in state-of-the-art process mining tools. The test bed for the system was lightweight and scalable consisting of an assembly workstation equipped with a single RGB camera recording only the hand movements of the worker from top. A neural network based real-time object classifier was trained to detect the worker’s hands. The hand detector delivers the input for an algorithm, which generates trajectories reflecting the movement paths of the hands. Those trajectories are automatically assigned to work steps using the position of material boxes on the assembly shelf as reference points and hierarchical clustering of similar behaviors with dynamic time warping. The system has been evaluated in a task-based study with ten participants in a laboratory, but under realistic conditions. The generated logs have been loaded into the process mining toolkit ProM to discover the underlying process model and to detect deviations from both, instructions and ground truth, using conformance checking. The results show that process mining delivers insights about the assembly process and the system’s precision.The data set contains the generated and the annotated logs based on the video material gathered during the user study. In addition, the petri nets from the process discovery and conformance checking conducted with ProM (http://www.promtools.org) and the reference nets modeled with Yasper (http://www.yasper.org/) are provided.

  11. Data: Understanding the Behavior of Process Mining Analysts: A Catalogue of...

    • zenodo.org
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jessica Van Suetendael; Jessica Van Suetendael; Benoît Depaire; Benoît Depaire; Mieke Jans; Mieke Jans; Niels Martin; Niels Martin (2025). Data: Understanding the Behavior of Process Mining Analysts: A Catalogue of Exploratory Process Mining Behaviors [Dataset]. http://doi.org/10.5281/zenodo.15845469
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jessica Van Suetendael; Jessica Van Suetendael; Benoît Depaire; Benoît Depaire; Mieke Jans; Mieke Jans; Niels Martin; Niels Martin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The codings for the literature-based ethogram and the interview protocol of the paper titled: "Understanding the Behavior of Process Mining Analysts: A Catalogue of Exploratory Process Mining Behaviors" can be found in this depository.

  12. Financial-Behavior

    • kaggle.com
    zip
    Updated Nov 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ziya (2024). Financial-Behavior [Dataset]. https://www.kaggle.com/datasets/ziya07/financial-behavior
    Explore at:
    zip(30268 bytes)Available download formats
    Dataset updated
    Nov 20, 2024
    Authors
    Ziya
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains 500 tweets related to financial literacy and consumer behavior, designed for tasks such as sentiment analysis, emotion classification, and behavior prediction. The dataset was generated to support research in financial literacy education and consumer behavior modeling, incorporating realistic tweet structures and metadata.

    Dataset Features tweet_content (string): The text of the tweets, reflecting various financial literacy topics and emotions.

    emotion (categorical): The emotion expressed in the tweet, selected from:

    Positive Fear Anticipation Disgust Surprise sentiment_score (float): A numerical score representing the sentiment of the tweet, ranging from -1 (negative sentiment) to 1 (positive sentiment).

    likes (integer): Number of likes the tweet received (simulated).

    retweets (integer): Number of retweets the tweet received (simulated).

    replies (integer): Number of replies the tweet received (simulated).

    topic_tags (categorical): The main financial topic discussed in the tweet, selected from:

    Savings Investment Budgeting Debt Management Financial Planning Credit Scores Spending Habits financial_behavior (categorical): A classification of the financial behavior implied by the tweet, categorized as:

    Good behavior Moderate behavior Risky behavior Potential Use Cases Sentiment analysis and emotion classification. Behavioral modeling for financial decision-making. Testing machine learning algorithms for financial literacy. Educational applications for personalized financial learning platforms. Simulating tweet analysis in social media mining studies.

  13. 4

    Data underlying the paper: An agent-based process mining architecture for...

    • data.4tu.nl
    • figshare.com
    zip
    Updated Aug 27, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rob Bemthuis; M. (Martijn) Koot; M.R.K. (Martijn) Mes; F.A. (Faiza) Bukhsh; M.E. (Maria-Eugenia) Iacob; N. (Nirvana) Meratnia (2019). Data underlying the paper: An agent-based process mining architecture for emergent behavior analysis [Dataset]. http://doi.org/10.4121/uuid:9e430177-1dd0-40e9-b48a-8eb39124ef4c
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 27, 2019
    Dataset provided by
    4TU.Centre for Research Data
    Authors
    Rob Bemthuis; M. (Martijn) Koot; M.R.K. (Martijn) Mes; F.A. (Faiza) Bukhsh; M.E. (Maria-Eugenia) Iacob; N. (Nirvana) Meratnia
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The dataset contains a collection of experiment results and event logs generated. The experiment comprises a job-shop scheduling problem, implemented in a discrete-event simulation model. The raw experiment results are given from which event log files can be generated by following the steps as described in this data paper or the referred academic paper. A collection of event log files is given, as well as the raw files. The logs include the filtered part of the case study as presented in the paper "An agent-based process mining architecture for emergent behavior analysis" by Rob Bemthuis, Martijn Koot, Martijn Mes, Faiza Bukhsh, Maria-Eugenia Iacob, and Nirvana Meratnia.

  14. Global Wildfire Database for GWIS (2021)

    • doi.pangaea.de
    html, tsv
    Updated May 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomàs Artés Vivancos (2022). Global Wildfire Database for GWIS (2021) [Dataset]. http://doi.org/10.1594/PANGAEA.943975
    Explore at:
    html, tsvAvailable download formats
    Dataset updated
    May 11, 2022
    Dataset provided by
    PANGAEA
    Authors
    Tomàs Artés Vivancos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2000 - Jan 1, 2021
    Variables measured
    DATE/TIME, File content, Binary Object, Binary Object (File Size)
    Description

    Global Wildfire Database for GWIS (2021) is an individual fire event focused database. Post processing of MCD64A1 providing geometries of final fire perimeters including initial and final date and the corresponding daily active areas for each fire. This dataset is an update of the data related with GlobFire (https://doi.org/10.6084/m9.figshare.10284101). […]

  15. c

    Global Data Mining Software Market Report 2025 Edition, Market Size, Share,...

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2025). Global Data Mining Software Market Report 2025 Edition, Market Size, Share, CAGR, Forecast, Revenue [Dataset]. https://www.cognitivemarketresearch.com/data-mining-software-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Jun 2, 2025
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    According to Cognitive Market Research, the global Data Mining Software market size will be USD XX million in 2025. It will expand at a compound annual growth rate (CAGR) of XX% from 2025 to 2031.

    North America held the major market share for more than XX% of the global revenue with a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. Europe accounted for a market share of over XX% of the global revenue with a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. Asia Pacific held a market share of around XX% of the global revenue with a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. Latin America had a market share of more than XX% of the global revenue with a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. Middle East and Africa had a market share of around XX% of the global revenue and was estimated at a market size of USD XX million in 2025 and will grow at a CAGR of XX% from 2025 to 2031. KEY DRIVERS

    Increasing Focus on Customer Satisfaction to Drive Data Mining Software Market Growth

    In today’s hyper-competitive and digitally connected marketplace, customer satisfaction has emerged as a critical factor for business sustainability and growth. The growing focus on enhancing customer satisfaction is proving to be a significant driver in the expansion of the data mining software market. Organizations are increasingly leveraging data mining tools to sift through vast volumes of customer data—ranging from transactional records and website activity to social media engagement and call center logs—to uncover insights that directly influence customer experience strategies. Data mining software empowers companies to analyze customer behavior patterns, identify dissatisfaction triggers, and predict future preferences. Through techniques such as classification, clustering, and association rule mining, businesses can break down large datasets to understand what customers want, what they are likely to purchase next, and how they feel about the brand. These insights not only help in refining customer service but also in shaping product development, pricing strategies, and promotional campaigns. For instance, Netflix uses data mining to recommend personalized content by analyzing a user's viewing history, ratings, and preferences. This has led to increased user engagement and retention, highlighting how a deep understanding of customer preferences—made possible through data mining—can translate into competitive advantage. Moreover, companies are increasingly using these tools to create highly targeted and customer-specific marketing campaigns. By mining data from e-commerce transactions, browsing behavior, and demographic profiles, brands can tailor their offerings and communications to suit individual customer segments. For Instance Amazon continuously mines customer purchasing and browsing data to deliver personalized product recommendations, tailored promotions, and timely follow-ups. This not only enhances customer satisfaction but also significantly boosts conversion rates and average order value. According to a report by McKinsey, personalization can deliver five to eight times the ROI on marketing spend and lift sales by 10% or more—a powerful incentive for companies to adopt data mining software as part of their customer experience toolkit. (Source: https://www.mckinsey.com/capabilities/growth-marketing-and-sales/our-insights/personalizing-at-scale#/) The utility of data mining tools extends beyond e-commerce and streaming platforms. In the banking and financial services industry, for example, institutions use data mining to analyze customer feedback, call center transcripts, and usage data to detect pain points and improve service delivery. Bank of America, for instance, utilizes data mining and predictive analytics to monitor customer interactions and provide proactive service suggestions or fraud alerts, significantly improving user satisfaction and trust. (Source: https://futuredigitalfinance.wbresearch.com/blog/bank-of-americas-erica-client-interactions-future-ai-in-banking) Similarly, telecom companies like Vodafone use data mining to understand customer churn behavior and implement retention strategies based on insights drawn from service usage patterns and complaint histories. In addition to p...

  16. Student Performance and Learning Behavior Dataset

    • kaggle.com
    zip
    Updated Sep 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adil Shamim (2025). Student Performance and Learning Behavior Dataset [Dataset]. https://www.kaggle.com/datasets/adilshamim8/student-performance-and-learning-style
    Explore at:
    zip(78897 bytes)Available download formats
    Dataset updated
    Sep 4, 2025
    Authors
    Adil Shamim
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset provides a comprehensive view of student performance and learning behavior, integrating academic, demographic, behavioral, and psychological factors.

    It was created by merging two publicly available Kaggle datasets, resulting in a unified dataset of 14,003 student records with 16 attributes. All entries are anonymized, with no personally identifiable information.

    Key Features

    • Study behaviors & engagementStudyHours, Attendance, Extracurricular, AssignmentCompletion, OnlineCourses, Discussions
    • Resources & environmentResources, Internet, EduTech
    • Motivation & psychologyMotivation, StressLevel
    • DemographicsGender, Age (18–30 years)
    • Learning preferenceLearningStyle
    • Performance indicatorsExamScore, FinalGrade

    Objectives & Use Cases

    The dataset can be used for:

    • Predictive modeling → Regression/classification of student performance (ExamScore, FinalGrade)
    • Clustering analysis → Identifying learning behavior groups with K-Means or other unsupervised methods
    • Educational analytics → Exploring how study habits, stress, and motivation affect outcomes
    • Adaptive learning research → Linking behavioral patterns to personalized learning pathways

    Analysis Pipeline (from original study)

    The dataset was analyzed in Python using:

    • Preprocessing → Encoding, normalization (z-score, Min–Max), deduplication
    • Clustering → K-Means, Elbow Method, Silhouette Score, Davies–Bouldin Index
    • Dimensionality Reduction → PCA (2D/3D visualizations)
    • Statistical Analysis → ANOVA, regression for group differences
    • Interpretation → Mapping clusters to LearningStyle categories & extracting insights for adaptive learning

    File

    • merged_dataset.csv → 14,003 rows × 16 columns Includes student demographics, behaviors, engagement, learning styles, and performance indicators.

    Provenance

    This dataset is an excellent playground for educational data mining — from clustering and behavioral analytics to predictive modeling and personalized learning applications.

  17. Defensive Assignment Identification

    • figshare.com
    zip
    Updated Aug 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabiano Baldo; Yoran, Leichsenring (2020). Defensive Assignment Identification [Dataset]. http://doi.org/10.6084/m9.figshare.12678989.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 18, 2020
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Fabiano Baldo; Yoran, Leichsenring
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file contains the source code, the dataset and result files used to validate the work entitled "A method to identify defensive assignments in team-based invasion sports using spatiotemporal trajectories" that is under publication at the International Journal of Geographical Information Science. The complete reference of the published paper will be posted when available.

  18. Sample data (five types of features of one participant)

    • figshare.com
    txt
    Updated Mar 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hua Liao (2022). Sample data (five types of features of one participant) [Dataset]. http://doi.org/10.6084/m9.figshare.19443503.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Mar 29, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Hua Liao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sample data (five types of features of one participant)

  19. Fraud Detection Software Developers in the US - Market Research Report...

    • ibisworld.com
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBISWorld (2025). Fraud Detection Software Developers in the US - Market Research Report (2015-2030) [Dataset]. https://www.ibisworld.com/united-states/market-research-reports/fraud-detection-software-developers-industry/
    Explore at:
    Dataset updated
    May 15, 2025
    Dataset authored and provided by
    IBISWorld
    License

    https://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/

    Time period covered
    2015 - 2030
    Area covered
    United States
    Description

    In the rapidly evolving US fraud detection software industry, developers invest significant capital in staying ahead of increasingly sophisticated cyber threats and fraud tactics. Over the past five years, accelerated digitalization, a surge in real-time payments and the adoption of e-commerce have fueled demand for industry solutions. Emerging trends such as behavioral biometrics, deepfake detection and real-time anomaly scoring have become essential, and developers now deliver cloud-based platforms able to address emerging threats. As businesses, banks, healthcare providers and public sector organizations face rising regulatory scrutiny and compliance demands, industry revenue has grown at a CAGR of 9.1% to an estimated $26.3 billion, including anticipated growth of 5.4% in 2025 alone. The widespread adoption of contactless payment technologies, such as mobile wallets and tap-to-pay cards enabled by Near Field Communication (NFC), has introduced a fresh set of vulnerabilities. Cybercriminals are now leveraging advanced techniques to exploit weaknesses that legacy systems are not designed to detect. These threats have required fraud detection software developers to integrate novel security measures into their offerings. Meanwhile, the rapid growth of e-commerce has been a significant driver of demand for fraud detection software among retail and wholesale companies. As more consumers migrate to online shopping platforms, transaction volumes have soared, exposing retailers and wholesalers to heightened risks. This has provided industry developers with a high-growth market where they often benefit from increased pricing power, which supports profit growth. Moving forward, the industry is set for further transformation as regulatory mandates around AI-enabled fraud prevention, deepfake detection and real-time compliance reporting become widespread. Continuous M&A activity and increased demand from high-growth market segments will strengthen revenue streams. Despite ongoing competitive pressures and rapidly shifting threat landscapes, these factors are forecast to support a robust industry revenue CAGR of 5.2% through 2030, reaching an estimated $33.8 billion.

  20. Tagged Flickr metadata

    • search.datacite.org
    • figshare.com
    Updated Jan 26, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nataliya Tkachenko (2017). Tagged Flickr metadata [Dataset]. http://doi.org/10.6084/m9.figshare.4591009.v1
    Explore at:
    Dataset updated
    Jan 26, 2017
    Dataset provided by
    DataCitehttps://www.datacite.org/
    Figsharehttp://figshare.com/
    Authors
    Nataliya Tkachenko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Anonymised datafile, extracted from YFCC100M archive, contains tags and keywords corresponding to risk-signalling and neutral environmental semantics

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nuno Antonio (2020). Lisbon, Portugal, hotel’s customer dataset with three years of personal, behavioral, demographic, and geographic information [Dataset]. http://doi.org/10.17632/j83f5fsh6c.1

Lisbon, Portugal, hotel’s customer dataset with three years of personal, behavioral, demographic, and geographic information

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Nov 18, 2020
Authors
Nuno Antonio
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Portugal, Lisbon
Description

Hotel customer dataset with 31 variables describing a total of 83,590 instances (customers). It comprehends three full years of customer behavioral data. In addition to personal and behavioral information, the dataset also contains demographic and geographical information. This dataset contributes to reducing the lack of real-world business data that can be used for educational and research purposes. The dataset can be used in data mining, machine learning, and other analytical field problems in the scope of data science. Due to its unit of analysis, it is a dataset especially suitable for building customer segmentation models, including clustering and RFM (Recency, Frequency, and Monetary value) models, but also be used in classification and regression problems.

Search
Clear search
Close search
Google apps
Main menu