100+ datasets found
  1. Google Certificate BellaBeats Capstone Project

    • kaggle.com
    zip
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason Porzelius (2023). Google Certificate BellaBeats Capstone Project [Dataset]. https://www.kaggle.com/datasets/jasonporzelius/google-certificate-bellabeats-capstone-project
    Explore at:
    zip(169161 bytes)Available download formats
    Dataset updated
    Jan 5, 2023
    Authors
    Jason Porzelius
    Description

    Introduction: I have chosen to complete a data analysis project for the second course option, Bellabeats, Inc., using a locally hosted database program, Excel for both my data analysis and visualizations. This choice was made primarily because I live in a remote area and have limited bandwidth and inconsistent internet access. Therefore, completing a capstone project using web-based programs such as R Studio, SQL Workbench, or Google Sheets was not a feasible choice. I was further limited in which option to choose as the datasets for the ride-share project option were larger than my version of Excel would accept. In the scenario provided, I will be acting as a Junior Data Analyst in support of the Bellabeats, Inc. executive team and data analytics team. This combined team has decided to use an existing public dataset in hopes that the findings from that dataset might reveal insights which will assist in Bellabeat's marketing strategies for future growth. My task is to provide data driven insights to business tasks provided by the Bellabeats, Inc.'s executive and data analysis team. In order to accomplish this task, I will complete all parts of the Data Analysis Process (Ask, Prepare, Process, Analyze, Share, Act). In addition, I will break each part of the Data Analysis Process down into three sections to provide clarity and accountability. Those three sections are: Guiding Questions, Key Tasks, and Deliverables. For the sake of space and to avoid repetition, I will record the deliverables for each Key Task directly under the numbered Key Task using an asterisk (*) as an identifier.

    Section 1 - Ask:

    A. Guiding Questions:
    1. Who are the key stakeholders and what are their goals for the data analysis project? 2. What is the business task that this data analysis project is attempting to solve?

    B. Key Tasks: 1. Identify key stakeholders and their goals for the data analysis project *The key stakeholders for this project are as follows: -Urška Sršen and Sando Mur - co-founders of Bellabeats, Inc. -Bellabeats marketing analytics team. I am a member of this team.

    1. Identify the business task. *The business task is: -As provided by co-founder Urška Sršen, the business task for this project is to gain insight into how consumers are using their non-BellaBeats smart devices in order to guide upcoming marketing strategies for the company which will help drive future growth. Specifically, the researcher was tasked with applying insights driven by the data analysis process to 1 BellaBeats product and presenting those insights to BellaBeats stakeholders.

    Section 2 - Prepare:

    A. Guiding Questions: 1. Where is the data stored and organized? 2. Are there any problems with the data? 3. How does the data help answer the business question?

    B. Key Tasks:

    1. Research and communicate the source of the data, and how it is stored/organized to stakeholders. *The data source used for our case study is FitBit Fitness Tracker Data. This dataset is stored in Kaggle and was made available through user Mobius in an open-source format. Therefore, the data is public and available to be copied, modified, and distributed, all without asking the user for permission. These datasets were generated by respondents to a distributed survey via Amazon Mechanical Turk reportedly (see credibility section directly below) between 03/12/2016 thru 05/12/2016.
      *Reportedly (see credibility section directly below), thirty eligible Fitbit users consented to the submission of personal tracker data, including output related to steps taken, calories burned, time spent sleeping, heart rate, and distance traveled. This data was broken down into minute, hour, and day level totals. This data is stored in 18 CSV documents. I downloaded all 18 documents into my local laptop and decided to use 2 documents for the purposes of this project as they were files which had merged activity and sleep data from the other documents. All unused documents were permanently deleted from the laptop. The 2 files used were: -sleepDay_merged.csv -dailyActivity_merged.csv

    2. Identify and communicate to stakeholders any problems found with the data related to credibility and bias. *As will be more specifically presented in the Process section, the data seems to have credibility issues related to the reported time frame of the data collected. The metadata seems to indicate that the data collected covered roughly 2 months of FitBit tracking. However, upon my initial data processing, I found that only 1 month of data was reported. *As will be more specifically presented in the Process section, the data has credibility issues related to the number of individuals who reported FitBit data. Specifically, the metadata communicates that 30 individual users agreed to report their tracking data. My initial data processing uncovered 33 individual ...

  2. f

    UC_vs_US Statistic Analysis.xlsx

    • figshare.com
    xlsx
    Updated Jul 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. (Fabiano) Dalpiaz (2020). UC_vs_US Statistic Analysis.xlsx [Dataset]. http://doi.org/10.23644/uu.12631628.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 9, 2020
    Dataset provided by
    Utrecht University
    Authors
    F. (Fabiano) Dalpiaz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.

    Tagging scheme:
    Aligned (AL) - A concept is represented as a class in both models, either
    

    with the same name or using synonyms or clearly linkable names; Wrongly represented (WR) - A class in the domain expert model is incorrectly represented in the student model, either (i) via an attribute, method, or relationship rather than class, or (ii) using a generic term (e.g., user'' instead ofurban planner''); System-oriented (SO) - A class in CM-Stud that denotes a technical implementation aspect, e.g., access control. Classes that represent legacy system or the system under design (portal, simulator) are legitimate; Omitted (OM) - A class in CM-Expert that does not appear in any way in CM-Stud; Missing (MI) - A class in CM-Stud that does not appear in any way in CM-Expert.

    All the calculations and information provided in the following sheets
    

    originate from that raw data.

    Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,
    

    including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.

    Sheet 3 (Size-Ratio):
    

    The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.

    Sheet 4 (Overall):
    

    Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.

    For sheet 4 as well as for the following four sheets, diverging stacked bar
    

    charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:

    Sheet 5 (By-Notation):
    

    Model correctness and model completeness is compared by notation - UC, US.

    Sheet 6 (By-Case):
    

    Model correctness and model completeness is compared by case - SIM, HOS, IFA.

    Sheet 7 (By-Process):
    

    Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.

    Sheet 8 (By-Grade):
    

    Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.

  3. Reliance on data & analysis for marketing decisions in Western Europe 2024

    • statista.com
    Updated May 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Reliance on data & analysis for marketing decisions in Western Europe 2024 [Dataset]. https://www.statista.com/statistics/1465527/reliance-data-analysis-marketing-decisions-europe/
    Explore at:
    Dataset updated
    May 15, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    May 2024
    Area covered
    Europe
    Description

    During a survey carried out in 2024, roughly one in three marketing managers from France, Germany, and the United Kingdom stated that they based every marketing decision on data. Under ** percent of respondents in all five surveyed countries said they struggled to incorporate data analytics into their decision-making process.

  4. Z

    Data Analysis for the Systematic Literature Review of DL4SE

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cody Watson; Nathan Cooper; David Nader; Kevin Moran; Denys Poshyvanyk (2024). Data Analysis for the Systematic Literature Review of DL4SE [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_4768586
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Washington and Lee University
    College of William and Mary
    Authors
    Cody Watson; Nathan Cooper; David Nader; Kevin Moran; Denys Poshyvanyk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data Analysis is the process that supports decision-making and informs arguments in empirical studies. Descriptive statistics, Exploratory Data Analysis (EDA), and Confirmatory Data Analysis (CDA) are the approaches that compose Data Analysis (Xia & Gong; 2014). An Exploratory Data Analysis (EDA) comprises a set of statistical and data mining procedures to describe data. We ran EDA to provide statistical facts and inform conclusions. The mined facts allow attaining arguments that would influence the Systematic Literature Review of DL4SE.

    The Systematic Literature Review of DL4SE requires formal statistical modeling to refine the answers for the proposed research questions and formulate new hypotheses to be addressed in the future. Hence, we introduce DL4SE-DA, a set of statistical processes and data mining pipelines that uncover hidden relationships among Deep Learning reported literature in Software Engineering. Such hidden relationships are collected and analyzed to illustrate the state-of-the-art of DL techniques employed in the software engineering context.

    Our DL4SE-DA is a simplified version of the classical Knowledge Discovery in Databases, or KDD (Fayyad, et al; 1996). The KDD process extracts knowledge from a DL4SE structured database. This structured database was the product of multiple iterations of data gathering and collection from the inspected literature. The KDD involves five stages:

    Selection. This stage was led by the taxonomy process explained in section xx of the paper. After collecting all the papers and creating the taxonomies, we organize the data into 35 features or attributes that you find in the repository. In fact, we manually engineered features from the DL4SE papers. Some of the features are venue, year published, type of paper, metrics, data-scale, type of tuning, learning algorithm, SE data, and so on.

    Preprocessing. The preprocessing applied was transforming the features into the correct type (nominal), removing outliers (papers that do not belong to the DL4SE), and re-inspecting the papers to extract missing information produced by the normalization process. For instance, we normalize the feature “metrics” into “MRR”, “ROC or AUC”, “BLEU Score”, “Accuracy”, “Precision”, “Recall”, “F1 Measure”, and “Other Metrics”. “Other Metrics” refers to unconventional metrics found during the extraction. Similarly, the same normalization was applied to other features like “SE Data” and “Reproducibility Types”. This separation into more detailed classes contributes to a better understanding and classification of the paper by the data mining tasks or methods.

    Transformation. In this stage, we omitted to use any data transformation method except for the clustering analysis. We performed a Principal Component Analysis to reduce 35 features into 2 components for visualization purposes. Furthermore, PCA also allowed us to identify the number of clusters that exhibit the maximum reduction in variance. In other words, it helped us to identify the number of clusters to be used when tuning the explainable models.

    Data Mining. In this stage, we used three distinct data mining tasks: Correlation Analysis, Association Rule Learning, and Clustering. We decided that the goal of the KDD process should be oriented to uncover hidden relationships on the extracted features (Correlations and Association Rules) and to categorize the DL4SE papers for a better segmentation of the state-of-the-art (Clustering). A clear explanation is provided in the subsection “Data Mining Tasks for the SLR od DL4SE”. 5.Interpretation/Evaluation. We used the Knowledge Discover to automatically find patterns in our papers that resemble “actionable knowledge”. This actionable knowledge was generated by conducting a reasoning process on the data mining outcomes. This reasoning process produces an argument support analysis (see this link).

    We used RapidMiner as our software tool to conduct the data analysis. The procedures and pipelines were published in our repository.

    Overview of the most meaningful Association Rules. Rectangles are both Premises and Conclusions. An arrow connecting a Premise with a Conclusion implies that given some premise, the conclusion is associated. E.g., Given that an author used Supervised Learning, we can conclude that their approach is irreproducible with a certain Support and Confidence.

    Support = Number of occurrences this statement is true divided by the amount of statements Confidence = The support of the statement divided by the number of occurrences of the premise

  5. Datasets for manuscript "Integrating data engineering and process systems...

    • catalog.data.gov
    • gimi9.com
    Updated Oct 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2025). Datasets for manuscript "Integrating data engineering and process systems engineering for end-of-life chemical flow analysis" [Dataset]. https://catalog.data.gov/dataset/datasets-for-manuscript-integrating-data-engineering-and-process-systems-engineering-for-e
    Explore at:
    Dataset updated
    Oct 10, 2025
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The Github Repository, https://github.com/jodhernandezbe/TRI4PLADS/tree/v1.0.0,, is publicly available and referenced in supplementary information. This GitHub repository describes the computational framework overview, software requirements, model use, model output, and disclaimer. This repository presents a multi-scale framework that combines data engineering with process systems engineering (PSE) to enhance the precision of chemical flow analysis (CFA) at the end-of-life (EoL) stage. The focus is on chemicals used in plastic manufacturing, tracing their flows through the supply chain and EoL pathways. Additionally, this study examines potential discharges from material recovery facilities to publicly owned treatment works (POTW) facilities, recognizing their relevance to human and environmental health. Tracking these discharges is critical, as industrial EoL material transfers to POTWs can interfere with biological treatment processes, leading to unintended environmental chemical releases. By integrating data-driven methodologies with mechanistic modeling, this framework supports the identification, quantification, and regulatory assessment of chemical discharges, providing a science-based foundation for industrial and policy decision-making in sustainable material and water management. The attached file CoU - Metadata File.xlsx contains the datasets to build Figure 3 and describe a qualitative flow diagram of methyl methacrylate from manufacturing to potential consumer products generated from the Chemical Conditions of Use Locator methodology (https://doi.org/10.1111/jiec.13626). The attached file "MMA POTW Dataset.xlsx" contains the datasets needed to run the Chemical Tracker and Exposure Assessor in Publicly Owned Treatment Works Model (ChemTEAPOTW) as described in the Github Repository https://github.com/gruizmer/ChemTEAPOTW. The attached file "Plastic Data-Calculations-Assumptions.docx" contains all calculations and assumption to estimate the methyl methacrylate (MMA) releases from plastic recycling. Finally, users can generate Figures 4 and 5 after following the step-by-step process described in main Github repository for the MMA case study. This dataset is associated with the following publication: Hernandez-Betancur, J.D., J.D. Chea, D. Perez, and G.J. Ruiz-Mercado. Integrating data engineering and process systems engineering for end-of-life chemical flow analysis. COMPUTERS AND CHEMICAL ENGINEERING. Elsevier Science Ltd, New York, NY, USA, 204: 109414, (2026).

  6. Data Package for Sustainability-Oriented Process Analysis and Re-Design

    • figshare.com
    pdf
    Updated Sep 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Finn Klessascheck (2024). Data Package for Sustainability-Oriented Process Analysis and Re-Design [Dataset]. http://doi.org/10.6084/m9.figshare.22591513.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Sep 1, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Finn Klessascheck
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset containing process models, event logs, LCA data for the evaluation of SOPA, as well as additional figures.

  7. P

    Process Analytics Service Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Feb 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2025). Process Analytics Service Market Report [Dataset]. https://www.promarketreports.com/reports/process-analytics-service-market-16984
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Feb 1, 2025
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Process Analytics Service Market is expected to reach a value of USD 674.52 million by 2033, exhibiting a CAGR of 8.41% during the forecast period (2025-2033). The market growth is attributed to increasing adoption of process analytics to optimize operations, reduce costs, and enhance decision-making. The rising demand for data-driven insights, growing adoption of cloud-based analytics solutions, and increasing investments in digital transformation initiatives are also driving the market growth. North America is the largest market for process analytics services, followed by Europe and Asia Pacific. The high adoption of process analytics in the manufacturing, financial services, and healthcare industries in these regions is fueling the market growth. The Asia Pacific region is expected to witness significant growth in the coming years due to the rapid adoption of digital technologies and increasing government initiatives to promote data analytics across various industries. Key players in the market include Domo, IBM, Oracle, Cisco, SAP, Microsoft, and others. These companies are offering a wide range of process analytics services, including consulting, implementation, support, and maintenance to help organizations leverage process analytics to improve their operational efficiency and business outcomes. The global Process Analytics Service market is projected to reach USD 802.7 Million by 2027, growing at a CAGR of 28.3%. Key drivers for this market are: AI and machine learning integration, Growing demand for real-time analytics; Increasing focus on process optimization; Rising adoption of cloud-based solutions; Expanding market in emerging economies . Potential restraints include: Increasing demand for data-driven insights, Adoption of AI and machine learning; Need for operational efficiency; Growing regulatory compliance requirements; Shift towards cloud-based solutions .

  8. Z

    Artificial datasets for multi-perspective Declare analysis

    • nde-dev.biothings.io
    • data-staging.niaid.nih.gov
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Burattin, Andrea (2020). Artificial datasets for multi-perspective Declare analysis [Dataset]. https://nde-dev.biothings.io/resources?id=zenodo_20030
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset authored and provided by
    Burattin, Andrea
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This file contains the dataset we used for the evaluation of the multi-perspective Declare analysis.

    Logs

    In particular, it contains logs with different sizes and different trace lengths. We generated traces with 10, 20, 30, 40, and 50 events and, for each of these lengths, we generated logs with 25000, 50000, 75000, and 100000 traces. Therefore, in total, there are 20 logs.

    Declare models

    In addition, the dataset contains 10 Declare models. In particular, we prepared two models with 10 constraints, one only containing constraints on the control-flow (without conditions on data and time), and another one including real multi-perspective constraints (with conditions on time and data). We followed the same procedure to create models with 20, 30, 40, and 50 constraints.

  9. Sentiment Analysis of Business Process Reviews

    • kaggle.com
    zip
    Updated Oct 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Khurram Shahzad (2021). Sentiment Analysis of Business Process Reviews [Dataset]. https://www.kaggle.com/drkhurramshahzad/sentiment-analysis-of-business-process-reviews
    Explore at:
    zip(738227 bytes)Available download formats
    Dataset updated
    Oct 5, 2021
    Authors
    Dr. Khurram Shahzad
    Description

    Dataset

    This dataset was created by Dr. Khurram Shahzad

    Contents

  10. cases study1 example for google data analytics

    • kaggle.com
    zip
    Updated Apr 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mohammed hatem (2023). cases study1 example for google data analytics [Dataset]. https://www.kaggle.com/datasets/mohammedhatem/cases-study1-example-for-google-data-analytics
    Explore at:
    zip(25278847 bytes)Available download formats
    Dataset updated
    Apr 22, 2023
    Authors
    mohammed hatem
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    In the way of my journey to earn the google data analytics certificate I will practice real world example by following the steps of the data analysis process: ask, prepare, process, analyze, share, and act. Picking the Bellabeat example.

  11. p

    process analytical technology Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Feb 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). process analytical technology Report [Dataset]. https://www.datainsightsmarket.com/reports/process-analytical-technology-1474614
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Feb 9, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the process analytical technology market was valued at USD XXX million in 2024 and is projected to reach USD XXX million by 2033, with an expected CAGR of XX% during the forecast period.

  12. Process Oil Market - Size, Share & Industry Analysis

    • mordorintelligence.com
    pdf,excel,csv,ppt
    Updated Jan 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mordor Intelligence (2025). Process Oil Market - Size, Share & Industry Analysis [Dataset]. https://www.mordorintelligence.com/industry-reports/process-oils-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Jan 28, 2025
    Dataset authored and provided by
    Mordor Intelligence
    License

    https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy

    Time period covered
    2019 - 2030
    Area covered
    Global
    Description

    The Process Oil Market Report is Segmented by Type (aromatic, Paraffinic, And Naphthenic), Application (rubber, Polymers, Personal Care, And Other Applications), And Geography (Asia-Pacific, North America, Europe, South America, And Middle-East and Africa). The Market Size and Forecast are Provided in Terms of Volume (tons) for all the Above Segments.

  13. d

    Data from: A simple method for statistical analysis of intensity differences...

    • catalog.data.gov
    • healthdata.gov
    • +1more
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). A simple method for statistical analysis of intensity differences in microarray-derived gene expression data [Dataset]. https://catalog.data.gov/dataset/a-simple-method-for-statistical-analysis-of-intensity-differences-in-microarray-derived-ge
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.

  14. d

    Distributed Real-time Embedded Analysis Method (DREAM)

    • catalog.data.gov
    • datasets.ai
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Distributed Real-time Embedded Analysis Method (DREAM) [Dataset]. https://catalog.data.gov/dataset/distributed-real-time-embedded-analysis-method-dream
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Models developed with this Open Source tool.

  15. Bellabeat Case Study: Smart Device Usage Analysis

    • kaggle.com
    Updated Sep 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saviour John (2025). Bellabeat Case Study: Smart Device Usage Analysis [Dataset]. https://www.kaggle.com/datasets/saviourjohn/bellabeat-case-study-smart-device-usage-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 18, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Saviour John
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This notebook presents the Bellabeat Google Data Analytics Capstone case study. The analysis uses Fitbit smart device data to uncover patterns in steps, sleep, and calories, applying these insights to Bellabeat Time, a wellness-focused smartwatch.

    The work follows the six-step analysis process (Ask, Prepare, Process, Analyze, Share, Act), providing both insights and actionable recommendations.

    Key Insights:

    Users average ~7,000 steps/day (below 10,000 recommended).

    Average sleep is 6–7 hours/night (less than the recommended 8).

    Steps strongly correlate with calories burned.

    Activity and sleep vary by weekday vs weekend.

    Deliverables:

    Exploratory analysis in RMarkdown

    Visualizations of activity and sleep patterns

    Actionable marketing recommendations for Bellabeat

  16. m

    Data from: Method for selecting the optimal technology in metal additive...

    • data.mendeley.com
    • observatorio-investigacion.unavarra.es
    Updated Feb 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Virginia Uralde (2024). Method for selecting the optimal technology in metal additive manufacturing using an analytical hierarchical process [Dataset]. http://doi.org/10.17632/wbsd9v2ztz.1
    Explore at:
    Dataset updated
    Feb 27, 2024
    Authors
    Virginia Uralde
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The research hypothesis of this study revolves around employing Multi-Criteria Decision Analysis (MCDA) techniques, particularly the Analytical Hierarchical Process (AHP), to optimize technology selection in metal additive manufacturing. The data collected and analyzed includes the results of the survey and criteria evaluation relevant to the decision-making process, such as reliability, finishing of the part after printing, complexity of post-processing, sustainability of the process, user preferences, machine price, manufacturing cost, and productivity. The AHP methodology involves constructing a hierarchy structure wherein the goal or objective, criteria, and alternatives are systematically organized. Pairwise comparisons are then made among criteria and alternatives, using a relative importance scale ranging from 1 to 9. These comparisons are recorded in a positive reciprocal matrix, which is then normalized to obtain numerical weights for decision-making. The priority vector or normalized principal eigenvector is computed, representing the relative importance of criteria, and the maximum eigenvalue is determined. Finally, a global ranking of decision alternatives is analyzed based on additive aggregation and normalization of the sum of local priorities of criteria and alternatives.

  17. P

    Process Data Historian Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jan 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Process Data Historian Software Report [Dataset]. https://www.datainsightsmarket.com/reports/process-data-historian-software-1981775
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Jan 13, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the Process Data Historian Software market was valued at USD XXX million in 2023 and is projected to reach USD XXX million by 2032, with an expected CAGR of XX% during the forecast period.

  18. f

    Data from: Application of statistical analysis to improve time management of...

    • scielo.figshare.com
    jpeg
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mariana Mello Pereira; Luiza Lavocat Galvão de Almeida Coelho; Gladston Luiz da Silva; Yanne Souza Alves Cunha (2023). Application of statistical analysis to improve time management of a process modeling project [Dataset]. http://doi.org/10.6084/m9.figshare.10026032.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    SciELO journals
    Authors
    Mariana Mello Pereira; Luiza Lavocat Galvão de Almeida Coelho; Gladston Luiz da Silva; Yanne Souza Alves Cunha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract This work was the result of a cooperation agreement between the University of Brasilia (UnB) and a security organization. Its goal was to model the logistic processes of the company to assist in the modernization of a control system for the management of materials. The project was managed in a dynamic model, through the monitoring and control of activities executed during its various stages. The development of a control system enabled the detection of discrepancies between what was planned and how it was performed, identifying its causes and which actions to take to ensure that the project got back on track according to the planned schedule and budget. The main objective of this article was to identify which elements controlled by the project affected its execution time. With that knowledge, it was possible to improve the planning of the next phases of the project. To this end, we performed a case study of exploratory aspect and quantitative nature to provide information on the object and guide the formulation of hypotheses. The qualitative analysis of the execution time of the modeling identified two dependent variables - systematic version and team - out of the four evaluated. The quantitative analysis studied two variables - number of modifications and number of elements -, which did not indicate evidence of correlation with the aforementioned time.

  19. Software tools used for data collection and analysis.

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John A. Borghi; Ana E. Van Gulick (2023). Software tools used for data collection and analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0252047.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    John A. Borghi; Ana E. Van Gulick
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Software tools used to collect and analyze data. Parentheses for analysis software indicate the tools participants were taught to use as part of their education in research methods and statistics. “Other” responses for data collection software were largely comprised of survey tools (e.g. Survey Monkey, LimeSurvey) and tools for building and running behavioral experiments (e.g. Gorilla, JsPsych). “Other” responses for data analysis software largely consisted of neuroimaging-related tools (e.g. SPM, AFNI).

  20. B

    Business Process Management Platforms Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Business Process Management Platforms Report [Dataset]. https://www.datainsightsmarket.com/reports/business-process-management-platforms-1398254
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Jan 16, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the Business Process Management Platforms market was valued at USD 2714 million in 2023 and is projected to reach USD 4897.92 million by 2032, with an expected CAGR of 8.8% during the forecast period.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jason Porzelius (2023). Google Certificate BellaBeats Capstone Project [Dataset]. https://www.kaggle.com/datasets/jasonporzelius/google-certificate-bellabeats-capstone-project
Organization logo

Google Certificate BellaBeats Capstone Project

Explore at:
zip(169161 bytes)Available download formats
Dataset updated
Jan 5, 2023
Authors
Jason Porzelius
Description

Introduction: I have chosen to complete a data analysis project for the second course option, Bellabeats, Inc., using a locally hosted database program, Excel for both my data analysis and visualizations. This choice was made primarily because I live in a remote area and have limited bandwidth and inconsistent internet access. Therefore, completing a capstone project using web-based programs such as R Studio, SQL Workbench, or Google Sheets was not a feasible choice. I was further limited in which option to choose as the datasets for the ride-share project option were larger than my version of Excel would accept. In the scenario provided, I will be acting as a Junior Data Analyst in support of the Bellabeats, Inc. executive team and data analytics team. This combined team has decided to use an existing public dataset in hopes that the findings from that dataset might reveal insights which will assist in Bellabeat's marketing strategies for future growth. My task is to provide data driven insights to business tasks provided by the Bellabeats, Inc.'s executive and data analysis team. In order to accomplish this task, I will complete all parts of the Data Analysis Process (Ask, Prepare, Process, Analyze, Share, Act). In addition, I will break each part of the Data Analysis Process down into three sections to provide clarity and accountability. Those three sections are: Guiding Questions, Key Tasks, and Deliverables. For the sake of space and to avoid repetition, I will record the deliverables for each Key Task directly under the numbered Key Task using an asterisk (*) as an identifier.

Section 1 - Ask:

A. Guiding Questions:
1. Who are the key stakeholders and what are their goals for the data analysis project? 2. What is the business task that this data analysis project is attempting to solve?

B. Key Tasks: 1. Identify key stakeholders and their goals for the data analysis project *The key stakeholders for this project are as follows: -Urška Sršen and Sando Mur - co-founders of Bellabeats, Inc. -Bellabeats marketing analytics team. I am a member of this team.

  1. Identify the business task. *The business task is: -As provided by co-founder Urška Sršen, the business task for this project is to gain insight into how consumers are using their non-BellaBeats smart devices in order to guide upcoming marketing strategies for the company which will help drive future growth. Specifically, the researcher was tasked with applying insights driven by the data analysis process to 1 BellaBeats product and presenting those insights to BellaBeats stakeholders.

Section 2 - Prepare:

A. Guiding Questions: 1. Where is the data stored and organized? 2. Are there any problems with the data? 3. How does the data help answer the business question?

B. Key Tasks:

  1. Research and communicate the source of the data, and how it is stored/organized to stakeholders. *The data source used for our case study is FitBit Fitness Tracker Data. This dataset is stored in Kaggle and was made available through user Mobius in an open-source format. Therefore, the data is public and available to be copied, modified, and distributed, all without asking the user for permission. These datasets were generated by respondents to a distributed survey via Amazon Mechanical Turk reportedly (see credibility section directly below) between 03/12/2016 thru 05/12/2016.
    *Reportedly (see credibility section directly below), thirty eligible Fitbit users consented to the submission of personal tracker data, including output related to steps taken, calories burned, time spent sleeping, heart rate, and distance traveled. This data was broken down into minute, hour, and day level totals. This data is stored in 18 CSV documents. I downloaded all 18 documents into my local laptop and decided to use 2 documents for the purposes of this project as they were files which had merged activity and sleep data from the other documents. All unused documents were permanently deleted from the laptop. The 2 files used were: -sleepDay_merged.csv -dailyActivity_merged.csv

  2. Identify and communicate to stakeholders any problems found with the data related to credibility and bias. *As will be more specifically presented in the Process section, the data seems to have credibility issues related to the reported time frame of the data collected. The metadata seems to indicate that the data collected covered roughly 2 months of FitBit tracking. However, upon my initial data processing, I found that only 1 month of data was reported. *As will be more specifically presented in the Process section, the data has credibility issues related to the number of individuals who reported FitBit data. Specifically, the metadata communicates that 30 individual users agreed to report their tracking data. My initial data processing uncovered 33 individual ...

Search
Clear search
Close search
Google apps
Main menu