100+ datasets found
  1. Google Certificate BellaBeats Capstone Project

    • kaggle.com
    zip
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason Porzelius (2023). Google Certificate BellaBeats Capstone Project [Dataset]. https://www.kaggle.com/datasets/jasonporzelius/google-certificate-bellabeats-capstone-project
    Explore at:
    zip(169161 bytes)Available download formats
    Dataset updated
    Jan 5, 2023
    Authors
    Jason Porzelius
    Description

    Introduction: I have chosen to complete a data analysis project for the second course option, Bellabeats, Inc., using a locally hosted database program, Excel for both my data analysis and visualizations. This choice was made primarily because I live in a remote area and have limited bandwidth and inconsistent internet access. Therefore, completing a capstone project using web-based programs such as R Studio, SQL Workbench, or Google Sheets was not a feasible choice. I was further limited in which option to choose as the datasets for the ride-share project option were larger than my version of Excel would accept. In the scenario provided, I will be acting as a Junior Data Analyst in support of the Bellabeats, Inc. executive team and data analytics team. This combined team has decided to use an existing public dataset in hopes that the findings from that dataset might reveal insights which will assist in Bellabeat's marketing strategies for future growth. My task is to provide data driven insights to business tasks provided by the Bellabeats, Inc.'s executive and data analysis team. In order to accomplish this task, I will complete all parts of the Data Analysis Process (Ask, Prepare, Process, Analyze, Share, Act). In addition, I will break each part of the Data Analysis Process down into three sections to provide clarity and accountability. Those three sections are: Guiding Questions, Key Tasks, and Deliverables. For the sake of space and to avoid repetition, I will record the deliverables for each Key Task directly under the numbered Key Task using an asterisk (*) as an identifier.

    Section 1 - Ask:

    A. Guiding Questions:
    1. Who are the key stakeholders and what are their goals for the data analysis project? 2. What is the business task that this data analysis project is attempting to solve?

    B. Key Tasks: 1. Identify key stakeholders and their goals for the data analysis project *The key stakeholders for this project are as follows: -Urška Sršen and Sando Mur - co-founders of Bellabeats, Inc. -Bellabeats marketing analytics team. I am a member of this team.

    1. Identify the business task. *The business task is: -As provided by co-founder Urška Sršen, the business task for this project is to gain insight into how consumers are using their non-BellaBeats smart devices in order to guide upcoming marketing strategies for the company which will help drive future growth. Specifically, the researcher was tasked with applying insights driven by the data analysis process to 1 BellaBeats product and presenting those insights to BellaBeats stakeholders.

    Section 2 - Prepare:

    A. Guiding Questions: 1. Where is the data stored and organized? 2. Are there any problems with the data? 3. How does the data help answer the business question?

    B. Key Tasks:

    1. Research and communicate the source of the data, and how it is stored/organized to stakeholders. *The data source used for our case study is FitBit Fitness Tracker Data. This dataset is stored in Kaggle and was made available through user Mobius in an open-source format. Therefore, the data is public and available to be copied, modified, and distributed, all without asking the user for permission. These datasets were generated by respondents to a distributed survey via Amazon Mechanical Turk reportedly (see credibility section directly below) between 03/12/2016 thru 05/12/2016.
      *Reportedly (see credibility section directly below), thirty eligible Fitbit users consented to the submission of personal tracker data, including output related to steps taken, calories burned, time spent sleeping, heart rate, and distance traveled. This data was broken down into minute, hour, and day level totals. This data is stored in 18 CSV documents. I downloaded all 18 documents into my local laptop and decided to use 2 documents for the purposes of this project as they were files which had merged activity and sleep data from the other documents. All unused documents were permanently deleted from the laptop. The 2 files used were: -sleepDay_merged.csv -dailyActivity_merged.csv

    2. Identify and communicate to stakeholders any problems found with the data related to credibility and bias. *As will be more specifically presented in the Process section, the data seems to have credibility issues related to the reported time frame of the data collected. The metadata seems to indicate that the data collected covered roughly 2 months of FitBit tracking. However, upon my initial data processing, I found that only 1 month of data was reported. *As will be more specifically presented in the Process section, the data has credibility issues related to the number of individuals who reported FitBit data. Specifically, the metadata communicates that 30 individual users agreed to report their tracking data. My initial data processing uncovered 33 individual ...

  2. Google Data Analytics Capstone Project

    • kaggle.com
    zip
    Updated Jul 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ponomarliliia (2023). Google Data Analytics Capstone Project [Dataset]. https://www.kaggle.com/datasets/ponomarlili/google-data-analytics-capstone-project
    Explore at:
    zip(214473433 bytes)Available download formats
    Dataset updated
    Jul 14, 2023
    Authors
    Ponomarliliia
    Description

    Introduction After completing my Google Data Analytics Professional Certificate on Coursera, I accomplished a Capstone Project, recommended by Google, to improve and highlight the technical skills of data analysis knowledge, such as R programming, SQL, and Tableau. In the Cyclistic Case Study, I performed many real-world tasks of a junior data analyst. To answer the critical business questions, I followed the steps of the data analysis process: ask, prepare, process, analyze, share, and act. **Scenario ** You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations. Characters and teams Cyclistic: A bike-share program that has grown to a fleet of 5,824 bicycles that are tracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system at any time. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day. Stakeholders Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels. Cyclistic marketing analytics team: A team of data analysts responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals and how you, as a junior data analyst, can help Cyclistic achieve them. *Cyclistic executive team: *The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.

  3. Reliance on data & analysis for marketing decisions in Western Europe 2024

    • statista.com
    Updated May 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Reliance on data & analysis for marketing decisions in Western Europe 2024 [Dataset]. https://www.statista.com/statistics/1465527/reliance-data-analysis-marketing-decisions-europe/
    Explore at:
    Dataset updated
    May 15, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    May 2024
    Area covered
    Europe
    Description

    During a survey carried out in 2024, roughly one in three marketing managers from France, Germany, and the United Kingdom stated that they based every marketing decision on data. Under ** percent of respondents in all five surveyed countries said they struggled to incorporate data analytics into their decision-making process.

  4. P

    Process Analytics Service Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Feb 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2025). Process Analytics Service Market Report [Dataset]. https://www.promarketreports.com/reports/process-analytics-service-market-16984
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Feb 1, 2025
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Process Analytics Service Market is expected to reach a value of USD 674.52 million by 2033, exhibiting a CAGR of 8.41% during the forecast period (2025-2033). The market growth is attributed to increasing adoption of process analytics to optimize operations, reduce costs, and enhance decision-making. The rising demand for data-driven insights, growing adoption of cloud-based analytics solutions, and increasing investments in digital transformation initiatives are also driving the market growth. North America is the largest market for process analytics services, followed by Europe and Asia Pacific. The high adoption of process analytics in the manufacturing, financial services, and healthcare industries in these regions is fueling the market growth. The Asia Pacific region is expected to witness significant growth in the coming years due to the rapid adoption of digital technologies and increasing government initiatives to promote data analytics across various industries. Key players in the market include Domo, IBM, Oracle, Cisco, SAP, Microsoft, and others. These companies are offering a wide range of process analytics services, including consulting, implementation, support, and maintenance to help organizations leverage process analytics to improve their operational efficiency and business outcomes. The global Process Analytics Service market is projected to reach USD 802.7 Million by 2027, growing at a CAGR of 28.3%. Key drivers for this market are: AI and machine learning integration, Growing demand for real-time analytics; Increasing focus on process optimization; Rising adoption of cloud-based solutions; Expanding market in emerging economies . Potential restraints include: Increasing demand for data-driven insights, Adoption of AI and machine learning; Need for operational efficiency; Growing regulatory compliance requirements; Shift towards cloud-based solutions .

  5. Google Data Analytics Capstone Project

    • kaggle.com
    zip
    Updated Nov 13, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NANCY CHAUHAN (2021). Google Data Analytics Capstone Project [Dataset]. https://www.kaggle.com/datasets/nancychauhan199/google-case-study-pdf
    Explore at:
    zip(284279 bytes)Available download formats
    Dataset updated
    Nov 13, 2021
    Authors
    NANCY CHAUHAN
    Description

    Case Study: How Does a Bike-Share Navigate Speedy Success?¶

    Introduction

    Welcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, you will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act. Along the way, the Case Study Roadmap tables — including guiding questions and key tasks — will help you stay on the right path. By the end of this lesson, you will have a portfolio-ready case study. Download the packet and reference the details of this case study anytime. Then, when you begin your job hunt, your case study will be a tangible way to demonstrate your knowledge and skills to potential employers.

    Scenario

    You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations. Characters and teams ● Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day. ● Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels. ● Cyclistic marketing analytics team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals — as well as how you, as a junior data analyst, can help Cyclistic achieve them. ● Cyclistic executive team: The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.

    About the company

    In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime. Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members. Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers, Moreno believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno believes there is a very good chance to convert casual riders into members. She notes that casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs. Moreno has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends

    Three questions will guide the future marketing program:

    How do annual members and casual riders use Cyclistic bikes differently? Why would casual riders buy Cyclistic annual memberships? How can Cyclistic use digital media to influence casual riders to become members? Moreno has assigned you the first question to answer: How do annual members and casual rid...

  6. f

    UC_vs_US Statistic Analysis.xlsx

    • figshare.com
    xlsx
    Updated Jul 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. (Fabiano) Dalpiaz (2020). UC_vs_US Statistic Analysis.xlsx [Dataset]. http://doi.org/10.23644/uu.12631628.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 9, 2020
    Dataset provided by
    Utrecht University
    Authors
    F. (Fabiano) Dalpiaz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.

    Tagging scheme:
    Aligned (AL) - A concept is represented as a class in both models, either
    

    with the same name or using synonyms or clearly linkable names; Wrongly represented (WR) - A class in the domain expert model is incorrectly represented in the student model, either (i) via an attribute, method, or relationship rather than class, or (ii) using a generic term (e.g., user'' instead ofurban planner''); System-oriented (SO) - A class in CM-Stud that denotes a technical implementation aspect, e.g., access control. Classes that represent legacy system or the system under design (portal, simulator) are legitimate; Omitted (OM) - A class in CM-Expert that does not appear in any way in CM-Stud; Missing (MI) - A class in CM-Stud that does not appear in any way in CM-Expert.

    All the calculations and information provided in the following sheets
    

    originate from that raw data.

    Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,
    

    including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.

    Sheet 3 (Size-Ratio):
    

    The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.

    Sheet 4 (Overall):
    

    Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.

    For sheet 4 as well as for the following four sheets, diverging stacked bar
    

    charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:

    Sheet 5 (By-Notation):
    

    Model correctness and model completeness is compared by notation - UC, US.

    Sheet 6 (By-Case):
    

    Model correctness and model completeness is compared by case - SIM, HOS, IFA.

    Sheet 7 (By-Process):
    

    Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.

    Sheet 8 (By-Grade):
    

    Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.

  7. Google Data Analytics Capstone

    • kaggle.com
    zip
    Updated Aug 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reilly McCarthy (2022). Google Data Analytics Capstone [Dataset]. https://www.kaggle.com/datasets/reillymccarthy/google-data-analytics-capstone/discussion
    Explore at:
    zip(67456 bytes)Available download formats
    Dataset updated
    Aug 9, 2022
    Authors
    Reilly McCarthy
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Hello! Welcome to the Capstone project I have completed to earn my Data Analytics certificate through Google. I chose to complete this case study through RStudio desktop. The reason I did this is that R is the primary new concept I learned throughout this course. I wanted to embrace my curiosity and learn more about R through this project. In the beginning of this report I will provide the scenario of the case study I was given. After this I will walk you through my Data Analysis process based on the steps I learned in this course:

    1. Ask
    2. Prepare
    3. Process
    4. Analyze
    5. Share
    6. Act

    The data I used for this analysis comes from this FitBit data set: https://www.kaggle.com/datasets/arashnic/fitbit

    " This dataset generated by respondents to a distributed survey via Amazon Mechanical Turk between 03.12.2016-05.12.2016. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. "

  8. P

    Process Analytics Market Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Process Analytics Market Report [Dataset]. https://www.marketresearchforecast.com/reports/process-analytics-market-5322
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Jun 2, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Process Analytics Marketsize was valued at USD 2.52 Billion in 2023 and is projected to reach USD 46.34 Billion by 2032, exhibiting a CAGR of 38.2 % during the forecast period. Key drivers for this market are: Increasing Adoption of Cloud-based Managed Services to Drive Market Growth. Potential restraints include: Adverse Health Effect May Hamper Market Growth. Notable trends are: Growing Implementation of Touch-based and Voice-based Infotainment Systems to Increase Adoption of Intelligent Cars.

  9. P

    Process Mining Solution Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Dec 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2024). Process Mining Solution Report [Dataset]. https://www.datainsightsmarket.com/reports/process-mining-solution-536677
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Dec 23, 2024
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global process mining solutions market is expanding rapidly, with a market size valued at XXX million in 2025 and projected to grow at a CAGR of XX% during the forecast period of 2025-2033. Key drivers of this growth include increasing adoption of digital transformation initiatives, rising demand for operational efficiency, and growing need for regulatory compliance. Major market trends include the emergence of cloud-based solutions, the integration of artificial intelligence (AI) and machine learning (ML), and the adoption of process mining in new industries, such as healthcare and retail. The market is segmented into various application areas, including manufacturing, financial services, healthcare, retail, and logistics and supply chain management. Automated process discovery tools, process efficiency analytics software, and business process compliance monitoring tools are प्रमुख solution types driving the market. Top companies in the process mining domain include Celonis, SAP Signavio, IBM, ARIS, and Appian, among others. North America, Europe, Asia Pacific, and the Middle East & Africa are key regional markets for process mining solutions.

  10. d

    Data from: A simple method for statistical analysis of intensity differences...

    • catalog.data.gov
    • healthdata.gov
    • +1more
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). A simple method for statistical analysis of intensity differences in microarray-derived gene expression data [Dataset]. https://catalog.data.gov/dataset/a-simple-method-for-statistical-analysis-of-intensity-differences-in-microarray-derived-ge
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.

  11. Process Analytical Instrumentation Market Report | Industry Analysis, Size &...

    • mordorintelligence.com
    pdf,excel,csv,ppt
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mordor Intelligence (2025). Process Analytical Instrumentation Market Report | Industry Analysis, Size & Forecast [Dataset]. https://www.mordorintelligence.com/industry-reports/global-process-analytical-instrumentation-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Oct 6, 2025
    Dataset provided by
    Authors
    Mordor Intelligence
    License

    https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy

    Time period covered
    2019 - 2030
    Area covered
    Global
    Description

    The Process Analytical Instrumentation Market Report is Segmented by Instrument Type (Gas Chromatographs, Gas Analyzers, Liquid Analyzers, Spectrometers), Component (Hardware, Software, and Services), End-User Industry (Oil and Gas, Chemicals and Petrochemicals, and More), Installation Type (In-line/On-line, At-Line, and Laboratory), and Geography. The Market Forecasts are Provided in Terms of Value (USD).

  12. p

    process analytical technology Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Feb 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). process analytical technology Report [Dataset]. https://www.datainsightsmarket.com/reports/process-analytical-technology-1474614
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Feb 9, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the process analytical technology market was valued at USD XXX million in 2024 and is projected to reach USD XXX million by 2033, with an expected CAGR of XX% during the forecast period.

  13. Datasets for manuscript "Integrating data engineering and process systems...

    • catalog.data.gov
    • gimi9.com
    Updated Oct 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2025). Datasets for manuscript "Integrating data engineering and process systems engineering for end-of-life chemical flow analysis" [Dataset]. https://catalog.data.gov/dataset/datasets-for-manuscript-integrating-data-engineering-and-process-systems-engineering-for-e
    Explore at:
    Dataset updated
    Oct 10, 2025
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The Github Repository, https://github.com/jodhernandezbe/TRI4PLADS/tree/v1.0.0,, is publicly available and referenced in supplementary information. This GitHub repository describes the computational framework overview, software requirements, model use, model output, and disclaimer. This repository presents a multi-scale framework that combines data engineering with process systems engineering (PSE) to enhance the precision of chemical flow analysis (CFA) at the end-of-life (EoL) stage. The focus is on chemicals used in plastic manufacturing, tracing their flows through the supply chain and EoL pathways. Additionally, this study examines potential discharges from material recovery facilities to publicly owned treatment works (POTW) facilities, recognizing their relevance to human and environmental health. Tracking these discharges is critical, as industrial EoL material transfers to POTWs can interfere with biological treatment processes, leading to unintended environmental chemical releases. By integrating data-driven methodologies with mechanistic modeling, this framework supports the identification, quantification, and regulatory assessment of chemical discharges, providing a science-based foundation for industrial and policy decision-making in sustainable material and water management. The attached file CoU - Metadata File.xlsx contains the datasets to build Figure 3 and describe a qualitative flow diagram of methyl methacrylate from manufacturing to potential consumer products generated from the Chemical Conditions of Use Locator methodology (https://doi.org/10.1111/jiec.13626). The attached file "MMA POTW Dataset.xlsx" contains the datasets needed to run the Chemical Tracker and Exposure Assessor in Publicly Owned Treatment Works Model (ChemTEAPOTW) as described in the Github Repository https://github.com/gruizmer/ChemTEAPOTW. The attached file "Plastic Data-Calculations-Assumptions.docx" contains all calculations and assumption to estimate the methyl methacrylate (MMA) releases from plastic recycling. Finally, users can generate Figures 4 and 5 after following the step-by-step process described in main Github repository for the MMA case study. This dataset is associated with the following publication: Hernandez-Betancur, J.D., J.D. Chea, D. Perez, and G.J. Ruiz-Mercado. Integrating data engineering and process systems engineering for end-of-life chemical flow analysis. COMPUTERS AND CHEMICAL ENGINEERING. Elsevier Science Ltd, New York, NY, USA, 204: 109414, (2026).

  14. Z

    Artificial datasets for multi-perspective Declare analysis

    • data-staging.niaid.nih.gov
    • nde-dev.biothings.io
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Burattin, Andrea (2020). Artificial datasets for multi-perspective Declare analysis [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_20030
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    University of Innsbruck
    Authors
    Burattin, Andrea
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This file contains the dataset we used for the evaluation of the multi-perspective Declare analysis.

    Logs

    In particular, it contains logs with different sizes and different trace lengths. We generated traces with 10, 20, 30, 40, and 50 events and, for each of these lengths, we generated logs with 25000, 50000, 75000, and 100000 traces. Therefore, in total, there are 20 logs.

    Declare models

    In addition, the dataset contains 10 Declare models. In particular, we prepared two models with 10 constraints, one only containing constraints on the control-flow (without conditions on data and time), and another one including real multi-perspective constraints (with conditions on time and data). We followed the same procedure to create models with 20, 30, 40, and 50 constraints.

  15. Google Data Analytics Capstone Project: Netflix

    • kaggle.com
    zip
    Updated Jan 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Doga Celik (2024). Google Data Analytics Capstone Project: Netflix [Dataset]. https://www.kaggle.com/datasets/dogacelik/google-data-analytics-capstone-project-netflix
    Explore at:
    zip(59851 bytes)Available download formats
    Dataset updated
    Jan 25, 2024
    Authors
    Doga Celik
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Introduction:

    In this case study the skills that I acquired from Google Data Analytics Professional Certificate Course is demonstrated. These skills will be used to complete the imagined task which was given by Netflix. The analysis process of this task will be consisted of following steps. Ask, Prepare, Process, Analyze, Share and Act.

    Scenario:

    The Netflix Chief Content Officer, Bela Bajaria, believes that companies success depends on to provide the customers what they want. Bajaria stated that the goal of this task is to find most wanted contents of the movies which will be added to the portfolio. Most of the movie contracts are signed before they come to the theaters, and it is hard to know if the customers really want to watch that movie and if the movie will be successful. There for my team wants to understand what type of content a movies success depends on. From these insights my team will design an investment strategy to choose the most popular movies that are expected to be in theaters in the near future. But first, Netflix executives must approve our recommendations. To be able to do that we must provide satisfying data insights along with professional data visualizations.

    About the Company:

    At Netflix, we want to entertain the world. Whatever your taste, and no matter where you live, we give you access to best-in-class TV series, documentaries, feature films and games. Our members control what they want to watch, when they want it, in one simple subscription. We’re streaming in more than 30 languages and 190 countries, because great stories can come from anywhere and be loved everywhere. We are the world’s biggest fans of entertainment, and we’re always looking to help you find your next favorite story.

    As a company Netflix knows that it is important to acquire or produce movies that people want to watch.

    There for Bajaria has set a clear goal: Define an investment strategy that will allow Netflix to provide customers the movies what they want to watch which will maximize the Sales.

    Ask:

    Business Task: To find out what kind of movie customers wants to watch and if the content type really has a correlation with the movie success. Stakeholders:

    Bela Bajaria: She joined Netflix in 2016 to oversee unscripted and scripted series. Bajaria also responsible from the content selection and strategy for different regions.

    Netflix content analytics team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Netflix content strategy.

    Netflix executive team: The notoriously detail-oriented executive team will decide whether to approve the recommended content program.

    Prepare:

    I start my preparation procedure by downloading every piece of data I'll need for the study. Top 1000 Highest-Grossing Movies of All Time.csv will be used. Additionally, 15 Lowest-Grossing Movies of All Time.csv was found during the data research and this dataset will be analyst as well. The data has been made available by IMDB and shared this two following URL addresses: https://www.imdb.com/list/ls098063263/ and https://www.imdb.com/list/ls069238222/ .

    Process:

    Data Cleaning:

    SQL: To begin the data cleaning process, I opened both csv file in SQL and conducted following operations:

    • Checked for and removed any duplicates. • Checked if there any null values. • Removed the columns that are not necessary. • Trim the Description column to have only gross profit in it. (This cleaning procedure only used for 1000 Highest-Grossing Movies of All Time.csv dataset.)

    • Renamed the Description column as Gross_Profit. (This cleaning procedure only used for 1000 Highest-Grossing Movies of All Time.csv dataset.)

    Follwing SQL codes were used during the data cleaning:

    SQL CODE used for Highest Grossing Movies DATASET

    SELECT Position, SUBSTR(Description,34,12) as Gross_Profit, Title, IMDb_Rating, Runtime_mins_, Year, Genres, Num_Votes, Release_Date FROM even-electron-400301.Highest_Gross_Movies.1

    SQL CODE used for Lowest Grossing Movies DATASET

    SELECT Position, Title, IMDb_Rating, Runtime_mins_, Year, Genres, Num_Votes, Release_Date FROM even-electron-400301.Lowest_Grossing_Movies.2 Order By Position

    Analyze:

    As a starter, I want to reemphasize the business task once again. Is content has a big impact on a movie’s success?

    To answer this question, there were a few information that I projected that I could pull of and use it during my analysis.

    • Average gross profit • Number of Genres • Total Gross Profit of the most popular genres • The distribution of the Gross income on Genres

    I used Microsoft Excel for the bullet points above. The operations to achieve the values above are as follows:

    • Average function for Average Gross profit in 1000 Highest-Grossing Movies of All Time. • Created a pivot table to work on Genres and Gross_Pr...

  16. f

    Data from: MCnebula: Critical Chemical Classes for the Classification and...

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    xlsx
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lichuang Huang; Qiyuan Shan; Qiang Lyu; Shuosheng Zhang; Lu Wang; Gang Cao (2023). MCnebula: Critical Chemical Classes for the Classification and Boost Identification by Visualization for Untargeted LC–MS/MS Data Analysis [Dataset]. http://doi.org/10.1021/acs.analchem.3c01072.s005
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    ACS Publications
    Authors
    Lichuang Huang; Qiyuan Shan; Qiang Lyu; Shuosheng Zhang; Lu Wang; Gang Cao
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Untargeted mass spectrometry is a robust tool for biology, but it usually requires a large amount of time on data analysis, especially for system biology. A framework called Multiple-Chemical nebula (MCnebula) was developed herein to facilitate the LC–MS data analysis process by focusing on critical chemical classes and visualization in multiple dimensions. This framework consists of three vital steps as follows: (1) abundance-based classes (ABC) selection algorithm, (2) critical chemical classes to classify “features” (corresponding to compounds), and (3) visualization as multiple Child-Nebulae (network graph) with annotation, chemical classification, and structure. Notably, MCnebula can be used to explore the classification and structural characteristic of unknown compounds beyond the limit of the spectral library. Moreover, it is intuitive and convenient for pathway analysis and biomarker discovery because of its function of ABC selection and visualization. MCnebula was implemented in the R language. A series of tools in R packages were provided to facilitate downstream analysis in an MCnebula-featured way, including feature selection, homology tracing of top features, pathway enrichment analysis, heat map clustering analysis, spectral visualization analysis, chemical information query, and output analysis reports. The broad utility of MCnebula was illustrated by a human-derived serum data set for metabolomics analysis. The results indicated that “Acyl carnitines” were screened out by tracing structural classes of biomarkers, which was consistent with the reference. A plant-derived data set was investigated to achieve a rapid annotation and discovery of compounds in E. ulmoides.

  17. p

    Market Analysis for PROCESS CONTROLLER MODEL

    • partassist.com
    Updated Nov 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Market Analysis for PROCESS CONTROLLER MODEL [Dataset]. https://partassist.com/tag/112732/process-controller-model
    Explore at:
    Dataset updated
    Nov 21, 2025
    Variables measured
    Countries, Price Range, Median Price, Average Price, Sold Listings, Total Listings, Active Listings, Unsold Listings, Number of Sellers, Sell-Through Rate
    Description

    Comprehensive market data and analytics for PROCESS CONTROLLER MODEL including pricing distribution, seller metrics, and market trends.

  18. P

    Process Data Historian Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jan 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Process Data Historian Software Report [Dataset]. https://www.datainsightsmarket.com/reports/process-data-historian-software-1981775
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Jan 13, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the Process Data Historian Software market was valued at USD XXX million in 2023 and is projected to reach USD XXX million by 2032, with an expected CAGR of XX% during the forecast period.

  19. P

    Process Twin Technology Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Feb 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Process Twin Technology Report [Dataset]. https://www.archivemarketresearch.com/reports/process-twin-technology-13364
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Feb 4, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The market for process twin technology is expanding rapidly, with a projected Compound Annual Growth Rate (CAGR) of XX% from 2025 to 2033. This growth is being driven by the increasing demand for digital transformation and optimization across various industries, including manufacturing, construction, and automotive. Process twin technology provides real-time insights into complex industrial processes, enabling businesses to improve efficiency, reduce costs, and increase productivity. The market is highly competitive, with a number of established players such as Emerson Electric, IBM, and GE Digital. Key trends in the market include the growth of cloud-based solutions, the integration of artificial intelligence (AI) and machine learning (ML), and the increasing adoption of digital twins across multiple industries. Despite the high growth potential, the market is also facing some challenges, such as concerns about data security and the need for specialized expertise to implement and manage process twin technologies. Overall, the market for process twin technology is expected to continue its rapid growth in the coming years, as businesses seek innovative solutions to optimize their operations and drive growth.

  20. Big Data Services Market Analysis, Size, and Forecast 2025-2029: North...

    • technavio.com
    pdf
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Big Data Services Market Analysis, Size, and Forecast 2025-2029: North America (Mexico), Europe (France, Germany, Italy, and UK), Middle East and Africa (UAE), APAC (Australia, China, India, Japan, and South Korea), South America (Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/big-data-services-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Feb 12, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2025 - 2029
    Description

    Snapshot img

    Big Data Services Market Size 2025-2029

    The big data services market size is forecast to increase by USD 604.2 billion, at a CAGR of 54.4% between 2024 and 2029.

    The market is experiencing significant growth, driven by the increasing adoption of big data in various industries, particularly in blockchain technology. The ability to process and analyze vast amounts of data in real-time is revolutionizing business operations and decision-making processes. However, this market is not without challenges. One of the most pressing issues is the need to cater to diverse client requirements, each with unique data needs and expectations. This necessitates customized solutions and a deep understanding of various industries and their data requirements. Additionally, ensuring data security and privacy in an increasingly interconnected world poses a significant challenge. Companies must navigate these obstacles while maintaining compliance with regulations and adhering to ethical data handling practices. To capitalize on the opportunities presented by the market, organizations must focus on developing innovative solutions that address these challenges while delivering value to their clients. By staying abreast of industry trends and investing in advanced technologies, they can effectively meet client demands and differentiate themselves in a competitive landscape.

    What will be the Size of the Big Data Services Market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
    Request Free SampleThe market continues to evolve, driven by the ever-increasing volume, velocity, and variety of data being generated across various sectors. Data extraction is a crucial component of this dynamic landscape, enabling entities to derive valuable insights from their data. Human resource management, for instance, benefits from data-driven decision making, operational efficiency, and data enrichment. Batch processing and data integration are essential for data warehousing and data pipeline management. Data governance and data federation ensure data accessibility, quality, and security. Data lineage and data monetization facilitate data sharing and collaboration, while data discovery and data mining uncover hidden patterns and trends. Real-time analytics and risk management provide operational agility and help mitigate potential threats. Machine learning and deep learning algorithms enable predictive analytics, enhancing business intelligence and customer insights. Data visualization and data transformation facilitate data usability and data loading into NoSQL databases. Government analytics, financial services analytics, supply chain optimization, and manufacturing analytics are just a few applications of big data services. Cloud computing and data streaming further expand the market's reach and capabilities. Data literacy and data collaboration are essential for effective data usage and collaboration. Data security and data cleansing are ongoing concerns, with the market continuously evolving to address these challenges. The integration of natural language processing, computer vision, and fraud detection further enhances the value proposition of big data services. The market's continuous dynamism underscores the importance of data cataloging, metadata management, and data modeling for effective data management and optimization.

    How is this Big Data Services Industry segmented?

    The big data services industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments. ComponentSolutionServicesEnd-userBFSITelecomRetailOthersTypeData storage and managementData analytics and visualizationConsulting servicesImplementation and integration servicesSupport and maintenance servicesSectorLarge enterprisesSmall and medium enterprises (SMEs)GeographyNorth AmericaUSMexicoEuropeFranceGermanyItalyUKMiddle East and AfricaUAEAPACAustraliaChinaIndiaJapanSouth KoreaSouth AmericaBrazilRest of World (ROW).

    By Component Insights

    The solution segment is estimated to witness significant growth during the forecast period.Big data services have become indispensable for businesses seeking operational efficiency and customer insight. The vast expanse of structured and unstructured data presents an opportunity for organizations to analyze consumer behaviors across multiple channels. Big data solutions facilitate the integration and processing of data from various sources, enabling businesses to gain a deeper understanding of customer sentiment towards their products or services. Data governance ensures data quality and security, while data federation and data lineage provide transparency and traceability. Artificial intelligence and machine learning algo

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jason Porzelius (2023). Google Certificate BellaBeats Capstone Project [Dataset]. https://www.kaggle.com/datasets/jasonporzelius/google-certificate-bellabeats-capstone-project
Organization logo

Google Certificate BellaBeats Capstone Project

Explore at:
zip(169161 bytes)Available download formats
Dataset updated
Jan 5, 2023
Authors
Jason Porzelius
Description

Introduction: I have chosen to complete a data analysis project for the second course option, Bellabeats, Inc., using a locally hosted database program, Excel for both my data analysis and visualizations. This choice was made primarily because I live in a remote area and have limited bandwidth and inconsistent internet access. Therefore, completing a capstone project using web-based programs such as R Studio, SQL Workbench, or Google Sheets was not a feasible choice. I was further limited in which option to choose as the datasets for the ride-share project option were larger than my version of Excel would accept. In the scenario provided, I will be acting as a Junior Data Analyst in support of the Bellabeats, Inc. executive team and data analytics team. This combined team has decided to use an existing public dataset in hopes that the findings from that dataset might reveal insights which will assist in Bellabeat's marketing strategies for future growth. My task is to provide data driven insights to business tasks provided by the Bellabeats, Inc.'s executive and data analysis team. In order to accomplish this task, I will complete all parts of the Data Analysis Process (Ask, Prepare, Process, Analyze, Share, Act). In addition, I will break each part of the Data Analysis Process down into three sections to provide clarity and accountability. Those three sections are: Guiding Questions, Key Tasks, and Deliverables. For the sake of space and to avoid repetition, I will record the deliverables for each Key Task directly under the numbered Key Task using an asterisk (*) as an identifier.

Section 1 - Ask:

A. Guiding Questions:
1. Who are the key stakeholders and what are their goals for the data analysis project? 2. What is the business task that this data analysis project is attempting to solve?

B. Key Tasks: 1. Identify key stakeholders and their goals for the data analysis project *The key stakeholders for this project are as follows: -Urška Sršen and Sando Mur - co-founders of Bellabeats, Inc. -Bellabeats marketing analytics team. I am a member of this team.

  1. Identify the business task. *The business task is: -As provided by co-founder Urška Sršen, the business task for this project is to gain insight into how consumers are using their non-BellaBeats smart devices in order to guide upcoming marketing strategies for the company which will help drive future growth. Specifically, the researcher was tasked with applying insights driven by the data analysis process to 1 BellaBeats product and presenting those insights to BellaBeats stakeholders.

Section 2 - Prepare:

A. Guiding Questions: 1. Where is the data stored and organized? 2. Are there any problems with the data? 3. How does the data help answer the business question?

B. Key Tasks:

  1. Research and communicate the source of the data, and how it is stored/organized to stakeholders. *The data source used for our case study is FitBit Fitness Tracker Data. This dataset is stored in Kaggle and was made available through user Mobius in an open-source format. Therefore, the data is public and available to be copied, modified, and distributed, all without asking the user for permission. These datasets were generated by respondents to a distributed survey via Amazon Mechanical Turk reportedly (see credibility section directly below) between 03/12/2016 thru 05/12/2016.
    *Reportedly (see credibility section directly below), thirty eligible Fitbit users consented to the submission of personal tracker data, including output related to steps taken, calories burned, time spent sleeping, heart rate, and distance traveled. This data was broken down into minute, hour, and day level totals. This data is stored in 18 CSV documents. I downloaded all 18 documents into my local laptop and decided to use 2 documents for the purposes of this project as they were files which had merged activity and sleep data from the other documents. All unused documents were permanently deleted from the laptop. The 2 files used were: -sleepDay_merged.csv -dailyActivity_merged.csv

  2. Identify and communicate to stakeholders any problems found with the data related to credibility and bias. *As will be more specifically presented in the Process section, the data seems to have credibility issues related to the reported time frame of the data collected. The metadata seems to indicate that the data collected covered roughly 2 months of FitBit tracking. However, upon my initial data processing, I found that only 1 month of data was reported. *As will be more specifically presented in the Process section, the data has credibility issues related to the number of individuals who reported FitBit data. Specifically, the metadata communicates that 30 individual users agreed to report their tracking data. My initial data processing uncovered 33 individual ...

Search
Clear search
Close search
Google apps
Main menu