Facebook
TwitterThis dataset is a practical SQL case study designed for learners who are looking to enhance their SQL skills in analyzing sales, products, and marketing data. It contains several SQL queries related to a simulated business database for product sales, marketing expenses, and location data. The database consists of three main tables: Fact, Product, and Location.
Objective of the Case Study: The purpose of this case study is to provide learners with a variety of practical SQL exercises that involve real-world business problems. The queries explore topics such as:
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Analyzing HR Data for Improved Workforce Management: A Case Study
INTRODUCTION
HR analytics, also known as people analytics, is a data-driven approach to managing human resources. It involves gathering and analyzing data related to employees, such as recruitment, performance, engagement, and retention, to derive insights and make informed decisions. This case study explores the application of HR analytics in a hypothetical organization and showcases its benefits in optimizing workforce management.
CASE STUDY OVERVIEW
Organization Description: Let's consider a medium-sized technology company called "TechSolutions Inc." The company specializes in software development and has a diverse workforce across different departments, including engineering, marketing, sales, and customer support.
Objectives: The main objectives of this case study are as follows: 1. Understand the factors influencing employee attrition and job satisfaction. 2. Identify key predictors of employee performance. 3. Develop strategies to improve employee engagement and retention.
DATA COLLECTION AND ANALYSIS
Data Sources: To conduct HR analytics, the following data sources can be utilized: 1. HRIS (Human Resource Information System): Employee demographic information, employment history, and compensation details. 2. Performance Management System: Employee performance ratings, goals, and achievements. 3. Employee Surveys: Feedback on job satisfaction, work-life balance, and engagement. 4. Exit Interviews: Reasons for employee departures and feedback on their experiences.
Data Analysis Steps: 1. Data Preprocessing: Clean and prepare the collected data, handle missing values, and ensure data quality. 2. Attrition Analysis: Analyze historical data to understand factors contributing to employee attrition, such as department, job level, salary, tenure, performance ratings, and employee demographics. 3. Job Satisfaction Analysis: Explore survey data to identify key drivers of job satisfaction, including work environment, career growth opportunities, compensation, and employee benefits. 4. Performance Prediction: Utilize machine learning techniques, such as regression or classification models, to identify predictors of employee performance based on historical performance data, employee characteristics, and other relevant variables. 5. Employee Engagement Analysis: Analyze survey data and feedback to assess employee engagement levels and identify areas of improvement, such as communication, recognition programs, or training opportunities. 6. Actionable Insights: Derive actionable insights from the analysis results to develop targeted strategies for improving employee retention, job satisfaction, and performance.
RESULTS AND RECOMMENDATIONS
Based on the analysis conducted in the previous steps, let's assume the following findings and corresponding recommendations:
Attrition Analysis:
Job Satisfaction Analysis:
Performance Prediction:
Employee Engagement Analysis:
By implementing these recommendations, TechSolutions Inc. can enhance employee satisfaction, engagement, and retention, leading to a more productive and motivated workforce.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This table is a summary table of insights of my first data analyst project, a Google Data Analytics Professional Certificate Programme Case Study.
It has nearly 5M rows and a 20 columns.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Thorough knowledge of the structure of analyzed data allows to form detailed scientific hypotheses and research questions. The structure of data can be revealed with methods for exploratory data analysis. Due to multitude of available methods, selecting those which will work together well and facilitate data interpretation is not an easy task. In this work we present a well fitted set of tools for a complete exploratory analysis of a clinical dataset and perform a case study analysis on a set of 515 patients. The proposed procedure comprises several steps: 1) robust data normalization, 2) outlier detection with Mahalanobis (MD) and robust Mahalanobis distances (rMD), 3) hierarchical clustering with Ward’s algorithm, 4) Principal Component Analysis with biplot vectors. The analyzed set comprised elderly patients that participated in the PolSenior project. Each patient was characterized by over 40 biochemical and socio-geographical attributes. Introductory analysis showed that the case-study dataset comprises two clusters separated along the axis of sex hormone attributes. Further analysis was carried out separately for male and female patients. The most optimal partitioning in the male set resulted in five subgroups. Two of them were related to diseased patients: 1) diabetes and 2) hypogonadism patients. Analysis of the female set suggested that it was more homogeneous than the male dataset. No evidence of pathological patient subgroups was found. In the study we showed that outlier detection with MD and rMD allows not only to identify outliers, but can also assess the heterogeneity of a dataset. The case study proved that our procedure is well suited for identification and visualization of biologically meaningful patient subgroups.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains financial transaction records, including revenue and expenses, over a specified period. It is designed for data analysis and visualization tasks, providing insights into financial performance and trends.
Key features include:
*Transaction Details: Includes transaction ID, date, category (revenue or expense), and amount in USD. *Payment Methods: Tracks different payment channels like credit cards and bank transfers. *Remarks: Additional context for each transaction, such as "Office Supplies" or "Quarterly Sales."
This dataset is ideal for practicing data cleaning, exploratory data analysis, and visualization. It supports applications like trend analysis, category comparison, and payment method distributions, making it a great resource for aspiring data analysts.
Facebook
TwitterWelcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, you will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act. Along the way, the Case Study Roadmap tables — including guiding questions and key tasks — will help you stay on the right path. By the end of this lesson, you will have a portfolio-ready case study. Download the packet and reference the details of this case study anytime. Then, when you begin your job hunt, your case study will be a tangible way to demonstrate your knowledge and skills to potential employers.
You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations. Characters and teams ● Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day. ● Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels. ● Cyclistic marketing analytics team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals — as well as how you, as a junior data analyst, can help Cyclistic achieve them. ● Cyclistic executive team: The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.
In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime. Until now, Cyclistic’s marketing strategy relied on building general awareness and appealing to broad consumer segments. One approach that helped make these things possible was the flexibility of its pricing plans: single-ride passes, full-day passes, and annual memberships. Customers who purchase single-ride or full-day passes are referred to as casual riders. Customers who purchase annual memberships are Cyclistic members. Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers, Moreno believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, Moreno believes there is a very good chance to convert casual riders into members. She notes that casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs. Moreno has set a clear goal: Design marketing strategies aimed at converting casual riders into annual members. In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics. Moreno and her team are interested in analyzing the Cyclistic historical bike trip data to identify trends
How do annual members and casual riders use Cyclistic bikes differently? Why would casual riders buy Cyclistic annual memberships? How can Cyclistic use digital media to influence casual riders to become members? Moreno has assigned you the first question to answer: How do annual members and casual rid...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Database for the article: Data analytics and Artificial Neural Network framework to profile academic success: Case Study of Leaders of Tomorrow Program
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Welcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members.
You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations.
● Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day. ● Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels. ● Cyclistic marketing analytics team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals — as well as how you, as a junior data analyst, can help Cyclistic achieve them. ● Cyclistic executive team: The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.
The data has been made available by Motivate International Inc. under this license. Dataset download link Click Here
Facebook
TwitterOne of the challenges of teaching scientific courses is helping students understand research methods, biological models, and data analysis, which can be especially difficult in classes without a laboratory component. Within the field of toxicology, it is also important for students to understand how living organisms are affected by exposure to toxicants and how these toxicants can impact the ecosystem. Resources focusing on active learning pedagogy are scarce in the field of toxicology compared to other disciplines. In this activity, upper-level students in an introductory toxicology course learn to interpret data from primary literature, draw conclusions about how toxicants, specifically metals, can impact susceptible populations, and understand the One Environmental Health approach. Students work in small groups to answer questions concerning data from a paper and then share their responses with the entire class building their communication skills. The instructor serves as a moderator, allowing the students to work through concepts, intervening only when necessary. This approach enables a deeper level of understanding of content and allows the students to engage actively in the learning process. As such, students think critically through relevant problems and find connections to the real world. This lesson can be adapted for several levels of students and could be modified depending on the objectives of the course.
Primary Image: One Environmental Health Approach in the Gulf of Maine. Representation of the movement of chemicals through the ecosystem and into humans which illustrates the basic principles of the One Environmental Health Approach.
Facebook
TwitterThe summary from the detailed analysis of the case study in EPA (1988b) is provided in Table 3 of the manuscript, and was used as the data source for the two datasets used in this study. These include a flat and hierarchical structure of the five balancing criteria, shown in Table 4 and Table 5, respectively. Table 4 provides a comprehensive score for each balancing criterion, similar to the summary tables presented in the FS of Superfund sites (e.g., (EPA 2016b, AECOM 2019)). Table 5 uses the same information in Table 3, but in this case, each piece of information is used to define multiple sub-criteria for each balancing criterion, except the cost one. This leads to a much more elaborate information table with the four remaining balancing criteria, now characterized by 13 sub-criteria. It is important to note that the scoring provided in Table 4 and Table 5, with the exception of the cost (c_5), were derived from the author’s interpretation of the descriptive language of the detailed analysis in for the hypothetical case study in presented in Table A-7 in Appendix A of the guidance document of EPA (1988b). It should be noted that the analysis of the three remedy alternatives presented in this hypothetical case study is governed by site-specific characteristics and may not represent potential performance of these remediation alternatives for other sites . The intent of this exercise is to illustrate the flexibility and adaptability of the MCDA process to address both the main, overarching criteria, as well as sub-criteria that may have specific importance in the decision process for a particular site. Ultimately, the sub-criteria can be adapted to address specific stakeholder perspectives or technical factors that may be linked to properties unique to the contaminant or physical characteristics of the site. This dataset is associated with the following publication: Cinelli, M., M.A. Gonzalez, R. Ford, J. McKernan, S. Corrente, M. Kadziński, and R. Słowiński. Supporting contaminated sites management with Multiple Criteria Decision Analysis: Demonstration of a regulation-consistent approach. JOURNAL OF CLEANER PRODUCTION. Elsevier Science Ltd, New York, NY, USA, 316: 128347, (2021).
Facebook
Twitterhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.15454/YNMQUYhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.15454/YNMQUY
This dataset is issued from the public repository TCGA (https://portal.gdc.cancer.gov/) and contain several files, each corresponding to a given omic on the same individuals with breast cancer. Raw data have been obtained from the mixOmics case study described in http://mixomics.org/mixdiablo/case-study-tcga/ [link accessed on August 18, 2021] and were made available by the package authors at http://mixomics.org/wp-content/uploads/2016/08/TCGA.normalised.mixDIABLO.RData_.zip (R data format). Data in the zip file had been normalised for technical biases by the package authors. Data from the train and test sets were exported as TXT/CSV files and completed with miRNA expression on the smae individuals and toy datasets to handle missing value cases and alike. They serve as a basis for the illustration of the web data analysis tool ASTERICS (Project 20008788 funded by Région Occitanie).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Companies are encouraged by the big data trend to experiment with advanced analytics and many turn to specialist consultancies to help them get started where they lack the necessary competences. We investigate the program of one such consultancy, Advectas - in particular the advanced analytics Jumpstart. Using qualitative techniques including semi structured interviews and content analysis we investigate the nature and value of the Jumpstart concept through five cases in different companies. We provide a definition, a process model and a set of thirteen best practices derived from these experiences, and discuss the distinctive qualities of this approach.
Facebook
TwitterIntroduction After completing my Google Data Analytics Professional Certificate on Coursera, I accomplished a Capstone Project, recommended by Google, to improve and highlight the technical skills of data analysis knowledge, such as R programming, SQL, and Tableau. In the Cyclistic Case Study, I performed many real-world tasks of a junior data analyst. To answer the critical business questions, I followed the steps of the data analysis process: ask, prepare, process, analyze, share, and act. **Scenario ** You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations. Characters and teams Cyclistic: A bike-share program that has grown to a fleet of 5,824 bicycles that are tracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system at any time. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day. Stakeholders Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels. Cyclistic marketing analytics team: A team of data analysts responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals and how you, as a junior data analyst, can help Cyclistic achieve them. *Cyclistic executive team: *The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.
Facebook
TwitterThe KILVOLC_FlowerKahn2021_1 dataset is the MISR Derived Case Study Data for Kilauea Volcanic Eruptions Including Geometric Plume Height and Qualitative Radiometric Particle Property Information version 1 dataset. It comprises MISR-derived output from a comprehensive analysis of Kilauea volcanic eruptions (2000-2018). Data collection for this dataset is complete. The data presented here are analyzed and discussed in the following paper: Flower, V.J.B., and R.A. Kahn, 2021. Twenty years of NASA-EOS multi-sensor satellite observations at Kīlauea volcano (2000-2019). J. Volc. Geo. Res. (in press).The data is subdivided by date and MISR orbit number. Within each case folder, there are up to 11 files relating to an individual MISR overpass. Files include plume height records (from both the red and blue spectral bands) derived from the MISR INteractive eXplorer (MINX) program, displayed in: map view, downwind profile plot (along with the associated wind vectors retrieved at plume elevation), a histogram of retrieved plume heights and a text file containing the digital plume height values. An additional JPG is included delineating the plume analysis region, start point for assessing downwind distance, and input wind direction used to initialize the MINX retrieval. A final two files are generated from the MISR Research Aerosol (RA) retrieval algorithm (Limbacher, J.A., and R.A. Kahn, 2014. MISR Research-Aerosol-Algorithm: Refinements For Dark Water Retrievals. Atm. Meas. Tech. 7, 1-19, doi:10.5194/amt-7-1-2014). These files include the RA model output in HDF5, and an associated JPG of key derived variables (e.g. Aerosol Optical Depth, Angstrom Exponent, Single Scattering Albedo, Fraction of Non-Spherical components, model uncertainty classifications and example camera views). File numbers per folder vary depending on the retrieval conditions of specific observations. RA plume retrievals are limited when cloud cover was widespread or the solar radiance was insufficient to run the RA. In these cases the RA files are not included in the individual folders. In cases where activity was observed from multiple volcanic zones in a single overpass, individual folders containing data relating to a single region, are included, and defined by a qualifier (e.g. '_1').
Facebook
Twitterhttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt
Yearly citation counts for the publication titled "Wind Data Analysis and a Case Study of Wind Power Generation in Hong Kong".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data presents the results of work on the analysis of contemporary neighbourhoods. The aim of this part of the research was to analysis housing estates already existed in various cities in Europe. The analyses ware done in real time with AI and powered for key factors such as sun hours, daylight potential, noise, wind, and microclimate. These data are obtainable by subsequent researchers and can be checked to verify conditions for specific locations.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Patient categorical and nominal attributes.
Facebook
TwitterThe use of common operational picture (COP) technology can give law enforcement and its public safety response partners the capacity to develop a shared situational awareness to support effective and timely decision-making. These technologies collate and display information relevant for situational awareness (e.g., the location and what is known about a crime incident, the location and operational status of an agency's patrol units, the duty status of officers). CNA conducted a mixed-methods study including a technical review of COP technologies and their capacities and a set of case studies intended to produce narratives of the COP technology adoption process as well as lessons learned and best practices regarding implementation and use of COP technologies. This study involved four phases over two years: (1) preparation and technology review, (2) qualitative case studies, (3) analysis, and (4) development and dissemination of results. This study produced a market review report describing the results from the technical review, including common technical characteristics and logistical requirements associated with COP technologies and a case study report of law enforcement agencies' adoption and use of COP technologies. This study provides guidance and lessons learned to agencies interested in implementing or revising their use of COP technology. Agencies will be able to identify how they can improve their information sharing and situational awareness capabilities using COP technology, and will be able to refer to the processes used by other, model agencies when undertaking the implementation of COP technology.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Learn how top startups leverage data for competitive advantage. Get the new Audible version of Winning with Data, featuring real SaaS case studies and proven strategies.
Facebook
TwitterThis dataset is a practical SQL case study designed for learners who are looking to enhance their SQL skills in analyzing sales, products, and marketing data. It contains several SQL queries related to a simulated business database for product sales, marketing expenses, and location data. The database consists of three main tables: Fact, Product, and Location.
Objective of the Case Study: The purpose of this case study is to provide learners with a variety of practical SQL exercises that involve real-world business problems. The queries explore topics such as: