Facebook
TwitterThis dataset is a practical SQL case study designed for learners who are looking to enhance their SQL skills in analyzing sales, products, and marketing data. It contains several SQL queries related to a simulated business database for product sales, marketing expenses, and location data. The database consists of three main tables: Fact, Product, and Location.
Objective of the Case Study: The purpose of this case study is to provide learners with a variety of practical SQL exercises that involve real-world business problems. The queries explore topics such as:
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Analyzing HR Data for Improved Workforce Management: A Case Study
INTRODUCTION
HR analytics, also known as people analytics, is a data-driven approach to managing human resources. It involves gathering and analyzing data related to employees, such as recruitment, performance, engagement, and retention, to derive insights and make informed decisions. This case study explores the application of HR analytics in a hypothetical organization and showcases its benefits in optimizing workforce management.
CASE STUDY OVERVIEW
Organization Description: Let's consider a medium-sized technology company called "TechSolutions Inc." The company specializes in software development and has a diverse workforce across different departments, including engineering, marketing, sales, and customer support.
Objectives: The main objectives of this case study are as follows: 1. Understand the factors influencing employee attrition and job satisfaction. 2. Identify key predictors of employee performance. 3. Develop strategies to improve employee engagement and retention.
DATA COLLECTION AND ANALYSIS
Data Sources: To conduct HR analytics, the following data sources can be utilized: 1. HRIS (Human Resource Information System): Employee demographic information, employment history, and compensation details. 2. Performance Management System: Employee performance ratings, goals, and achievements. 3. Employee Surveys: Feedback on job satisfaction, work-life balance, and engagement. 4. Exit Interviews: Reasons for employee departures and feedback on their experiences.
Data Analysis Steps: 1. Data Preprocessing: Clean and prepare the collected data, handle missing values, and ensure data quality. 2. Attrition Analysis: Analyze historical data to understand factors contributing to employee attrition, such as department, job level, salary, tenure, performance ratings, and employee demographics. 3. Job Satisfaction Analysis: Explore survey data to identify key drivers of job satisfaction, including work environment, career growth opportunities, compensation, and employee benefits. 4. Performance Prediction: Utilize machine learning techniques, such as regression or classification models, to identify predictors of employee performance based on historical performance data, employee characteristics, and other relevant variables. 5. Employee Engagement Analysis: Analyze survey data and feedback to assess employee engagement levels and identify areas of improvement, such as communication, recognition programs, or training opportunities. 6. Actionable Insights: Derive actionable insights from the analysis results to develop targeted strategies for improving employee retention, job satisfaction, and performance.
RESULTS AND RECOMMENDATIONS
Based on the analysis conducted in the previous steps, let's assume the following findings and corresponding recommendations:
Attrition Analysis:
Job Satisfaction Analysis:
Performance Prediction:
Employee Engagement Analysis:
By implementing these recommendations, TechSolutions Inc. can enhance employee satisfaction, engagement, and retention, leading to a more productive and motivated workforce.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This table is a summary table of insights of my first data analyst project, a Google Data Analytics Professional Certificate Programme Case Study.
It has nearly 5M rows and a 20 columns.
Facebook
TwitterIntroduction
Welcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members. In order to answer the key business questions, you will follow the steps of the data analysis process: ask, prepare, process, analyze, share, and act. Along the way, the Case Study Roadmap tables — including guiding questions and key tasks — will help you stay on the right path.
You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations. Characters and teams.
Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day.
Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels.
Cyclistic marketing analytics team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals — as well as how you, as a junior data analyst, can help Cyclistic achieve them.
Cyclistic executive team: The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.
ride_id: It is a distinct identifier assigned to each individual ride. rideable_type: This column indicates the type of bikes used for each ride. started_at: This column denotes the timestamp when a particular ride began. ended_at: This column represents the timestamp when a specific ride concluded. start_station_name: This column contains the name of the station where the bike ride originated. start_station_id: This column represents the unique identifier for the station where the bike ride originated. end_station_name: This column contains the name of the station where the bike ride concluded. end_station_id: This column represents the unique identifier for the station where the bike ride concluded. start_lat: This column denotes the latitude coordinate of the starting point of the bike ride. start_lng: This column denotes the longitude coordinate of the starting point of the bike ride. end_lat: This column denotes the latitude coordinate of the ending point of the bike ride. end_lng: This column denotes the longitude coordinate of the ending point of the bike ride. member_casual: This column indicates whether the rider is a member or a casual user.
Facebook
TwitterThe Poverty Mapping Project: Poverty and Food Security Case Studies data set consists of small area estimates of poverty, inequality, food security and related measures for subnational administrative Units in Mexico, Ecuador, Kenya, Malawi, Bangladesh, Sri Lanka, Nigeria and Vietnam. These data come from country level cases studies that examine poverty and food security from a spatial analysis perspective. The data products include shapefiles (vector data) and tabular data sets (csv format). Additionally, a data catalog (xls format) containing detailed information and documentation is provided. This data set is produced by the Columbia University Center for International Earth Science Information Network (CIESIN) and Centro Internacional de Agricultura Tropical (CIAT). The data set was originally produced by CIAT, International Maize and Wheat Improvement Center (CIMMYT), International Livestock Research Institute (ILRI), International Food Policy Research Institute (IFPRI), International Rice Research Institute (IRRI), International Water Management Institute (IWMI), and International Institute for Tropical Agriculture (IITA).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Thorough knowledge of the structure of analyzed data allows to form detailed scientific hypotheses and research questions. The structure of data can be revealed with methods for exploratory data analysis. Due to multitude of available methods, selecting those which will work together well and facilitate data interpretation is not an easy task. In this work we present a well fitted set of tools for a complete exploratory analysis of a clinical dataset and perform a case study analysis on a set of 515 patients. The proposed procedure comprises several steps: 1) robust data normalization, 2) outlier detection with Mahalanobis (MD) and robust Mahalanobis distances (rMD), 3) hierarchical clustering with Ward’s algorithm, 4) Principal Component Analysis with biplot vectors. The analyzed set comprised elderly patients that participated in the PolSenior project. Each patient was characterized by over 40 biochemical and socio-geographical attributes. Introductory analysis showed that the case-study dataset comprises two clusters separated along the axis of sex hormone attributes. Further analysis was carried out separately for male and female patients. The most optimal partitioning in the male set resulted in five subgroups. Two of them were related to diseased patients: 1) diabetes and 2) hypogonadism patients. Analysis of the female set suggested that it was more homogeneous than the male dataset. No evidence of pathological patient subgroups was found. In the study we showed that outlier detection with MD and rMD allows not only to identify outliers, but can also assess the heterogeneity of a dataset. The case study proved that our procedure is well suited for identification and visualization of biologically meaningful patient subgroups.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
List of Top Schools of Communications in Statistics Case Studies Data Analysis and Applications sorted by citations.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Database for the article: Data analytics and Artificial Neural Network framework to profile academic success: Case Study of Leaders of Tomorrow Program
Facebook
TwitterThis dataset was created by Danell Eduardo Rapozo Ramirez
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
List of Top Authors of Communications in Statistics Case Studies Data Analysis and Applications sorted by articles.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Welcome to the Cyclistic bike-share analysis case study! In this case study, you will perform many real-world tasks of a junior data analyst. You will work for a fictional company, Cyclistic, and meet different characters and team members.
You are a junior data analyst working in the marketing analyst team at Cyclistic, a bike-share company in Chicago. The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, your team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, your team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives must approve your recommendations, so they must be backed up with compelling data insights and professional data visualizations.
● Cyclistic: A bike-share program that features more than 5,800 bicycles and 600 docking stations. Cyclistic sets itself apart by also offering reclining bikes, hand tricycles, and cargo bikes, making bike-share more inclusive to people with disabilities and riders who can’t use a standard two-wheeled bike. The majority of riders opt for traditional bikes; about 8% of riders use the assistive options. Cyclistic users are more likely to ride for leisure, but about 30% use them to commute to work each day. ● Lily Moreno: The director of marketing and your manager. Moreno is responsible for the development of campaigns and initiatives to promote the bike-share program. These may include email, social media, and other channels. ● Cyclistic marketing analytics team: A team of data analysts who are responsible for collecting, analyzing, and reporting data that helps guide Cyclistic marketing strategy. You joined this team six months ago and have been busy learning about Cyclistic’s mission and business goals — as well as how you, as a junior data analyst, can help Cyclistic achieve them. ● Cyclistic executive team: The notoriously detail-oriented executive team will decide whether to approve the recommended marketing program.
The data has been made available by Motivate International Inc. under this license. Dataset download link Click Here
Facebook
TwitterPrevious studies that used data from Stack Overflow to develop predictive models often employed limited benchmarks of 3-5 models or adopted arbitrary selection methods. Despite being insightful, such approaches may not provide optimal results given their limited scope, suggesting the need to benchmark more models to avoid overlooking untested algorithms. Our study evaluates 21 algorithms across three tasks: predicting the number of question a user is likely to answer, their code quality violations, and their dropout status. We employed normalisation, standardisation, as well as logarithmic and power transformations paired with Bayesian hyperparameter optimisation and genetic algorithms. CodeBERT, a pre-trained language model for both natural and programming languages, was fine-tuned to classify user dropout given their posts (questions and answers) and code snippets. This replication package is provided for those interested in further examining our research methodology.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data obtained for the study case 5: RSA/ECC.
Facebook
TwitterOne of the challenges of teaching scientific courses is helping students understand research methods, biological models, and data analysis, which can be especially difficult in classes without a laboratory component. Within the field of toxicology, it is also important for students to understand how living organisms are affected by exposure to toxicants and how these toxicants can impact the ecosystem. Resources focusing on active learning pedagogy are scarce in the field of toxicology compared to other disciplines. In this activity, upper-level students in an introductory toxicology course learn to interpret data from primary literature, draw conclusions about how toxicants, specifically metals, can impact susceptible populations, and understand the One Environmental Health approach. Students work in small groups to answer questions concerning data from a paper and then share their responses with the entire class building their communication skills. The instructor serves as a moderator, allowing the students to work through concepts, intervening only when necessary. This approach enables a deeper level of understanding of content and allows the students to engage actively in the learning process. As such, students think critically through relevant problems and find connections to the real world. This lesson can be adapted for several levels of students and could be modified depending on the objectives of the course.
Primary Image: One Environmental Health Approach in the Gulf of Maine. Representation of the movement of chemicals through the ecosystem and into humans which illustrates the basic principles of the One Environmental Health Approach.
Facebook
TwitterThe Poverty Mapping Project: Poverty and Food Security Case Studies data set consists of small area estimates of poverty, inequality, food security and related measures for subnational administrative Units in Mexico, Ecuador, Kenya, Malawi, Bangladesh, Sri Lanka, Nigeria and Vietnam. These data come from country level cases studies that examine poverty and food security from a spatial analysis perspective. The data products include shapefiles (vector data) and tabular data sets (csv format). Additionally, a data catalog (xls format) containing detailed information and documentation is provided. This data set is produced by the Columbia University Center for International Earth Science Information Network (CIESIN) and Centro Internacional de Agricultura Tropical (CIAT). The data set was originally produced by CIAT, International Maize and Wheat Improvement Center (CIMMYT), International Livestock Research Institute (ILRI), International Food Policy Research Institute (IFPRI), International Rice Research Institute (IRRI), International Water Management Institute (IWMI), and International Institute for Tropical Agriculture (IITA).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains the text of the documents that are sources of evidence used in [1] and [2] to distill our reference scenarios according to the methodology suggested by Yin in [3].
The dataset is composed of 95 unique document texts spanning the period 2005-2022. This dataset makes available a corpus of documentary sources useful for outlining case studies related to scenarios in which the DPO finds himself operating in the performance of his daily activities.
The language used in the corpus is mainly Italian, but some documents are in English and French. For the reader's benefit, we provide an English translation of the title of each document.
The documentary sources are of many types (for example, court decisions, supervisory authorities' decisions, job advertisements, and newspaper articles), provided by different bodies (such as supervisor authorities, data controllers, European Union institutions, private companies, courts, public authorities, research organizations, newspapers, and public administrations), and redacted from distinct professional roles (for example, data protection officers, general managers, university rectors, collegiate bodies, judges, and journalists).
The documentary sources were collected from 31 different bodies. Most of the documents in the corpus (a total of 83 documents) have been transformed into Rich Text Format (RTF), while the other documents (a total of 12) are in PDF format. All the documents have been manually read and verified. The dataset is helpful as a starting point for a case studies analysis on the daily issues a data protection officer face. Details on the methodology can be found in the accompanying papers.
The available files are as follows:
documents-texts.zip --> contain a directory of .rtf files (in some cases .pdf files) with the text of documents used as sources for the case studies. Each file has been renamed with its SHA1 hash so that it can be easily recognized.
documents-metadata.csv --> Contains a CSV file with the metadata for each document used as a source for the case studies.
This dataset is the original one used in the publication [1] and the preprint containing the additional material [2].
[1] F. Ciclosi and F. Massacci, "The Data Protection Officer: A Ubiquitous Role That No One Really Knows" in IEEE Security & Privacy, vol. 21, no. 01, pp. 66-77, 2023, doi: 10.1109/MSEC.2022.3222115, url: https://doi.ieeecomputersociety.org/10.1109/MSEC.2022.3222115.
[2] F. Ciclosi and F. Massacci, "The Data Protection Officer, an ubiquitous role nobody really knows." arXiv preprint arXiv:2212.07712, 2022.
[3] R. K. Yin, Case study research and applications. Sage, 2018.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper presents a critical review of 47 articles published between 2006 and 2011 to identify how case studies have been applied in Brazilian research on public administration. In addition to their theoretical and methodological characteristics, four further specific topics of interest were addressed: (a) what is meant by case study; (b) the relationship between the phenomenon of interest and the case under investigation; (c) the possibility of replication; and (d) how the supposed method contributes towards the development of the field of public administration. The main inconsistencies found were: the methodological descriptions are confusing; the results are inconsistent compared with data gathering procedures and data analysis techniques; a lack of information about the number of interviewed individuals; and no descriptions of research variables. The results suggest the reviewed case studies present methodological inconsistencies and limitations, which undermine their scientific value and relevance to academic work in Brazil.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
We present a showcase of our experience with videos complementing analytical chemistry lectures to familiarize undergraduate students with instrumental element analysis. This includes a detailed account of how we planned, produced, and utilized a video to review the course content at the end of the semester. The analytical case study focused on the determination of magnesium in two well water samples with emphasis on flame atomic absorption spectroscopy, while also comparing results with inductively coupled plasma optical emission spectroscopy and titration measurements. During the lecture, we engaged students by asking them for suggestions on how to carry out the measurements before showing the respective video sections. A survey among the students revealed a remarkably positive response to this approach. We demonstrate our video production approach by making decisions and choices from the video production, such as recording and editing, explicit and conclude with practical advice for planning and producing similar videos to visualize case studies.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An anonymized dataset compiled by NipsApp from 112 game projects completed between January and August 2025. The dataset includes data points on game production efficiency, post-launch analytics, and team communication collected during real development cycles.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Companies are encouraged by the big data trend to experiment with advanced analytics and many turn to specialist consultancies to help them get started where they lack the necessary competences. We investigate the program of one such consultancy, Advectas - in particular the advanced analytics Jumpstart. Using qualitative techniques including semi structured interviews and content analysis we investigate the nature and value of the Jumpstart concept through five cases in different companies. We provide a definition, a process model and a set of thirteen best practices derived from these experiences, and discuss the distinctive qualities of this approach.
Facebook
TwitterThis dataset is a practical SQL case study designed for learners who are looking to enhance their SQL skills in analyzing sales, products, and marketing data. It contains several SQL queries related to a simulated business database for product sales, marketing expenses, and location data. The database consists of three main tables: Fact, Product, and Location.
Objective of the Case Study: The purpose of this case study is to provide learners with a variety of practical SQL exercises that involve real-world business problems. The queries explore topics such as: