This is a Jupyter Notebook demonstrating the Exploratory Data Analysis (EDA) process on the primary dataset available for this training program: Arkansas’ Registered Apprenticeship Partners Information Management Data System (RAPIDS) Data. EDA is a vital first step as it provides numerical and visual summaries of the data. This notebook was developed for the Summer 2022 Applied Data Analytics training facilitated by the State of Arkansas and Coleridge Initiative.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
McKinsey's Solve is a gamified problem-solving assessment used globally in the consulting firm’s recruitment process. This dataset simulates assessment results across geographies, education levels, and roles over a 7-year period. It aims to provide deep insights into performance trends, candidate readiness, resume quality, and cognitive task outcomes.
Inspired by McKinsey’s real-world assessment framework, this dataset was designed to enable: - Exploratory Data Analysis (EDA) - Recruitment trend analysis - Gamified performance modelling - Dashboard development in Excel / Power BI - Resume and education impact evaluation - Regional performance benchmarking - Data storytelling for portfolio projects
Whether you're building dashboards or training models, this dataset offers practical and relatable data for HR analytics and consulting use cases.
This dataset includes 4,000 rows and the following columns: - Testtaker ID: Unique identifier - Country / Region: Geographic segmentation - Gender / Age: Demographics - Year: Assessment year (2018–2025) - Highest Level of Education: From high school to PhD / MBA - School or University Attended: Mapped to country and education level - First-generation University Student: Yes/No - Employment Status: Student, Employed, Unemployed - Role Applied For and Department / Interest: Business/tech disciplines - Past Test Taker: Indicates repeat attempts - Prepared with Online Materials: Indicates test prep involvement - Desired Office Location: Mapped to McKinsey's international offices - Ecosystem / Redrock / Seawolf (%): Game performance scores - Time Spent on Each Game (mins) - Total Product Score: Average of the 3 game scores - Process Score: A secondary assessment component - Resume Score: Scored based on education prestige, role fit, and clarity - Total Assessment Score (%): Final decision metric - Status (Pass/Fail): Based on total score ≥ 75%
Analytic provenance is a data repository that can be used to study human analysis activity, thought processes, and software interaction with visual analysis tools during exploratory data analysis. It was collected during a series of user studies involving exploratory data analysis scenario with textual and cyber security data. Interactions logs, think-alouds, videos and all coded data in this study are available online for research purposes. Analysis sessions are segmented in multiple sub-task steps based on user think-alouds, video and audios captured during the studies. These analytic provenance datasets can be used for research involving tools and techniques for analyzing interaction logs and analysis history.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This is a Jupyter Notebook demonstrating the Exploratory Data Analysis (EDA) process on the primary dataset available for this training program: Arkansas’ Registered Apprenticeship Partners Information Management Data System (RAPIDS) Data. EDA is a vital first step as it provides numerical and visual summaries of the data. This notebook was developed for the Summer 2022 Applied Data Analytics training facilitated by the State of Arkansas and Coleridge Initiative.