15 datasets found
  1. SPORTS_DATA_ANALYSIS_ON_EXCEL

    • kaggle.com
    zip
    Updated Dec 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nil kamal Saha (2024). SPORTS_DATA_ANALYSIS_ON_EXCEL [Dataset]. https://www.kaggle.com/datasets/nilkamalsaha/sports-data-analysis-on-excel
    Explore at:
    zip(1203633 bytes)Available download formats
    Dataset updated
    Dec 12, 2024
    Authors
    Nil kamal Saha
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    PROJECT OBJECTIVE

    We are a part of XYZ Co Pvt Ltd company who is in the business of organizing the sports events at international level. Countries nominate sportsmen from different departments and our team has been given the responsibility to systematize the membership roster and generate different reports as per business requirements.

    Questions (KPIs)

    TASK 1: STANDARDIZING THE DATASET

    • Populate the FULLNAME consisting of the following fields ONLY, in the prescribed format: PREFIX FIRSTNAME LASTNAME.{Note: All UPPERCASE)
    • Get the COUNTRY NAME to which these sportsmen belong to. Make use of LOCATION sheet to get the required data
    • Populate the LANGUAGE_!poken by the sportsmen. Make use of LOCTION sheet to get the required data
    • Generate the EMAIL ADDRESS for those members, who speak English, in the prescribed format :lastname.firstnamel@xyz .org {Note: All lowercase) and for all other members, format should be lastname.firstname@xyz.com (Note: All lowercase)
    • Populate the SPORT LOCATION of the sport played by each player. Make use of SPORT sheet to get the required data

    TASK 2: DATA FORMATING

    • Display MEMBER IDas always 3 digit number {Note: 001,002 ...,D2D,..etc)
    • Format the BIRTHDATE as dd mmm'yyyy (Prescribed format example: 09 May' 1986)
    • Display the units for the WEIGHT column (Prescribed format example: 80 kg)
    • Format the SALARY to show the data In thousands. If SALARY is less than 100,000 then display data with 2 decimal places else display data with one decimal place. In both cases units should be thousands (k) e.g. 87670 -> 87.67 k and 12 250 -> 123.2 k

    TASK 3: SUMMARIZE DATA - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1) • Create a PIVOT table in the worksheet ANALYSIS, starting at cell B3,with the following details:

    • In COLUMNS; Group : GENDER.
    • In ROWS; Group : COUNTRY (Note: use COUNTRY NAMES).
    • In VALUES; calculate the count of candidates from each COUNTRY and GENDER type, Remove GRAND TOTALs.

    TASK 4: SUMMARIZE DATA - EXCEL FUNCTIONS (Use SPORTSMEN worksheet after attempting TASK 1)

    • Create a SUMMARY table in the worksheet ANALYSIS,starting at cell G4, with the following details:

    • Starting from range RANGE H4; get the distinct GENDER. Use remove duplicates option and transpose the data.
    • Starting from range RANGE GS; get the distinct COUNTRY (Note: use COUNTRY NAMES).
    • In the cross table,get the count of candidates from each COUNTRY and GENDER type.

    TASK 5: GENERATE REPORT - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1)

    • Create a PIVOT table report in the worksheet REPORT, starting at cell A3, with the following information:

    • Change the report layout to TABULAR form.
    • Remove expand and collapse buttons.
    • Remove GRAND TOTALs.
    • Allow user to filter the data by SPORT LOCATION.

    Process

    • Verify data for any missing values and anomalies, and sort out the same.
    • Made sure data is consistent and clean with respect to data type, data format and values used.
    • Created pivot tables according to the questions asked.
  2. Scooter Sales - Excel Project

    • kaggle.com
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ann Truong (2023). Scooter Sales - Excel Project [Dataset]. https://www.kaggle.com/datasets/bvanntruong/scooter-sales-excel-project
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    Kaggle
    Authors
    Ann Truong
    Description

    The link for the Excel project to download can be found on GitHub here. It includes the raw data, Pivot Tables, and an interactive dashboard with Pivot Charts and Slicers. The project also includes business questions and the formulas I used to answer. The image below is included for ease. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2F61e460b5f6a1fa73cfaaa33aa8107bd5%2FBusinessQuestions.png?generation=1686190703261971&alt=media" alt=""> The link for the Tableau adjusted dashboard can be found here.

    A screenshot of the interactive Excel dashboard is also included below for ease. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fe581f1fce8afc732f7823904da9e4cce%2FScooter%20Dashboard%20Image.png?generation=1686190815608343&alt=media" alt="">

  3. Pivot table - Data analysis project

    • kaggle.com
    Updated Jul 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gamal Khattab (2022). Pivot table - Data analysis project [Dataset]. https://www.kaggle.com/datasets/gamalkhattab/pivot-table
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 18, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gamal Khattab
    Description

    Summarize big data with pivot table and charts and slicers

  4. VLOOKUP & PIVOT TABLE

    • kaggle.com
    zip
    Updated May 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Derrick Mallison (2023). VLOOKUP & PIVOT TABLE [Dataset]. https://www.kaggle.com/datasets/derrickmallison/vlookup-and-pivot-table
    Explore at:
    zip(58855 bytes)Available download formats
    Dataset updated
    May 3, 2023
    Authors
    Derrick Mallison
    Description

    Dataset

    This dataset was created by Derrick Mallison

    Contents

  5. HUD FHA Single Family Portfolio Snapshot

    • datalumos.org
    • openicpsr.org
    Updated Feb 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Department of Housing and Urban Development (2025). HUD FHA Single Family Portfolio Snapshot [Dataset]. http://doi.org/10.3886/E220223V2
    Explore at:
    Dataset updated
    Feb 20, 2025
    Dataset authored and provided by
    United States Department of Housing and Urban Developmenthttp://www.hud.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 2010 - Nov 2024
    Area covered
    United States of America
    Description

    The Single-Family Portfolio Snapshot consists of a monthly data table and a report generator (Excel pivot table) that can be used to quickly create new reports of interest to the user from the data records. The data records themselves are loan level records using all of the categorical variables highlighted on the report generator table. Users may download and save the Excel file that contains the data records and the pivot table.The report generator sheet consists of an Excel pivot table that gives individual users some ability to analyze monthly trends on dimensions of interest to them. There are six choice dimensions: property state, property county, loan purpose, loan type, property product type, and downpayment source.Each report generator selection variable has an associated drop-down menu that is accessed by clicking once on the associated arrows. Only single selections can be made from each menu. For example, users must choose one state or all states, one county or all counties. If a county is chosen that does not correspond with the selected state, the result will be null values.The data records include each report generator choice variable plus the property zip code, originating mortgagee (lender) number, sponsor-lender name, sponsor number, nonprofit gift provider tax identification number, interest rate, and FHA insurance endorsement year and month. The report generator only provides output for the dollar amount of loans. Users who desire to analyze other data that are available on the data table, for example, interest rates or sponsor number, must first download the Excel file. See the data definitions (PDF in top folder) for details on each data element.Files switch from .zip to excel in August 2017.

  6. Europe Bike Store Sales

    • kaggle.com
    zip
    Updated Mar 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PrepInsta Technologies (2023). Europe Bike Store Sales [Dataset]. https://www.kaggle.com/datasets/prepinstaprime/europe-bike-store-sales/versions/1
    Explore at:
    zip(1209546 bytes)Available download formats
    Dataset updated
    Mar 21, 2023
    Authors
    PrepInsta Technologies
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Europe
    Description

    In the Europe bikes dataset, Extract the insight into sales in each country and each state of their countries using Excel.

  7. d

    GP Practice Prescribing Presentation-level Data - July 2014

    • digital.nhs.uk
    csv, zip
    Updated Oct 31, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2014). GP Practice Prescribing Presentation-level Data - July 2014 [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/practice-level-prescribing-data
    Explore at:
    csv(1.4 GB), zip(257.7 MB), csv(1.7 MB), csv(275.8 kB)Available download formats
    Dataset updated
    Oct 31, 2014
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Time period covered
    Jul 1, 2014 - Jul 31, 2014
    Area covered
    United Kingdom
    Description

    Warning: Large file size (over 1GB). Each monthly data set is large (over 4 million rows), but can be viewed in standard software such as Microsoft WordPad (save by right-clicking on the file name and selecting 'Save Target As', or equivalent on Mac OSX). It is then possible to select the required rows of data and copy and paste the information into another software application, such as a spreadsheet. Alternatively, add-ons to existing software, such as the Microsoft PowerPivot add-on for Excel, to handle larger data sets, can be used. The Microsoft PowerPivot add-on for Excel is available from Microsoft http://office.microsoft.com/en-gb/excel/download-power-pivot-HA101959985.aspx Once PowerPivot has been installed, to load the large files, please follow the instructions below. Note that it may take at least 20 to 30 minutes to load one monthly file. 1. Start Excel as normal 2. Click on the PowerPivot tab 3. Click on the PowerPivot Window icon (top left) 4. In the PowerPivot Window, click on the "From Other Sources" icon 5. In the Table Import Wizard e.g. scroll to the bottom and select Text File 6. Browse to the file you want to open and choose the file extension you require e.g. CSV Once the data has been imported you can view it in a spreadsheet. What does the data cover? General practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total. For each practice in England, the following information is presented at presentation level for each medicine, dressing and appliance, (by presentation name): - the total number of items prescribed and dispensed - the total net ingredient cost - the total actual cost - the total quantity The data covers NHS prescriptions written in England and dispensed in the community in the UK. Prescriptions written in England but dispensed outside England are included. The data includes prescriptions written by GPs and other non-medical prescribers (such as nurses and pharmacists) who are attached to GP practices. GP practices are identified only by their national code, so an additional data file - linked to the first by the practice code - provides further detail in relation to the practice. Presentations are identified only by their BNF code, so an additional data file - linked to the first by the BNF code - provides the chemical name for that presentation.

  8. e

    Road traffic accidents

    • data.europa.eu
    • gimi9.com
    csv
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leeds City Council, Road traffic accidents [Dataset]. https://data.europa.eu/data/datasets/road-traffic-accidents?locale=da
    Explore at:
    csvAvailable download formats
    Dataset authored and provided by
    Leeds City Council
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Information on accidents across Leeds. Data includes location, number of people and vehicles involved, road surface, weather conditions and severity of any casualties.

    Please note

    • The Eastings and Northings are generated at the roadside where the accident occurred. Sometimes due to poor internet connectivity this data is may not be as accurate as it could be. If you notice any errors please contact accident.studies@leeds.gov.uk.

    Due to the format of the report a number of figures in the columns are repeated, these are:

    • Reference Number
    • Easting
    • Northing
    • Number of Vehicles
    • Accident Date
    • Time (24hr)
    • 1st Road Class
    • Road Surface
    • Lighting Conditions
    • Weather Conditions

    Reference Number

    Grid Ref: Easting

    Grid Ref: Northing

    Number of vehicles

    Accident Date

    Time (24hr)

    21G0539

    427798

    426248

    5

    16/01/2015

    1205

    21G0539

    427798

    426248

    5

    16/01/2015

    1205

    21G1108

    431142

    430087

    1

    16/01/2015

    1732

    21H0565

    434602

    436699>

    1

    17/01/2015

    930

    21H0638

    434254

    434318

    2

    17/01/2015

    1315

    21H0638

    434254

    434318

    2

    17/01/2015

    1315

    Therefore the number of vehicles involved in accident 21G0539 were 5, and in accident 21H0638 were 2. Overall in the example above a total of 9 vehicles were involved in accidents

    A useful tool to analyse the data is Excel pivot tables, these help summarise large amounts of data in a easy to view table, for further information on pivot table visit here.

    Further Information

    • Please see the guidance document for further information on categories.
  9. w

    Road Traffic Accident Data

    • data.wu.ac.at
    csv, html, json
    Updated Aug 24, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calderdale Council (2018). Road Traffic Accident Data [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/MDUzYTY1MjktNmM4Yy00MmFjLWFlMWUtNDU1YjI3MDhlNTM1
    Explore at:
    json, html, csvAvailable download formats
    Dataset updated
    Aug 24, 2018
    Dataset provided by
    Calderdale Council
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Information on accidents casualites across Calderdale. Data includes location, number of people and vehicles involved, road surface, weather conditions and severity of any casualties.

    Please note

    • The Eastings and Northings are generated at the roadside where the accident occurred. Sometimes due to poor internet connectivity this data is may not be as accurate as it could be. If you notice any errors please contact accident.studies@leeds.gov.uk.

    Due to the format of the report a number of figures in the columns are repeated, these are:

    • Reference Number
    • Easting
    • Northing
    • Number of Vehicles
    • Accident Date
    • Time (24hr)
    • 1st Road Class
    • Road Surface
    • Lighting Conditions
    • Weather Conditions

    Reference Number

    Grid Ref: Easting

    Grid Ref: Northing

    Number of vehicles

    Accident Date

    Time (24hr)

    21G0539

    427798

    426248

    5

    16/01/2015

    1205

    21G0539

    427798

    426248

    5

    16/01/2015

    1205

    21G1108

    431142

    430087

    1

    16/01/2015

    1732

    21H0565

    434602

    436699>

    1

    17/01/2015

    930

    21H0638

    434254

    434318

    2

    17/01/2015

    1315

    21H0638

    434254

    434318

    2

    17/01/2015

    1315

    Therefore the number of vehicles involved in accident 21G0539 were 5, and in accident 21H0638 were 2. Overall in the example above a total of 9 vehicles were involved in accidents

    A useful tool to analyse the data is Excel pivot tables, these help summarise large amounts of data in a easy to view table, for further information on pivot tables visit here.

  10. Checklist derived from plant species of Benin downloaded from GBIF site

    • gbif.org
    Updated May 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jean GANGLO; Jean GANGLO (2019). Checklist derived from plant species of Benin downloaded from GBIF site [Dataset]. http://doi.org/10.15468/mid3vk
    Explore at:
    Dataset updated
    May 9, 2019
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Laboratory of Forest Sciences (University of Abomey-Calavi)
    Authors
    Jean GANGLO; Jean GANGLO
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1780 - Jan 11, 2019
    Area covered
    Description

    Plant species collected throughout Benin were published on GBIF site. Data concerning those species were downloaded from GBIF site. Using Excel dynamic pivot table we derived and achieved the checklist of plant species of Benin from the dataset downloaded.

  11. Coffee Shop Sales Analysis

    • kaggle.com
    Updated Apr 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Monis Amir (2024). Coffee Shop Sales Analysis [Dataset]. https://www.kaggle.com/datasets/monisamir/coffee-shop-sales-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 25, 2024
    Dataset provided by
    Kaggle
    Authors
    Monis Amir
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Analyzing Coffee Shop Sales: Excel Insights 📈

    In my first Data Analytics Project, I Discover the secrets of a fictional coffee shop's success with my data-driven analysis. By Analyzing a 5-sheet Excel dataset, I've uncovered valuable sales trends, customer preferences, and insights that can guide future business decisions. 📊☕

    DATA CLEANING 🧹

    • REMOVED DUPLICATES OR IRRELEVANT ENTRIES: Thoroughly eliminated duplicate records and irrelevant data to refine the dataset for analysis.

    • FIXED STRUCTURAL ERRORS: Rectified any inconsistencies or structural issues within the data to ensure uniformity and accuracy.

    • CHECKED FOR DATA CONSISTENCY: Verified the integrity and coherence of the dataset by identifying and resolving any inconsistencies or discrepancies.

    DATA MANIPULATION 🛠️

    • UTILIZED LOOKUPS: Used Excel's lookup functions for efficient data retrieval and analysis.

    • IMPLEMENTED INDEX MATCH: Leveraged the Index Match function to perform advanced data searches and matches.

    • APPLIED SUMIFS FUNCTIONS: Utilized SumIFs to calculate totals based on specified criteria.

    • CALCULATED PROFITS: Used relevant formulas and techniques to determine profit margins and insights from the data.

    PIVOTING THE DATA 𝄜

    • CREATED PIVOT TABLES: Utilized Excel's PivotTable feature to pivot the data for in-depth analysis.

    • FILTERED DATA: Utilized pivot tables to filter and analyze specific subsets of data, enabling focused insights. Specially used in “PEAK HOURS” and “TOP 3 PRODUCTS” charts.

    VISUALIZATION 📊

    • KEY INSIGHTS: Unveiled the grand total sales revenue while also analyzing the average bill per person, offering comprehensive insights into the coffee shop's performance and customer spending habits.

    • SALES TREND ANALYSIS: Used Line chart to compute total sales across various time intervals, revealing valuable insights into evolving sales trends.

    • PEAK HOUR ANALYSIS: Leveraged Clustered Column chart to identify peak sales hours, shedding light on optimal operating times and potential staffing needs.

    • TOP 3 PRODUCTS IDENTIFICATION: Utilized Clustered Bar chart to determine the top three coffee types, facilitating strategic decisions regarding inventory management and marketing focus.

    *I also used a Timeline to visualize chronological data trends and identify key patterns over specific times.

    While it's a significant milestone for me, I recognize that there's always room for growth and improvement. Your feedback and insights are invaluable to me as I continue to refine my skills and tackle future projects. I'm eager to hear your thoughts and suggestions on how I can make my next endeavor even more impactful and insightful.

    THANKS TO: WsCube Tech Mo Chen Alex Freberg

    TOOLS USED: Microsoft Excel

    DataAnalytics #DataAnalyst #ExcelProject #DataVisualization #BusinessIntelligence #SalesAnalysis #DataAnalysis #DataDrivenDecisions

  12. E-Commerce Sales Data Analysis Using Excel

    • kaggle.com
    zip
    Updated Dec 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Utkarsh Anand (2024). E-Commerce Sales Data Analysis Using Excel [Dataset]. https://www.kaggle.com/datasets/utkarshanand09/e-commerce-sales-data-analysis-using-excel
    Explore at:
    zip(60943371 bytes)Available download formats
    Dataset updated
    Dec 27, 2024
    Authors
    Utkarsh Anand
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Performed in-depth analysis of Myntra's e-commerce data using Excel to identify sales trends, customer behavior, and performance metrics. Leveraged advanced Excel functionalities, including pivot tables, charts, conditional formatting, and data cleaning techniques, to derive actionable insights and create visually compelling reports.

  13. Video Game Sales Dataset (Excel Dashboard Project)

    • kaggle.com
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adewale Lateef W (2025). Video Game Sales Dataset (Excel Dashboard Project) [Dataset]. https://www.kaggle.com/datasets/adewalelateefw/video-game-sales-dataset-excel-dashboard-project
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 7, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Adewale Lateef W
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains video game sales data prepared for an Excel data analysis and dashboard project.

    It includes detailed information on:

    Game titles

    Platforms

    Genres

    Publishers

    Regional and global sales

    The dataset was cleaned, structured, and analyzed in Microsoft Excel to explore patterns in the global video game market. It can be used to:

    Practice data cleaning and pivot tables

    Build interactive dashboards

    Perform sales comparisons across regions and genres

    Develop business insights from entertainment data

    🧩 File Information

    Format: .xlsx (Excel Workbook)

    Columns: Name, Platform, Year, Genre, Publisher, NA_Sales, EU_Sales, JP_Sales, Other_Sales, Global_Sales

    💡 Use Cases

    Excel dashboard and chart creation

    Data visualization and storytelling

    Business and market analysis practice

    Portfolio or learning projects

    👤 Prepared by

    Adewale Lateef W — for data analysis and Excel dashboard learning purposes.

  14. Bike Buyers - Excel Project

    • kaggle.com
    zip
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ann Truong (2023). Bike Buyers - Excel Project [Dataset]. https://www.kaggle.com/datasets/bvanntruong/bike-buyers
    Explore at:
    zip(11866 bytes)Available download formats
    Dataset updated
    Jun 8, 2023
    Authors
    Ann Truong
    Description

    This dataset illustrates customer data from bike sales. It contains information such as Income, Occupation, Age, Commute, Gender, Children, and more. This is fictional data, created and used for data exploration and cleaning.

    The link for the Excel project to download can be found on GitHub here. It includes the raw data, the cleaned data, Pivot Tables, and a dashboard with Pivot Charts and Slicers for interaction. This allows the interactive dashboard to filter by Marital Status, Region, and Education.

    Below is a screenshot of the dashboard for ease. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fcbc9db6fe00f3201c64e4fdb668ce9d1%2FBikeBuyers%20Dashboard%20Image.png?generation=1686186378985936&alt=media" alt="">

  15. 2022 Bikeshare Data -Reduced File Size -All Months

    • kaggle.com
    zip
    Updated Mar 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kendall Marie (2023). 2022 Bikeshare Data -Reduced File Size -All Months [Dataset]. https://www.kaggle.com/datasets/kendallmarie/2022-bikeshare-data-all-months-combined
    Explore at:
    zip(98884 bytes)Available download formats
    Dataset updated
    Mar 8, 2023
    Authors
    Kendall Marie
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This is a condensed version of the raw data obtained through the Google Data Analytics Course, made available by Lyft and the City of Chicago under this license (https://ride.divvybikes.com/data-license-agreement).

    I originally did my study in another platform, and the original files were too large to upload to Posit Cloud in full. Each of the 12 monthly files contained anywhere from 100k to 800k rows. Therefore, I decided to reduce the number of rows drastically by performing grouping, summaries, and thoughtful omissions in Excel for each csv file. What I have uploaded here is the result of that process.

    Data is grouped by: month, day, rider_type, bike_type, and time_of_day. total_rides represent the sum of the data in each grouping as well as the total number of rows that were combined to make the new summarized row, avg_ride_length is the calculated average of all data in each grouping.

    Be sure that you use weighted averages if you want to calculate the mean of avg_ride_length for different subgroups as the values in this file are already averages of the summarized groups. You can include the total_rides value in your weighted average calculation to weigh properly.

    9 Columns:

    date - year, month, and day in date format - includes all days in 2022 day_of_week - Actual day of week as character. Set up a new sort order if needed. rider_type - values are either 'casual', those who pay per ride, or 'member', for riders who have annual memberships. bike_type - Values are 'classic' (non-electric, traditional bikes), or 'electric' (e-bikes). time_of_day - this divides the day into 6 equal time frames, 4 hours each, starting at 12AM. Each individual ride was placed into one of these time frames using the time they STARTED their rides, even if the ride was long enough to end in a later time frame. This column was added to help summarize the original dataset. total_rides - Count of all individual rides in each grouping (row). This column was added to help summarize the original dataset. avg_ride_length - The calculated average of all rides in each grouping (row). Look to total_rides to know how many original rides length values were included in this average. This column was added to help summarize the original dataset. min_ride_length - Minimum ride length of all rides in each grouping (row). This column was added to help summarize the original dataset. max_ride_length - Maximum ride length of all rides in each grouping (row). This column was added to help summarize the original dataset.

    Please note: the time_of_day column has inconsistent spacing. Use mutate(time_of_day = gsub(" ", "", time_of _day)) to remove all spaces.

    Revisions

    Below is the list of revisions I made in Excel before uploading the final csv files to the R environment:

    • Deleted station location columns and lat/long as much of this data was already missing.

    • Deleted ride id column since each observation was unique and I would not be joining with another table on this variable.

    • Deleted rows pertaining to "docked bikes" since there were no member entries for this type and I could not compare member vs casual rider data. I also received no information in the project details about what constitutes a "docked" bike.

    • Used ride start time and end time to calculate a new column called ride_length (by subtracting), and deleted all rows with 0 and 1 minute results, which were explained in the project outline as being related to staff tasks rather than users. An example would be taking a bike out of rotation for maintenance.

    • Placed start time into a range of times (time_of_day) in order to group more observations while maintaining general time data. time_of_day now represents a time frame when the bike ride BEGAN. I created six 4-hour time frames, beginning at 12AM.

    • Added a Day of Week column, with Sunday = 1 and Saturday = 7, then changed from numbers to the actual day names.

    • Used pivot tables to group total_rides, avg_ride_length, min_ride_length, and max_ride_length by date, rider_type, bike_type, and time_of_day.

    • Combined into one csv file with all months, containing less than 9,000 rows (instead of several million)

  16. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nil kamal Saha (2024). SPORTS_DATA_ANALYSIS_ON_EXCEL [Dataset]. https://www.kaggle.com/datasets/nilkamalsaha/sports-data-analysis-on-excel
Organization logo

SPORTS_DATA_ANALYSIS_ON_EXCEL

Explore at:
zip(1203633 bytes)Available download formats
Dataset updated
Dec 12, 2024
Authors
Nil kamal Saha
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

PROJECT OBJECTIVE

We are a part of XYZ Co Pvt Ltd company who is in the business of organizing the sports events at international level. Countries nominate sportsmen from different departments and our team has been given the responsibility to systematize the membership roster and generate different reports as per business requirements.

Questions (KPIs)

TASK 1: STANDARDIZING THE DATASET

  • Populate the FULLNAME consisting of the following fields ONLY, in the prescribed format: PREFIX FIRSTNAME LASTNAME.{Note: All UPPERCASE)
  • Get the COUNTRY NAME to which these sportsmen belong to. Make use of LOCATION sheet to get the required data
  • Populate the LANGUAGE_!poken by the sportsmen. Make use of LOCTION sheet to get the required data
  • Generate the EMAIL ADDRESS for those members, who speak English, in the prescribed format :lastname.firstnamel@xyz .org {Note: All lowercase) and for all other members, format should be lastname.firstname@xyz.com (Note: All lowercase)
  • Populate the SPORT LOCATION of the sport played by each player. Make use of SPORT sheet to get the required data

TASK 2: DATA FORMATING

  • Display MEMBER IDas always 3 digit number {Note: 001,002 ...,D2D,..etc)
  • Format the BIRTHDATE as dd mmm'yyyy (Prescribed format example: 09 May' 1986)
  • Display the units for the WEIGHT column (Prescribed format example: 80 kg)
  • Format the SALARY to show the data In thousands. If SALARY is less than 100,000 then display data with 2 decimal places else display data with one decimal place. In both cases units should be thousands (k) e.g. 87670 -> 87.67 k and 12 250 -> 123.2 k

TASK 3: SUMMARIZE DATA - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1) • Create a PIVOT table in the worksheet ANALYSIS, starting at cell B3,with the following details:

  • In COLUMNS; Group : GENDER.
  • In ROWS; Group : COUNTRY (Note: use COUNTRY NAMES).
  • In VALUES; calculate the count of candidates from each COUNTRY and GENDER type, Remove GRAND TOTALs.

TASK 4: SUMMARIZE DATA - EXCEL FUNCTIONS (Use SPORTSMEN worksheet after attempting TASK 1)

• Create a SUMMARY table in the worksheet ANALYSIS,starting at cell G4, with the following details:

  • Starting from range RANGE H4; get the distinct GENDER. Use remove duplicates option and transpose the data.
  • Starting from range RANGE GS; get the distinct COUNTRY (Note: use COUNTRY NAMES).
  • In the cross table,get the count of candidates from each COUNTRY and GENDER type.

TASK 5: GENERATE REPORT - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1)

• Create a PIVOT table report in the worksheet REPORT, starting at cell A3, with the following information:

  • Change the report layout to TABULAR form.
  • Remove expand and collapse buttons.
  • Remove GRAND TOTALs.
  • Allow user to filter the data by SPORT LOCATION.

Process

  • Verify data for any missing values and anomalies, and sort out the same.
  • Made sure data is consistent and clean with respect to data type, data format and values used.
  • Created pivot tables according to the questions asked.
Search
Clear search
Close search
Google apps
Main menu