Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
PROJECT OBJECTIVE
We are a part of XYZ Co Pvt Ltd company who is in the business of organizing the sports events at international level. Countries nominate sportsmen from different departments and our team has been given the responsibility to systematize the membership roster and generate different reports as per business requirements.
Questions (KPIs)
TASK 1: STANDARDIZING THE DATASET
TASK 2: DATA FORMATING
TASK 3: SUMMARIZE DATA - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1) • Create a PIVOT table in the worksheet ANALYSIS, starting at cell B3,with the following details:
TASK 4: SUMMARIZE DATA - EXCEL FUNCTIONS (Use SPORTSMEN worksheet after attempting TASK 1)
• Create a SUMMARY table in the worksheet ANALYSIS,starting at cell G4, with the following details:
TASK 5: GENERATE REPORT - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1)
• Create a PIVOT table report in the worksheet REPORT, starting at cell A3, with the following information:
Process
Facebook
TwitterThis dataset was created by Derrick Mallison
Facebook
TwitterThe link for the Excel project to download can be found on GitHub here.
It includes the raw data, Pivot Tables, and an interactive dashboard with Pivot Charts and Slicers. The project also includes business questions and the formulas I used to answer. The image below is included for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2F61e460b5f6a1fa73cfaaa33aa8107bd5%2FBusinessQuestions.png?generation=1686190703261971&alt=media" alt="">
The link for the Tableau adjusted dashboard can be found here.
A screenshot of the interactive Excel dashboard is also included below for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fe581f1fce8afc732f7823904da9e4cce%2FScooter%20Dashboard%20Image.png?generation=1686190815608343&alt=media" alt="">
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
In the Europe bikes dataset, Extract the insight into sales in each country and each state of their countries using Excel.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Single-Family Portfolio Snapshot consists of a monthly data table and a report generator (Excel pivot table) that can be used to quickly create new reports of interest to the user from the data records. The data records themselves are loan level records using all of the categorical variables highlighted on the report generator table. Users may download and save the Excel file that contains the data records and the pivot table.The report generator sheet consists of an Excel pivot table that gives individual users some ability to analyze monthly trends on dimensions of interest to them. There are six choice dimensions: property state, property county, loan purpose, loan type, property product type, and downpayment source.Each report generator selection variable has an associated drop-down menu that is accessed by clicking once on the associated arrows. Only single selections can be made from each menu. For example, users must choose one state or all states, one county or all counties. If a county is chosen that does not correspond with the selected state, the result will be null values.The data records include each report generator choice variable plus the property zip code, originating mortgagee (lender) number, sponsor-lender name, sponsor number, nonprofit gift provider tax identification number, interest rate, and FHA insurance endorsement year and month. The report generator only provides output for the dollar amount of loans. Users who desire to analyze other data that are available on the data table, for example, interest rates or sponsor number, must first download the Excel file. See the data definitions (PDF in top folder) for details on each data element.Files switch from .zip to excel in August 2017.
Facebook
TwitterSummarize big data with pivot table and charts and slicers
Facebook
Twitterhttps://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Warning: Large file size (over 1GB). Each monthly data set is large (over 4 million rows), but can be viewed in standard software such as Microsoft WordPad (save by right-clicking on the file name and selecting 'Save Target As', or equivalent on Mac OSX). It is then possible to select the required rows of data and copy and paste the information into another software application, such as a spreadsheet. Alternatively, add-ons to existing software, such as the Microsoft PowerPivot add-on for Excel, to handle larger data sets, can be used. The Microsoft PowerPivot add-on for Excel is available from Microsoft http://office.microsoft.com/en-gb/excel/download-power-pivot-HA101959985.aspx Once PowerPivot has been installed, to load the large files, please follow the instructions below. Note that it may take at least 20 to 30 minutes to load one monthly file. 1. Start Excel as normal 2. Click on the PowerPivot tab 3. Click on the PowerPivot Window icon (top left) 4. In the PowerPivot Window, click on the "From Other Sources" icon 5. In the Table Import Wizard e.g. scroll to the bottom and select Text File 6. Browse to the file you want to open and choose the file extension you require e.g. CSV Once the data has been imported you can view it in a spreadsheet. What does the data cover? General practice prescribing data is a list of all medicines, dressings and appliances that are prescribed and dispensed each month. A record will only be produced when this has occurred and there is no record for a zero total. For each practice in England, the following information is presented at presentation level for each medicine, dressing and appliance, (by presentation name): - the total number of items prescribed and dispensed - the total net ingredient cost - the total actual cost - the total quantity The data covers NHS prescriptions written in England and dispensed in the community in the UK. Prescriptions written in England but dispensed outside England are included. The data includes prescriptions written by GPs and other non-medical prescribers (such as nurses and pharmacists) who are attached to GP practices. GP practices are identified only by their national code, so an additional data file - linked to the first by the practice code - provides further detail in relation to the practice. Presentations are identified only by their BNF code, so an additional data file - linked to the first by the BNF code - provides the chemical name for that presentation.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Information on accidents across Leeds. Data includes location, number of people and vehicles involved, road surface, weather conditions and severity of any casualties.
Due to the format of the report a number of figures in the columns are repeated, these are:
Reference Number
Grid Ref: Easting
Grid Ref: Northing
Number of vehicles
Accident Date
Time (24hr)
21G0539
427798
426248
5
16/01/2015
1205
21G0539
427798
426248
5
16/01/2015
1205
21G1108
431142
430087
1
16/01/2015
1732
21H0565
434602
436699>
1
17/01/2015
930
21H0638
434254
434318
2
17/01/2015
1315
21H0638
434254
434318
2
17/01/2015
1315
Therefore the number of vehicles involved in accident 21G0539 were 5, and in accident 21H0638 were 2. Overall in the example above a total of 9 vehicles were involved in accidents
A useful tool to analyse the data is Excel pivot tables, these help summarise large amounts of data in a easy to view table, for further information on pivot table visit here.
Facebook
TwitterThis dataset contains annual Excel pivot tables that display summaries of the patients treated in each hospital-based and freestanding Ambulatory Surgery Clinic licensed by the California Department of Public Health (CDPH). The summary data includes discharge disposition, expected payer, preferred language spoken, age groups, race groups, sex, principal diagnosis groups, principal procedure groups, and principal external cause of injury/morbidity groups. The data can also be summarized statewide or for a specific facility county, type of control, and/or type of license (hospital or clinic). Note: Physician-owned ambulatory surgery clinics do not report their data to HCAI and, therefore, are not included in the statewide frequencies.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Information on accidents casualites across Calderdale. Data includes location, number of people and vehicles involved, road surface, weather conditions and severity of any casualties.
Due to the format of the report a number of figures in the columns are repeated, these are:
Reference Number
Grid Ref: Easting
Grid Ref: Northing
Number of vehicles
Accident Date
Time (24hr)
21G0539
427798
426248
5
16/01/2015
1205
21G0539
427798
426248
5
16/01/2015
1205
21G1108
431142
430087
1
16/01/2015
1732
21H0565
434602
436699>
1
17/01/2015
930
21H0638
434254
434318
2
17/01/2015
1315
21H0638
434254
434318
2
17/01/2015
1315
Therefore the number of vehicles involved in accident 21G0539 were 5, and in accident 21H0638 were 2. Overall in the example above a total of 9 vehicles were involved in accidents
A useful tool to analyse the data is Excel pivot tables, these help summarise large amounts of data in a easy to view table, for further information on pivot tables visit here.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Plant species collected throughout Benin were published on GBIF site. Data concerning those species were downloaded from GBIF site. Using Excel dynamic pivot table we derived and achieved the checklist of plant species of Benin from the dataset downloaded.
Facebook
TwitterThe purpose of the WASH KAP survey was to collect primary data on several indicators related to the WASH Program implemented in the refugee and host communities of Palabek Settlement, Uganda. The survey aimed at assessing the level of improvement on the accessibility of WASH facilities after a 2 year intervention project.
The survey used cross-sectional design used and both qualitative and quantitative techniques such as use of UNHCR standard WASH questionnaires, field visits and observations were employed during the study. In the 2019/20, the LWF provided WASH services to both refugee settlements and host community living in and around Palabek settlement. In order to gauge the coverage, the LWF conducted this KAP survey. The respondents were drawn from the host community (238 households) and the refugee settlement (446 households).
Palabek
Households
Refugees and host community
Sample survey data [ssd]
A sample size of 578 respondents was determined using the sample size calculator for the survey with a 4% margin of error and an 95% confidence level. Through the entire exercise, the survey process however reached a total of 684 respondents. Analysis, interpretation, and presentation of data were undertaken using Microsoft Excel Pivot tables and chats.
Computer Assisted Personal Interview [capi]
Facebook
TwitterStream Temperature: Site: Gwynns Falls at Gwynnbrook (GFGB):
In the Baltimore urban long-term ecological research (LTER) project, (Baltimore Ecosystem Study, BES) we use the watershed approach to evaluate integrated ecosystem function. The LTER research is centered on the Gwynns Falls watershed, a 17,150 ha catchment that traverses a gradient from the urban core of Baltimore, through older urban residential (1900 - 1950) and suburban (1950- 1980) zones, rapidly suburbanizing areas and a rural/suburban fringe.
Stream temperature is continuously measured throughout the Gwynns Falls watershed along with supplemental sites around Baltimore County/City. A total of 22 sites contain sensors (HOBO Pro v2 Water Temperature Data Logger - U22-001) that take an instantaneous temperature reading every 2 minutes. These data are downloaded on a monthly basis.
This dataset is for at Gwynnbrook/Delight. This site samples drainage from approximately 1,000 ha of old and new suburban and suburbanizing land use.
A detailed description of this site is posted at: http://md.water.usgs.gov/BES/ 01589197/.
Streamflow data for this site are posted at: http://waterdata.usgs.gov/md/nwis/nwisman?site_no=01589197
Purpose: Long-term monitoring of stream temperature in a suburban catchment.
Theme keywords: stream, watershed, temperature, suburban, Baltimore Ecosystem Study
Coordinates: Lat/Long
39.4430 (39 26 35) (-)76.7834 (-76 47 00)
Review process for BES stream temperature data:
Raw data were recorded and logged every 2-minutes using HOBO Pro v2 Water Temperature Data Logger - U22-001.
Data are exported into Microsoft Excel documents.
Then organized by site and by month
Each month's data were entered into a pivot table in Microsoft Excel and daily means and counts of daily data points were calculated.
Plots were graphed of sites with close geographic proximity on the same graph to illustrate possible outlier data.
Missing and odd data were flagged, and notes taken from the field visits are provided where applicable.
Facebook
TwitterMetadata for BES site - Stream Temperature:
Gwynns Falls at Gwynnbrook (GFGB):
In the Baltimore urban long-term ecological research (LTER) project, (Baltimore Ecosystem Study, BES) we use the watershed approach to evaluate integrated ecosystem function. The LTER research is centered on the Gwynns Falls watershed, a 17,150 ha catchment that traverses a gradient from the urban core of Baltimore, through older urban residential (1900 - 1950) and suburban (1950- 1980) zones, rapidly suburbanizing areas and a rural/suburban fringe.
Stream temperature is continuously measured throughout the Gwynns Falls watershed along with supplemental sites around Baltimore County/City. A total of 22 sites contain sensors (HOBO Pro v2 Water Temperature Data Logger - U22-001) that take an instantaneous temperature reading every 2 minutes. These data are downloaded on a monthly basis.
This dataset is for the Gwynns Falls at Gwynnbrook/Delight. This site samples drainage from approximately 1,000 ha of old and new suburban and suburbanizing land use.
A detailed description of this site is posted at: http://md.water.usgs.gov/BES/ 01589197/
Streamflow data for this site are posted at: http://waterdata.usgs.gov/md/nwis/nwisman?site_no=01589197
Purpose: Long-term monitoring of stream temperature in a suburban catchment.
Theme keywords: stream, watershed, temperature, suburban, Baltimore Ecosystem Study
Coordinates: Lat/Long
39.4430 (39 26 35) (-)76.7834 (-76 47 00)
Review process for BES stream temperature data:
Raw data were recorded and logged every 2-minutes using HOBO Pro v2 Water Temperature Data Logger - U22-001.
Data are exported into Microsoft Excel documents.
Then organized by site and by month
Each month's data were entered into a pivot table in Microsoft Excel and daily means and counts of daily data points were calculated.
Plots were graphed of sites with close geographic proximity on the same graph to illustrate possible outlier data.
Missing and odd data were flagged, and notes taken from the field visits are provided where applicable.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Analyzing Coffee Shop Sales: Excel Insights 📈
In my first Data Analytics Project, I Discover the secrets of a fictional coffee shop's success with my data-driven analysis. By Analyzing a 5-sheet Excel dataset, I've uncovered valuable sales trends, customer preferences, and insights that can guide future business decisions. 📊☕
DATA CLEANING 🧹
• REMOVED DUPLICATES OR IRRELEVANT ENTRIES: Thoroughly eliminated duplicate records and irrelevant data to refine the dataset for analysis.
• FIXED STRUCTURAL ERRORS: Rectified any inconsistencies or structural issues within the data to ensure uniformity and accuracy.
• CHECKED FOR DATA CONSISTENCY: Verified the integrity and coherence of the dataset by identifying and resolving any inconsistencies or discrepancies.
DATA MANIPULATION 🛠️
• UTILIZED LOOKUPS: Used Excel's lookup functions for efficient data retrieval and analysis.
• IMPLEMENTED INDEX MATCH: Leveraged the Index Match function to perform advanced data searches and matches.
• APPLIED SUMIFS FUNCTIONS: Utilized SumIFs to calculate totals based on specified criteria.
• CALCULATED PROFITS: Used relevant formulas and techniques to determine profit margins and insights from the data.
PIVOTING THE DATA 𝄜
• CREATED PIVOT TABLES: Utilized Excel's PivotTable feature to pivot the data for in-depth analysis.
• FILTERED DATA: Utilized pivot tables to filter and analyze specific subsets of data, enabling focused insights. Specially used in “PEAK HOURS” and “TOP 3 PRODUCTS” charts.
VISUALIZATION 📊
• KEY INSIGHTS: Unveiled the grand total sales revenue while also analyzing the average bill per person, offering comprehensive insights into the coffee shop's performance and customer spending habits.
• SALES TREND ANALYSIS: Used Line chart to compute total sales across various time intervals, revealing valuable insights into evolving sales trends.
• PEAK HOUR ANALYSIS: Leveraged Clustered Column chart to identify peak sales hours, shedding light on optimal operating times and potential staffing needs.
• TOP 3 PRODUCTS IDENTIFICATION: Utilized Clustered Bar chart to determine the top three coffee types, facilitating strategic decisions regarding inventory management and marketing focus.
*I also used a Timeline to visualize chronological data trends and identify key patterns over specific times.
While it's a significant milestone for me, I recognize that there's always room for growth and improvement. Your feedback and insights are invaluable to me as I continue to refine my skills and tackle future projects. I'm eager to hear your thoughts and suggestions on how I can make my next endeavor even more impactful and insightful.
THANKS TO: WsCube Tech Mo Chen Alex Freberg
TOOLS USED: Microsoft Excel
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The National Pollutant Release Inventory (NPRI) is Canada's public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling. Each file contains data from 1993 to the latest reporting year. These CSV format datasets are in normalized or ‘list’ format and are optimized for pivot table analyses. Here is a description of each file: - The RELEASES file contains all substance release quantities. - The DISPOSALS file contains all on-site and off-site disposal quantities, including tailings and waste rock (TWR). - The TRANSFERS file contains all quantities transferred for recycling or treatment prior to disposal. - The COMMENTS file contains all the comments provided by facilities about substances included in their report. - The GEO LOCATIONS file contains complete geographic information for all facilities that have reported to the NPRI. Please consult the following resources to enhance your analysis: - Guide on using and Interpreting NPRI Data: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/using-interpreting-data.html - Access additional data from the NPRI, including datasets and mapping products: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/exploredata.html Supplemental Information More NPRI datasets and mapping products are available here: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/access.html Supporting Projects: National Pollutant Release Inventory (NPRI)
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Performed in-depth analysis of Myntra's e-commerce data using Excel to identify sales trends, customer behavior, and performance metrics. Leveraged advanced Excel functionalities, including pivot tables, charts, conditional formatting, and data cleaning techniques, to derive actionable insights and create visually compelling reports.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains video game sales data prepared for an Excel data analysis and dashboard project.
It includes detailed information on:
Game titles
Platforms
Genres
Publishers
Regional and global sales
The dataset was cleaned, structured, and analyzed in Microsoft Excel to explore patterns in the global video game market. It can be used to:
Practice data cleaning and pivot tables
Build interactive dashboards
Perform sales comparisons across regions and genres
Develop business insights from entertainment data
🧩 File Information
Format: .xlsx (Excel Workbook)
Columns: Name, Platform, Year, Genre, Publisher, NA_Sales, EU_Sales, JP_Sales, Other_Sales, Global_Sales
💡 Use Cases
Excel dashboard and chart creation
Data visualization and storytelling
Business and market analysis practice
Portfolio or learning projects
👤 Prepared by
Adewale Lateef W — for data analysis and Excel dashboard learning purposes.
Facebook
TwitterThis dataset illustrates customer data from bike sales. It contains information such as Income, Occupation, Age, Commute, Gender, Children, and more. This is fictional data, created and used for data exploration and cleaning.
The link for the Excel project to download can be found on GitHub here. It includes the raw data, the cleaned data, Pivot Tables, and a dashboard with Pivot Charts and Slicers for interaction. This allows the interactive dashboard to filter by Marital Status, Region, and Education.
Below is a screenshot of the dashboard for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fcbc9db6fe00f3201c64e4fdb668ce9d1%2FBikeBuyers%20Dashboard%20Image.png?generation=1686186378985936&alt=media" alt="">
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This is a condensed version of the raw data obtained through the Google Data Analytics Course, made available by Lyft and the City of Chicago under this license (https://ride.divvybikes.com/data-license-agreement).
I originally did my study in another platform, and the original files were too large to upload to Posit Cloud in full. Each of the 12 monthly files contained anywhere from 100k to 800k rows. Therefore, I decided to reduce the number of rows drastically by performing grouping, summaries, and thoughtful omissions in Excel for each csv file. What I have uploaded here is the result of that process.
Data is grouped by: month, day, rider_type, bike_type, and time_of_day. total_rides represent the sum of the data in each grouping as well as the total number of rows that were combined to make the new summarized row, avg_ride_length is the calculated average of all data in each grouping.
Be sure that you use weighted averages if you want to calculate the mean of avg_ride_length for different subgroups as the values in this file are already averages of the summarized groups. You can include the total_rides value in your weighted average calculation to weigh properly.
date - year, month, and day in date format - includes all days in 2022 day_of_week - Actual day of week as character. Set up a new sort order if needed. rider_type - values are either 'casual', those who pay per ride, or 'member', for riders who have annual memberships. bike_type - Values are 'classic' (non-electric, traditional bikes), or 'electric' (e-bikes). time_of_day - this divides the day into 6 equal time frames, 4 hours each, starting at 12AM. Each individual ride was placed into one of these time frames using the time they STARTED their rides, even if the ride was long enough to end in a later time frame. This column was added to help summarize the original dataset. total_rides - Count of all individual rides in each grouping (row). This column was added to help summarize the original dataset. avg_ride_length - The calculated average of all rides in each grouping (row). Look to total_rides to know how many original rides length values were included in this average. This column was added to help summarize the original dataset. min_ride_length - Minimum ride length of all rides in each grouping (row). This column was added to help summarize the original dataset. max_ride_length - Maximum ride length of all rides in each grouping (row). This column was added to help summarize the original dataset.
Please note: the time_of_day column has inconsistent spacing. Use mutate(time_of_day = gsub(" ", "", time_of _day)) to remove all spaces.
Below is the list of revisions I made in Excel before uploading the final csv files to the R environment:
Deleted station location columns and lat/long as much of this data was already missing.
Deleted ride id column since each observation was unique and I would not be joining with another table on this variable.
Deleted rows pertaining to "docked bikes" since there were no member entries for this type and I could not compare member vs casual rider data. I also received no information in the project details about what constitutes a "docked" bike.
Used ride start time and end time to calculate a new column called ride_length (by subtracting), and deleted all rows with 0 and 1 minute results, which were explained in the project outline as being related to staff tasks rather than users. An example would be taking a bike out of rotation for maintenance.
Placed start time into a range of times (time_of_day) in order to group more observations while maintaining general time data. time_of_day now represents a time frame when the bike ride BEGAN. I created six 4-hour time frames, beginning at 12AM.
Added a Day of Week column, with Sunday = 1 and Saturday = 7, then changed from numbers to the actual day names.
Used pivot tables to group total_rides, avg_ride_length, min_ride_length, and max_ride_length by date, rider_type, bike_type, and time_of_day.
Combined into one csv file with all months, containing less than 9,000 rows (instead of several million)
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
PROJECT OBJECTIVE
We are a part of XYZ Co Pvt Ltd company who is in the business of organizing the sports events at international level. Countries nominate sportsmen from different departments and our team has been given the responsibility to systematize the membership roster and generate different reports as per business requirements.
Questions (KPIs)
TASK 1: STANDARDIZING THE DATASET
TASK 2: DATA FORMATING
TASK 3: SUMMARIZE DATA - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1) • Create a PIVOT table in the worksheet ANALYSIS, starting at cell B3,with the following details:
TASK 4: SUMMARIZE DATA - EXCEL FUNCTIONS (Use SPORTSMEN worksheet after attempting TASK 1)
• Create a SUMMARY table in the worksheet ANALYSIS,starting at cell G4, with the following details:
TASK 5: GENERATE REPORT - PIVOT TABLE (Use SPORTSMEN worksheet after attempting TASK 1)
• Create a PIVOT table report in the worksheet REPORT, starting at cell A3, with the following information:
Process