https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Analyzing Coffee Shop Sales: Excel Insights 📈
In my first Data Analytics Project, I Discover the secrets of a fictional coffee shop's success with my data-driven analysis. By Analyzing a 5-sheet Excel dataset, I've uncovered valuable sales trends, customer preferences, and insights that can guide future business decisions. 📊☕
DATA CLEANING 🧹
• REMOVED DUPLICATES OR IRRELEVANT ENTRIES: Thoroughly eliminated duplicate records and irrelevant data to refine the dataset for analysis.
• FIXED STRUCTURAL ERRORS: Rectified any inconsistencies or structural issues within the data to ensure uniformity and accuracy.
• CHECKED FOR DATA CONSISTENCY: Verified the integrity and coherence of the dataset by identifying and resolving any inconsistencies or discrepancies.
DATA MANIPULATION 🛠️
• UTILIZED LOOKUPS: Used Excel's lookup functions for efficient data retrieval and analysis.
• IMPLEMENTED INDEX MATCH: Leveraged the Index Match function to perform advanced data searches and matches.
• APPLIED SUMIFS FUNCTIONS: Utilized SumIFs to calculate totals based on specified criteria.
• CALCULATED PROFITS: Used relevant formulas and techniques to determine profit margins and insights from the data.
PIVOTING THE DATA 𝄜
• CREATED PIVOT TABLES: Utilized Excel's PivotTable feature to pivot the data for in-depth analysis.
• FILTERED DATA: Utilized pivot tables to filter and analyze specific subsets of data, enabling focused insights. Specially used in “PEAK HOURS” and “TOP 3 PRODUCTS” charts.
VISUALIZATION 📊
• KEY INSIGHTS: Unveiled the grand total sales revenue while also analyzing the average bill per person, offering comprehensive insights into the coffee shop's performance and customer spending habits.
• SALES TREND ANALYSIS: Used Line chart to compute total sales across various time intervals, revealing valuable insights into evolving sales trends.
• PEAK HOUR ANALYSIS: Leveraged Clustered Column chart to identify peak sales hours, shedding light on optimal operating times and potential staffing needs.
• TOP 3 PRODUCTS IDENTIFICATION: Utilized Clustered Bar chart to determine the top three coffee types, facilitating strategic decisions regarding inventory management and marketing focus.
*I also used a Timeline to visualize chronological data trends and identify key patterns over specific times.
While it's a significant milestone for me, I recognize that there's always room for growth and improvement. Your feedback and insights are invaluable to me as I continue to refine my skills and tackle future projects. I'm eager to hear your thoughts and suggestions on how I can make my next endeavor even more impactful and insightful.
THANKS TO: WsCube Tech Mo Chen Alex Freberg
TOOLS USED: Microsoft Excel
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
About Datasets: - Domain : Finance - Project: Bank loan of customers - Datasets: Finance_1.xlsx & Finance_2.xlsx - Dataset Type: Excel Data - Dataset Size: Each Excel file has 39k+ records
KPI's: 1. Year wise loan amount Stats 2. Grade and sub grade wise revol_bal 3. Total Payment for Verified Status Vs Total Payment for Non Verified Status 4. State wise loan status 5. Month wise loan status 6. Get more insights based on your understanding of the data
Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results
This data contains Power Query, Power Pivot, Merge data, Clustered Bar Chart, Clustered Column Chart, Line Chart, 3D Pie chart, Dashboard, slicers, timeline, formatting techniques.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
"Classification and Quantification of Strawberry Fruit Shape" is a dataset that includes raw RGB images and binary images of strawberry fruit. These folders contain JPEG images taken from the same experimental units on 2 different harvest dates. Images in each folder are labeled according to the 4 digit plot ID from the field experiment (####_) and the 10 digit individual ID (_##########).
"H1" and "H2" folders contain RGB images of multiple fruits. Each fruit was extracted and binarized to become the images in "H1_indiv" and "H2_indiv".
"H1_indiv" and "H2_indiv" folders contain images of individual fruit. Each fruit is bordered by ten white pixels. There are a total of 6,874 images between these two folders. The images were used then resized and scaled to be the images in "ReSized".
"ReSized" contains 6,874 binary images of individual berries. These images are all square images (1000x1000px) with the object represented by black pixels (0) and background represented with white pixels (1). Each image was scaled so that it would take up the maximum number of pixels in a 1000 x 1000px image and would maintain the aspect ratio.
"Fruit_image_data.csv" contains all of the morphometric features extracted from individual images including intermediate values.
All images title with the form "B##_NA" were discarded prior to any analyses. These images come from the buffer plots, not the experimental units of the study.
"PPKC_Figures.zip" contains all figures (F1-F7) and supplemental figures (S1-S7_ from the manuscript. Captions for the main figures are found in the manuscript. Captions for Supplemental figures are below.
Fig. S1 Results of PPKC against original cluster assignments. Ordered centroids from k = 2 to k = 8. On the left are the unordered assignments from k-means, and the on the right are the order assignments following PPKC. Cluster position indicated on the right [1, 8].
Fig. S2 Optimal Value of k. (A) Total within clusters sum of squares. (B) The inverse of the Adjusted R . (C) Akaike information criterion (AIC). (D) Bayesian information criterion (AIC). All metrics were calculated on a random sample of 3, 437 images (50%). 10 samples were randomly drawn. The vertical dashed line in each plot represents the optimal value of k. Reported metrics are standardized to be between [0, 1].
Fig. S3 Hierarchical clustering and distance between classes on PC1. The relationship between clusters at each value of k is represented as both a dendrogram and as bar plot. The labels on the dendrogram (i.e., V1, V2, V3,..., V10) represent the original cluster assignment from k-means. The barplot to the right of each dendrogram depicts the elements of the eigenvector associated with the largest eigenvalue form PPKC. The labels above each line represent the original cluster assignment.
Fig. S4 BLUPs for 13 selected features. For each plot, the X-axis is the index and the Y-axis is the BLUP value estimated from a linear mixed model. Grey points represent the mean feature value for each individual. Each point is the BLUP for a single genotype.
Fig. S5 Effects of Eigenfruit, Vertical Biomass, and Horizontal Biomass Analyses. (A) Effects of PC [1, 7] from the Eigenfruit analysis on the mean shape (center column). The left column is the mean shape minus 1.5× the standard deviation. Right is the mean shape plus 1.5× the standard deviation. The horizontal axis is the horizontal pixel position. The vertical axis is the vertical pixel position. (B) Effects of PC [1, 3] from the Horizontal Biomass analysis on the mean shape (center column). The left column is the mean shape minus 1.5× the standard deviation. Right is the mean shape plus 1.5× the standard deviation. The horizontal axis is the vertical position from the image (height). The vertical axis is the number of activated pixels (RowSum) at the given vertical position. (C) Effects of PC [1, 3] from the Vertical Biomass analysis on the mean shape (center column). The left column is the mean shape minus 1.5× the standard deviation. Right is the mean shape plus 1.5× the standard deviation. The horizontal axis is the horizontal position from the image (width). The vertical axis is the number of activated pixels (ColSum) at the given horizontal position.
Fig. S6 PPKC with variable sample size. Ordered centroids from k = 2 to k = 5 using different image sets for clustering. For all k = [2, 5], k-means clustering was performed using either 100, 80, 50%, or 20% of the total number of images; 6,874, 5, 500, 3, 437, and 1, 374 respectively. Cluster position indicated on the right [1, 5].
Fig. S7 Comparison of scale and continuous features. (A.) PPKC 4-unit ordinal scale. (B.) Distributions of the selected features with each level of k = 4 from the PPKC 4-unit ordinal scale. The light gray line is cluster 1, the medium gray line is cluster 2, the dark gray line is cluster 3, and the black line is cluster 4.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Analyzing Coffee Shop Sales: Excel Insights 📈
In my first Data Analytics Project, I Discover the secrets of a fictional coffee shop's success with my data-driven analysis. By Analyzing a 5-sheet Excel dataset, I've uncovered valuable sales trends, customer preferences, and insights that can guide future business decisions. 📊☕
DATA CLEANING 🧹
• REMOVED DUPLICATES OR IRRELEVANT ENTRIES: Thoroughly eliminated duplicate records and irrelevant data to refine the dataset for analysis.
• FIXED STRUCTURAL ERRORS: Rectified any inconsistencies or structural issues within the data to ensure uniformity and accuracy.
• CHECKED FOR DATA CONSISTENCY: Verified the integrity and coherence of the dataset by identifying and resolving any inconsistencies or discrepancies.
DATA MANIPULATION 🛠️
• UTILIZED LOOKUPS: Used Excel's lookup functions for efficient data retrieval and analysis.
• IMPLEMENTED INDEX MATCH: Leveraged the Index Match function to perform advanced data searches and matches.
• APPLIED SUMIFS FUNCTIONS: Utilized SumIFs to calculate totals based on specified criteria.
• CALCULATED PROFITS: Used relevant formulas and techniques to determine profit margins and insights from the data.
PIVOTING THE DATA 𝄜
• CREATED PIVOT TABLES: Utilized Excel's PivotTable feature to pivot the data for in-depth analysis.
• FILTERED DATA: Utilized pivot tables to filter and analyze specific subsets of data, enabling focused insights. Specially used in “PEAK HOURS” and “TOP 3 PRODUCTS” charts.
VISUALIZATION 📊
• KEY INSIGHTS: Unveiled the grand total sales revenue while also analyzing the average bill per person, offering comprehensive insights into the coffee shop's performance and customer spending habits.
• SALES TREND ANALYSIS: Used Line chart to compute total sales across various time intervals, revealing valuable insights into evolving sales trends.
• PEAK HOUR ANALYSIS: Leveraged Clustered Column chart to identify peak sales hours, shedding light on optimal operating times and potential staffing needs.
• TOP 3 PRODUCTS IDENTIFICATION: Utilized Clustered Bar chart to determine the top three coffee types, facilitating strategic decisions regarding inventory management and marketing focus.
*I also used a Timeline to visualize chronological data trends and identify key patterns over specific times.
While it's a significant milestone for me, I recognize that there's always room for growth and improvement. Your feedback and insights are invaluable to me as I continue to refine my skills and tackle future projects. I'm eager to hear your thoughts and suggestions on how I can make my next endeavor even more impactful and insightful.
THANKS TO: WsCube Tech Mo Chen Alex Freberg
TOOLS USED: Microsoft Excel