26 datasets found
  1. Dairy Goods Sales Dataset

    • kaggle.com
    zip
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suraj (2023). Dairy Goods Sales Dataset [Dataset]. https://www.kaggle.com/datasets/suraj520/dairy-goods-sales-dataset
    Explore at:
    zip(232961 bytes)Available download formats
    Dataset updated
    Jun 6, 2023
    Authors
    Suraj
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Dairy Goods Sales Dataset provides a detailed and comprehensive collection of data related to dairy farms, dairy products, sales, and inventory management. This dataset encompasses a wide range of information, including farm location, land area, cow population, farm size, production dates, product details, brand information, quantities, pricing, shelf life, storage conditions, expiration dates, sales information, customer locations, sales channels, stock quantities, stock thresholds, and reorder quantities.

    Features:

    1. Location: The geographical location of the dairy farm.
    2. Total Land Area (acres): The total land area occupied by the dairy farm.
    3. Number of Cows: The number of cows present in the dairy farm.
    4. Farm Size: The size of the dairy farm(in sq.km).
    5. Date: The date of data recording.
    6. Product ID: The unique identifier for each dairy product.
    7. Product Name: The name of the dairy product.
    8. Brand: The brand associated with the dairy product.
    9. Quantity (liters/kg): The quantity of the dairy product available.
    10. Price per Unit: The price per unit of the dairy product.
    11. Total Value: The total value of the available quantity of the dairy product.
    12. Shelf Life (days): The shelf life of the dairy product in days.
    13. Storage Condition: The recommended storage condition for the dairy product.
    14. Production Date: The date of production for the dairy product.
    15. Expiration Date: The date of expiration for the dairy product.
    16. Quantity Sold (liters/kg): The quantity of the dairy product sold.
    17. Price per Unit (sold): The price per unit at which the dairy product was sold.
    18. Approx. Total Revenue (INR): The approximate total revenue generated from the sale of the dairy product.
    19. Customer Location: The location of the customer who purchased the dairy product.
    20. Sales Channel: The channel through which the dairy product was sold (Retail, Wholesale, Online).
    21. Quantity in Stock (liters/kg): The quantity of the dairy product remaining in stock.
    22. Minimum Stock Threshold (liters/kg): The minimum stock threshold for the dairy product.
    23. Reorder Quantity (liters/kg): The recommended quantity to reorder for the dairy product.

    Potential Use-Case:

    This dataset can be used by researchers, analysts, and businesses in the dairy industry for various purposes, such as:

    1. Analyzing the performance of dairy farms based on location, land area, and cow population.
    2. Understanding the sales and distribution patterns of different dairy products across various brands and regions.
    3. Studying the impact of storage conditions and shelf life on the quality and availability of dairy products.
    4. Analyzing customer preferences and buying behavior based on location and sales channels.
    5. Optimizing inventory management by tracking stock quantities, minimum thresholds, and reorder quantities.
    6. Conducting market research and trend analysis in the dairy industry.
    7. Developing predictive models for demand forecasting and pricing strategies.

    Note: This dataset includes data from the period between 2019 and 2022, and it specifically focuses on selected dairy brands operating in specific states and union territories of India. There is an intentional drift highlighted in the dataset's figures due to its opensource and creative license, currently !

  2. Bellabeat - Case Study (Google Career Certificate)

    • kaggle.com
    zip
    Updated Feb 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexandra Loop (2024). Bellabeat - Case Study (Google Career Certificate) [Dataset]. https://www.kaggle.com/datasets/alexandraloop/bellabeat-case-study-google-career-certificate/code
    Explore at:
    zip(25317533 bytes)Available download formats
    Dataset updated
    Feb 17, 2024
    Authors
    Alexandra Loop
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Analyst: Alexandra Loop Date: 12/02/2024

    Business Task:

    Question to be Answered : - What are trends in non-Bellabeat smart device usage? - What do these trends suggest for Bellabeat customers? - How could these trends help influence Bellabeat marketing strategy?

    Description of Data Sources:

    Data Set to be studied: FitBit Fitness Tracker Data: Pattern Recognition with tracker data: Improve Your Overall Health

    Data privacy: Data was sourced from a public dataset available on Kaggle. Information has been anonymized prior to being posted online.
    

    Bias: Due to the degree of anonymity in this study, the only demographic data available in this study is weight, and other cultural differences or lifestyle requirements cannot be accounted for. The sample size is quite small. The time period of the study is only a month so the observer effect could conceivably still be influencing the sample groups. We also have no information on the weather in the region studied. April and May are very variable months in terms of accessible outdoor activities.

    Process:

    Cleaning Process: After going through the data to find duplicates, whitespace, and nulls, I have determined that this set of data has been well-cleaned and already aggregated into several reasonably sized spreadsheets.

    Trim: No issues found

    Consistent length ID: No issues found

    Irrelevant columns: In WLI_M the fat column is not consistently filled in so it is not productive to use it in analysis Sedentary_active_distance was mostly filled with nulls and could confuse the data I have removed the columns

    Irrelevant Rows: 77 rows in daily_Activity_merged had 0s across the board. As there is little chance that someone would take zero steps I decided to interpret these days as ones where people did not put on the fitbit. As such they are irrelevant rows. Removed 77 columns. 85 rows in daily_intensities_merged registered 0 minutes of sedentary activity, which I do not believe to be possible. Row 241 logged 2 minutes of sedentary activity. I have determined it to be unusable. Row 322 likewise does not add up to a day’s minutes and has been deleted. Removed 85 columns 7 rows had 1440 sedentary minutes, which I have determined to be time on but not used. Implication of the presence noted.

    Scientifically debunked information: BMI as a measurement has been determined to be problematic on many lines, it misrepresents non-white people who have different healthy body types, does not account for muscle mass or scoliosis, has been known to change definitions in accordance with business interests rather than health data, and was never meant to be used as a measure of individual health. I have removed the BMI column from the Weight Log Info chart.

    Cleaning Process 1: I have elected to see what can be found in the data as it was organized by the providers first.
    Cleaning Process 2: I calculated and removed rows where the participants did not put on the fitbit. These rows were removed, and the implications of their presence have been noted. Found Averages, Minimum, and Maximum Values of Steps, distance, types of active minutes, and calories. Found the sum of all kinds of minutes documented to check for inconsistencies. Found the difference between total minutes and a full 1440 minutes. I tried to make a pie chart to convey the average minutes of activity, and so created a duplicate dataset to trim down and remove misleading data caused by different inputs.

    Analysis:

    Observations: On average, the participants do not seem interested in moderate physical activity as it was the category with the fewest number of active minutes. Perhaps advertise the effectiveness of low impact workouts. Very few participants volunteered their weights, but none of them lost weight. The person with the highest weight volunteered it only once near the beginning. Given evidence from the Health At Every Size movement, we cannot deny the possibility that having to be weight conscious could have had negative effects on this individual. I would suggest that weight would be a counterproductive focus for our marketing campaign as it would make heavier people less likely to want to participate, and any claims of weight loss would be statistically unfounded, and open us up to false advertising lawsuits. Fully half of the participants had days where they did not put on their fitbit at all during the day. For a total number of 77-84 lost days of data, meaning that on average participants who did not wear their fitbit daily lost 5 days of data, though of course some lost significantly more. I would suggest focusing on creating a biometric tracker that is comfortable and rarely needs to be charged so that people will gain more reliable resources from it. 400 full days of data are recorded, meaning that the participants did not take the device off to sleep, shower, or swim. 280 more have 16...

  3. FitBit Fitness Tracker Data - Capstone Project

    • kaggle.com
    zip
    Updated Mar 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gloria (2022). FitBit Fitness Tracker Data - Capstone Project [Dataset]. https://www.kaggle.com/datasets/gloriarc/fitbit-fitness-tracker-data-capstone-project/discussion
    Explore at:
    zip(26793 bytes)Available download formats
    Dataset updated
    Mar 14, 2022
    Authors
    Gloria
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Capstone Project: How Can a Wellness Technology Company Play It Smart?

    Bellabeat, a high-tech company that manufactures health-focused smart products, is seeking growth opportunities in the global smart device market. The company's Time smart device, which is fashionable, functional, and designed specifically for women, has been selected for a marketing campaign. The founding owners have asked the marketing analytics team to provide high-level recommendations and an analysis of smart device data, specifically "FitBit Data."

    Content

    The dataset was downloaded from https://creativecommons.org/publicdomain/zero/1.0/, dataset made available through https://wordpress.org/openverse/search/?q=FitBit%20Fitness%20Tracker%20Data Download into Kaggle

    Dataset: Daily_Activity_2022_27_02) Rows: 940 Columns: 15 Data is from April 2016 to May 2016 30 eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. The data includes information about daily activity steps, and heart rate that can be used to explore user's habits.

    Acknowledgements

    Thank you to all the Google Analytics Course Instructors, all were very versed in their area of expertise. The instructors helped to build confidence and noted that coding errors are part of the learning process. Thank you to the Rstudio Community and all of the online resources provided.

    Inspiration

    The inspiration for selecting this capstone project is based on the belief that health and well being is ultimately a person's wealth. If there were a catastrophe tomorrow and we were to loose all of our possessions we soon realize that possession can be recovered and life goes on. When our health is threatened our world can be turned upside down and we may never recover. Tracking the smart device users activities provides insight on how much activity individuals are getting, how many minutes and when we tend to get the most activity. The amount of activity, exercise is important for our overall health. Physical activity guidelines for adults should be at least 150 minutes (2 hours and 30 minutes) a week of vigorous-intensity aerobic physical activity. The benefits of physical activity are numerous, improved bone health, improved weight status, reduced anxiety and many more which leads to overall improved quality of life.

  4. Brewery Operations and Market Analysis Dataset

    • kaggle.com
    zip
    Updated Mar 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AQMAR HUSAIN (2025). Brewery Operations and Market Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/aqmarh11/brewery-operations-and-market-analysis-dataset
    Explore at:
    zip(27314 bytes)Available download formats
    Dataset updated
    Mar 13, 2025
    Authors
    AQMAR HUSAIN
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Overview This dataset presents real-world brewing records from Arbor Brewing Company, a renowned craft brewery in Bangalore, India, covering January 2020 to December 2020. It offers a comprehensive view of brewing parameters, quality assessments, sales trends, and operational efficiency—crucial for optimizing brewing processes and market strategies.

    Dataset Contents 1. Brewing Parameters Fermentation Time (days) – Duration of fermentation for each batch. Temperature (°C) – Fermentation temperature. pH Level – Acidity measurement affecting beer flavor and stability. Gravity – Indicator of fermentation progress. Alcohol Content (% ABV) – Alcohol percentage per batch. Bitterness (IBU) – Measurement of hop bitterness. Color (EBC) – Beer color intensity based on the European Brewery Convention scale. Ingredient Ratio – Composition of raw materials in each batch. 2. Beer Styles & Packaging Beer Styles – Includes IPA, Lager, Stout, Pilsner, and Wheat Beer. SKU (Stock Keeping Unit) – Packaging formats: Pints, Bottles, Cans, Kegs. Location – Sales across different regions in Bangalore. 3. Sales & Quality Metrics Volume Produced (liters) – Batch-wise beer production. Total Sales (INR) – Revenue generated per batch. Quality Score – Sensory evaluation rating for each batch. 4. Efficiency & Loss Metrics Brewhouse Efficiency (%) – Indicator of brewing system efficiency. Losses at Various Stages: Brewing Loss (%) – Losses due to raw material inefficiencies. Fermentation Loss (%) – Yield reduction during fermentation. Bottling/Kegging Loss (%) – Losses occurring during final packaging. Applications 📌 Brewing Process Optimization – Identifying brewing conditions that improve beer quality. 📌 Market Analysis – Analyzing sales trends based on beer styles and packaging choices. 📌 Supply Chain Optimization – Reducing brewing and packaging losses to improve profitability. 📌 Quality Control – Evaluating the impact of brewing parameters on final product consistency.

    Data Format File Format: CSV (Comma-Separated Values) Total Records: 583 batches Date Range: January 1, 2020 – December 30, 2020 Intended Audience This dataset is valuable for brewing professionals, data scientists, market analysts, and quality control experts. It also serves as an excellent resource for academic research in food technology, fermentation science, and business analytics.

    Disclaimer While derived from real brewing practices, this dataset has been anonymized and processed for educational and analytical purposes. Data accuracy should be validated for business use.

  5. The Price and Sales of Avocado

    • kaggle.com
    zip
    Updated Jan 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris (2020). The Price and Sales of Avocado [Dataset]. https://www.kaggle.com/alanluo418/avocado-prices-20152019
    Explore at:
    zip(942369 bytes)Available download formats
    Dataset updated
    Jan 14, 2020
    Authors
    Chris
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    The database update for Justin (https://www.kaggle.com/neuromusic/avocado-prices)

    From a BIG Fan of Avocado Toast

    Content

    This data was downloaded from the Hass Avocado Board website in January of 2020.

    Columns in the dataset:

     Date - The date of the observation
    
     AveragePrice - The Average Sales Price of Current Year
    
     Total Volume - Total Bulk and Bags Units
    
     4046 - Total number of avocados with PLU 4046 sold
    
     4225 - Total number of avocados with PLU 4225 sold
    
     4770 - Total number of avocados with PLU 4770 sold
    
     type - conventional or organic
    
     year - the current year
    
     region - the city or region of the observation
    

    Acknowledgements

    Thanks again to the Hass Avocado Board for sharing this data and thanks to Justin for share the idea about Avocado. http://www.hassavocadoboard.com/retail/volume-and-price-data

  6. U.S. fitness center/health club memberships 2000-2024

    • statista.com
    Updated Nov 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). U.S. fitness center/health club memberships 2000-2024 [Dataset]. https://www.statista.com/statistics/236123/us-fitness-center-health-club-memberships/
    Explore at:
    Dataset updated
    Nov 26, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    The number of members of fitness centers and health clubs within the United States has experienced a near continual increase between 2000 and 2024. In 2024, there were found to be around ** million members of fitness centers and health clubs within the U.S., the greatest number during the period of observation.

  7. d

    Retail Precincts Business Location Data | 20,000+ APAC & Middle East...

    • datarade.ai
    Updated Nov 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GapMaps (2025). Retail Precincts Business Location Data | 20,000+ APAC & Middle East Locations [Dataset]. https://datarade.ai/data-products/retail-precincts-business-location-data-20-000-apac-midd-gapmaps
    Explore at:
    .csv, .pdf, .geojsonAvailable download formats
    Dataset updated
    Nov 19, 2025
    Dataset authored and provided by
    GapMaps
    Area covered
    Middle East, United Arab Emirates, Australia, India, Malaysia, Saudi Arabia, Vietnam, Philippines, Thailand, Singapore, New Zealand
    Description

    This dataset provides a complete and highly structured view of retail precincts across multiple regions, designed to support market analysis, location intelligence, retail expansion, and AI/ML modelling. It delivers information in multiple formats to accommodate a wide range of analytical, GIS, and business use cases, making it an essential resource for retail analysts, urban planners, investment teams, and data-driven decision-makers.

    Included Data Files & Formats

    1. Point File
    2. All Precincts by Centroid (GeoJSON/Shapefile).
    3. Each precinct is represented as a single point located at its geometric centroid.
    4. Includes key attributes: id, precinct name. Ideal for quick visualisation, clustering, and spatial reference when boundary shapes are not required.
    5. Supports applications such as proximity analysis, mapping, and location-based AI/ML models.

    6. Polygon File – All Precincts by Polygon (GeoJSON/Shapefile)

    7. Provides full precinct boundaries in polygon geometry for precise spatial representation.

    8. Includes key attributes: id, precinct name.

    9. Enables detailed GIS analysis, including area calculations, spatial overlays, and integration with mobility or demographic datasets.

    10. Suitable for urban planning, retail network optimisation, trade area analysis, and catchment studies.

    PDF – Precinct Reports (see attached sample) - Reports include comprehensive retail precinct insights across malls and high streets, showing retailer mix by category (F&B, Apparel, Fitness, Grocery, Health/Fitness, and more), catchment size, shopper origins, population, consuming class population and precinct ranking—designed to provide insights on store expansion opportunities. - Supports qualitative assessments, market research, and executive reporting.

    1. Excel – Tabular Overview
    2. Comprehensive spreadsheet with all precincts, including the following fields: ID, Precinct Name, Ranking, State, Country
    3. Enables straightforward filtering, sorting, and integration with other datasets.
    4. Useful for high-level analysis, reporting, and as a reference table for GIS mapping or AI models.
  8. how Can Wellness technology company play it smart?

    • kaggle.com
    zip
    Updated Jul 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aurelien Kuate Kamno (2024). how Can Wellness technology company play it smart? [Dataset]. https://www.kaggle.com/datasets/aurelienkuatekamno/how-can-wellness-technology-company-play-it-smart/versions/1
    Explore at:
    zip(190187 bytes)Available download formats
    Dataset updated
    Jul 29, 2024
    Authors
    Aurelien Kuate Kamno
    Description

    Description of the Dataset 1. Dataset Overview

    Name: Wellness Technology Market Analysis Dataset Purpose: This dataset is designed to analyze various factors influencing the success of wellness technology companies. It aims to identify strategic opportunities and challenges in the wellness tech industry by evaluating market trends, customer behavior, and competitive dynamics. 2. Key Attributes

    Company ID: A unique identifier for each wellness technology company. Company Name: The name of the company. Product Categories: Types of wellness products offered (e.g., wearables, fitness apps, mental health platforms). Market Share: Percentage of market share held by the company in different regions. Revenue: Annual revenue generated by the company (numerical, in USD). Customer Satisfaction Score: Average customer satisfaction ratings (numerical, e.g., 1 to 10 scale). Investment Amount: Total investment received by the company (numerical, in USD). Product Features: Key features of each product (categorical, e.g., heart rate monitoring, sleep tracking). Competitive Position: Assessment of the company’s position relative to competitors (categorical, e.g., leader, challenger, niche). Innovation Index: An index score representing the level of innovation in the company’s product offerings (numerical). Marketing Spend: Annual expenditure on marketing and promotional activities (numerical, in USD). User Demographics: Age, gender, and location of the users (categorical and numerical). 3. Data Collection Method

    Sources: The data was collected from a combination of primary and secondary sources:

    Industry Reports: Data was sourced from market research reports and industry analysis published by organizations like Gartner, IDC, and Statista.

    Company Financial Statements: Financial information and market share data were obtained from public financial reports and investor relations sections of company websites.

    Customer Reviews and Ratings: Customer satisfaction scores and feedback were collected from review platforms such as Trustpilot, Google Reviews, and app store ratings.

    Surveys and Interviews: Direct surveys and interviews with industry experts, company executives, and customers were conducted to gather qualitative insights into product features and competitive positioning.

    Market Analysis Tools: Tools like Google Trends and social media analytics were used to assess market trends and consumer sentiment.

    Collection Tools and Techniques:

    Web Scraping: Automated scripts were used to extract data from online reviews and financial websites. APIs: Data was pulled from APIs provided by financial databases and market analysis tools. Surveys: Surveys were administered using platforms like SurveyMonkey to gather direct feedback from stakeholders. Data Quality Assurance:

    Data Cleaning: Involves handling missing values, correcting data inconsistencies, and ensuring accurate data entry. Validation: Data was cross-verified with multiple sources to ensure reliability and accuracy. 4. Dataset Size and Format

    Size: The dataset comprises data from [number of companies, e.g., 50] wellness technology companies and covers [number of records, e.g., 500] individual data points. Format: The data is stored in [format, e.g., Excel spreadsheets, SQL database] for ease of analysis and integration with analytical tools. 5. Privacy and Compliance

    Data Privacy: All data collected is anonymized to ensure the privacy of individuals and companies. Compliance: The data collection process adheres to relevant data protection regulations such as GDPR and CCPA, ensuring proper consent and secure handling of data.

  9. Avocado Prices

    • kaggle.com
    zip
    Updated Jun 6, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Justin Kiggins (2018). Avocado Prices [Dataset]. https://www.kaggle.com/neuromusic/avocado-prices
    Explore at:
    zip(643781 bytes)Available download formats
    Dataset updated
    Jun 6, 2018
    Authors
    Justin Kiggins
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Context

    It is a well known fact that Millenials LOVE Avocado Toast. It's also a well known fact that all Millenials live in their parents basements.

    Clearly, they aren't buying home because they are buying too much Avocado Toast!

    But maybe there's hope... if a Millenial could find a city with cheap avocados, they could live out the Millenial American Dream.

    Content

    This data was downloaded from the Hass Avocado Board website in May of 2018 & compiled into a single CSV. Here's how the Hass Avocado Board describes the data on their website:

    The table below represents weekly 2018 retail scan data for National retail volume (units) and price. Retail scan data comes directly from retailers’ cash registers based on actual retail sales of Hass avocados. Starting in 2013, the table below reflects an expanded, multi-outlet retail data set. Multi-outlet reporting includes an aggregation of the following channels: grocery, mass, club, drug, dollar and military. The Average Price (of avocados) in the table reflects a per unit (per avocado) cost, even when multiple units (avocados) are sold in bags. The Product Lookup codes (PLU’s) in the table are only for Hass avocados. Other varieties of avocados (e.g. greenskins) are not included in this table.

    Some relevant columns in the dataset:

    • Date - The date of the observation
    • AveragePrice - the average price of a single avocado
    • type - conventional or organic
    • year - the year
    • Region - the city or region of the observation
    • Total Volume - Total number of avocados sold
    • 4046 - Total number of avocados with PLU 4046 sold
    • 4225 - Total number of avocados with PLU 4225 sold
    • 4770 - Total number of avocados with PLU 4770 sold

    Acknowledgements

    Many thanks to the Hass Avocado Board for sharing this data!!

    http://www.hassavocadoboard.com/retail/volume-and-price-data

    Inspiration

    In which cities can millenials have their avocado toast AND buy a home?

    Was the Avocadopocalypse of 2017 real?

  10. Football Players' Transfer Fee Prediction Dataset

    • kaggle.com
    zip
    Updated Nov 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khang Huynh Nguyen Trong (2023). Football Players' Transfer Fee Prediction Dataset [Dataset]. https://www.kaggle.com/khanghunhnguyntrng/football-players-transfer-fee-prediction-dataset
    Explore at:
    zip(562377 bytes)Available download formats
    Dataset updated
    Nov 30, 2023
    Authors
    Khang Huynh Nguyen Trong
    Description

    This dataset is undertaken to create a predictive model for the transfer values of football players. We will utilize data from football players and construct a model to predict transfer fees based on that data. Player data includes basic information such as age, height, playing position, as well as professional statistics like goal scoring, assists (in 2 season 2021-2022 and 2022-2023), injuries, along with total individual and team awards in their career.

    We had gathered information on players competing in several top-tier global football leagues:

    11 European leagues, including the Premier League and Championship in England, Bundesliga in Germany, La Liga in Spain, Serie A in Italy, Ligue 1 in France, Eredivisie in the Netherlands, Liga NOS in Portugal, Premier Liga in Russia, Super Lig in Turkey, and Bundesliga in Austria.

    4 American leagues, including Brasileiro in Brazil, Major League Soccer in the United States, Primera División in Argentina, and Liga MX in Mexico.

    1 African league, namely the DStv Premiership in South Africa.

    4 Asian leagues, comprising J-League in Japan, Saudi Pro League in Saudi Arabia, K-League 1 in South Korea, and A-League in Australia.

  11. Data from: Bellabeat case study

    • kaggle.com
    zip
    Updated Sep 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aisha Shadad (2024). Bellabeat case study [Dataset]. https://www.kaggle.com/datasets/aishashadad9/bellabeat-case-study
    Explore at:
    zip(552611 bytes)Available download formats
    Dataset updated
    Sep 26, 2024
    Authors
    Aisha Shadad
    Description

    This dataset consists of anonymized data collected from Fitbit users, encompassing various metrics related to physical activity, sleep patterns, and heart rate. It serves as a foundational resource for analyzing user behavior and health insights relevant to Bellabeat’s mission. Dataset Details:

    Source: The dataset is derived from Fitbit users who voluntarily shared their data.
    Variables:
      Steps: Number of steps taken daily.
      Calories: Total calories burned throughout the day.
      Heart Rate: Average heart rate recorded during the day.
      Sleep Duration: Total hours of sleep tracked.
      Activity Levels: Categorization of daily activity levels (sedentary, lightly active, fairly active, very active).
      Date: Date of the recorded metrics.
    

    Size and Format:

    File Type: CSV (Comma-Separated Values)
    Number of Records: [Insert number of rows/records]
    Number of Variables: [Insert number of columns/variables]
    

    Purpose:

    The dataset is used to uncover trends in user health and activity, providing insights that can guide product development and marketing strategies for Bellabeat. By analyzing these metrics, the project aims to identify how users engage with fitness tracking and how Bellabeat can better support their wellness journeys.

  12. EU-wide transparency exercise results 2017 - Asset quality

    • data.europa.eu
    csv, excel xls
    Updated Dec 21, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Banking Authority (2018). EU-wide transparency exercise results 2017 - Asset quality [Dataset]. https://data.europa.eu/data/datasets/eu-wide-transparency-exercise-results-credit-risk?locale=es
    Explore at:
    excel xls, csvAvailable download formats
    Dataset updated
    Dec 21, 2018
    Dataset authored and provided by
    European Banking Authorityhttp://eba.europa.eu/
    License

    http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj

    Area covered
    European Union
    Description

    The 2017 EU-wide transparency exercise provides detailed bank-by-bank data on capital positions, risk exposure amounts, leverage exposures and asset quality for 132 banks across 25 countries of the European Union (EU) and the European Economic Area (EEA). The data, which is exclusively based on supervisory reporting, is published at the highest level of consolidation for the reference dates of 31 December 2016 and 30 June 2017.

    The EU-wide transparency exercise is published along with the Risk Assessment Report (RAR), which is based on the full EBA's reporting sample, made up of 186 banks, of which 36 EU foreign subsidiaries of other EU banks (sample as of June 2017). In order to allow users to reconcile Transparency data with respective figures for the EU/EEA in the RAR, as well as in the interactive tools, data is also disclosed for the bucket "All other banks", which includes the aggregated values for the banks, excluding subsidiaries of other EU banks, that are in the RAR sample but have not participated in the transparency exercise.

    The EBA has been conducting transparency exercises at the EU-wide level on an annual basis since 2011. The transparency exercise is part of the EBA's ongoing efforts to foster transparency and market discipline in the EU financial market, and complements banks' own Pillar 3 disclosures, as laid down in the EU's capital requirements directive (CRD). Along with the dataset, the EBA also provides a wide range of interactive tools that allow users to compare and to visualise data across time and at a country and a bank-by-bank level.

  13. Brewery CSV

    • kaggle.com
    zip
    Updated Mar 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    abcbbong (2025). Brewery CSV [Dataset]. https://www.kaggle.com/datasets/abcbbong/brewery-csv
    Explore at:
    zip(1137561506 bytes)Available download formats
    Dataset updated
    Mar 7, 2025
    Authors
    abcbbong
    Description

    This dataset offers a detailed look into brewery operations, capturing the intricate process of beer production from fermentation to bottling. It includes comprehensive metrics on brewing parameters, quality scores, sales performance, and operational efficiency across multiple locations. Originally compiled for a data science Capstone Project, this dataset is ideal for enthusiasts and analysts interested in exploring the intersection of manufacturing, quality control, and market trends in the craft beer industry.

    Column Name Description Data Type Units Notes Batch_ID Unique identifier for each brewing batch. Numeric/String - Could be numeric or alphanumeric depending on the system's ID generation method. Brew_Date Date and time when the brewing process started for the batch. Date/Time - Format appears to be "MM/DD/YYYY HH:MM". May require consistent parsing if formats vary in the larger dataset. Time component seems to be granular down to minutes. Beer_Style Style or type of beer being brewed (e.g., Wheat Beer, Sour, Ale, Stout, Lager, Pilsner, IPA, Porter). Text/Categorical - Categorical values representing different beer styles. SKU Stock Keeping Unit. Potentially a code representing the packaging type. (e.g., Kegs, Cans, Pints, Bottles). Text/Categorical - Seems to describe packaging form, similar to 'Form' in the previous data dictionary but potentially more product-centric. Location Location associated with the brewing process, potentially the brewery location or intended sales region (e.g., Whitefield, Malleswaram, Rajajinagar, Marathahalli, Electronic City, Indiranagar, Koramangala). Text/Categorical - Likely refers to brewery or distribution location. Needs context to understand if it represents production site or intended market. Fermentation_Time Duration of the fermentation process. Numeric (Integer) Hours Integer values representing time in hours. This is the time the wort ferments. Temperature Temperature during fermentation. Numeric (Float) Degrees Celsius (°C) Float values likely in degrees Celsius. Crucial parameter for fermentation control. pH_Level pH level of the brew during fermentation. Numeric (Float) pH Units Float values representing pH, a measure of acidity/alkalinity. Important for yeast activity and beer quality. Gravity Specific Gravity of the wort before fermentation (Original Gravity - OG). Numeric (Float) SG Units Float values representing Specific Gravity, a measure of sugar concentration in the wort. Used to estimate potential alcohol content. Typically represented as values like 1.xxx. Alcohol_Content Final Alcohol content of the beer. Numeric (Float) Percentage (%) Float values representing alcohol by volume (ABV) as a percentage. Bitterness Perceived bitterness of the beer, measured in International Bitterness Units (IBUs). Numeric (Integer) IBUs Integer values representing International Bitterness Units. Higher IBU means more bitter beer. Color Color of the beer, often measured on the Standard Reference Method (SRM) scale or similar color scale. Numeric (Integer) SRM (or similar) Integer values representing beer color intensity. Higher value means darker beer. Scale might be SRM or EBC - need context to confirm, but SRM is common in US brewing. Ingredient_Ratio Ratio of key ingredients used in the brew. Format appears to be "1:X:Y", possibly representing Malt:Hops:Yeast ratios or similar key ingredient proportions. Text/Categorical Ratio Text-based ratio. Needs further decoding to understand what 'X' and 'Y' represent in the ratio. Common ratios in brewing might involve malt types, hop varieties, yeast strains, or water-to-grain ratios. Volume_Produced Total volume of beer produced in this batch. Numeric (Integer) Liters/Gallons Integer values. Units are likely Liters or Gallons. Context needed to determine which volume unit is used. Could also be in Barrels (US or UK). Total_Sales Total sales revenue generated from this batch of beer. Numeric (Float) Currency Units Float values representing revenue. Currency units will depend on the context (e.g., USD, EUR, INR). Quality_Score Overall quality score of the beer batch, possibly based on sensory evaluation, lab tests, or a combination. Numeric (Float) Score/Points Float values representing a quality score. Scale and meaning of the score (higher is better? range?) needs to be defined by the quality assessment process used. Brewhouse_Efficiency Efficiency of the brewhouse operation, indicating how effectively sugars are extracted from grains during mashing and lautering. Numeric (Float) Percentage (%) Float values in percentage. Higher efficiency is generally better, indicating less sugar loss during the mashing process. Loss_During_Brewing Percentage of volume lost during the brewing process (pre-fermentation). Numeric (Float) Percentage (%) Float values in percentage. Represents losses during wort production - e.g., evapora...

  14. Fruits Classification 🍇

    • kaggle.com
    zip
    Updated Apr 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DeepNets (2023). Fruits Classification 🍇 [Dataset]. https://www.kaggle.com/datasets/utkarshsaxenadn/fruits-classification/suggestions
    Explore at:
    zip(88954615 bytes)Available download formats
    Dataset updated
    Apr 9, 2023
    Authors
    DeepNets
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The fruit classification dataset is a collection of images of various fruits used for the purpose of the training and testing computer vision models. The dataset includes five different types of fruit: * Apples * Bananas * Grapes * Mangoes * Strawberries

    Each class contains 2000 images, resulting in a total of 10,000 images in the dataset.

    The images in the dataset are of various shapes, sizes, and colors, and have been captured under different lighting conditions. The dataset is useful for training and testing models that perform tasks such as object detection, image classification, and segmentation.

    The dataset can be used for various research projects, such as developing and testing new image classification algorithms, and for benchmarking existing algorithms. The dataset can also be used to train machine learning models that can be used in real-world applications, such as in the agricultural industry for fruit grading and sorting.

    Overall, the fruit classification dataset is a valuable resource for researchers and developers working in the field of computer vision, and its availability will help advance the development of new algorithms and technologies for image analysis and classification.

    Data Structure

    The data is split into three sets: training, validation, and testing. The training set is used to train the model, while the validation set is used to evaluate the model's performance during training and make adjustments as necessary. The testing set is used to evaluate the final performance of the model after training is complete.

    The dataset is split based on a ratio of 97% for training, 2% for validation, and 1% for testing. This means that the training set contains 9700 images (97% of the total), the validation set contains 200 images (2% of the total), and the testing set contains 100 images (1% of the total).

    Each class in the dataset is split into three sets based on the ratio. For example, for the "Apple" class, 97% (1940 images) are used for training, 2% (40 images) are used for validation, and 1% (20 images) are used for testing. This ensures that the distribution of classes is consistent across all three sets and that the model is trained on a representative sample of all classes.

    Overall, the split of the dataset into training, validation, and testing sets ensures that the model is robust and generalizes well to new, unseen data.

    Python Script

    The script provided creates train, validation, and test sets from a fruit image dataset by splitting the dataset into predetermined ratios, shuffling the images, and moving them to their respective directories.

  15. Zomato Dataset

    • kaggle.com
    zip
    Updated Feb 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UTTAM KUMAR (2023). Zomato Dataset [Dataset]. https://www.kaggle.com/datasets/uttamkumar15802/zomato-dataset
    Explore at:
    zip(3942644 bytes)Available download formats
    Dataset updated
    Feb 17, 2023
    Authors
    UTTAM KUMAR
    Description

    res_id: A unique identifier for each restaurant in the dataset. name: The name of the restaurant. type: The type of restaurant, such as Cafe, Bar, or Fine Dining. address: The street address of the restaurant. city: The city where the restaurant is located. locality: The neighborhood or locality where the restaurant is located. latitude: The latitude coordinate of the restaurant's location. longitude: The longitude coordinate of the restaurant's location. cuisines: The types of cuisine served at the restaurant. timings: The opening and closing times of the restaurant. average_cost_for_two: The average cost for two people to dine at the restaurant. price_range: The price range of the restaurant, such as Inexpensive, Moderate, or Expensive. highlights: The features of the restaurant, such as Takeaway, Outdoor Seating, or Live Music. aggregate_rating: The overall rating of the restaurant based on customer reviews. votes: The total number of customer reviews for the restaurant. photo_count: The number of photos uploaded by customers for the restaurant. opentable_support: Whether the restaurant supports online reservations through OpenTable. delivery: Whether the restaurant offers delivery. state: The state or province where the restaurant is located. area: The area within the city where the restaurant is located.

  16. Azerbaijan Premier League 2023-2024

    • kaggle.com
    zip
    Updated Sep 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fuad Ibrahimli (2024). Azerbaijan Premier League 2023-2024 [Dataset]. https://www.kaggle.com/fuadibrahimli/azerbaijan-premier-league-2023-2024
    Explore at:
    zip(782 bytes)Available download formats
    Dataset updated
    Sep 13, 2024
    Authors
    Fuad Ibrahimli
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Azerbaijan
    Description

    The dataset you've provided contains detailed information about the Azerbaijan Premier League for the 2023-2024 season. Here's a description of the dataset:

    Azerbaijan Premier League 2023-2024 Dataset Description: Team: The name of the football club participating in the league. Matches Played: Total number of matches each team has played during the season. Win: Number of matches won by the team. Draw: Number of matches that ended in a draw. Lose: Number of matches lost by the team. Goal For: Total goals scored by the team. Goals Against: Total goals conceded by the team. Goal Difference: The difference between goals scored and goals conceded (Goal For - Goals Against). Points: Total points accumulated by the team during the season (3 points for a win, 1 point for a draw). Squad: The size of the squad (number of players). Age: The average age of the players in the squad. Foreigners: The number of foreign players in the squad. Total Market Value (€): The estimated total market value of the team, expressed in euros. Stadium Name: The name of the home stadium for the team. Stadium Capacity: The total seating capacity of the stadium. This dataset provides a comprehensive overview of the team performances, squad composition, and stadium information for the current season of the Azerbaijan Premier League.

  17. FIFA 23 Fut Players Dataset

    • kaggle.com
    zip
    Updated May 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammad Essam (2025). FIFA 23 Fut Players Dataset [Dataset]. https://www.kaggle.com/datasets/mohammedessam97/fifa-23-fut-players-dataset
    Explore at:
    zip(736121 bytes)Available download formats
    Dataset updated
    May 26, 2025
    Authors
    Mohammad Essam
    Description

    Context

    The dataset provided include players data for the ultimate team mode in FIFA 23 .

    Columns Description

    here is a description of some columns in the dataset :

    Ratings : Player Rating Position : Player Position Version : Card Version PS : Price on Playstation , if 0 then it is not available in market SKI : Skills rating of player ( from 0 to 5 ) WF : Weak Foot Skills ( from 0 to 5 ) WR : Work rate of player on the field , and given in the formula ( Attack Work rate / Defence Work rate ) , each value can be ( low , medium , high ) PAC : Player Pace (Speed) SHO : Player Shooting power PAS : Player Pass DRI : Player Dribble DEF : Player Defence PHY : Player Physicality Body : player height given in cm and feet followed by type of body of player ( for some players the game have custom body for them ) Popularity : popularity of using the player BS : Base stats IGS : In game stats

  18. Wyscout Player Profile Data

    • kaggle.com
    zip
    Updated Dec 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pastor Soto (2024). Wyscout Player Profile Data [Dataset]. https://www.kaggle.com/datasets/pastorsoto/wyscout-players
    Explore at:
    zip(189846 bytes)Available download formats
    Dataset updated
    Dec 29, 2024
    Authors
    Pastor Soto
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Source: This dataset contains player profile information originally sourced from Wyscout.

    Content: The dataset includes biographical and physical details for football players. Key information includes: * Unique Wyscout player identifier (wyId) * Player names (firstName, lastName, shortName) * Demographics: birthDate, passportArea (including name, ID, alpha3code), birthArea * Physical attributes: height (cm), weight (kg), preferred foot * Team information: currentTeamId, currentNationalTeamId * Playing role: role (including code2, code3, and full name like 'Goalkeeper')

    Structure: The data is provided in JSON. Note that some fields (passportArea, birthArea, role) contain nested JSON-like structures providing more detailed information.

    Potential Use Cases: * Demographic analysis of player populations * Exploratory data analysis on player attributes (height, weight, age) by position * Foundation for merging with other football datasets (e.g., match events, market values) * Educational purposes for data analysis and visualization

  19. Zomato Restaurants Dataset

    • kaggle.com
    zip
    Updated Jul 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bharath Devanaboina (2023). Zomato Restaurants Dataset [Dataset]. https://www.kaggle.com/datasets/bharathdevanaboina/zomato-restaurants-dataset
    Explore at:
    zip(1143744 bytes)Available download formats
    Dataset updated
    Jul 23, 2023
    Authors
    Bharath Devanaboina
    Description

    Kaggle Dataset: Zomato Restaurants Data

    Overview: This Kaggle dataset contains information about 850 restaurants listed on the popular online food delivery and restaurant discovery platform, Zomato. The dataset provides a comprehensive collection of restaurant details, including their names, locations, ratings, cuisines, pricing, and more. This data is valuable for data analysis, market research, and gaining insights into the restaurant landscape in various cities.

    Data Description: The dataset comprises the following columns:

    1. Restaurant ID: A unique identifier for each restaurant.

    2. Restaurant Name: The name of the restaurant.

    3. Cuisines: The cuisines or types of food served by the restaurant.

    4. Average Cost for Two: The average cost for two people to dine at the restaurant.

    5. Currency: The currency used for pricing (e.g., USD, INR, etc.).

    6. Has Online Delivery: Indicates whether the restaurant offers online delivery (Yes/No).

    7. Has Table Booking: Indicates whether the restaurant allows table booking (Yes/No).

    8. Is Delivering Now: Indicates whether the restaurant is currently providing delivery services (Yes/No).

    9. Aggregate Rating: The overall rating of the restaurant, which is an average of all customer ratings.

    10. Rating Color: A color code representing the restaurant's rating (e.g., Green, Yellow, Orange).

    11. Rating Text: A text description of the restaurant's rating (e.g., Excellent, Good, Average).

    12. Votes: The total number of customer votes or ratings received by the restaurant.

    13. Country Code: The country code where the restaurant is located.

    14. City: The city in which the restaurant is situated.

    15. Locality: The specific locality or area of the city where the restaurant is located.

    16. Address: The complete address of the restaurant.

    Potential Use-Cases: - Restaurant Analysis: Researchers and analysts can study various aspects of restaurants, such as their cuisines, ratings, and pricing, to identify patterns and trends.

    • Location-Based Insights: The dataset allows for geographic analysis, providing insights into popular localities and cities with a higher concentration of restaurants.

    • Market Research: Food and hospitality businesses can use the dataset to understand customer preferences, competitor analysis, and market opportunities.

    • Predictive Models: The data can be used to build predictive models for factors like customer ratings or average cost based on cuisines and other attributes.

    • Data Visualization: Data enthusiasts can create engaging visualizations to showcase restaurant distributions, rating trends, and more.

    Acknowledgements: The dataset was sourced from Zomato, a popular restaurant search and discovery service. The data was made available on Kaggle for educational and research purposes.

    Disclaimer: The data is as of a specific date and may not reflect real-time information or changes made to restaurants after the data collection. Users should verify the accuracy of the data before using it for critical purposes.

  20. Pistachio Species Classification

    • kaggle.com
    zip
    Updated May 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaurav Dutta (2023). Pistachio Species Classification [Dataset]. https://www.kaggle.com/datasets/gauravduttakiit/pistachio-species-classification
    Explore at:
    zip(46130452 bytes)Available download formats
    Dataset updated
    May 20, 2023
    Authors
    Gaurav Dutta
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    DATASET: https://www.muratkoklu.com/datasets/ CV:https://www.muratkoklu.com/en/publications/

    Pistachio Image Dataset Citation Request :

    OZKAN IA., KOKLU M. and SARACOGLU R. (2021). Classification of Pistachio Species Using Improved K-NN Classifier. Progress in Nutrition, Vol. 23, N. 2, pp. DOI:10.23751/pn.v23i2.9686. (Open Access) https://www.mattioli1885journals.com/index.php/progressinnutrition/article/view/9686/9178

    SINGH D, TASPINAR YS, KURSUN R, CINAR I, KOKLU M, OZKAN IA, LEE H-N., (2022). Classification and Analysis of Pistachio Species with Pre-Trained Deep Learning Models, Electronics, 11 (7), 981. https://doi.org/10.3390/electronics11070981. (Open Access)

    Article Download (PDF): 1: https://www.mattioli1885journals.com/index.php/progressinnutrition/article/view/9686/9178 2: https://doi.org/10.3390/electronics11070981

    DATASET: https://www.muratkoklu.com/datasets/

    ABSTRACT: To keep the economic value of pistachio nuts which have an important place in the agricultural economy, the efficiency of post-harvest industrial processes is very important. To provide this efficiency, new methods and technologies are needed for the separation and classification of pistachios. Different pistachio species address different markets, which increases the need for the classification of pistachio species. This study, it is aimed to develop a classification model different from traditional separation methods, based on image processing and artificial intelligence which are capable to provide the required classification. A computer vision system has been developed to distinguish two different species of pistachios with different characteristics that address different market types. 2148 sample images for these two kinds of pistachios were taken with a high-resolution camera. The image processing techniques, segmentation, and feature extraction were applied to the obtained images of the pistachio samples. A pistachio dataset that has sixteen attributes was created. An advanced classifier based on the k-NN method, which is a simple and successful classifier, and principal component analysis was designed on the obtained dataset. In this study; a multi-level system including feature extraction, dimension reduction, and dimension weighting stages has been proposed. Experimental results showed that the proposed approach achieved a classification success of 94.18%. The presented high-performance classification model provides an important need for the separation of pistachio species and increases the economic value of species. In addition, the developed model is important in terms of its application to similar studies.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Suraj (2023). Dairy Goods Sales Dataset [Dataset]. https://www.kaggle.com/datasets/suraj520/dairy-goods-sales-dataset
Organization logo

Dairy Goods Sales Dataset

A comprehensive dataset on dairy farms, products, sales, and inventory tracking

Explore at:
5 scholarly articles cite this dataset (View in Google Scholar)
zip(232961 bytes)Available download formats
Dataset updated
Jun 6, 2023
Authors
Suraj
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

The Dairy Goods Sales Dataset provides a detailed and comprehensive collection of data related to dairy farms, dairy products, sales, and inventory management. This dataset encompasses a wide range of information, including farm location, land area, cow population, farm size, production dates, product details, brand information, quantities, pricing, shelf life, storage conditions, expiration dates, sales information, customer locations, sales channels, stock quantities, stock thresholds, and reorder quantities.

Features:

  1. Location: The geographical location of the dairy farm.
  2. Total Land Area (acres): The total land area occupied by the dairy farm.
  3. Number of Cows: The number of cows present in the dairy farm.
  4. Farm Size: The size of the dairy farm(in sq.km).
  5. Date: The date of data recording.
  6. Product ID: The unique identifier for each dairy product.
  7. Product Name: The name of the dairy product.
  8. Brand: The brand associated with the dairy product.
  9. Quantity (liters/kg): The quantity of the dairy product available.
  10. Price per Unit: The price per unit of the dairy product.
  11. Total Value: The total value of the available quantity of the dairy product.
  12. Shelf Life (days): The shelf life of the dairy product in days.
  13. Storage Condition: The recommended storage condition for the dairy product.
  14. Production Date: The date of production for the dairy product.
  15. Expiration Date: The date of expiration for the dairy product.
  16. Quantity Sold (liters/kg): The quantity of the dairy product sold.
  17. Price per Unit (sold): The price per unit at which the dairy product was sold.
  18. Approx. Total Revenue (INR): The approximate total revenue generated from the sale of the dairy product.
  19. Customer Location: The location of the customer who purchased the dairy product.
  20. Sales Channel: The channel through which the dairy product was sold (Retail, Wholesale, Online).
  21. Quantity in Stock (liters/kg): The quantity of the dairy product remaining in stock.
  22. Minimum Stock Threshold (liters/kg): The minimum stock threshold for the dairy product.
  23. Reorder Quantity (liters/kg): The recommended quantity to reorder for the dairy product.

Potential Use-Case:

This dataset can be used by researchers, analysts, and businesses in the dairy industry for various purposes, such as:

  1. Analyzing the performance of dairy farms based on location, land area, and cow population.
  2. Understanding the sales and distribution patterns of different dairy products across various brands and regions.
  3. Studying the impact of storage conditions and shelf life on the quality and availability of dairy products.
  4. Analyzing customer preferences and buying behavior based on location and sales channels.
  5. Optimizing inventory management by tracking stock quantities, minimum thresholds, and reorder quantities.
  6. Conducting market research and trend analysis in the dairy industry.
  7. Developing predictive models for demand forecasting and pricing strategies.

Note: This dataset includes data from the period between 2019 and 2022, and it specifically focuses on selected dairy brands operating in specific states and union territories of India. There is an intentional drift highlighted in the dataset's figures due to its opensource and creative license, currently !

Search
Clear search
Close search
Google apps
Main menu