22 datasets found
  1. Retail Transactions Dataset

    • kaggle.com
    Updated May 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prasad Patil (2024). Retail Transactions Dataset [Dataset]. https://www.kaggle.com/datasets/prasad22/retail-transactions-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 18, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Prasad Patil
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset was created to simulate a market basket dataset, providing insights into customer purchasing behavior and store operations. The dataset facilitates market basket analysis, customer segmentation, and other retail analytics tasks. Here's more information about the context and inspiration behind this dataset:

    Context:

    Retail businesses, from supermarkets to convenience stores, are constantly seeking ways to better understand their customers and improve their operations. Market basket analysis, a technique used in retail analytics, explores customer purchase patterns to uncover associations between products, identify trends, and optimize pricing and promotions. Customer segmentation allows businesses to tailor their offerings to specific groups, enhancing the customer experience.

    Inspiration:

    The inspiration for this dataset comes from the need for accessible and customizable market basket datasets. While real-world retail data is sensitive and often restricted, synthetic datasets offer a safe and versatile alternative. Researchers, data scientists, and analysts can use this dataset to develop and test algorithms, models, and analytical tools.

    Dataset Information:

    The columns provide information about the transactions, customers, products, and purchasing behavior, making the dataset suitable for various analyses, including market basket analysis and customer segmentation. Here's a brief explanation of each column in the Dataset:

    • Transaction_ID: A unique identifier for each transaction, represented as a 10-digit number. This column is used to uniquely identify each purchase.
    • Date: The date and time when the transaction occurred. It records the timestamp of each purchase.
    • Customer_Name: The name of the customer who made the purchase. It provides information about the customer's identity.
    • Product: A list of products purchased in the transaction. It includes the names of the products bought.
    • Total_Items: The total number of items purchased in the transaction. It represents the quantity of products bought.
    • Total_Cost: The total cost of the purchase, in currency. It represents the financial value of the transaction.
    • Payment_Method: The method used for payment in the transaction, such as credit card, debit card, cash, or mobile payment.
    • City: The city where the purchase took place. It indicates the location of the transaction.
    • Store_Type: The type of store where the purchase was made, such as a supermarket, convenience store, department store, etc.
    • Discount_Applied: A binary indicator (True/False) representing whether a discount was applied to the transaction.
    • Customer_Category: A category representing the customer's background or age group.
    • Season: The season in which the purchase occurred, such as spring, summer, fall, or winter.
    • Promotion: The type of promotion applied to the transaction, such as "None," "BOGO (Buy One Get One)," or "Discount on Selected Items."

    Use Cases:

    • Market Basket Analysis: Discover associations between products and uncover buying patterns.
    • Customer Segmentation: Group customers based on purchasing behavior.
    • Pricing Optimization: Optimize pricing strategies and identify opportunities for discounts and promotions.
    • Retail Analytics: Analyze store performance and customer trends.

    Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.

  2. Retail Sales Index (RSI) - Datasets - Government of the Republic of Trinidad...

    • data.gov.tt
    Updated Sep 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.gov.tt (2023). Retail Sales Index (RSI) - Datasets - Government of the Republic of Trinidad and Tobago Open Data Platform [Dataset]. https://data.gov.tt/dataset/retail-sales-index
    Explore at:
    Dataset updated
    Sep 28, 2023
    Dataset provided by
    Data.govhttps://data.gov/
    Area covered
    Trinidad and Tobago
    Description

    The Retail Sales Index (RSI) is like a health check-up for the shopping world, done every three (3) months. Imagine visiting many different stores, from big to small, and noting how much they are selling. That is what the RSI does. It adds up the sales from these stores to get a feel for how well retail businesses are doing. This index helps us understand if people spend more or less at shops, which is a big deal for the economy. Think of it as a way to gauge our shopping habits. Plus, by comparing it with the Retail Price Index (RPI), which tracks price changes, we can see how much we are spending but how much stuff we are actually buying, considering price changes.

  3. T

    US Retail Sales

    • tradingeconomics.com
    • zh.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). US Retail Sales [Dataset]. https://tradingeconomics.com/united-states/retail-sales
    Explore at:
    csv, xml, excel, jsonAvailable download formats
    Dataset updated
    Jun 17, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 29, 1992 - May 31, 2025
    Area covered
    United States
    Description

    Retail Sales in the United States decreased 0.90 percent in May of 2025 over the previous month. This dataset provides - U.S. December Retail Sales Increased More Than Forecast - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  4. sales data

    • kaggle.com
    Updated Aug 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ronny Kimathi kaimenyi (2023). sales data [Dataset]. https://www.kaggle.com/datasets/ronnykym/online-store-sales-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 2, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ronny Kimathi kaimenyi
    License

    https://ec.europa.eu/info/legal-notice_enhttps://ec.europa.eu/info/legal-notice_en

    Description

    Deluxe is an online retailer based in UK that deals in a wide range of products in the following categories: 1. Clothing 2. Games 3. Appliances 4. Electronics 5. Books 6. Beauty products 7. Smartphones 8. Outdoors products 9. Accessories 10. Other Basic household products are classified as 'Other' in the category column since they have small value to the business.

    Data Description: dates: sale date order_value_EUR : sale price in EUR cost: cost of goods sold in EUR category: item category country: customers' country at the time of purchase customer_name: name of customer device_type: The gadget used by customer to access our online store(PC, mobile, tablet) sales_manager: name of the sales manager for each sale sales_representative: name of the sales rep for each sale order_id: unique identifier of an order

    The data was recorded for the period 1/2/2019 and 12/30/2020 with an aim to generate business insights to guide business direction. We would like to see what interesting insights the Kaggle community members can produce from this data.

  5. d

    Retail Price Index (RPI) - Datasets - Government of the Republic of Trinidad...

    • data.gov.tt
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Retail Price Index (RPI) - Datasets - Government of the Republic of Trinidad and Tobago Open Data Platform [Dataset]. https://data.gov.tt/dataset/retail-price-index
    Explore at:
    Dataset updated
    Nov 21, 2023
    Description

    The Retail Price Index (RPI) is a tool that helps us understand how the prices of everyday items change over time in Trinidad and Tobago. Imagine you have a shopping basket filled with various items people commonly buy, like food, gas, and other services. The RPI keeps track of how the prices of these items in the basket change each month. To do this, experts regularly check the prices of these items in fifteen (15) different areas across Trinidad and Tobago. They visit local stores, markets, and gas stations to note the current prices of food and gas, which tend to change often. For items whose prices do not change as quickly, they check the prices every three (3) months. This way, the RPI gives a clear picture of how much more or less it costs to buy the same set of items over time.

  6. R

    Person Counter Dataset

    • universe.roboflow.com
    zip
    Updated Jun 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tkbees (2023). Person Counter Dataset [Dataset]. https://universe.roboflow.com/tkbees-ogrtd/person-counter-tq0wf
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 15, 2023
    Dataset authored and provided by
    Tkbees
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Person Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. Retail Analytics: Store owners can use the model to track the number of customers visiting their stores during different times of the day or seasons, which can help in workforce and resource allocation.

    2. Crowd Management: Event organizers or public authorities can utilize the model to monitor crowd sizes at concerts, festivals, public gatherings or protests, aiding in security and emergency planning.

    3. Smart Transportation: The model can be integrated into public transit systems to count the number of passengers in buses or trains, providing real-time occupancy information and assisting in transportation planning.

    4. Health and Safety Compliance: During times of pandemics or emergencies, the model can be used to count the number of people in a location, ensuring compliance with restrictions on gathering sizes.

    5. Building Security: The model can be adopted in security systems to track how many people enter and leave a building or a particular area, providing useful data for access control.

  7. Data from: Retail Sales Index

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Jun 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Retail Sales Index [Dataset]. https://www.ons.gov.uk/businessindustryandtrade/retailindustry/datasets/retailsalesindexreferencetables
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 20, 2025
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    A series of retail sales data for Great Britain in value and volume terms, seasonally and non-seasonally adjusted.

  8. Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

    • zenodo.org
    • explore.openaire.eu
    zip
    Updated Oct 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. http://doi.org/10.5281/zenodo.6832242
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sofia Yfantidou; Sofia Yfantidou; Christina Karagianni; Stefanos Efstathiou; Stefanos Efstathiou; Athena Vakali; Athena Vakali; Joao Palotti; Joao Palotti; Dimitrios Panteleimon Giakatos; Dimitrios Panteleimon Giakatos; Thomas Marchioro; Thomas Marchioro; Andrei Kazlouski; Elena Ferrari; Šarūnas Girdzijauskas; Šarūnas Girdzijauskas; Christina Karagianni; Andrei Kazlouski; Elena Ferrari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LifeSnaps Dataset Documentation

    Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

    The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

    Data Import: Reading CSV

    For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

    Data Import: Setting up a MongoDB (Recommended)

    To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

    To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

    For the Fitbit data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c fitbit 

    For the SEMA data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c sema 

    For surveys data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c surveys 

    If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

    Data Availability

    The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

    {
      _id: 
  9. T

    United Kingdom Retail Sales MoM

    • tradingeconomics.com
    • de.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Jun 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United Kingdom Retail Sales MoM [Dataset]. https://tradingeconomics.com/united-kingdom/retail-sales
    Explore at:
    json, xml, excel, csvAvailable download formats
    Dataset updated
    Jun 20, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 29, 1996 - May 31, 2025
    Area covered
    United Kingdom
    Description

    Retail Sales in the United Kingdom decreased 2.70 percent in May of 2025 over the previous month. This dataset provides the latest reported value for - United Kingdom Retail Sales MoM - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  10. Walmart Retail Data

    • kaggle.com
    Updated May 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saad Abdur Razzaq (2024). Walmart Retail Data [Dataset]. https://www.kaggle.com/datasets/saadabdurrazzaq/walmart-retail-data/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 6, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Saad Abdur Razzaq
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The dataset comprises transactional information from previous 5 years from Walmart retail stores, with diverse details such as customer demographics, order specifics, product attributes, and sales logistics. It includes data on the city where purchases were made, customer age, names, and segments, along with any applied discounts and the quantity of products ordered. Each transaction is uniquely identified by an order ID, accompanied by order date, priority, and shipping details like mode, cost, and dates. Product-related information encompasses base margins, categories, containers, names, and sub-categories, enabling insights into profitability, sales, and regional performance. The dataset also provides granular details such as profit margins, unit prices, and ZIP codes, facilitating analysis at multiple levels like customer behavior, product performance, and operational efficiencies within Walmart's retail ecosystem.

    The columns in dataset are:

    1. City: The city where the purchase was made.
    2. Customer Age: Age of the customer making the purchase.
    3. Customer Name: Name of the customer.
    4. Customer Segment: Segment to which the customer belongs (like retail, wholesale, etc.).
    5. Discount: Any discount applied to the purchase.
    6. Number of Records: The count of records for each transaction.
    7. Order Date: Date when the order was placed.
    8. Order ID: Unique identifier for each order.
    9. Order Priority: Priority level of the order (like high, medium, low).
    10. Order Quantity: Quantity of products ordered.
    11. Product Base Margin: Base margin percentage for the product.
    12. Product Category: Category to which the product belongs (like electronics, groceries, etc.).
    13. Product Container: Container type of the product.
    14. Product Name: Name of the product.
    15. Product Sub-Category: Sub-category to which the product belongs.
    16. Profit: Profit earned from the transaction.
    17. Region: Region where the purchase was made.
    18. Row ID: Unique identifier for each row.
    19. Sales: Total sales amount.
    20. Ship Date: Date when the order was shipped.
    21. Ship Mode: Mode of shipping (like standard, express, etc.).
    22. Shipping Cost: Cost associated with shipping.
    23. State: State where the purchase was made.
    24. Unit Price: Price per unit of the product.
    25. Zip Code: ZIP code of the customer or store location.
  11. R

    Terrace Fusion Dataset

    • universe.roboflow.com
    zip
    Updated Jun 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    datasets connection (2022). Terrace Fusion Dataset [Dataset]. https://universe.roboflow.com/datasets-connection/terrace-fusion/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 3, 2022
    Dataset authored and provided by
    datasets connection
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Social Distancing Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. "Public Safety Compliance" Use this model to monitor public spaces like parks, beaches, or shopping areas to ensure compliance with social distancing protocols. The nature of the images in the dataset could help identify instances where people are or aren't practicing safe distances and provide data on public adherence to guidelines.

    2. "Event Management" Event organizers can integrate this model into their security system to enforce social distancing norms during festivals, concerts, games, or any other mass gathering. This will enable efficient crowd control without requiring extensive human effort.

    3. "Retail Analytics" Retail stores could use this model to monitor customers' adherence to social distancing norms inside their stores. Understanding customer behavior with respect to these norms may provide insights for strategic decisions and operational efficiency.

    4. "Urban Planning and Research" Researchers or urban planners can utilize this model to study the effectiveness of current social distancing policies and norms in different environments. This could help guide future policies or planning of city spaces.

    5. "Education Sector" Schools, colleges, and universities can input live feeds or recorded footage to monitor student behavior regarding social-distancing norms. Providing real-time feedback, or periodic reports might help educational institutions in ensuring an appropriate level of safety on their campuses.

  12. C

    Data from: Retail Theft

    • data.cityofchicago.org
    Updated Jul 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chicago Police Department (2025). Retail Theft [Dataset]. https://data.cityofchicago.org/Public-Safety/Retail-Theft/skfh-cpun
    Explore at:
    tsv, csv, application/rssxml, xml, application/rdfxml, application/geo+json, kml, kmzAvailable download formats
    Dataset updated
    Jul 12, 2025
    Authors
    Chicago Police Department
    Description

    This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e

  13. Monthly average retail prices for selected products

    • www150.statcan.gc.ca
    • datasets.ai
    • +2more
    Updated Jul 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Monthly average retail prices for selected products [Dataset]. http://doi.org/10.25318/1810024501-eng
    Explore at:
    Dataset updated
    Jul 2, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Monthly average retail prices for selected products, for Canada and provinces. Prices are presented for the current month and the previous four months. Prices are based on transaction data from Canadian retailers, and are presented in Canadian current dollars.

  14. a

    sd.SD.HLTH FOODASSIST P

    • opendata-geospatialdenver.hub.arcgis.com
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    geospatialDENVER: Putting Denver on the map. (2025). sd.SD.HLTH FOODASSIST P [Dataset]. https://opendata-geospatialdenver.hub.arcgis.com/items/26c4b54a14cb434493ed6257d6cde718
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset authored and provided by
    geospatialDENVER: Putting Denver on the map.
    Area covered
    Description

    This feature layer is maintained by Denver Department of Public Health and Environment and is a combination of data available through Hunger Free Colorado and ground truthing by the DDPHE CBH Food Systems Team. The data stewards are Paola Babb and Jessika Brenin. It displays food retail locations in Denver. The dataset displays the locations of food pantries classified as traditional of non-traditional, food banks, and food distributors.Data set comes from Hunger Free Colorado anon-profit that supports connecting people to food resources. They review food access points across the state on a yearly basis. The data set was filtered by Denver County and specifically looking at what FIC classified as a Traditional Food Pantry, Non-traditional Food Pantries, Food Bank and Food Distributor.Traditional Food Pantry: Brick and mortar location, that distributes food on a regular basis (daily, weekly, monthly). Typically distributed via in-store shopping and/or food boxesNon-traditional Food Pantries: Food assistance program occurs without a brick and mortar structure and/or on an ad hoc basis. Often targets a specific neighborhood or population and may not be open to the publicFood Bank: A place where food pantries purchase food from Food Distributor: A place that supports pantries in the delivery of food, procurement and access to local fresh produce

  15. Clickstream Data for Online Shopping

    • kaggle.com
    Updated Apr 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bojan Tunguz (2021). Clickstream Data for Online Shopping [Dataset]. https://www.kaggle.com/datasets/tunguz/clickstream-data-for-online-shopping/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 13, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bojan Tunguz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Source:

    Mariusz Šapczyński, Cracow University of Economics, Poland, lapczynm '@' uek.krakow.pl Sylwester Białowąs, Poznan University of Economics and Business, Poland, sylwester.bialowas '@' ue.poznan.pl

    Data Set Information:

    The dataset contains information on clickstream from online store offering clothing for pregnant women. Data are from five months of 2008 and include, among others, product category, location of the photo on the page, country of origin of the IP address and product price in US dollars.

    Attribute Information:

    The dataset contains 14 variables described in a separate file (See 'Data set description')

    Relevant Papers:

    N/A

    Citation Request:

    If you use this dataset, please cite:

    Šapczyński M., Białowąs S. (2013) Discovering Patterns of Users' Behaviour in an E-shop - Comparison of Consumer Buying Behaviours in Poland and Other European Countries, “Studia Ekonomiczne†, nr 151, “La société de l'information : perspective européenne et globale : les usages et les risques d'Internet pour les citoyens et les consommateurs†, p. 144-153

    Data description ìe-shop clothing 2008î

    Variables:

    1. YEAR (2008)

    ========================================================

    2. MONTH -> from April (4) to August (8)

    ========================================================

    3. DAY -> day number of the month

    ========================================================

    4. ORDER -> sequence of clicks during one session

    ========================================================

    5. COUNTRY -> variable indicating the country of origin of the IP address with the

    following categories:

    1-Australia 2-Austria 3-Belgium 4-British Virgin Islands 5-Cayman Islands 6-Christmas Island 7-Croatia 8-Cyprus 9-Czech Republic 10-Denmark 11-Estonia 12-unidentified 13-Faroe Islands 14-Finland 15-France 16-Germany 17-Greece 18-Hungary 19-Iceland 20-India 21-Ireland 22-Italy 23-Latvia 24-Lithuania 25-Luxembourg 26-Mexico 27-Netherlands 28-Norway 29-Poland 30-Portugal 31-Romania 32-Russia 33-San Marino 34-Slovakia 35-Slovenia 36-Spain 37-Sweden 38-Switzerland 39-Ukraine 40-United Arab Emirates 41-United Kingdom 42-USA 43-biz (.biz) 44-com (.com) 45-int (.int) 46-net (.net) 47-org (*.org)

    ========================================================

    6. SESSION ID -> variable indicating session id (short record)

    ========================================================

    7. PAGE 1 (MAIN CATEGORY) -> concerns the main product category:

    1-trousers 2-skirts 3-blouses 4-sale

    ========================================================

    8. PAGE 2 (CLOTHING MODEL) -> contains information about the code for each product

    (217 products)

    ========================================================

    9. COLOUR -> colour of product

    1-beige 2-black 3-blue 4-brown 5-burgundy 6-gray 7-green 8-navy blue 9-of many colors 10-olive 11-pink 12-red 13-violet 14-white

    ========================================================

    10. LOCATION -> photo location on the page, the screen has been divided into six parts:

    1-top left 2-top in the middle 3-top right 4-bottom left 5-bottom in the middle 6-bottom right

    ========================================================

    11. MODEL PHOTOGRAPHY -> variable with two categories:

    1-en face 2-profile

    ========================================================

    12. PRICE -> price in US dollars

    ========================================================

    13. PRICE 2 -> variable informing whether the price of a particular product is higher than

    the average price for the entire product category

    1-yes 2-no

    ========================================================

    14. PAGE -> page number within the e-store website (from 1 to 5)

    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++

  16. Online Retail E-Commerce Data

    • kaggle.com
    Updated Mar 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shravan Kanamadi (2025). Online Retail E-Commerce Data [Dataset]. https://www.kaggle.com/datasets/shravankanamadi/online-retail-e-commerce-data/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 12, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shravan Kanamadi
    Description

    Online Retail E-Commerce Data Hey everyone! 👋

    This dataset contains real e-commerce transaction data from 2009 to 2011. It comes from a UK-based online store that sells a variety of products. The data includes details like invoices, product codes, descriptions, prices, and even customer IDs.

    What’s Inside? Each row represents a transaction, and the dataset has the following key columns: 🛒 Invoice – Unique order ID 📦 StockCode – Product code 📝 Description – Name of the product 📊 Quantity – Number of units sold ⏳ InvoiceDate – When the purchase happened 💰 Price – Unit price of the product 👤 Customer ID – Unique identifier for each customer 🌍 Country – Where the customer is from

    Why is this dataset useful? This dataset is great for exploring: Customer Segmentation (Find high-value customers) Customer Lifetime Value (LTV) Analysis Sales & Revenue Trends Market Basket Analysis (Which products are bought together?) Predicting Churn & Retention Strategies

    How Can You Use It? If you're into data science, machine learning, or business analytics, this dataset is perfect for hands-on projects. You can analyze customer behavior, predict sales, or even build recommendation systems.

    Hope this dataset helps with your projects! Let me know if you find something interesting.

  17. Data from: Online retail

    • kaggle.com
    Updated Mar 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hicham IKNE (2020). Online retail [Dataset]. https://www.kaggle.com/hikne707/online-retail/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 5, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Hicham IKNE
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Data Set Information:

    This Online Retail II data set contains all the transactions occurring for a UK-based and registered, non-store online retail between 01/12/2009 and 09/12/2011.The company mainly sells unique all-occasion gift-ware. Many customers of the company are wholesalers.

    Attribute Information:

    InvoiceNo: Invoice number. Nominal. A 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation. StockCode: Product (item) code. Nominal. A 5-digit integral number uniquely assigned to each distinct product. Description: Product (item) name. Nominal. Quantity: The quantities of each product (item) per transaction. Numeric.
    InvoiceDate: Invice date and time. Numeric. The day and time when a transaction was generated. UnitPrice: Unit price. Numeric. Product price per unit in sterling (£). CustomerID: Customer number. Nominal. A 5-digit integral number uniquely assigned to each customer. Country: Country name. Nominal. The name of the country where a customer resides.

  18. Online Retail List for RFM

    • kaggle.com
    Updated Sep 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    İlker Yıldız (2021). Online Retail List for RFM [Dataset]. https://www.kaggle.com/ilkeryildiz/online-retail-listing/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 23, 2021
    Dataset provided by
    Kaggle
    Authors
    İlker Yıldız
    Description

    Context

    This Online Retail II data set contains all the transactions occurring for a UK-based and registered, non-store online retail between 01/12/2009 and 09/12/2011.The company mainly sells unique all-occasion gift-ware. Many customers of the company are wholesalers.

    Details

    An e-commerce company wants to segment its customers and determine marketing strategies according to these segments. To this end, we will define the behavior of customers and create groups according to clusters in these behaviors. In other words, we will include those who exhibit common behaviors in the same groups and we will try to develop special sales and marketing techniques for these groups.

    • InvoiceNo: Invoice number. Nominal. A 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation.
    • StockCode: Product (item) code. Nominal. A 5-digit integral number uniquely assigned to each distinct product.
    • Description: Product (item) name. Nominal.
    • Quantity: The quantities of each product (item) per transaction. Numeric.
    • InvoiceDate: Invice date and time. Numeric. The day and time when a transaction was generated.
    • UnitPrice: Unit price. Numeric. Product price per unit in sterling (£).
    • CustomerID: Customer number. Nominal. A 5-digit integral number uniquely assigned to each customer.
    • Country: Country name. Nominal. The name of the country where a customer resides.
  19. Predicting Coupon Redemption

    • kaggle.com
    Updated Nov 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vasudeva (2019). Predicting Coupon Redemption [Dataset]. https://www.kaggle.com/vasudeva009/predicting-coupon-redemption/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 17, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    vasudeva
    Description

    Problem Statement

    Predicting Coupon Redemption

    XYZ Credit Card company regularly helps its merchants understand their data better and take key business decisions accurately by providing machine learning and analytics consulting. ABC is an established Brick & Mortar retailer that frequently conducts marketing campaigns for its diverse product range. As a merchant of XYZ, they have sought XYZ to assist them in their discount marketing process using the power of machine learning.

    Discount marketing and coupon usage are very widely used promotional techniques to attract new customers and to retain & reinforce loyalty of existing customers. The measurement of a consumer’s propensity towards coupon usage and the prediction of the redemption behaviour are crucial parameters in assessing the effectiveness of a marketing campaign.

    ABC promotions are shared across various channels including email, notifications, etc. A number of these campaigns include coupon discounts that are offered for a specific product/range of products. The retailer would like the ability to predict whether customers redeem the coupons received across channels, which will enable the retailer’s marketing team to accurately design coupon construct, and develop more precise and targeted marketing strategies.

    The data available in this problem contains the following information, including the details of a sample of campaigns and coupons used in previous campaigns -

    User Demographic Details

    Campaign and coupon Details

    Product details

    Previous transactions

    Based on previous transaction & performance data from the last 18 campaigns, predict the probability for the next 10 campaigns in the test set for each coupon and customer combination, whether the customer will redeem the coupon or not?

    Dataset Description

    Here is the schema for the different data tables available. The detailed data dictionary is provided next.

    You are provided with the following files:

    train.csv: Train data containing the coupons offered to the given customers under the 18 campaigns

    VariableDefinition
    idUnique id for coupon customer impression
    campaign_idUnique id for a discount campaign
    coupon_idUnique id for a discount coupon
    customer_idUnique id for a customer
    redemption_status(target) (0 - Coupon not redeemed, 1 - Coupon redeemed)

    campaign_data.csv: Campaign information for each of the 28 campaigns

    VariableDefinition
    campaign_idUnique id for a discount campaign
    campaign_typeAnonymised Campaign Type (X/Y)
    start_dateCampaign Start Date
    end_dateCampaign End Date

    coupon_item_mapping.csv: Mapping of coupon and items valid for discount under that coupon

    VariableDefinition
    coupon_idUnique id for a discount coupon (no order)
    item_idUnique id for items for which given coupon is valid (no order)

    customer_demographics.csv: Customer demographic information for some customers

    VariableDefinition
    customer_idUnique id for a customer
    age_rangeAge range of customer family in years
    marital_statusMarried/Single
    rented0 - not rented accommodation, 1 - rented accommodation
    family_sizeNumber of family members
    no_of_childrenNumber of children in the family
    income_bracketLabel Encoded Income Bracket (Higher income corresponds to higher number)

    customer_transaction_data.csv: Transaction data for all customers for duration of campaigns in the train data

    VariableDefinition
    dateDate of Transaction
    customer_idUnique id for a customer
    item_idUnique id for item
    quantityquantity of item bought
    selling_priceSales value of the transaction
    other_discountDiscount from other sources such as manufacturer coupon/loyalty card
    coupon_discountDiscount availed from retailer coupon

    item_data.csv: Item information for each item sold by the retailer

    VariableDefinition
    item_idUnique id for itemv
    brandUnique id for item brand
    brand_typeBrand Type (local/Established)
    categoryItem Category

    test.csv: Contains the coupon customer combination for which redemption status is to be predicted

    VariableDefinition
    idUnique id for coupon customer impression
    campaign_idUnique id for a discount campaign
    coupon_idUnique id for a discount coupon
    customer_idUnique id for a customer

    To summarise the entire process:

    • Customers receive coupons under various campaigns and may choose to redeem it.
    • They can redeem the given coupon for any valid product for that coupon as per coupon item mapping within the duration between campaign start date and end date
    • Next, the customer will redeem the coupon for an item at the retailer store and that will reflect in the transaction table in the column co...
  20. Supplement Sales Prediction

    • kaggle.com
    Updated Sep 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    A SURESH (2021). Supplement Sales Prediction [Dataset]. https://www.kaggle.com/sureshmecad/supplement-sales-prediction/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 17, 2021
    Dataset provided by
    Kaggle
    Authors
    A SURESH
    Description

    Context

    Supplement Sales Prediction

    • Your Client WOMart is a leading nutrition and supplement retail chain that offers a comprehensive range of products for all your wellness and fitness needs.

    • WOMart follows a multi-channel distribution strategy with 350+ retail stores spread across 100+ cities.

    • Effective forecasting for store sales gives essential insight into upcoming cash flow, meaning WOMart can more accurately plan the cashflow at the store level.

    • Sales data for 18 months from 365 stores of WOMart is available along with information on Store Type, Location Type for each store, Region Code for every store, Discount provided by the store on every day, Number of Orders everyday etc.

    • Your task is to predict the store sales for each store in the test set for the next two months.

    Content

    Train Data |Variable |Definition | |-------------------------------|-------------------------------| |ID |Unique Identifier for a row | |Store_id |Unique id for each Store| |Store_Type |Type of the Store| |Location_Type |Type of the location where Store is located| |Region_Code |Code of the Region where Store is located| |Date |Information about the Date| |Holiday |If there is holiday on the given Date, 1 : Yes, 0 : No| |Discount |If discount is offered by store on the given Date, Yes/ No| |#Orders |Number of Orders received by the Store on the given Day| |Sales |Total Sale for the Store on the given Day|

    Test Data |Variable |Definition | |-----------------------------|-------------------------| |ID |Unique Identifier for a row | |Store_id |Unique id for each Store | |Store_Type |Type of the Store | |Location_Type |Type of the location where Store is located | |Region_Code |Code of the Region where Store is located | |Date |Information about the Date | |Holiday |If there is holiday on the given Date, 1 : Yes, 0 : No | |Discount |If discount is offered by store on the given Date, Yes/ No |

    Sample_Submission |Variable |Definition | |------------------------|----------------| |ID |Unique Identifier for a row | |Sales |Total Sale for the Store on the given Day |

    Evaluation

    • The evaluation metric for this competition is MSLE * 1000 across all entries in the test set.

    Public and Private Split

    • Test data is further divided into Public (First 20 Days) and Private (Last 41 Days). You will make the prediction for two months (61 days).
    • Your initial responses will be checked and scored on the Public data.
    • The final rankings would be based on your private score which will be published once the competition is over.

    The sales column that we submit would be compared to the actual answer similar to the following. Instead of 8 items it is 22266 items(the function is avable in sklearn).

    Sample Input :

    actual = [27.5, 55.9, 25.8, 17.7, 27.6, 55.9, 25.7, 17.8] predicted = 24.0, 49.1, 21.0, 16.2, 23.3, 47.0, 12.1, 15.2*1000

    Sample Output:

    82.9949678377161

    Public and Private Split

    • Test data is further divided into Public (First 20 Days) and Private (Last 41 Days). You will make the prediction for two months (61 days).

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Prasad Patil (2024). Retail Transactions Dataset [Dataset]. https://www.kaggle.com/datasets/prasad22/retail-transactions-dataset
Organization logo

Retail Transactions Dataset

For market basket analysis, customer segmentation & other retail analytics tasks

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 18, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Prasad Patil
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

This dataset was created to simulate a market basket dataset, providing insights into customer purchasing behavior and store operations. The dataset facilitates market basket analysis, customer segmentation, and other retail analytics tasks. Here's more information about the context and inspiration behind this dataset:

Context:

Retail businesses, from supermarkets to convenience stores, are constantly seeking ways to better understand their customers and improve their operations. Market basket analysis, a technique used in retail analytics, explores customer purchase patterns to uncover associations between products, identify trends, and optimize pricing and promotions. Customer segmentation allows businesses to tailor their offerings to specific groups, enhancing the customer experience.

Inspiration:

The inspiration for this dataset comes from the need for accessible and customizable market basket datasets. While real-world retail data is sensitive and often restricted, synthetic datasets offer a safe and versatile alternative. Researchers, data scientists, and analysts can use this dataset to develop and test algorithms, models, and analytical tools.

Dataset Information:

The columns provide information about the transactions, customers, products, and purchasing behavior, making the dataset suitable for various analyses, including market basket analysis and customer segmentation. Here's a brief explanation of each column in the Dataset:

  • Transaction_ID: A unique identifier for each transaction, represented as a 10-digit number. This column is used to uniquely identify each purchase.
  • Date: The date and time when the transaction occurred. It records the timestamp of each purchase.
  • Customer_Name: The name of the customer who made the purchase. It provides information about the customer's identity.
  • Product: A list of products purchased in the transaction. It includes the names of the products bought.
  • Total_Items: The total number of items purchased in the transaction. It represents the quantity of products bought.
  • Total_Cost: The total cost of the purchase, in currency. It represents the financial value of the transaction.
  • Payment_Method: The method used for payment in the transaction, such as credit card, debit card, cash, or mobile payment.
  • City: The city where the purchase took place. It indicates the location of the transaction.
  • Store_Type: The type of store where the purchase was made, such as a supermarket, convenience store, department store, etc.
  • Discount_Applied: A binary indicator (True/False) representing whether a discount was applied to the transaction.
  • Customer_Category: A category representing the customer's background or age group.
  • Season: The season in which the purchase occurred, such as spring, summer, fall, or winter.
  • Promotion: The type of promotion applied to the transaction, such as "None," "BOGO (Buy One Get One)," or "Discount on Selected Items."

Use Cases:

  • Market Basket Analysis: Discover associations between products and uncover buying patterns.
  • Customer Segmentation: Group customers based on purchasing behavior.
  • Pricing Optimization: Optimize pricing strategies and identify opportunities for discounts and promotions.
  • Retail Analytics: Analyze store performance and customer trends.

Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.

Search
Clear search
Close search
Google apps
Main menu