23 datasets found
  1. Data from: Retail Sales Analysis:

    • kaggle.com
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Talha khalid (2024). Retail Sales Analysis: [Dataset]. https://www.kaggle.com/datasets/talhachoudary/sales-of-company/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Talha khalid
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview This collection of datasets is designed to provide a comprehensive overview of a retail business's operations, focusing on calendar information, customer demographics, order details, and product information. These datasets are ideal for performing in-depth sales analysis, customer segmentation, demand forecasting, and inventory management.

    Dataset Descriptions Calendar.csv

    Description: This file contains detailed calendar information to assist with time-based analysis. It includes important dates, such as holidays, weekends, and fiscal periods, which can be critical for analyzing sales trends, seasonality, and promotional impacts. Key Columns: Date: The specific date. Day of Week: The day of the week (e.g., Monday, Tuesday). Month: The month corresponding to the date. Quarter: The fiscal quarter (Q1, Q2, etc.). Year: The year of the date. Holiday Flag: Indicates if the date is a public holiday. Customer.csv

    Description: This dataset contains demographic information about the customers. It’s useful for customer segmentation, lifetime value analysis, and targeted marketing campaigns. Key Columns: Customer ID: A unique identifier for each customer. Name: The full name of the customer. Age: The age of the customer. Gender: The gender of the customer. Location: The geographic location (city/state) of the customer. Loyalty Tier: The loyalty program tier of the customer (e.g., Bronze, Silver, Gold). Order.csv

    Description: This dataset tracks individual customer orders, including transaction details. It is essential for sales analysis, order fulfillment tracking, and revenue analysis. Key Columns: Order ID: A unique identifier for each order. Customer ID: The ID of the customer who placed the order (linking to Customer.csv). Order Date: The date the order was placed. Product ID: The ID of the product ordered (linking to Product.csv). Quantity: The quantity of the product ordered. Total Price: The total price of the order. Product.csv

    Description: This dataset provides detailed information on the products available in the retail store. It includes categories, pricing, and supplier information, making it useful for inventory management and product performance analysis. Key Columns: Product ID: A unique identifier for each product. Product Name: The name of the product. Category: The category under which the product falls (e.g., Electronics, Clothing). Supplier ID: The ID of the supplier providing the product. Unit Price: The price per unit of the product. Stock Quantity: The number of units available in stock. Usability These datasets can be utilized for various business analytics tasks, including:

    Sales and Revenue Analysis: By linking the Order.csv and Product.csv, one can analyze sales performance by product category, identify best-sellers, and determine revenue drivers. Customer Segmentation: Using Customer.csv, segment customers based on demographics or purchase behavior to tailor marketing efforts. Demand Forecasting: Integrate Calendar.csv to model seasonality effects and predict future sales trends. Provenance These datasets are typically generated from an ERP system or CRM and are structured to support a variety of business intelligence applications. Users may need to perform data cleaning or transformation depending on the specific use case.

    Licensing and Coverage The datasets are provided without a specific license. Users are encouraged to verify and attribute the source as needed. Coverage typically includes the entire operational history of the retail business, though users should check for any specific time range covered.

  2. d

    US Consumer Demographics | Homeowners & Renters | Email & Mobile Phone |...

    • datarade.ai
    .json, .csv, .xls
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CompCurve (2024). US Consumer Demographics | Homeowners & Renters | Email & Mobile Phone | Bulk & Custom | 255M People [Dataset]. https://datarade.ai/data-products/compcurve-us-consumer-demographics-homeowners-renters-compcurve
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Oct 18, 2024
    Dataset authored and provided by
    CompCurve
    Area covered
    United States
    Description

    Knowing who your consumers are is essential for businesses, marketers, and researchers. This detailed demographic file offers an in-depth look at American consumers, packed with insights about personal details, household information, financial status, and lifestyle choices. Let's take a closer look at the data:

    Personal Identifiers and Basic Demographics At the heart of this dataset are the key details that make up a consumer profile:

    Unique IDs (PID, HHID) for individuals and households Full names (First, Middle, Last) and suffixes Gender and age Date of birth Complete location details (address, city, state, ZIP) These identifiers are critical for accurate marketing and form the base for deeper analysis.

    Geospatial Intelligence This file goes beyond just listing addresses by including rich geospatial data like:

    Latitude and longitude Census tract and block details Codes for Metropolitan Statistical Areas (MSA) and Core-Based Statistical Areas (CBSA) County size codes Geocoding accuracy This allows for precise geographic segmentation and localized marketing.

    Housing and Property Data The dataset covers a lot of ground when it comes to housing, providing valuable insights for real estate professionals, lenders, and home service providers:

    Homeownership status Dwelling type (single-family, multi-family, etc.) Property values (market, assessed, and appraised) Year built and square footage Room count, amenities like fireplaces or pools, and building quality This data is crucial for targeting homeowners with products and services like refinancing or home improvement offers.

    Wealth and Financial Data For a deeper dive into consumer wealth, the file includes:

    Estimated household income Wealth scores Credit card usage Mortgage info (loan amounts, rates, terms) Home equity estimates and investment property ownership These indicators are invaluable for financial services, luxury brands, and fundraising organizations looking to reach affluent individuals.

    Lifestyle and Interests One of the most useful features of the dataset is its extensive lifestyle segmentation:

    Hobbies and interests (e.g., gardening, travel, sports) Book preferences, magazine subscriptions Outdoor activities (camping, fishing, hunting) Pet ownership, tech usage, political views, and religious affiliations This data is perfect for crafting personalized marketing campaigns and developing products that align with specific consumer preferences.

    Consumer Behavior and Purchase Habits The file also sheds light on how consumers behave and shop:

    Online and catalog shopping preferences Gift-giving tendencies, presence of children, vehicle ownership Media consumption (TV, radio, internet) Retailers and e-commerce businesses will find this behavioral data especially useful for tailoring their outreach.

    Demographic Clusters and Segmentation Pre-built segments like:

    Household, neighborhood, family, and digital clusters Generational and lifestage groups make it easier to quickly target specific demographics, streamlining the process for market analysis and campaign planning.

    Ethnicity and Language Preferences In today's multicultural market, knowing your audience's cultural background is key. The file includes:

    Ethnicity codes and language preferences Flags for Hispanic/Spanish-speaking households This helps ensure culturally relevant and sensitive communication.

    Education and Occupation Data The dataset also tracks education and career info:

    Education level and occupation codes Home-based business indicators This data is essential for B2B marketers, recruitment agencies, and education-focused campaigns.

    Digital and Social Media Habits With everyone online, digital behavior insights are a must:

    Internet, TV, radio, and magazine usage Social media platform engagement (Facebook, Instagram, LinkedIn) Streaming subscriptions (Netflix, Hulu) This data helps marketers, app developers, and social media managers connect with their audience in the digital space.

    Political and Charitable Tendencies For political campaigns or non-profits, this dataset offers:

    Political affiliations and outlook Charitable donation history Volunteer activities These insights are perfect for cause-related marketing and targeted political outreach.

    Neighborhood Characteristics By incorporating census data, the file provides a bigger picture of the consumer's environment:

    Population density, racial composition, and age distribution Housing occupancy and ownership rates This offers important context for understanding the demographic landscape.

    Predictive Consumer Indexes The dataset includes forward-looking indicators in categories like:

    Fashion, automotive, and beauty products Health, home decor, pet products, sports, and travel These predictive insights help businesses anticipate consumer trends and needs.

    Contact Information Finally, the file includes key communication details:

    Multiple phone numbers (landline, mobile) and email addresses Do Not Call (DNC) flags...

  3. d

    Dataplex: All CMS Data Feeds | Access 1519 Reports & 26B+ Rows of Data |...

    • datarade.ai
    .csv
    Updated Aug 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataplex (2024). Dataplex: All CMS Data Feeds | Access 1519 Reports & 26B+ Rows of Data | Perfect for Historical Analysis & Easy Ingestion [Dataset]. https://datarade.ai/data-products/dataplex-all-cms-data-feeds-access-1519-reports-26b-row-dataplex
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Aug 14, 2024
    Dataset authored and provided by
    Dataplex
    Area covered
    United States of America
    Description

    The All CMS Data Feeds dataset is an expansive resource offering access to 118 unique report feeds, providing in-depth insights into various aspects of the U.S. healthcare system. With over 25.8 billion rows of data meticulously collected since 2007, this dataset is invaluable for healthcare professionals, analysts, researchers, and businesses seeking to understand and analyze healthcare trends, performance metrics, and demographic shifts over time. The dataset is updated monthly, ensuring that users always have access to the most current and relevant data available.

    Dataset Overview:

    118 Report Feeds: - The dataset includes a wide array of report feeds, each providing unique insights into different dimensions of healthcare. These topics range from Medicare and Medicaid service metrics, patient demographics, provider information, financial data, and much more. The breadth of information ensures that users can find relevant data for nearly any healthcare-related analysis. - As CMS releases new report feeds, they are automatically added to this dataset, keeping it current and expanding its utility for users.

    25.8 Billion Rows of Data:

    • With over 25.8 billion rows of data, this dataset provides a comprehensive view of the U.S. healthcare system. This extensive volume of data allows for granular analysis, enabling users to uncover insights that might be missed in smaller datasets. The data is also meticulously cleaned and aligned, ensuring accuracy and ease of use.

    Historical Data Since 2007: - The dataset spans from 2007 to the present, offering a rich historical perspective that is essential for tracking long-term trends and changes in healthcare delivery, policy impacts, and patient outcomes. This historical data is particularly valuable for conducting longitudinal studies and evaluating the effects of various healthcare interventions over time.

    Monthly Updates:

    • To ensure that users have access to the most current information, the dataset is updated monthly. These updates include new reports as well as revisions to existing data, making the dataset a continuously evolving resource that stays relevant and accurate.

    Data Sourced from CMS:

    • The data in this dataset is sourced directly from the Centers for Medicare & Medicaid Services (CMS). After collection, the data is meticulously cleaned and its attributes are aligned, ensuring consistency, accuracy, and ease of use for any application. Furthermore, any new updates or releases from CMS are automatically integrated into the dataset, keeping it comprehensive and current.

    Use Cases:

    Market Analysis:

    • The dataset is ideal for market analysts who need to understand the dynamics of the healthcare industry. The extensive historical data allows for detailed segmentation and analysis, helping users identify trends, market shifts, and growth opportunities. The comprehensive nature of the data enables users to perform in-depth analyses of specific market segments, making it a valuable tool for strategic decision-making.

    Healthcare Research:

    • Researchers will find the All CMS Data Feeds dataset to be a robust foundation for academic and commercial research. The historical data, combined with the breadth of coverage across various healthcare metrics, supports rigorous, in-depth analysis. Researchers can explore the effects of healthcare policies, study patient outcomes, analyze provider performance, and more, all within a single, comprehensive dataset.

    Performance Tracking:

    • Healthcare providers and organizations can use the dataset to track performance metrics over time. By comparing data across different periods, organizations can identify areas for improvement, monitor the effectiveness of initiatives, and ensure compliance with regulatory standards. The dataset provides the detailed, reliable data needed to track and analyze key performance indicators.

    Compliance and Regulatory Reporting:

    • The dataset is also an essential tool for compliance officers and those involved in regulatory reporting. With detailed data on provider performance, patient outcomes, and healthcare utilization, the dataset helps organizations meet regulatory requirements, prepare for audits, and ensure adherence to best practices. The accuracy and comprehensiveness of the data make it a trusted resource for regulatory compliance.

    Data Quality and Reliability:

    The All CMS Data Feeds dataset is designed with a strong emphasis on data quality and reliability. Each row of data is meticulously cleaned and aligned, ensuring that it is both accurate and consistent. This attention to detail makes the dataset a trusted resource for high-stakes applications, where data quality is critical.

    Integration and Usability:

    Ease of Integration:

    • The dataset is provided in a CSV format, which is widely compatible with most data analysis tools and platforms. This ensures that users can easily integrate the data into their existing wo...
  4. D

    Geodemographic Segmentation Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Geodemographic Segmentation Market Research Report 2033 [Dataset]. https://dataintelo.com/report/geodemographic-segmentation-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Geodemographic Segmentation Market Outlook




    As per our latest research, the global geodemographic segmentation market size in 2024 stands at USD 3.2 billion, demonstrating robust momentum driven by the rising demand for advanced customer profiling and targeted marketing strategies. The market is projected to expand at a CAGR of 11.7% from 2025 to 2033, reaching an estimated value of USD 8.9 billion by the end of the forecast period. This growth is primarily fueled by the increasing adoption of data-driven decision-making across industries and the integration of artificial intelligence with geodemographic analytics.




    The primary growth factor for the geodemographic segmentation market is the unparalleled need for precise consumer insights in a rapidly digitizing world. As businesses strive to understand and anticipate customer behavior, geodemographic segmentation enables organizations to dissect vast datasets, combining geographic, demographic, and socioeconomic attributes. This approach not only enhances marketing efficiency but also allows for hyper-localized targeting, which has become essential in today’s competitive landscape. The proliferation of digital channels and mobile devices has further augmented the availability of granular data, empowering organizations to craft personalized experiences that resonate with specific audience clusters. Moreover, the integration of advanced analytics tools and machine learning algorithms has significantly improved the accuracy and predictive power of geodemographic models, making them indispensable for modern enterprises.




    Another significant driver is the transformative impact of geodemographic segmentation in sectors such as retail, real estate, and financial services. Retailers, for instance, leverage these insights to optimize store locations, tailor product offerings, and refine promotional strategies, resulting in enhanced customer engagement and increased sales conversion rates. In real estate, geodemographic analysis aids in identifying emerging neighborhoods, understanding population trends, and assessing investment risks. The banking and financial sector utilizes these tools to refine credit risk models, detect fraud, and design customized offerings for diverse demographic segments. Furthermore, the healthcare industry is increasingly adopting geodemographic segmentation to improve outreach for preventive care programs and allocate resources more efficiently, particularly in underserved regions. This cross-industry adoption underscores the versatility and strategic value of geodemographic segmentation solutions.




    Additionally, regulatory shifts and the growing emphasis on privacy and data security are shaping the evolution of the geodemographic segmentation market. With the implementation of stringent data protection laws such as GDPR in Europe and CCPA in California, organizations are compelled to adopt transparent and compliant data practices. This has led to a surge in demand for secure, privacy-focused geodemographic solutions that ensure robust data governance while delivering actionable insights. Vendors are responding by incorporating advanced encryption, anonymization, and consent management features into their offerings. While these regulatory requirements present challenges, they also create opportunities for innovation and differentiation, as companies that prioritize ethical data use are likely to gain a competitive edge and foster greater trust among consumers.




    From a regional perspective, North America remains the dominant market for geodemographic segmentation, accounting for approximately 38% of global revenue in 2024, followed closely by Europe and the rapidly expanding Asia Pacific region. The presence of leading technology providers, a mature digital ecosystem, and high adoption rates of analytics solutions contribute to North America’s leadership. Europe’s market growth is buoyed by regulatory compliance and the proliferation of smart city initiatives, while Asia Pacific’s market is witnessing accelerated growth due to urbanization, a burgeoning middle class, and increasing investments in digital infrastructure. Latin America and the Middle East & Africa are also experiencing steady progress, driven by the digital transformation of commercial and government sectors. This regional diversification is expected to intensify competition and spur innovation across the global market.


    <br

  5. Distribution of samples by age group and gender.

    • plos.figshare.com
    xls
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan (2025). Distribution of samples by age group and gender. [Dataset]. http://doi.org/10.1371/journal.pntd.0012918.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 19, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background COVID-19 pandemic had unprecedented global impact on health and society, highlighting the need for a detailed understanding of SARS-CoV-2 evolution in response to host and environmental factors. This study investigates the evolution of SARS-CoV-2 via mutation dynamics, focusing on distinct age cohorts, geographical location, and vaccination status within the Indian population, one of the nations most affected by COVID-19. Methodology Comprehensive dataset, across diverse time points during the Alpha, Delta, and Omicron variant waves, captured essential phases of the pandemic’s footprint in India. By leveraging genomic data from Global Initiative on Sharing Avian Influenza Data (GISAID), we examined the substitution mutation landscape of SARS-CoV-2 in three demographic segments: children (1–17 years), working-age adults (18–64 years), and elderly individuals (65+ years). A balanced dataset of 69,975 samples was used for the study, comprising 23,325 samples from each group. This design ensured high statistical power, as confirmed by power analysis. We employed bioinformatics and statistical analyses, to explore genetic diversity patterns and substitution frequencies across the age groups. Principal findings The working-age group exhibited a notably high frequency of unique substitutions, suggesting that immune pressures within highly interactive populations may accelerate viral adaptation. Geographic analysis emphasizes notable regional variation in substitution rates, potentially driven by population density and local transmission dynamics, while regions with more homogeneous strain circulation show relatively lower substitution rates. The analysis also revealed a significant surge in unique substitutions across all age groups during the vaccination period, with substitution rates remaining elevated even after widespread vaccination, compared to pre-vaccination levels. This trend supports the virus's adaptive response to heightened immune pressures from vaccination, as observed through the increased prevalence of substitutions in important regions of SARS-CoV-2 genome like ORF1ab and Spike, potentially contributing to immune escape and transmissibility. Conclusion Our findings affirm the importance of continuous surveillance on viral evolution, particularly in countries with high transmission rates. This research provides insights for anticipating future viral outbreaks and refining pandemic preparedness strategies, thus enhancing our capacity for proactive global health responses.

  6. Customer Purchasing Patterns with Market Basket

    • kaggle.com
    zip
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Customer Purchasing Patterns with Market Basket [Dataset]. https://www.kaggle.com/datasets/thedevastator/customer-purchasing-patterns-with-market-basket/data
    Explore at:
    zip(170949 bytes)Available download formats
    Dataset updated
    Feb 7, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Customer Purchasing Patterns with Market Basket Analysis

    Identifying Key Associations

    By [source]

    About this dataset

    This dataset contains customer purchase patterns from a retail retail company that was used to identify key associations using the Market Basket Analysis model. In particular, this dataset provides insights into loyalty programs, customer segmentation, product recommendations and even cross-selling opportunities. The records contain customer demographic information such as age, gender and income, as well as details about their purchasing history such as payment methods, product quantity purchased and shipping status. Through a deep analysis of these data points using Market Basket Analysis we can gain insights into how customers interact with the products in order to increase sales and optimize loyalty incentives. The dataset is composed of two files: the prepared_dataset file containing aggregate customer purchase data; and teleco_market_basket which contains individual level customer purchase information. With these datasets we can start tracking important itemsets or combinations of items purchased together by customers and use them in powerful ways to provide better service levels while increasing overall satisfaction

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    To use this dataset, start by exploring the available variables by looking at descriptive statistics such as mean, median and standard deviation for each variable. This will allow you to gain a better understanding of how customers are utilizing different products or services. Additionally, you can look for correlations between variables to identify associations between different variables or products purchased in tandem.

    Once you’ve determined which associations are most meaningful in terms of predicting customer behavior, you can then utilize these insights to inform new marketing strategies or other business decisions. For instance, if a certain product category is often bought with another product category in tandem by your customers then that insight could be used to drive sales of both products simultaneously.

    Additionally, using this dataset offers an opportunity to compare and analyze sales figures against time periods or particular seasons (excluding dates), which allows managers to anticipate future trends more confidently without relying on gut intuitions alone!

    Research Ideas

    • Identifying which customer segments are most likely to purchase complementary products based on which items have been purchased together.
    • Suggesting potential discounts and incentives for customers based on their previous purchases or purchase patterns.
    • Identifying whether customer loyalty programs have an effect on the purchasing habits of customers by analyzing changes in their purchase patterns over time

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: prepared_dataset.csv | Column name | Description | |:--------------|:-----------------------------------| | 0 | Customer ID (Integer) | | 1 | Product ID (Integer) | | 2 | Price of product purchased (Float) | | 3 | Quantity Purchased (Integer) | | 4 | Payment Method (String) | | 5 | Shipping Cost (Float) | | 6 | Shipping Method (String) | | 7 | Order Date (Date) | | 8 | Delivery Date (Date) | | 9 | Delivery Status (String) | | 10 | Customer Name (String) | | 11 | Customer Address (String) | | 12 | Customer City (String) | | 13 | Customer State (String) | | 14 | Customer Zip Code (Integer) | | 15 | Customer Country (String) | | 16 | Customer Phone Number (String) | | 17 | Customer Email (String) | | 18 | Customer IP Address (String) | | 19 | Item20 (String) ...

  7. d

    UK Consumer Data | Sagacity Enhance Core | 95m+ individuals | 100+ full...

    • datarade.ai
    .csv, .xls, .txt
    Updated Mar 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sagacity (2021). UK Consumer Data | Sagacity Enhance Core | 95m+ individuals | 100+ full coverage variables | Audience & Segmentation Data | UK Coverage [Dataset]. https://datarade.ai/data-products/enhance-core-consumer-marketing-data-uk-coverage-sagacity
    Explore at:
    .csv, .xls, .txtAvailable download formats
    Dataset updated
    Mar 20, 2021
    Dataset authored and provided by
    Sagacity
    Area covered
    United Kingdom
    Description

    Overview This product, with over 100 actual and modelled variables, is designed to help you gain better insight into your customers and prospects. The Enhance dataset provides users with a set of predictive and descriptive attributes which support more informed, targeted and relevant marketing to consumers.

    What is it? Enhance Core is an individual level data set, containing self-declared, freely given socio-demographic data on over 90m individuals. The data is obtained from a range of sources, including; Satisfaction & Lifestyle surveys, Website Registrations, Newsletter & Service subscriptions, Offers & Competition websites and public Social Media feeds.

    Use cases -Using key information, appended from Enhance, to create personalised messaging for direct mail & digital marketing campaigns - Using Profiling & Predictive messaging to identify important cohorts within the customer base, and those that can be “Forgotten” - Seeing how the current customer base compares to the UK base, so you can identify which potential audiences you are missing and also those that your business excels in. - Segment your customers into distinct groups so that you can offer them the right products through the most appropriate channels

    Additional Insights Enhance Core, Property & Geo (Individual, Property & Postcode level data) can all be used modularly, allowing you to understand the full picture of your customer base, considering not only their individual variance but also where they live & those around them.

  8. E-commerce Customer Behaviour Dataset

    • kaggle.com
    zip
    Updated Sep 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Samuel W E (2025). E-commerce Customer Behaviour Dataset [Dataset]. https://www.kaggle.com/datasets/paulsamuelwe/e-commerce-customer-behaviour-dataset
    Explore at:
    zip(10257 bytes)Available download formats
    Dataset updated
    Sep 27, 2025
    Authors
    Paul Samuel W E
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    E-Commerce Customer Behavior Dataset

    The E-Commerce Customer Behavior Dataset is a synthetic dataset designed to capture the full spectrum of customer interactions with an online retail platform. Created by Gretel AI for educational and research purposes, it provides a comprehensive view of how customers browse, purchase, and review products. The dataset is ideal for data science practice, machine learning modeling, and exploratory analytics.

    Features and Variables

    Customer ID

    • Unique identifier for each customer.
    • Allows tracking customer behavior across multiple features.

    Age

    • Numeric value representing customer age.
    • Useful for demographic analysis and segmentation.

    Gender

    • Categorical: Male, Female, Other.
    • Enables study of gender-specific purchasing patterns.

    Location

    • Geographic location of the customer (city or region).
    • Supports regional analysis and location-based marketing insights.

    Annual Income

    • Customer’s annual income in USD.
    • Key for understanding purchasing power and spending habits.

    Purchase History

    • Structured list of products purchased, including:

      • Date of purchase
      • Product category
      • Price
    • Allows analysis of repeat purchases, product popularity, and category trends.

    Browsing History

    • Records of products viewed by the customer with timestamps.
    • Useful to study engagement patterns, interests, and conversion likelihood.

    Product Reviews

    • Textual reviews and ratings (1–5 stars) provided by customers.
    • Enables qualitative analysis of customer satisfaction and sentiment.

    Time on Site

    • Total duration (in minutes) spent by the customer per session.
    • Indicator of user engagement and browsing intensity.

    Data Summary

    FeatureRange / DistributionNotes
    Age24–65Mean: 40, Std: 11
    GenderFemale 52%, Male 36%, Other 12%Categorical
    LocationMost common: City D (24%), City E (12%), Other (64%)Regional trends
    Annual Income$40,000–$100,000Mean: $65,800, Std: $16,900
    Time on Site32.5–486.3 minsMean: 233, Std: 109

    Example Entries

    Purchase History

    [
     {"Date": "2022-03-05", "Category": "Clothing", "Price": 34.99},
     {"Date": "2022-02-12", "Category": "Electronics", "Price": 129.99},
     {"Date": "2022-01-20", "Category": "Home & Garden", "Price": 29.99}
    ]
    

    Browsing History

    [
     {"Timestamp": "2022-03-10T14:30:00Z"},
     {"Timestamp": "2022-03-11T09:45:00Z"},
     {"Timestamp": "2022-03-12T16:20:00Z"}
    ]
    

    Product Review

    {
     "Review Text": "Excellent product, highly recommend!",
     "Rating": 5
    }
    

    Methodology

    This dataset was synthetically generated using machine learning techniques to simulate realistic customer behavior:

    1. Pattern Recognition Identifying trends and correlations observed in real-world e-commerce datasets.

    2. Synthetic Data Generation Producing data points for all features while preserving realistic relationships.

    3. Controlled Variation Introducing diversity to reflect a wide range of customer behaviors while maintaining logical consistency.

    Potential Use Cases

    • Customer segmentation and profiling
    • Predictive modeling of purchases and churn
    • Recommender system development
    • Sentiment analysis and natural language processing on reviews
    • Engagement and behavioral analytics

    License

    CC BY 4.0 (Attribution 4.0 International) Free to use for educational and research purposes with attribution.

    Important Notes

    • This dataset is fully synthetic — it contains no personal or sensitive information.
    • Ideal for learners, educators, and researchers looking to practice analytics and machine learning in a realistic e-commerce context.
  9. Marketing Insights for E-Commerce Company

    • kaggle.com
    zip
    Updated Oct 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rishi Kumar (2023). Marketing Insights for E-Commerce Company [Dataset]. https://www.kaggle.com/datasets/rishikumarrajvansh/marketing-insights-for-e-commerce-company
    Explore at:
    zip(628618 bytes)Available download formats
    Dataset updated
    Oct 27, 2023
    Authors
    Rishi Kumar
    Description

    ** Inputs related to Analysis for additional reference:** 1. Why do we need customer Segmentation? As every customer is unique and can be targeted in different ways. The Customer segmentation plays an important role in this case. The segmentation helps to understand profiles of customers and can be helpful in defining cross sell/upsell/activation/acquisition strategies. 2. What is RFM Segmentation? RFM Segmentation is an acronym of recency, frequency and monetary based segmentation. Recency is about when the last order of a customer. It means the number of days since a customer made the last purchase. If it’s a case for a website or an app, this could be interpreted as the last visit day or the last login time. Frequency is about the number of purchases in a given period. It could be 3 months, 6 months or 1 year. So we can understand this value as for how often or how many customers used the product of a company. The bigger the value is, the more engaged the customers are. Alternatively We can define, average duration between two transactions Monetary is the total amount of money a customer spent in that given period. Therefore big spenders will be differentiated with other customers such as MVP or VIP. 3. What is LTV and How to define it? In the current world, almost every retailer promotes its subscription and this is further used to understand the customer lifetime. Retailer can manage these customers in better manner if they know which customer is high life time value. Customer lifetime value (LTV) can also be defined as the monetary value of a customer relationship, based on the present value of the projected future cash flows from the customer relationship. Customer lifetime value is an important concept in that it encourages firms to shift their focus from quarterly profits to the long-term health of their customer relationships. Customer lifetime value is an important metric because it represents an upper limit on spending to acquire new customers. For this reason it is an important element in calculating payback of advertising spent in marketing mix modelling. 4. Why do need to predict Customer Lifetime Value? The LTV is an important building block in campaign design and marketing mix management. Although targeting models can help to identify the right customers to be targeted, LTV analysis can help to quantify the expected outcome of targeting in terms of revenues and profits. The LTV is also important because other major metrics and decision thresholds can be derived from it. For example, the LTV is naturally an upper limit on the spending to acquire a customer, and the sum of the LTVs for all of the customers of a brand, known as the customer equity, is a major metric forbusiness valuations. Similarly to many other problems of marketing analytics and algorithmic marketing, LTV modelling can be approached from descriptive, predictive, and prescriptive perspectives. 5. How Next Purchase Day helps to Retailers? Our objective is to analyse when our customer will purchase products in the future so for such customers we can build strategy and can come up with strategies and marketing campaigns accordingly. a. Group-1: Customers who will purchase in more than 60 days b. Group-2: Customers who will purchase in 30-60 days c. Group-3: Customers who will purchase in 0-30 days 6. What is Cohort Analysis? How it will be helpful? A cohort is a group of users who share a common characteristic that is identified in this report by an Analytics dimension. For example, all users with the same Acquisition Date belong to the same cohort. The Cohort Analysis report lets you isolate and analyze cohort behaviour. Cohort analysis in e-commerce means to monitor your customers’ behaviour based on common traits they share – the first product they bought, when they became customers, etc. - - to find patterns and tailor marketing activities for the group.

    Transaction data has been provided for the period of 1st Jan 2019 to 31st Dec 2019. The below data sets have been provided. Online_Sales.csv: This file contains actual orders data (point of Sales data) at transaction level with below variables. CustomerID: Customer unique ID Transaction_ID: Transaction Unique ID Transaction_Date: Date of Transaction Product_SKU: SKU ID – Unique Id for product Product_Description: Product Description Product_Cateogry: Product Category Quantity: Number of items ordered Avg_Price: Price per one quantity Delivery_Charges: Charges for delivery Coupon_Status: Any discount coupon applied Customers_Data.csv: This file contains customer’s demographics. CustomerID: Customer Unique ID Gender: Gender of customer Location: Location of Customer Tenure_Months: Tenure in Months Discount_Coupon.csv: Discount coupons have been given for different categories in different months Month: Discount coupon applied in that month Product_Category: Product categor...

  10. Standardized synonymous and non-synonymous substitution counts across genes...

    • plos.figshare.com
    xls
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan (2025). Standardized synonymous and non-synonymous substitution counts across genes in different groups. [Dataset]. http://doi.org/10.1371/journal.pntd.0012918.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 19, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Standardized synonymous and non-synonymous substitution counts across genes in different groups.

  11. Segmentation and socio-demographic variables.

    • plos.figshare.com
    xls
    Updated Jun 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mauricio Carvache-Franco; Tahani Hassan; Orly Carvache-Franco; Wilmer Carvache-Franco; Olga Martin-Moreno (2023). Segmentation and socio-demographic variables. [Dataset]. http://doi.org/10.1371/journal.pone.0287113.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mauricio Carvache-Franco; Tahani Hassan; Orly Carvache-Franco; Wilmer Carvache-Franco; Olga Martin-Moreno
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Food festivals have been a growing tourism sector in recent years due to their contributions to a region’s economic, marketing, brand, and social growth. This study analyses the demand for the Bahrain food festival. The stated objectives were: i) To identify the motivational dimensions of the demand for the food festival, (ii) To determine the segments of the demand for the food festival, and (iii) To establish the relationship between the demand segments and socio-demographic aspects. The food festival investigated was the Bahrain Food Festival held in Bahrain, located on the east coast of the Persian Gulf. The sample consisted of 380 valid questionnaires and was taken using social networks from those attending the event. The statistical techniques used were factorial analysis and the K-means grouping method. The results show five motivational dimensions: Local food, Art, Entertainment, Socialization, and Escape and novelty. In addition, two segments were found; the first, Entertainment and novelties, is related to attendees who seek to enjoy the festive atmosphere and discover new restaurants. The second is Multiple motives, formed by attendees with several motivations simultaneously. This segment has the highest income and expenses, making it the most important group for developing plans and strategies. The results will contribute to the academic literature and the organizers of food festivals.

  12. Social Influence on Shopping

    • kaggle.com
    zip
    Updated Dec 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Social Influence on Shopping [Dataset]. https://www.kaggle.com/thedevastator/uncovering-millennials-shopping-habits-and-socia
    Explore at:
    zip(15369 bytes)Available download formats
    Dataset updated
    Dec 5, 2022
    Authors
    The Devastator
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Social Influence on Shopping

    Social Survey Data from 300,000 Millennials and Gen Z Members

    By Adam Halper [source]

    About this dataset

    This dataset offers a comprehensive look into the shopping habits of millennials and Gen Z members, including valuable insights about how their choices are influenced by social media. By exploring the responses given to survey questions related to this topic, we can gain an understanding of how these generations' interests, beliefs and desires shape their decisions when it comes to retail experiences. With 150 million survey responses from our 300,000+ millennial and Gen Z participants, we can uncover powerful insights that could help influencers, businesses and marketers more accurately target this demographic. Our data includes important information such as questions asked during the survey, segment types targeted by those questions and corresponding answers gathered with detailed counts/percentages - making this dataset incredibly useful for anyone wanting an in-depth understanding of what drives the purchasing behavior of today's youth

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    The first step in using this dataset is to take a look at each column: Question, Segment Type, Segment Description, Answer, Count & Percentage. The Question column will provide background on what exactly each survey question was asking - allowing you to get an overall view of what kind of topics were being surveyed in relation to millennials' shopping habits & social media influence. You will then be able to follow up with analysis based on the respective Segment Types & Descriptions given (such as income levels), which leads us into analyzing answers from both Count & Percentage columns combined - providing absolute numbers vs relative ones for further analysis (such as percentages).

    Afterwards you'll need an advanced data analysis program such as SPSS or R-Studio - depending on your technical ability - though all most basic spreadsheet programs should suffice, excluding Matlab supported ones due its excessive complexity for something simple like this.. After selecting your preferred program inputting our file with all 150 million survey responses may take some time based on your computers processing capabilities but once loaded you'll be ready for endless possibilities! Now it's time get running with pulling out key insights you require utilizing various different tools found within these platforms whether it be linear regression or guided ANOVA testing which ever technique fits best should help lead navigate through uncovering deeper meaning in your ultra specific question!

    As a final precaution while diving through waters filled surprises also keep note any adjustments needed potentially due overfitting or multicollinearity otherwise could cause major issues skew end results unfit requiring start whole process anew! Good luck delving deep discovering millennial behavior related digital world!

    Research Ideas

    • Identifying which type of segment is most responsive to engaging shopping experiences, such as influencer marketing, social media discounts and campaigns, etc.
    • Analyzing the answers given to survey questions in order to understand millennial and Gen Z's opinion about social influence on their shopping habits - what do they view positively or negatively?
    • Using the survey responses to uncover any interesting trends or correlations between different segments - is there a particular demographic that values or uses certain types of social influence on their shopping habits more than others?

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original.

    Columns

    File: WhatsgoodlyData-6.csv | Column name | Description ...

  13. Demographic characteristics.

    • plos.figshare.com
    xls
    Updated Dec 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tianyang Huang (2023). Demographic characteristics. [Dataset]. http://doi.org/10.1371/journal.pone.0295581.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 14, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Tianyang Huang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the current severe aging of the population, the problem of "digital divide" of the elderly has become increasingly prominent, and the elderly market represents a vast demographic that is increasingly becoming an important customer segment for mobile shopping in the future. However, there is currently insufficient attention given to the research on mobile shopping behavior among older adults. This study tries to answer what are the driving factors of mobile phone shopping behavior among the elderly? The purpose of this study is to analyze the factors that drive the elderly’s mobile phone shopping behavior, and to establish a mobile phone shopping acceptance model for the elderly to predict the factors of the elderly’s behavioral intention of using smart phones. Based on the second edition of Unified Theory of Acceptance and Use of Technology theory (UTAUT 2), this study proposed a mobile phone shopping acceptance model for the elderly. The study collected valid data from 389 Chinese elderly people through questionnaires and analyzed them using structural equation models. The results showed that utilitarian, anxiety, trust, performance expectancy, effort expectancy, social influence, facilitating conditions and habit directly impact the older adults’ intention to engage in mobile shopping. Additionally, facilitating conditions, habit and the older adults’ intention to engage in mobile shopping act as driving factors for actual use behavior. This study further expands the UTAUT theoretical model, provides a theoretical basis for the research of mobile shopping behavior of the elderly, and enricues the application groups and fields of the UTAUT theoretical model. The results of this study provide inspiration for the development, design and marketing of age-appropriate mobile shopping products, and contribute to the realization and further adoption of age-appropriate mobile shopping, and also contribute to promoting the active aging of the elderly.

  14. Pairwise comparisons of age groups using chi-square statistics.

    • plos.figshare.com
    xls
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan (2025). Pairwise comparisons of age groups using chi-square statistics. [Dataset]. http://doi.org/10.1371/journal.pntd.0012918.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 19, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pairwise comparisons of age groups using chi-square statistics.

  15. Kolmogorov-Smirnov test results for temporal distribution.

    • plos.figshare.com
    xls
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan (2025). Kolmogorov-Smirnov test results for temporal distribution. [Dataset]. http://doi.org/10.1371/journal.pntd.0012918.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 19, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Kolmogorov-Smirnov test results for temporal distribution.

  16. Demographic and pathological characteristics of the seven participants.

    • plos.figshare.com
    xls
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daisuke Nishiyama; Hiroshi Iwasaki; Takaya Taniguchi; Daisuke Fukui; Manabu Yamanaka; Teiji Harada; Hiroshi Yamada (2023). Demographic and pathological characteristics of the seven participants. [Dataset]. http://doi.org/10.1371/journal.pone.0257371.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Daisuke Nishiyama; Hiroshi Iwasaki; Takaya Taniguchi; Daisuke Fukui; Manabu Yamanaka; Teiji Harada; Hiroshi Yamada
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Demographic and pathological characteristics of the seven participants.

  17. Demographic, radiological, and cancer staging sample statistics of the...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moritz Gross; Michael Spektor; Ariel Jaffe; Ahmet S. Kucukkaya; Simon Iseke; Stefan P. Haider; Mario Strazzabosco; Julius Chapiro; John A. Onofrey (2023). Demographic, radiological, and cancer staging sample statistics of the training, validation, and testing cohorts from 219 HCC patients included in this study. [Dataset]. http://doi.org/10.1371/journal.pone.0260630.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Moritz Gross; Michael Spektor; Ariel Jaffe; Ahmet S. Kucukkaya; Simon Iseke; Stefan P. Haider; Mario Strazzabosco; Julius Chapiro; John A. Onofrey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Demographic, radiological, and cancer staging sample statistics of the training, validation, and testing cohorts from 219 HCC patients included in this study.

  18. Estimation of the mutation rate μ per site per generation (in units of 10−8)...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jan 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhendong Huang; Jerome Kelleher; Yao-ban Chan; David Balding (2025). Estimation of the mutation rate μ per site per generation (in units of 10−8) on human chromosome 20 and 21 for populations MSL (Mende in Sierra Leone), LWK (Luhya in Webuye, Kenya), BEB (Bengali from Bangladesh), ITU (Indian Telugu from the UK), FIN (Finnish in Finland), GBR (British in England and Scotland), JPT (Japanese in Tokyo, Japan), and CHB (Han Chinese in Beijing, China). [Dataset]. http://doi.org/10.1371/journal.pgen.1011537.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 21, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Zhendong Huang; Jerome Kelleher; Yao-ban Chan; David Balding
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United Kingdom, Finland, Great Britain, India, Tokyo, Sierra Leone, Bangladesh, England, Kenya, Japan
    Description

    The TSABC analysis assumes the 1KGP demographic model in each population.

  19. Smartphone Sensor Data for Mental Health Research

    • kaggle.com
    zip
    Updated Jan 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Smartphone Sensor Data for Mental Health Research [Dataset]. https://www.kaggle.com/datasets/thedevastator/smartphone-sensor-data-for-mental-health-researc/code
    Explore at:
    zip(757326 bytes)Available download formats
    Dataset updated
    Jan 21, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Smartphone Sensor Data for Mental Health Research

    User Engagement, Experience, and Ethics

    By [source]

    About this dataset

    In addition to smartphone sensor data, survey responses were also collected that provide an insight into participants' views on passive data collection for research purposes. This remarkable set of information opens up new possibilities in terms of understanding and treating mental health by leveraging technological advances — while also providing valuable insights into important ethical considerations related to it. Our dataset thus offers researchers a crucial tool for unlocking advances in our understanding of mental health and its associated conditions — paving the way for further exploration through different contexts.$

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains valuable and comprehensive sensor data collected from smartphones as part of a feasibility study aimed at understanding mental health through the use of smartphone data. This dataset is ideal for researchers and data scientists who are interested in exploring the potential of smartphone sensors for aiding understanding of mental health.

    In this guide, we will be discussing how to best utilize this dataset to explore the different aspects associated with a user’s experience with collecting mental health-related data via their device. All these attributes have been organized into columns within the dataset.

    The first set of columns ‘os’, ‘model’ and ‘phone_age’ provide us information related to the participants' devices such as its type/make, operating system and age respectively. This can be used to group users who share similar technologies or devices, which can help us better understand how device differences may affect user engagement or experience with collecting this kind of data.

    The second set consists of demographics-related details such as participant 'age', 'gender' and 'phone_use' (or frequency). These columns give us insight into who is using what types/makes of devices in order to collect their mental health related data; it may uncover any trends associated with certain demographic segments receiving more benefit from certain types/makes compared to others etc.

    The third set pertains more closely towards understanding participant engagement; these include 'time', 'bluetooth_use', 'running_problem' statuses which enable us determine whether participants experienced any issues using Bluetooth while trying to collect their respective datasets; did they feel comfortable enough while doing so? etc It also includes 'data_use' which would tell us how much usage was obtained from each participant on average (in MB). Additionally there are also survey based opinions on acceptability ('settings') describing whether participants felt that automated collection was acceptable or not included alongside battery status ('battery').
    All in all by applying a combination analysis approach – examining different attributes separately as well as consulting other sources like survey results – deeper insights around user experience can be discerned via this unique dataset!

    Research Ideas

    • Analyzing the data to understand user engagement with the app, in order to develop methods of encouraging consistent use of smartphone sensors for mental health research.
    • Investigating how battery life and device settings affect user experience with the app, as knowing these factors could help optimize usage in future studies.
    • Combining this data with other datasets to build a better understanding of how mental health changes over time and how different activities might affect it – such as looking at changes in communication patterns or phone usage depending on mood levels/symptoms

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: participant_info.csv | Column name | Description | |:--------------|:-----------------------------------------| | os | Operating system of the device. (String) | | model | Model of the ...

  20. Sound and Audio Data in Syria

    • kaggle.com
    zip
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Techsalerator (2025). Sound and Audio Data in Syria [Dataset]. https://www.kaggle.com/datasets/techsalerator/sound-and-audio-data-in-syria
    Explore at:
    zip(12171329 bytes)Available download formats
    Dataset updated
    Apr 1, 2025
    Authors
    Techsalerator
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Syria
    Description

    Techsalerator’s Location Sentiment Data for Syria

    Techsalerator’s Location Sentiment Data for Syria offers a comprehensive collection of information vital for businesses, researchers, and technology developers. This dataset provides profound insights into location-based sentiment, mood, and emotional patterns across different regions of Syria.

    For access to the full dataset, contact us at info@techsalerator.com or visit Techsalerator Contact Us.

    Techsalerator’s Location Sentiment Data for Syria

    Techsalerator’s Location Sentiment Data for Syria provides a structured analysis of sentiment-related data across urban, rural, and conflict-affected regions. This dataset is essential for social science research, crisis management, political analysis, and local community development.

    Top 5 Key Data Fields

    • Location of Sentiment Capture – Identifies the geographic area where the sentiment data was collected, enabling region-specific sentiment analysis.
    • Sentiment Intensity – Measures the strength of positive or negative sentiment in different locations, important for understanding emotional responses to events.
    • Time of Capture – Records the exact time and date when sentiment was captured, allowing for temporal analysis of emotional trends.
    • Sentiment Category – Categorizes sentiment into various themes, such as political, social, economic, and cultural sentiments.
    • Sentiment Source Identification – Categorizes data based on the source of sentiment, including social media, news outlets, public forums, and interviews.

    Top 5 Sentiment Trends in Syria

    • Impact of Conflict on Emotional Well-being – Increasing negative sentiment related to ongoing conflicts, highlighting the psychological toll on affected populations.
    • Economic Sentiment Fluctuations – Growing concerns over inflation and unemployment are driving negative sentiment across major urban centers.
    • Political Sentiment Analysis – Divided political sentiments in urban and rural regions reflect differing perspectives on governance and security policies.
    • Youth Sentiment – Younger populations exhibit a mix of frustration and hope, with a desire for greater opportunities and stability.
    • Humanitarian Sentiment – Sentiment data reveals increasing empathy and support for displaced populations, shaping international aid efforts.

    Top 5 Applications of Location Sentiment Data in Syria

    • Crisis Management – Sentiment analysis helps humanitarian organizations respond effectively by understanding local emotional states during crises.
    • Political Strategy – Political analysts use sentiment data to gauge public opinion and adjust campaigns accordingly in different regions of Syria.
    • Community Development – Local governments and NGOs utilize sentiment data to address community concerns and tailor development programs to meet emotional and social needs.
    • Media and Journalism – Journalists use sentiment data to report on public reactions and shifts in national and local moods.
    • Market Research – Businesses analyze sentiment to assess customer perceptions, preferences, and reactions to products and services in Syria.

    Accessing Techsalerator’s Location Sentiment Data

    To obtain Techsalerator’s Location Sentiment Data for Syria, contact info@techsalerator.com with your specific requirements. Techsalerator offers customized datasets based on requested fields, with delivery available within 24 hours. Ongoing access options can also be discussed.

    Included Data Fields

    • Location of Sentiment Capture
    • Sentiment Intensity
    • Time of Capture
    • Sentiment Category (Political, Economic, Social, etc.)
    • Sentiment Source Identification
    • Geographic Region (Urban, Rural, Conflict Zones)
    • Demographic Segmentation (Age, Gender, etc.)
    • Emotion Type (Fear, Hope, Frustration, etc.)
    • Political Affiliation Sentiment
    • Cultural Sentiment Trends

    For deep insights into sentiment patterns and emotional landscapes in Syria, Techsalerator’s dataset is an invaluable resource for political analysts, researchers, NGOs, and businesses.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Talha khalid (2024). Retail Sales Analysis: [Dataset]. https://www.kaggle.com/datasets/talhachoudary/sales-of-company/code
Organization logo

Data from: Retail Sales Analysis:

Comprehensive Data for Customer, Order, Product, and Time-Based Analysis

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 6, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Talha khalid
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Overview This collection of datasets is designed to provide a comprehensive overview of a retail business's operations, focusing on calendar information, customer demographics, order details, and product information. These datasets are ideal for performing in-depth sales analysis, customer segmentation, demand forecasting, and inventory management.

Dataset Descriptions Calendar.csv

Description: This file contains detailed calendar information to assist with time-based analysis. It includes important dates, such as holidays, weekends, and fiscal periods, which can be critical for analyzing sales trends, seasonality, and promotional impacts. Key Columns: Date: The specific date. Day of Week: The day of the week (e.g., Monday, Tuesday). Month: The month corresponding to the date. Quarter: The fiscal quarter (Q1, Q2, etc.). Year: The year of the date. Holiday Flag: Indicates if the date is a public holiday. Customer.csv

Description: This dataset contains demographic information about the customers. It’s useful for customer segmentation, lifetime value analysis, and targeted marketing campaigns. Key Columns: Customer ID: A unique identifier for each customer. Name: The full name of the customer. Age: The age of the customer. Gender: The gender of the customer. Location: The geographic location (city/state) of the customer. Loyalty Tier: The loyalty program tier of the customer (e.g., Bronze, Silver, Gold). Order.csv

Description: This dataset tracks individual customer orders, including transaction details. It is essential for sales analysis, order fulfillment tracking, and revenue analysis. Key Columns: Order ID: A unique identifier for each order. Customer ID: The ID of the customer who placed the order (linking to Customer.csv). Order Date: The date the order was placed. Product ID: The ID of the product ordered (linking to Product.csv). Quantity: The quantity of the product ordered. Total Price: The total price of the order. Product.csv

Description: This dataset provides detailed information on the products available in the retail store. It includes categories, pricing, and supplier information, making it useful for inventory management and product performance analysis. Key Columns: Product ID: A unique identifier for each product. Product Name: The name of the product. Category: The category under which the product falls (e.g., Electronics, Clothing). Supplier ID: The ID of the supplier providing the product. Unit Price: The price per unit of the product. Stock Quantity: The number of units available in stock. Usability These datasets can be utilized for various business analytics tasks, including:

Sales and Revenue Analysis: By linking the Order.csv and Product.csv, one can analyze sales performance by product category, identify best-sellers, and determine revenue drivers. Customer Segmentation: Using Customer.csv, segment customers based on demographics or purchase behavior to tailor marketing efforts. Demand Forecasting: Integrate Calendar.csv to model seasonality effects and predict future sales trends. Provenance These datasets are typically generated from an ERP system or CRM and are structured to support a variety of business intelligence applications. Users may need to perform data cleaning or transformation depending on the specific use case.

Licensing and Coverage The datasets are provided without a specific license. Users are encouraged to verify and attribute the source as needed. Coverage typically includes the entire operational history of the retail business, though users should check for any specific time range covered.

Search
Clear search
Close search
Google apps
Main menu