99 datasets found
  1. Bank Customer Segmentation (1M+ Transactions)

    • kaggle.com
    zip
    Updated Oct 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shivam Bansal (2021). Bank Customer Segmentation (1M+ Transactions) [Dataset]. https://www.kaggle.com/shivamb/bank-customer-segmentation
    Explore at:
    zip(25360448 bytes)Available download formats
    Dataset updated
    Oct 26, 2021
    Authors
    Shivam Bansal
    Description

    Bank Customer Segmentation

    Most banks have a large customer base - with different characteristics in terms of age, income, values, lifestyle, and more. Customer segmentation is the process of dividing a customer dataset into specific groups based on shared traits.

    According to a report from Ernst & Young, “A more granular understanding of consumers is no longer a nice-to-have item, but a strategic and competitive imperative for banking providers. Customer understanding should be a living, breathing part of everyday business, with insights underpinning the full range of banking operations.

    About this Dataset

    This dataset consists of 1 Million+ transaction by over 800K customers for a bank in India. The data contains information such as - customer age (DOB), location, gender, account balance at the time of the transaction, transaction details, transaction amount, etc.

    Interesting Analysis Ideas

    The dataset can be used for different analysis, example -

    1. Perform Clustering / Segmentation on the dataset and identify popular customer groups along with their definitions/rules
    2. Perform Location-wise analysis to identify regional trends in India
    3. Perform transaction-related analysis to identify interesting trends that can be used by a bank to improve / optimi their user experiences
    4. Customer Recency, Frequency, Monetary analysis
    5. Network analysis or Graph analysis of customer data.
  2. E-Commerce Customer Segmentation Dataset

    • kaggle.com
    zip
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeynep Üstün (2025). E-Commerce Customer Segmentation Dataset [Dataset]. https://www.kaggle.com/datasets/zeynepustun/e-commerce-customer-segmentation-dataset
    Explore at:
    zip(517 bytes)Available download formats
    Dataset updated
    Aug 2, 2025
    Authors
    Zeynep Üstün
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    E-Commerce Customer Segmentation Dataset This synthetic dataset contains information about 20 customers of an e-commerce platform, designed for customer segmentation and classification tasks.

    Dataset Overview Each record represents a unique customer with demographic and behavioral features that help classify them into different customer segments.

    Features: customer_id: Unique identifier for each customer

    age: Age of the customer (years)

    annual_income_k$: Annual income in thousands of dollars

    spending_score: A score between 0 and 100 indicating customer spending habits (higher means more spending)

    membership_years: Length of membership in years

    segment: Customer segment label; possible values are:

    Low (low-value customers)

    Medium (medium-value customers)

    High (high-value customers)

    Potential Use Cases Customer segmentation

    Targeted marketing campaigns

    Customer lifetime value prediction

    Behavioral analytics and profiling

    Clustering and classification algorithm testing

    Dataset Size 20 samples

    6 columns

    License This dataset is provided under the Apache 2.0 License.

  3. d

    Customer Attributes Dataset - Demographics, Devices & Locations APAC Data...

    • datarade.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AI Keyboard, Customer Attributes Dataset - Demographics, Devices & Locations APAC Data (1st Party Data w/90M+ records) [Dataset]. https://datarade.ai/data-products/bobble-ai-demographic-data-apac-age-gender-1st-party-data-w-52m-records-bobble-ai
    Explore at:
    .json, .csv, .xls, .parquetAvailable download formats
    Dataset authored and provided by
    AI Keyboard
    Area covered
    Germany, India, Indonesia, Pakistan, United Arab Emirates, Saudi Arabia, Netherlands, United States of America, Nepal, Philippines
    Description

    The User Profile Data is a structured, anonymized dataset designed to help organizations understand who their users are, what devices they use, and where they are located. Each record provides privacy-compliant linkages between user IDs, demographic profiles, device intelligence, and geolocation data, offering deep context for analytics, segmentation, and personalization.

    Built for privacy-safe analytics, the dataset uses hashed identifiers like phone number and email and standardized formats, making it easy to integrate into big-data platforms, AI pipelines, and machine learning models for advanced analytics.

    Demographic insights include gender, age, and age group, essential for audience profiling, marketing optimization, and consumer intelligence. All gender data is user-declared and AI-verified through image-based avatar validation, ensuring data accuracy and authenticity.

    The dataset’s Device Intelligence Layer includes rich technical attributes such as device brand, model, OS version, user agent, RAM, language, and timezone, enabling technical segmentation, performance analytics, and targeted ad delivery across diverse device ecosystems.

    On the location and POI front, the dataset combines GPS-based and IP-based coordinates—including country, region, city, latitude, longitude —to provide high-precision geospatial insights. This enables mobility pattern analysis, market expansion planning, and POI clustering for advanced location intelligence.

    Each user record contains onboarding and lifecycle fields like unique IDs, and profile update timestamps, allowing accurate tracking of user acquisition trends, data freshness, and activity duration.

    🔍 Key Features • 1st-party, consent-based demographic & device data • AI-verified gender insights via avatar recognition • OS-level app data with 120+ daily sessions per user • Global coverage across APAC and emerging markets • GPS + IP-based geolocation & POI intelligence • Privacy-compliant, hashed identifiers for safe integration

    🚀 Use Cases • Audience segmentation & lookalike modeling • Ad-tech and mar-tech optimization • Geospatial & POI analytics • Fraud detection & risk scoring • Personalization & recommendation engines • App performance & device compatibility insights

    🏢 Industries Served Ad-Tech • Mar-Tech • FinTech • Telecom • Retail Analytics • Consumer Intelligence • AI & ML Platforms

  4. d

    GIS Data | USA & Canada | Over 40k Demographics Variables To Inform Business...

    • datarade.ai
    .json, .csv
    Updated Aug 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GapMaps (2024). GIS Data | USA & Canada | Over 40k Demographics Variables To Inform Business Decisions | Consumer Spending Data| Demographic Data [Dataset]. https://datarade.ai/data-products/gapmaps-premium-demographic-data-by-ags-usa-canada-gis-gapmaps
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Aug 13, 2024
    Dataset authored and provided by
    GapMaps
    Area covered
    Canada, United States
    Description

    GapMaps GIS data for USA and Canada sourced from Applied Geographic Solutions (AGS) includes an extensive range of the highest quality demographic and lifestyle segmentation products. All databases are derived from superior source data and the most sophisticated, refined, and proven methodologies.

    GIS Data attributes include:

    1. Latest Estimates and Projections The estimates and projections database includes a wide range of core demographic data variables for the current year and 5- year projections, covering five broad topic areas: population, households, income, labor force, and dwellings.

    2. Crime Risk Crime Risk is the result of an extensive analysis of a rolling seven years of FBI crime statistics. Based on detailed modeling of the relationships between crime and demographics, Crime Risk provides an accurate view of the relative risk of specific crime types (personal, property and total) at the block and block group level.

    3. Panorama Segmentation AGS has created a segmentation system for the United States called Panorama. Panorama has been coded with the MRI Survey data to bring you Consumer Behavior profiles associated with this segmentation system.

    4. Business Counts Business Counts is a geographic summary database of business establishments, employment, occupation and retail sales.

    5. Non-Resident Population The AGS non-resident population estimates utilize a wide range of data sources to model the factors which drive tourists to particular locations, and to match that demand with the supply of available accommodations.

    6. Consumer Expenditures AGS provides current year and 5-year projected expenditures for over 390 individual categories that collectively cover almost 95% of household spending.

    7. Retail Potential This tabulation utilizes the Census of Retail Trade tables which cross-tabulate store type by merchandise line.

    8. Environmental Risk The environmental suite of data consists of several separate database components including: -Weather Risks -Seismological Risks -Wildfire Risk -Climate -Air Quality -Elevation and terrain

    Primary Use Cases for GapMaps GIS Data:

    1. Retail (eg. Fast Food/ QSR, Cafe, Fitness, Supermarket/Grocery)
    2. Customer Profiling: get a detailed understanding of the demographic & segmentation profile of your customers, where they work and their spending potential
    3. Analyse your trade areas at a granular census block level using all the key metrics
    4. Site Selection: Identify optimal locations for future expansion and benchmark performance across existing locations.
    5. Target Marketing: Develop effective marketing strategies to acquire more customers.
    6. Integrate AGS demographic data with your existing GIS or BI platform to generate powerful visualizations.

    7. Finance / Insurance (eg. Hedge Funds, Investment Advisors, Investment Research, REITs, Private Equity, VC)

    8. Network Planning

    9. Customer (Risk) Profiling for insurance/loan approvals

    10. Target Marketing

    11. Competitive Analysis

    12. Market Optimization

    13. Commercial Real-Estate (Brokers, Developers, Investors, Single & Multi-tenant O/O)

    14. Tenant Recruitment

    15. Target Marketing

    16. Market Potential / Gap Analysis

    17. Marketing / Advertising (Billboards/OOH, Marketing Agencies, Indoor Screens)

    18. Customer Profiling

    19. Target Marketing

    20. Market Share Analysis

  5. E-Commerce Customer Behavior & Sales Analysis -TR

    • kaggle.com
    zip
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UmutUygurr (2025). E-Commerce Customer Behavior & Sales Analysis -TR [Dataset]. https://www.kaggle.com/datasets/umuttuygurr/e-commerce-customer-behavior-and-sales-analysis-tr
    Explore at:
    zip(138245 bytes)Available download formats
    Dataset updated
    Oct 29, 2025
    Authors
    UmutUygurr
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    🛒 E-Commerce Customer Behavior and Sales Dataset 📊 Dataset Overview This comprehensive dataset contains 5,000 e-commerce transactions from a Turkish online retail platform, spanning from January 2023 to March 2024. The dataset provides detailed insights into customer demographics, purchasing behavior, product preferences, and engagement metrics.

    🎯 Use Cases This dataset is perfect for:

    Customer Segmentation Analysis: Identify distinct customer groups based on behavior Sales Forecasting: Predict future sales trends and patterns Recommendation Systems: Build product recommendation engines Customer Lifetime Value (CLV) Prediction: Estimate customer value Churn Analysis: Identify customers at risk of leaving Marketing Campaign Optimization: Target customers effectively Price Optimization: Analyze price sensitivity across categories Delivery Performance Analysis: Optimize logistics and shipping 📁 Dataset Structure The dataset contains 18 columns with the following features:

    Order Information Order_ID: Unique identifier for each order (ORD_XXXXXX format) Date: Transaction date (2023-01-01 to 2024-03-26) Customer Demographics Customer_ID: Unique customer identifier (CUST_XXXXX format) Age: Customer age (18-75 years) Gender: Customer gender (Male, Female, Other) City: Customer city (10 major Turkish cities) Product Information Product_Category: 8 categories (Electronics, Fashion, Home & Garden, Sports, Books, Beauty, Toys, Food) Unit_Price: Price per unit (in TRY/Turkish Lira) Quantity: Number of units purchased (1-5) Transaction Details Discount_Amount: Discount applied (if any) Total_Amount: Final transaction amount after discount Payment_Method: Payment method used (5 types) Customer Behavior Metrics Device_Type: Device used for purchase (Mobile, Desktop, Tablet) Session_Duration_Minutes: Time spent on website (1-120 minutes) Pages_Viewed: Number of pages viewed during session (1-50) Is_Returning_Customer: Whether customer has purchased before (True/False) Post-Purchase Metrics Delivery_Time_Days: Delivery duration (1-30 days) Customer_Rating: Customer satisfaction rating (1-5 stars) 📈 Key Statistics Total Records: 5,000 transactions Date Range: January 2023 - March 2024 (15 months) Average Transaction Value: ~450 TRY Customer Satisfaction: 3.9/5.0 average rating Returning Customer Rate: 60% Mobile Usage: 55% of transactions 🔍 Data Quality ✅ No missing values ✅ Consistent formatting across all fields ✅ Realistic data distributions ✅ Proper data types for all columns ✅ Logical relationships between features 💡 Sample Analysis Ideas Customer Segmentation with K-Means Clustering

    Segment customers based on spending, frequency, and recency Sales Trend Analysis

    Identify seasonal patterns and peak shopping periods Product Category Performance

    Compare revenue, ratings, and return rates across categories Device-Based Behavior Analysis

    Understand how device choice affects purchasing patterns Predictive Modeling

    Build models to predict customer ratings or purchase amounts City-Level Market Analysis

    Compare market performance across different cities 🛠️ Technical Details File Format: CSV (Comma-Separated Values) Encoding: UTF-8 File Size: ~500 KB Delimiter: Comma (,) 📚 Column Descriptions Column Name Data Type Description Example Order_ID String Unique order identifier ORD_001337 Customer_ID String Unique customer identifier CUST_01337 Date DateTime Transaction date 2023-06-15 Age Integer Customer age 35 Gender String Customer gender Female City String Customer city Istanbul Product_Category String Product category Electronics Unit_Price Float Price per unit 1299.99 Quantity Integer Units purchased 2 Discount_Amount Float Discount applied 129.99 Total_Amount Float Final amount paid 2469.99 Payment_Method String Payment method Credit Card Device_Type String Device used Mobile Session_Duration_Minutes Integer Session time 15 Pages_Viewed Integer Pages viewed 8 Is_Returning_Customer Boolean Returning customer True Delivery_Time_Days Integer Delivery duration 3 Customer_Rating Integer Satisfaction rating 5 🎓 Learning Outcomes By working with this dataset, you can learn:

    Data cleaning and preprocessing techniques Exploratory Data Analysis (EDA) with Python/R Statistical analysis and hypothesis testing Machine learning model development Data visualization best practices Business intelligence and reporting 📝 Citation If you use this dataset in your research or project, please cite:

    E-Commerce Customer Behavior and Sales Dataset (2024) Turkish Online Retail Platform Data (2023-2024) Available on Kaggle ⚖️ License This dataset is released under the CC0: Public Domain license. You are free to use it for any purpose.

    🤝 Contribution Found any issues or have suggestions? Feel free to provide feedback!

    📞 Contact For questions or collaborations, please reach out through Kaggle.

    Happy Analyzing! 🚀

    Keywords: e-c...

  6. 🌆 City Lifestyle Segmentation Dataset

    • kaggle.com
    zip
    Updated Nov 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UmutUygurr (2025). 🌆 City Lifestyle Segmentation Dataset [Dataset]. https://www.kaggle.com/datasets/umuttuygurr/city-lifestyle-segmentation-dataset
    Explore at:
    zip(11274 bytes)Available download formats
    Dataset updated
    Nov 15, 2025
    Authors
    UmutUygurr
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22121490%2F7189944f8fc292a094c90daa799d08ca%2FChatGPT%20Image%2015%20Kas%202025%2014_07_37.png?generation=1763204959770660&alt=media" alt="">

    🌆 About This Dataset

    This synthetic dataset simulates 300 global cities across 6 major geographic regions, designed specifically for unsupervised machine learning and clustering analysis. It explores how economic status, environmental quality, infrastructure, and digital access shape urban lifestyles worldwide.

    🎯 Perfect For:

    • 📊 K-Means, DBSCAN, Agglomerative Clustering
    • 🔬 PCA & t-SNE Dimensionality Reduction
    • 🗺️ Geospatial Visualization (Plotly, Folium)
    • 📈 Correlation Analysis & Feature Engineering
    • 🎓 Educational Projects (Beginner to Intermediate)

    📦 What's Inside?

    FeatureDescriptionRange
    10 FeaturesEconomic, environmental & social indicatorsRealistically scaled
    300 CitiesEurope, Asia, Americas, Africa, OceaniaDiverse distributions
    Strong CorrelationsIncome ↔ Rent (+0.8), Density ↔ Pollution (+0.6)ML-ready
    No Missing ValuesClean, preprocessed dataReady for analysis
    4-5 Natural ClustersMetropolitan hubs, eco-towns, developing centersPre-validated

    🔥 Key Features

    Realistic Correlations: Income strongly predicts rent (+0.8), internet access (+0.7), and happiness (+0.6)
    Regional Diversity: Each region has distinct economic and environmental characteristics
    Clustering-Ready: Naturally separable into 4-5 lifestyle archetypes
    Beginner-Friendly: No data cleaning required, includes example code
    Documented: Comprehensive README with methodology and use cases

    🚀 Quick Start Example

    import pandas as pd
    from sklearn.cluster import KMeans
    from sklearn.preprocessing import StandardScaler
    
    # Load and prepare
    df = pd.read_csv('city_lifestyle_dataset.csv')
    X = df.drop(['city_name', 'country'], axis=1)
    X_scaled = StandardScaler().fit_transform(X)
    
    # Cluster
    kmeans = KMeans(n_clusters=5, random_state=42)
    df['cluster'] = kmeans.fit_predict(X_scaled)
    
    # Analyze
    print(df.groupby('cluster').mean())
    

    🎓 Learning Outcomes

    After working with this dataset, you will be able to: 1. Apply K-Means, DBSCAN, and Hierarchical Clustering 2. Use PCA for dimensionality reduction and visualization 3. Interpret correlation matrices and feature relationships 4. Create geographic visualizations with cluster assignments 5. Profile and name discovered clusters based on characteristics

    📚 Ideal For These Projects

    • 🏆 Kaggle Competitions: Practice clustering techniques
    • 📝 Academic Projects: Urban planning, sociology, environmental science
    • 💼 Portfolio Work: Showcase ML skills to employers
    • 🎓 Learning: Hands-on practice with unsupervised learning
    • 🔬 Research: Urban lifestyle segmentation studies

    🌍 Expected Clusters

    ClusterCharacteristicsExample Cities
    Metropolitan Tech HubsHigh income, density, rentSilicon Valley, Singapore
    Eco-Friendly TownsLow density, clean air, high happinessNordic cities
    Developing CentersMid income, high density, poor airEmerging markets
    Low-Income SuburbanLow infrastructure, incomeRural areas
    Industrial Mega-CitiesVery high density, pollutionManufacturing hubs

    🛠️ Technical Details

    • Format: CSV (UTF-8)
    • Size: ~300 rows × 10 columns
    • Missing Values: 0%
    • Data Types: 2 categorical, 8 numerical
    • Target Variable: None (unsupervised)
    • Correlation Strength: Pre-validated (r: 0.4 to 0.8)

    📖 What Makes This Dataset Special?

    Unlike random synthetic data, this dataset was carefully engineered with: - ✨ Realistic correlation structures based on urban research - 🌍 Regional characteristics matching real-world patterns - 🎯 Optimal cluster separability (validated via silhouette scores) - 📚 Comprehensive documentation and starter code

    🏅 Use This Dataset If You Want To:

    ✓ Learn clustering without data cleaning hassles
    ✓ Practice PCA and dimensionality reduction
    ✓ Create beautiful geographic visualizations
    ✓ Understand feature correlation in real-world contexts
    ✓ Build a portfolio project with clear business insights

    📊 Acknowledgments

    This dataset was designed for educational purposes in machine learning and data science. While synthetic, it reflects real patterns observed in global urban development research.

    Happy Clustering! 🎉

  7. d

    Consumer Data | Global Population Data | Audience Targeting Data |...

    • datarade.ai
    .csv
    Updated Jul 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GeoPostcodes (2024). Consumer Data | Global Population Data | Audience Targeting Data | Segmentation data [Dataset]. https://datarade.ai/data-products/geopostcodes-consumer-data-population-data-audience-targe-geopostcodes
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jul 11, 2024
    Dataset authored and provided by
    GeoPostcodes
    Area covered
    Uzbekistan, Nepal, Sint Maarten (Dutch part), Cameroon, Pitcairn, Guam, Guernsey, Malawi, Syrian Arab Republic, Algeria
    Description

    A global database of population segmentation data that provides an understanding of population distribution at administrative and zip code levels over 55 years, past, present, and future.

    Leverage up-to-date audience targeting data trends for market research, audience targeting, and sales territory mapping.

    Self-hosted consumer data curated based on trusted sources such as the United Nations or the European Commission, with a 99% match accuracy. The Consumer Data is standardized, unified, and ready to use.

    Use cases for the Global Population Database (Consumer Data Data/Segmentation data)

    • Ad targeting

    • B2B Market Intelligence

    • Customer analytics

    • Marketing campaign analysis

    • Demand forecasting

    • Sales territory mapping

    • Retail site selection

    • Reporting

    • Audience targeting

    Segmentation data export methodology

    Our location data packages are offered in CSV format. All geospatial data are optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more.

    Product Features

    • Historical population data (55 years)

    • Changes in population density

    • Urbanization Patterns

    • Accurate at zip code and administrative level

    • Optimized for easy integration

    • Easy customization

    • Global coverage

    • Updated yearly

    • Standardized and reliable

    • Self-hosted delivery

    • Fully aggregated (ready to use)

    • Rich attributes

    Why do companies choose our Population Databases

    • Standardized and unified demographic data structure

    • Seamless integration in your system

    • Dedicated location data expert

    Note: Custom population data packages are available. Please submit a request via the above contact button for more details.

  8. m

    Lisbon, Portugal, hotel’s customer dataset with three years of personal,...

    • data.mendeley.com
    Updated Nov 18, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nuno Antonio (2020). Lisbon, Portugal, hotel’s customer dataset with three years of personal, behavioral, demographic, and geographic information [Dataset]. http://doi.org/10.17632/j83f5fsh6c.1
    Explore at:
    Dataset updated
    Nov 18, 2020
    Authors
    Nuno Antonio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Lisbon, Portugal
    Description

    Hotel customer dataset with 31 variables describing a total of 83,590 instances (customers). It comprehends three full years of customer behavioral data. In addition to personal and behavioral information, the dataset also contains demographic and geographical information. This dataset contributes to reducing the lack of real-world business data that can be used for educational and research purposes. The dataset can be used in data mining, machine learning, and other analytical field problems in the scope of data science. Due to its unit of analysis, it is a dataset especially suitable for building customer segmentation models, including clustering and RFM (Recency, Frequency, and Monetary value) models, but also be used in classification and regression problems.

  9. Customer Segmentation Data

    • kaggle.com
    zip
    Updated Mar 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Smit Raval (2024). Customer Segmentation Data [Dataset]. https://www.kaggle.com/datasets/ravalsmit/customer-segmentation-data/discussion
    Explore at:
    zip(1842344 bytes)Available download formats
    Dataset updated
    Mar 11, 2024
    Authors
    Smit Raval
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides comprehensive customer data suitable for segmentation analysis. It includes anonymized demographic, transactional, and behavioral attributes, allowing for detailed exploration of customer segments. Leveraging this dataset, marketers, data scientists, and business analysts can uncover valuable insights to optimize targeted marketing strategies and enhance customer engagement. Whether you're looking to understand customer behavior or improve campaign effectiveness, this dataset offers a rich resource for actionable insights and informed decision-making.

    Key Features:

    Anonymized demographic, transactional, and behavioral data. Suitable for customer segmentation analysis. Opportunities to optimize targeted marketing strategies. Valuable insights for improving campaign effectiveness. Ideal for marketers, data scientists, and business analysts.

    Usage Examples:

    Segmenting customers based on demographic attributes. Analyzing purchase behavior to identify high-value customer segments. Optimizing marketing campaigns for targeted engagement. Understanding customer preferences and tailoring product offerings accordingly. Evaluating the effectiveness of marketing strategies and iterating for improvement. Explore this dataset to unlock actionable insights and drive success in your marketing initiatives!

  10. Consumer Marketing Data API | Tailored Consumer Insights | Target with...

    • datarade.ai
    Updated Oct 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2021). Consumer Marketing Data API | Tailored Consumer Insights | Target with Precision | Best Price Guarantee [Dataset]. https://datarade.ai/data-products/consumer-marketing-data-api-tailored-consumer-insights-ta-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Oct 27, 2021
    Dataset provided by
    Area covered
    Senegal, Hong Kong, Madagascar, United Arab Emirates, Sweden, Turkey, Estonia, Philippines, Vanuatu, Burundi
    Description

    Success.ai’s Consumer Marketing Data API empowers your marketing, analytics, and product teams with on-demand access to a vast and continuously updated dataset of consumer insights. Covering detailed demographics, behavioral patterns, and purchasing histories, this API enables you to go beyond generic outreach and craft tailored campaigns that truly resonate with your target audiences.

    With AI-validated accuracy and support for precise filtering, the Consumer Marketing Data API ensures you’re always equipped with the most relevant data. Backed by our Best Price Guarantee, this solution is essential for refining your strategies, improving conversion rates, and driving sustainable growth in today’s competitive consumer landscape.

    Why Choose Success.ai’s Consumer Marketing Data API?

    1. Tailored Consumer Insights for Precision Targeting

      • Access verified demographic, behavioral, and purchasing data to understand what consumers truly value.
      • AI-driven validation ensures 99% accuracy, minimizing wasted spend and improving engagement outcomes.
    2. Comprehensive Global Reach

      • Includes consumer profiles from diverse regions and markets, enabling you to scale campaigns and discover emerging opportunities.
      • Adapt swiftly to new markets, product launches, and shifting consumer preferences with real-time data at your fingertips.
    3. Continuously Updated and Real-Time Data

      • Receive ongoing updates that reflect evolving consumer behaviors, interests, and market trends.
      • Respond quickly to seasonal changes, competitor moves, and industry disruptions, ensuring your campaigns remain timely and relevant.
    4. Ethical and Compliant

      • Fully adheres to GDPR, CCPA, and other global data privacy regulations, guaranteeing responsible and lawful data usage.

    Data Highlights:

    • Detailed Demographics: Age, gender, location, and income levels to refine targeting and messaging.
    • Behavioral Insights: Interests, browsing patterns, and content consumption habits to anticipate consumer needs.
    • Purchasing History: Understand consumer spending, brand loyalty, and product preferences to tailor promotions effectively.
    • Real-Time Updates: Keep pace with evolving consumer tastes, ensuring your strategies remain forward-focused and competitive.

    Key Features of the Consumer Marketing Data API:

    1. Granular Targeting and Segmentation

      • Query the API to segment consumers by demographics, interests, past purchases, or engagement patterns.
      • Focus campaigns on the most receptive audiences, enhancing conversion rates and ROI.
    2. Flexible and Seamless Integration

      • Easily integrate the API into CRM systems, marketing automation tools, or analytics platforms.
      • Streamline workflows and eliminate manual data imports, freeing resources for strategic initiatives.
    3. Continuous Data Enrichment

      • Refresh consumer profiles with the latest data, ensuring every decision is backed by current insights.
      • Reduce data decay and maintain top-notch data hygiene to maximize long-term marketing effectiveness.
    4. AI-Driven Validation

      • Rely on advanced AI validation techniques to guarantee high-quality data accuracy and reliability.
      • Increase confidence in your campaigns and decrease budget wasted on irrelevant targets.

    Strategic Use Cases:

    1. Highly Personalized Marketing Campaigns

      • Deliver tailored offers, recommendations, and content that align with individual consumer preferences.
      • Boost engagement and loyalty by making every touchpoint relevant and meaningful.
    2. Market Expansion and Product Launches

      • Identify segments most receptive to new products or services, ensuring successful market entry.
      • Stay ahead of consumer demands, evolving your product line and marketing mix to meet changing preferences.
    3. Competitive Analysis and Trend Forecasting

      • Leverage consumer insights to anticipate emerging trends and outpace competitors in capturing new markets.
      • Adjust marketing strategies proactively to capitalize on seasonal, cultural, or economic shifts.
    4. Customer Retention and Loyalty Programs

      • Use historical purchase and engagement data to identify at-risk customers and implement retention strategies.
      • Cultivate brand advocates by delivering personalized offers and exclusive perks to loyal consumers.

    Why Choose Success.ai?

    1. Best Price Guarantee

      • Access premium-quality consumer marketing data at unmatched prices, ensuring maximum ROI for your outreach efforts.
    2. Seamless Integration

      • Easily incorporate the API into existing workflows, eliminating data silos and manual data management.
    3. Data Accuracy with AI Validation

      • Depend on 99% accuracy to guide data-driven decisions, refine targeting, and elevate your marketing initiatives.
    4. Customizable and Scalable Solutions

      • Tailor datasets to focus on specific demog...
  11. m

    AIWR Dataset

    • data.mendeley.com
    Updated Aug 10, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sangdaow Noppitak (2021). AIWR Dataset [Dataset]. http://doi.org/10.17632/d73mpc529b.2
    Explore at:
    Dataset updated
    Aug 10, 2021
    Authors
    Sangdaow Noppitak
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Aerial Image Water Resources (AIWR) Dataset

    According to the standard of land use code by fundamental geographic data set (FGDS), Thailand land use classification requires an analysis and transformation of satellite images data together with field survey data. In this article, researchers studied only land use in water bodies. The water bodies in this research can be divided into 2 levels: natural body of water (W1) artificial body of (W2) water.

    The aerial image data used in this research was 1:50 meters. Every aerial image had 650x650 pixels. Those images included water bodies type W1 and W2. Ground truth of all aerial images was set for before sending it to be analyzed and interpreted by remote sensing experts. This assured that the water bodies groupings were correct. An example of ground truth, which has been checked by experts. Ground truth has been used in learning the algorithm in deep learning mode and also used in further evaluation.

    The aerial images used in the experiment consists of water body: types W1 and W2. Aerial image water resources dataset, AIWR has 800 images. Data were chosen at random and divided into 3 sections: training, validation, and test set with ratio 8:1:1. Therefore, 640 aerial images were used for learning and creating the model, 80 images were used for validation, and the remaining 80 images were used for test.

  12. App Users Segmentation: Case Study

    • kaggle.com
    zip
    Updated Jun 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhanupratap Biswas (2023). App Users Segmentation: Case Study [Dataset]. https://www.kaggle.com/datasets/bhanupratapbiswas/app-users-segmentation-case-study
    Explore at:
    zip(11584 bytes)Available download formats
    Dataset updated
    Jun 12, 2023
    Authors
    Bhanupratap Biswas
    Description

    Here's a step-by-step guide on how to approach user segmentation for FitTrackr:

    Define your segmentation goals: Start by determining what you want to achieve with user segmentation. For example, you might want to identify the most engaged users, understand the demographics of your user base, or target specific user groups with personalized promotions.

    Gather data: Collect relevant data about your app users. This can include demographic information (age, gender, location), app usage data (frequency of app usage, time spent on different features), user behavior (types of workouts, goals set, achievements unlocked), and any other relevant data points available to you.

    Identify relevant segmentation variables: Based on the goals you defined, identify the key variables that will help you segment your user base effectively. For FitTrackr, potential variables could include age, gender, fitness goals (e.g., weight loss, muscle gain), workout preferences (e.g., cardio, strength training), and user engagement level.

    Segment the user base: Use clustering techniques or segmentation algorithms to divide your user base into distinct segments based on the identified variables. You can employ methods such as k-means clustering, hierarchical clustering, or even machine learning algorithms like decision trees or random forests.

    Analyze and profile each segment: Once the segmentation is done, analyze each segment to understand their characteristics, preferences, and needs. Create detailed user profiles for each segment, including demographic information, app usage patterns, fitness goals, and any other relevant attributes. This will help you tailor your marketing messages and app features to each segment's specific requirements.

    Develop targeted strategies: Based on the insights gained from user profiles, develop targeted marketing strategies and app features for each segment. For example, if you have a segment of users who primarily focus on weight loss, you might create personalized workout plans or send them motivational content related to weight management.

    Implement and evaluate: Implement the targeted strategies and monitor their effectiveness. Continuously evaluate and refine your segmentation approach based on user feedback, engagement metrics, and the achievement of your goals.

  13. Z

    Data from: A multi-scale labeled dataset for boulder segmentation and...

    • data.niaid.nih.gov
    • data.europa.eu
    Updated Oct 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mattia Pugliatti; Michele Maestrini (2023). A multi-scale labeled dataset for boulder segmentation and navigation on small bodies [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8231154
    Explore at:
    Dataset updated
    Oct 5, 2023
    Dataset provided by
    Politecnico di Milano
    Authors
    Mattia Pugliatti; Michele Maestrini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The capability to detect boulders on the surface of small bodies is beneficial for vision-based applications such as hazard detection during critical operations, safety quantification, autonomous planning of scientific operations, and autonomous navigation. This task, however, is challenging due to the wide assortment of irregular shapes, the characteristics of the boulders population, and the rapid variability in the illumination conditions. Moreover, the lack of publicly available labeled datasets damps the research about data-driven algorithms. The following dataset has been designed and made publicly available to tackle these challenges. Its purpose is twofold. First, from the lessons learned from previous datasets, to develop a multi-purpose, high-fidelity dataset with boulders scattered across the surface of a small body. Second, to exploit domain randomization, artificial noise addition, scaling, and post-processing, enabling the design of data-driven pipelines.

    The methodology used to generate the dataset is illustrated in the work "A multi-scale labeled dataset for boulder segmentation and navigation on small bodies" by Mattia Pugliatti and Michele Maestrini, presented at the 74th IAC (International Astronautical Congress), 2024, Baku, Azerbaijan.

    The dataset contains the image-label pairs of 47502 samples, organized with the following structure:

    Dataset_PugliattiMaestrini_2023IAC --img --labels --masks

    The dataset is comprised of 47502 samples. The "img" folder contains the input, 512x 512 grayscale images. The "labels" folder includes the .txt segmentation labels of the 15 most prominent boulders for each image detected with the methodology illustrated in the IAC paper. The "masks" dataset contains the segmentation masks for all image layers, with the values being encoded between 0 and 17 as uint8. The samples are named as XXXXXX_YYY. XXXXXX stands for the image's original ID during rendering. YYY corresponds to the sub-splits of the original image obtained at rendering:

    001 - Top-Left crop
    002 - Top-Right crop
    003 - Bottom-Left crop
    004 - Bottom-right crop
    005 - Whole, resized
    

    The file "10000_ub_2023-01-18 00.09.43.txt" contains all the values of the rendering inputs detailed in the IAC paper.

  14. Distribution of samples by age group and gender.

    • plos.figshare.com
    xls
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan (2025). Distribution of samples by age group and gender. [Dataset]. http://doi.org/10.1371/journal.pntd.0012918.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 19, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mansi Patel; Uzma Shamim; Umang Umang; Rajesh Pandey; Jitendra Narayan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background COVID-19 pandemic had unprecedented global impact on health and society, highlighting the need for a detailed understanding of SARS-CoV-2 evolution in response to host and environmental factors. This study investigates the evolution of SARS-CoV-2 via mutation dynamics, focusing on distinct age cohorts, geographical location, and vaccination status within the Indian population, one of the nations most affected by COVID-19. Methodology Comprehensive dataset, across diverse time points during the Alpha, Delta, and Omicron variant waves, captured essential phases of the pandemic’s footprint in India. By leveraging genomic data from Global Initiative on Sharing Avian Influenza Data (GISAID), we examined the substitution mutation landscape of SARS-CoV-2 in three demographic segments: children (1–17 years), working-age adults (18–64 years), and elderly individuals (65+ years). A balanced dataset of 69,975 samples was used for the study, comprising 23,325 samples from each group. This design ensured high statistical power, as confirmed by power analysis. We employed bioinformatics and statistical analyses, to explore genetic diversity patterns and substitution frequencies across the age groups. Principal findings The working-age group exhibited a notably high frequency of unique substitutions, suggesting that immune pressures within highly interactive populations may accelerate viral adaptation. Geographic analysis emphasizes notable regional variation in substitution rates, potentially driven by population density and local transmission dynamics, while regions with more homogeneous strain circulation show relatively lower substitution rates. The analysis also revealed a significant surge in unique substitutions across all age groups during the vaccination period, with substitution rates remaining elevated even after widespread vaccination, compared to pre-vaccination levels. This trend supports the virus's adaptive response to heightened immune pressures from vaccination, as observed through the increased prevalence of substitutions in important regions of SARS-CoV-2 genome like ORF1ab and Spike, potentially contributing to immune escape and transmissibility. Conclusion Our findings affirm the importance of continuous surveillance on viral evolution, particularly in countries with high transmission rates. This research provides insights for anticipating future viral outbreaks and refining pandemic preparedness strategies, thus enhancing our capacity for proactive global health responses.

  15. c

    Customer Transactions Dataset

    • cubig.ai
    zip
    Updated Jun 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Customer Transactions Dataset [Dataset]. https://cubig.ai/store/products/496/customer-transactions-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 22, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Customer Transactions Dataset is an actual transaction-based customer analysis dataset that records 100,000 customer-specific transaction details (such as payment method, purchased product, amount, date, status, type, etc.) in a tabular format.

    2) Data Utilization (1) Customer Transactions Dataset has characteristics that: • Each row contains customer ID, payment method, purchased goods, transaction amount, transaction date, transaction status (success/failure, etc.), and transaction type (purchase/refund, etc.). • The data is organized appropriately for customer segmentation and behavioral analysis, such as customer-specific iterations, various payment methods, and product-specific purchase patterns. (2) Customer Transactions Dataset can be used to: • Customer Segmentation and Target Marketing: Use transaction patterns, payment methods, purchase history, etc. to define customer groups and use them to develop customized marketing strategies. • Purchase behavior and departure prediction: Based on data such as transaction type, amount, status, etc., it can be applied to customer purchase behavior analysis, departure risk prediction, and finding loyal customers.

  16. Customer Segmentation for Targeted Campaigns

    • kaggle.com
    zip
    Updated May 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mani Devesh (2024). Customer Segmentation for Targeted Campaigns [Dataset]. https://www.kaggle.com/datasets/manidevesh/customer-sales-data
    Explore at:
    zip(914292 bytes)Available download formats
    Dataset updated
    May 21, 2024
    Authors
    Mani Devesh
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Project Overview: Customer Segmentation Using K-Means Clustering

    Introduction In this project, I analysed customer data from a retail store to identify distinct customer segments. The dataset includes key attributes such as age, city, and total sales of the customers. By leveraging K-Means clustering, an unsupervised machine learning technique, I aim to group customers based on their age and sales metrics. These insights will enable the creation of targeted marketing campaigns tailored to the specific needs and behaviours of each customer segment.

    Objectives - Cluster Customers: Use K-Means clustering to group customers based on age and total sales. - Analyse Segments: Examine the characteristics of each customer segment. - Targeted Marketing: Develop strategies for personalized marketing campaigns targeting each identified customer group.

    Data Description The dataset comprises:

    • Age: The age of the customers.
    • City: The city where the customers reside.
    • Total Sales: The total sales generated by each customer.

    Methodology - Data Preprocessing: Clean and preprocess the data to handle any missing or inconsistent entries. - Feature Selection: Focus on age and total sales as primary features for clustering. - K-Means Clustering: Apply the K-Means algorithm to identify distinct customer segments. - Cluster Analysis: Analyse the resulting clusters to understand the demographic and sales characteristics of each group. - Marketing Strategy Development: Create targeted marketing strategies for each customer segment to enhance engagement and sales.

    Expected Outcomes - Customer Segments: Clear identification of customer groups based on age and purchasing behaviour. - Insights for Marketing: Detailed understanding of each segment to inform targeted marketing efforts. - Business Impact: Enhanced ability to tailor marketing campaigns, potentially leading to increased customer satisfaction and sales.

    By clustering customers based on age and total sales, this project aims to provide actionable insights for personalized marketing, ultimately driving better customer engagement and higher sales for the retail store.

  17. a

    Demographic and Health Survey 2000 - Armenia

    • microdata.armstat.am
    • catalog.ihsn.org
    • +1more
    Updated Oct 10, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Statistical Service (2019). Demographic and Health Survey 2000 - Armenia [Dataset]. https://microdata.armstat.am/index.php/catalog/1
    Explore at:
    Dataset updated
    Oct 10, 2019
    Dataset provided by
    National Statistical Service
    Ministry of Health
    Time period covered
    2000
    Area covered
    Armenia
    Description

    Abstract

    The Armenia Demographic and Health Survey (ADHS) was a nationally representative sample survey designed to provide information on population and health issues in Armenia. The primary goal of the survey was to develop a single integrated set of demographic and health data, the first such data set pertaining to the population of the Republic of Armenia. In addition to integrating measures of reproductive, child, and adult health, another feature of the DHS survey is that the majority of data are presented at the marz level.

    The ADHS was conducted by the National Statistical Service and the Ministry of Health of the Republic of Armenia during October through December 2000. ORC Macro provided technical support for the survey through the MEASURE DHS+ project. MEASURE DHS+ is a worldwide project, sponsored by the USAID, with a mandate to assist countries in obtaining information on key population and health indicators. USAID/Armenia provided funding for the survey. The United Nations Children’s Fund (UNICEF)/Armenia provided support through the donation of equipment.

    The ADHS collected national- and regional-level data on fertility and contraceptive use, maternal and child health, adult health, and AIDS and other sexually transmitted diseases. The survey obtained detailed information on these issues from women of reproductive age and, on certain topics, from men as well. Data are presented by marz wherever sample size permits.

    The ADHS results are intended to provide the information needed to evaluate existing social programs and to design new strategies for improving the health of and health services for the people of Armenia. The ADHS also contributes to the growing international database on demographic and health-related variables.

    Geographic coverage

    National

    Analysis unit

    • Household
    • Children under five years
    • Women age 15-49
    • Men age 15-54

    Kind of data

    Sample survey data

    Sampling procedure

    The sample was designed to provide estimates of most survey indicators (including fertility, abortion, and contraceptive prevalence) for Yerevan and each of the other ten administrative regions (marzes). The design also called for estimates of infant and child mortality at the national level for Yerevan and other urban areas and rural areas.

    The target sample size of 6,500 completed interviews with women age 15-49 was allocated as follows: 1,500 to Yerevan and 500 to each of the ten marzes. Within each marz, the sample was allocated between urban and rural areas in proportion to the population size. This gave a target sample of approximately 2,300 completed interviews for urban areas exclusive of Yerevan and 2,700 completed interviews for the rural sector. Interviews were completed with 6,430 women. Men age 15-54 were interviewed in every third household; this yielded 1,719 completed interviews.

    A two-stage sample was used. In the first stage, 260 areas or primary sampling units (PSUs) were selected with probability proportional to population size (PPS) by systematic selection from a list of areas. The list of areas was the 1996 Data Base of Addresses and Households constructed by the National Statistical Service. Because most selected areas were too large to be directly listed, a separate segmentation operation was conducted prior to household listing. Large selected areas were divided into segments of which two segments were included in the sample. A complete listing of households was then carried out in selected segments as well as selected areas that were not segmented.

    The listing of households served as the sampling frame for the selection of households in the second stage of sampling. Within each area, households were selected systematically so as to yield an average of 25 completed interviews with eligible women per area. All women 15-49 who stayed in the sampled households on the night before the interview were eligible for the survey. In each segment, a subsample of one-third of all households was selected for the men's component of the survey. In these households, all men 15-54 who stayed in the household on the previous night were eligible for the survey.

    Note: See detailed description of sample design in APPENDIX A of the survey report.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    Three questionnaires were used in the ADHS: a Household Questionnaire, a Women’s Questionnaire, and a Men’s Questionnaire. The questionnaires were based on the model survey instruments developed for the MEASURE DHS+ program. The model questionnaires were adapted for use during a series of expert meetings hosted by the Center of Perinatology, Obstetrics, and Gynecology. The questionnaires were developed in English and translated into Armenian and Russian. The questionnaires were pretested in July 2000.

    The Household Questionnaire was used to list all usual members of and visitors to a household and to collect information on the physical characteristics of the dwelling unit. The first part of the household questionnaire collected information on the age, sex, residence, educational attainment, and relationship to the household head of each household member or visitor. This information provided basic demographic data for Armenian households. It also was used to identify the women and men who were eligible for the individual interview (i.e., women 15-49 and men 15-54). The second part of the Household Questionnaire consisted of questions on housing characteristics (e.g., the flooring material, the source of water, and the type of toilet facilities) and on ownership of a variety of consumer goods.

    The Women’s Questionnaire obtained information on the following topics: - Background characteristics - Pregnancy history - Antenatal, delivery, and postnatal care - Knowledge and use of contraception - Attitudes toward contraception and abortion - Reproductive and adult health - Vaccinations, birth registration, and health of children under age five - Episodes of diarrhea and respiratory illness of children under age five - Breastfeeding and weaning practices - Height and weight of women and children under age five - Hemoglobin measurement of women and children under age five - Marriage and recent sexual activity - Fertility preferences - Knowledge of and attitude toward AIDS and other sexually transmitted infections.

    The Men’s Questionnaire focused on the following topics: - Background characteristics - Health - Marriage and recent sexual activity - Attitudes toward and use of condoms - Knowledge of and attitude toward AIDS and other sexually transmitted infections.

    Cleaning operations

    After a team had completed interviewing in a cluster, questionnaires were returned promptly to the National Statistical Service in Yerevan for data processing. The office editing staff first checked that questionnaires for all selected households and eligible respondents had been received from the field staff. In addition, a few questions that had not been precoded (e.g., occupation) were coded at this time. Using the ISSA (Integrated System for Survey Analysis) software, a specially trained team of data processing staff entered the questionnaires and edited the resulting data set on microcomputers. The process of office editing and data processing was initiated soon after the beginning of fieldwork and was completed by the end of January 2001.

    Response rate

    A total of 6,524 households were selected for the sample, of which 6,150 were occupied at the time of fieldwork. The main reason for the difference is that some of the dwelling units that were occupied during the household listing operation were either vacant or the household was away for an extended period at the time of interviewing. Of the occupied households, 97 percent were successfully interviewed.

    In these households, 6,685 women were identified as eligible for the individual interview (i.e., age 15-49). Interviews were completed with 96 percent of them. Of the 1,913 eligible men identified, 90 percent were successfully interviewed. The principal reason for non-response among eligible women and men was the failure to find them at home despite repeated visits to the household. The refusal rate was low.

    The overall response rates, the product of the household and the individual response rates, were 94 percent for women and 87 percent for men.

    Note: See summarized response rates by residence (urban/rural) in Table 1.1 of the survey report.

    Sampling error estimates

    The estimates from a sample survey are affected by two types of errors: (1) nonsampling errors, and (2) sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2000 Armenia Demographic and Health Survey (ADHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.

    Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the ADHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey

  18. d

    Consumer B2C Data | United States

    • datarade.ai
    .csv, .xls
    Updated Nov 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archetype Data (2025). Consumer B2C Data | United States [Dataset]. https://datarade.ai/data-products/consumer-b2c-data-united-states-archetype-data
    Explore at:
    .csv, .xlsAvailable download formats
    Dataset updated
    Nov 21, 2025
    Dataset authored and provided by
    Archetype Data
    Area covered
    United States
    Description

    Archetype Data’s B2C Consumer File is one of the most comprehensive and data-rich consumer datasets in the United States, encompassing over 260 million verified individuals and households. Designed for precision marketing, analytics, and customer intelligence, this dataset delivers unparalleled depth across lifestyle, demographic, financial, and behavioral dimensions enabling businesses to understand, segment, and engage consumers with accuracy and confidence.

    Each consumer record includes fundamental demographic elements such as name, age, gender, location, household composition, and contact information. Building upon that, Archetype Data enriches every profile with 400+ lifestyle, financial, and behavioral variables that capture consumer intent, spending capacity, purchasing habits, media preferences, and digital engagement patterns. This multidimensional view empowers marketers, insurers, and data-driven enterprises to identify not just who a consumer is—but how they live, shop, and connect.

    What truly differentiates Archetype Data’s B2C file is its integration with our Linq360™ B2B2C dataset, which links consumers to the businesses they own or operate. This linkage provides a powerful bridge between professional and personal identity, offering unparalleled insight into small business owners, entrepreneurs, and professionals as both business decision-makers and consumers.

    Whether activating audiences across CTV, programmatic display, social, or direct mail, our data seamlessly maps into today’s leading marketing and advertising ecosystems, including LiveRamp, The Trade Desk, and other major platforms.

    The B2C Consumer File supports a wide range of applications; audience segmentation, modeling, CRM enrichment, lookalike development, and attribution measurement—across industries such as retail, finance, insurance, media, and healthcare. Whether you’re building a custom audience for a digital campaign, enriching customer records, or analyzing lifestyle trends within a region, Archetype Data’s file provides the scale and precision needed to deliver meaningful results.

  19. Data from: ICDAR 2021 Competition on Historical Map Segmentation — Dataset

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, bin
    Updated May 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Chazalon; Joseph Chazalon; Edwin Carlinet; Edwin Carlinet; Yizi Chen; Yizi Chen; Julien Perret; Julien Perret; Bertrand Duménieu; Bertrand Duménieu; Clément Mallet; Clément Mallet; Thierry Géraud; Thierry Géraud (2021). ICDAR 2021 Competition on Historical Map Segmentation — Dataset [Dataset]. http://doi.org/10.5281/zenodo.4817662
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    May 30, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Joseph Chazalon; Joseph Chazalon; Edwin Carlinet; Edwin Carlinet; Yizi Chen; Yizi Chen; Julien Perret; Julien Perret; Bertrand Duménieu; Bertrand Duménieu; Clément Mallet; Clément Mallet; Thierry Géraud; Thierry Géraud
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ICDAR 2021 Competition on Historical Map Segmentation — Dataset

    This is the dataset of the ICDAR 2021 Competition on Historical Map Segmentation (“MapSeg”).
    This competition ran from November 2020 to April 2021.
    Evaluation tools are freely available but distributed separately.

    Official competition website: https://icdar21-mapseg.github.io/

    The competition report can be cited as:

    Joseph Chazalon, Edwin Carlinet, Yizi Chen, Julien Perret, Bertrand Duménieu, Clément Mallet, Thierry Géraud, Vincent Nguyen, Nam Nguyen, Josef Baloun, Ladislav Lenc, and Pavel Král, "ICDAR 2021 Competition on Historical Map Segmentation", in Proceedings of the 16th International Conference on Document Analysis and Recognition (ICDAR'21), September 5-10, 2021, Lausanne, Switzerland.

    BibTeX entry:

    @InProceedings{chazalon.21.icdar.mapseg,
     author  = {Joseph Chazalon and Edwin Carlinet and Yizi Chen and Julien Perret and Bertrand Duménieu and Clément Mallet and Thierry Géraud and Vincent Nguyen and Nam Nguyen and Josef Baloun and Ladislav Lenc and and Pavel Král},
     title   = {ICDAR 2021 Competition on Historical Map Segmentation},
     booktitle = {Proceedings of the 16th International Conference on Document Analysis and Recognition (ICDAR'21)},
     year   = {2021},
     address  = {Lausanne, Switzerland},
    }

    We thank the City of Paris for granting us with the permission to use and reproduce the atlases used in this work.

    The images of this dataset are extracted from a series of 9 atlases of the City of Paris produced between 1894 and 1937 by the Map Service (“Service du plan”) of the City of Paris, France, for the purpose of urban management and planning. For each year, a set of approximately 20 sheets forms a tiled view of the city, drawn at 1/5000 scale using trigonometric triangulation.

    Sample citation of original documents:

    Atlas municipal des vingt arrondissements de Paris. 1894, 1895, 1898, 1905, 1909, 1912, 1925, 1929, and 1937. Bibliothèque de l’Hôtel de Ville. City of Paris. France.

    Motivation

    This competition aims as encouraging research in the digitization of historical maps. In order to be usable in historical studies, information contained in such images need to be extracted. The general pipeline involves multiples stages; we list some essential ones here:

    • segment map content: locate the area of the image which contains map content;
    • extract map object from different layers: detect objects like roads, buildings, building blocks, rivers, etc. to create geometric data;
    • georeference the map: by detecting objects at known geographic coordinate, compute the transformation to turn geometric objects into geographic ones (which can be overlaid on current maps).

    Task overview

    • Task 1: Detection of building blocks
    • Task 2: Segmentation of map content within map sheets
    • Task 3: Localization of graticule lines intersections

    Please refer to the enclosed README.md file or to the official website for the description of tasks and file formats.

    Evaluation metrics and tools

    Evaluation metrics are described in the competition report and tools are available at https://github.com/icdar21-mapseg/icdar21-mapseg-eval and should also be archived using Zenodo.

  20. S

    A dataset of building instances of typical cities in China

    • scidb.cn
    Updated Mar 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fang Fang; Kaishun Wu; Daoyuan Zheng (2021). A dataset of building instances of typical cities in China [Dataset]. http://doi.org/10.11922/sciencedb.00620
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 25, 2021
    Dataset provided by
    Science Data Bank
    Authors
    Fang Fang; Kaishun Wu; Daoyuan Zheng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    China
    Description

    Building contour data is an important part of basic geographic information. The performance of building automatic extraction is driven by a large number of training samples. To enrich the publicly datasets of cities in China, we created a building instance dataset that is sourced from high-resolution remote sensing images and was manually and interactively annotated. This dataset consists of 7,260 samples of regions including 63,886 building instances in the cities of Beijing, Shanghai, Shenzhen and Wuhan, China. The annotations of the dataset consist of MS COCO 2017 format files and the corresponding building mask binary maps. This dataset provides fundamental data for the research of building detection and extraction from high-resolution remote sensing images.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shivam Bansal (2021). Bank Customer Segmentation (1M+ Transactions) [Dataset]. https://www.kaggle.com/shivamb/bank-customer-segmentation
Organization logo

Bank Customer Segmentation (1M+ Transactions)

Customer demographics and transactions data from an Indian Bank

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
zip(25360448 bytes)Available download formats
Dataset updated
Oct 26, 2021
Authors
Shivam Bansal
Description

Bank Customer Segmentation

Most banks have a large customer base - with different characteristics in terms of age, income, values, lifestyle, and more. Customer segmentation is the process of dividing a customer dataset into specific groups based on shared traits.

According to a report from Ernst & Young, “A more granular understanding of consumers is no longer a nice-to-have item, but a strategic and competitive imperative for banking providers. Customer understanding should be a living, breathing part of everyday business, with insights underpinning the full range of banking operations.

About this Dataset

This dataset consists of 1 Million+ transaction by over 800K customers for a bank in India. The data contains information such as - customer age (DOB), location, gender, account balance at the time of the transaction, transaction details, transaction amount, etc.

Interesting Analysis Ideas

The dataset can be used for different analysis, example -

  1. Perform Clustering / Segmentation on the dataset and identify popular customer groups along with their definitions/rules
  2. Perform Location-wise analysis to identify regional trends in India
  3. Perform transaction-related analysis to identify interesting trends that can be used by a bank to improve / optimi their user experiences
  4. Customer Recency, Frequency, Monetary analysis
  5. Network analysis or Graph analysis of customer data.
Search
Clear search
Close search
Google apps
Main menu