24 datasets found
  1. Demographic profile of audience segments.

    • plos.figshare.com
    xls
    Updated Jan 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes (2024). Demographic profile of audience segments. [Dataset]. http://doi.org/10.1371/journal.pone.0296049.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pandemics such as Covid-19 pose tremendous public health communication challenges in promoting protective behaviours, vaccination, and educating the public about risks. Segmenting audiences based on attitudes and behaviours is a means to increase the precision and potential effectiveness of such communication. The present study reports on such an audience segmentation effort for the population of England, sponsored by the United Kingdom Health Security Agency (UKHSA) and involving a collaboration of market research and academic experts. A cross-sectional online survey was conducted between 4 and 24 January 2022 with 5525 respondents (5178 used in our analyses) in England using market research opt-in panel. An additional 105 telephone interviews were conducted to sample persons without online or smartphone access. Respondents were quota sampled to be demographically representative. The primary analytic technique was k means cluster analysis, supplemented with other techniques including multi-dimensional scaling and use of respondent ‐ as well as sample-standardized data when necessary to address differences in response set for some groups of respondents. Identified segments were profiled against demographic, behavioural self-report, attitudinal, and communication channel variables, with differences by segment tested for statistical significance. Seven segments were identified, including distinctly different groups of persons who tended toward a high level of compliance and several that were relatively low in compliance. The segments were characterized by distinctive patterns of demographics, attitudes, behaviours, trust in information sources, and communication channels preferred. Segments were further validated by comparing the segmentation variable versus a set of demographic variables as predictors of reported protective behaviours in the past two weeks and of vaccine refusal; the demographics together had about one-quarter the effect size of the single seven-level segment variable. With respect to managerial implications, different communication strategies for each segment are suggested for each segment, illustrating advantages of rich segmentation descriptions for understanding public health communication audiences. Strengths and weaknesses of the methods used are discussed, to help guide future efforts.

  2. Jimrealtex customer dataset

    • kaggle.com
    zip
    Updated Nov 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JIMOH YEKINI (2025). Jimrealtex customer dataset [Dataset]. https://www.kaggle.com/datasets/jimohyekini/jimrealtex-customer-dataset
    Explore at:
    zip(1591 bytes)Available download formats
    Dataset updated
    Nov 22, 2025
    Authors
    JIMOH YEKINI
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Description: Jimrealtex Customer Dataset

    This dataset contains customer demographic and behavioral information designed for exploring segmentation, clustering, and predictive analytics in retail and marketing contexts. It provides a simple yet powerful foundation for practicing data science techniques such as K-Means clustering, customer profiling, and recommendation systems.

    ### Dataset Features - CustomerID: Unique identifier for each customer
    - Genre: Gender of the customer (Male/Female)
    - Age: Age of the customer (years)
    - Annual Income (k$): Annual income in thousands of dollars
    - Spending Score: A score assigned by the business based on customer behavior and spending patterns

    Notes - Some records contain missing values (nan) in Age, Annual Income, or Spending Score. These can be handled using imputation, removal, or advanced techniques depending on the analysis.
    - Spending Score is an arbitrary metric often used in clustering exercises to simulate customer engagement.

    ### Potential Use Cases - Customer Segmentation: Apply clustering algorithms (e.g., K-Means, DBSCAN) to group customers by income and spending habits.
    - Marketing Strategy: Identify high-value customers and tailor promotions.
    - Predictive Modeling: Build models to predict spending behavior based on demographics.
    - Data Cleaning Practice: Handle missing values and prepare the dataset for machine learning tasks.

    ** Why This Dataset?**

    This dataset is widely used in machine learning tutorials and business analytics projects because it is small, interpretable, and directly applicable to real-world scenarios like retail customer analysis. It’s ideal for beginners learning clustering and for professionals prototyping segmentation strategies.

  3. Segments and demographic variables predicting Covid-19 protective behaviors....

    • plos.figshare.com
    xls
    Updated Jan 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes (2024). Segments and demographic variables predicting Covid-19 protective behaviors. [Dataset]. http://doi.org/10.1371/journal.pone.0296049.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Segments and demographic variables predicting Covid-19 protective behaviors.

  4. App Users Segmentation: Case Study

    • kaggle.com
    zip
    Updated Jun 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhanupratap Biswas (2023). App Users Segmentation: Case Study [Dataset]. https://www.kaggle.com/datasets/bhanupratapbiswas/app-users-segmentation-case-study
    Explore at:
    zip(11584 bytes)Available download formats
    Dataset updated
    Jun 12, 2023
    Authors
    Bhanupratap Biswas
    Description

    Here's a step-by-step guide on how to approach user segmentation for FitTrackr:

    Define your segmentation goals: Start by determining what you want to achieve with user segmentation. For example, you might want to identify the most engaged users, understand the demographics of your user base, or target specific user groups with personalized promotions.

    Gather data: Collect relevant data about your app users. This can include demographic information (age, gender, location), app usage data (frequency of app usage, time spent on different features), user behavior (types of workouts, goals set, achievements unlocked), and any other relevant data points available to you.

    Identify relevant segmentation variables: Based on the goals you defined, identify the key variables that will help you segment your user base effectively. For FitTrackr, potential variables could include age, gender, fitness goals (e.g., weight loss, muscle gain), workout preferences (e.g., cardio, strength training), and user engagement level.

    Segment the user base: Use clustering techniques or segmentation algorithms to divide your user base into distinct segments based on the identified variables. You can employ methods such as k-means clustering, hierarchical clustering, or even machine learning algorithms like decision trees or random forests.

    Analyze and profile each segment: Once the segmentation is done, analyze each segment to understand their characteristics, preferences, and needs. Create detailed user profiles for each segment, including demographic information, app usage patterns, fitness goals, and any other relevant attributes. This will help you tailor your marketing messages and app features to each segment's specific requirements.

    Develop targeted strategies: Based on the insights gained from user profiles, develop targeted marketing strategies and app features for each segment. For example, if you have a segment of users who primarily focus on weight loss, you might create personalized workout plans or send them motivational content related to weight management.

    Implement and evaluate: Implement the targeted strategies and monitor their effectiveness. Continuously evaluate and refine your segmentation approach based on user feedback, engagement metrics, and the achievement of your goals.

  5. d

    Audience Targeting Data | 330M+ Global Devices | Audience Data & Advertising...

    • datarade.ai
    .json, .csv
    Updated Feb 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DRAKO (2025). Audience Targeting Data | 330M+ Global Devices | Audience Data & Advertising | API Delivery [Dataset]. https://datarade.ai/data-products/audience-targeting-data-330m-global-devices-audience-dat-drako
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Feb 4, 2025
    Dataset authored and provided by
    DRAKO
    Area covered
    Czech Republic, Armenia, Curaçao, Russian Federation, Equatorial Guinea, Serbia, Namibia, Suriname, San Marino, Eritrea
    Description

    DRAKO is a Mobile Location Audience Targeting provider with a programmatic trading desk specialising in geolocation analytics and programmatic advertising. Through our customised approach, we offer business and consumer insights as well as addressable audiences for advertising.

    Mobile Location Data can be meaningfully transformed into Audience Targeting when used in conjunction with other dataset. Our expansive POI Data allows us to segment users by visitation to major brands and retailers as well as categorizes them into syndicated segments. Beyond POI visits, our proprietary Home Location Model determines residents of geographic areas such as Designated Market Areas, Counties, or States. Relatedly, our Home Location Model also fuels our Geodemographic Census Data segments as we are able to determine residents of the smallest census units. Additionally, we also have audiences of: ticketed event and venue visitors; survey data; and retail data.

    All of our Audience Targeting is 100% deterministic in that it only includes high-quality, real visits to locations as defined by a POIs satellite imagery buildings contour. We never use a radius when building an audience unless requested. We have a horizontal accuracy of 5m.

    Additionally, we can always cross reference your audience targeting with our syndicated segments:

    Overview of our Syndicated Audience Data Segments: - Brand/POI segments (specific named stores and locations) - Categories (behavioural segments - revealed habits) - Census demographic segments (HH income, race, religion, age, family structure, language, etc.,) - Events segments (ticketed live events, conferences, and seminars) - Resident segments (State/province, CMAs, DMAs, city, county, sub-county) - Political segments (Canadian Federal and Provincial, US Congressional Upper and Lower House, US States, City elections, etc.,) - Survey Data (Psychosocial/Demographic survey data) - Retail Data (Receipt/transaction data)

    All of our syndicated segments are customizable. That means you can limit them to people within a certain geography, remove employees, include only the most frequent visitors, define your own custom lookback, or extend our audiences using our Home, Work, and Social Extensions.

    In addition to our syndicated segments, we’re also able to run custom queries return to you all the Mobile Ad IDs (MAIDs) seen at in a specific location (address; latitude and longitude; or WKT84 Polygon) or in your defined geographic area of interest (political districts, DMAs, Zip Codes, etc.,)

    Beyond just returning all the MAIDs seen within a geofence, we are also able to offer additional customizable advantages: - Average precision between 5 and 15 meters - CRM list activation + extension - Extend beyond Mobile Location Data (MAIDs) with our device graph - Filter by frequency of visitations - Home and Work targeting (retrieve only employees or residents of an address) - Home extensions (devices that reside in the same dwelling from your seed geofence) - Rooftop level address geofencing precision (no radius used EVER unless user specified) - Social extensions (devices in the same social circle as users in your seed geofence) - Turn analytics into addressable audiences - Work extensions (coworkers of users in your seed geofence)

    Data Compliance: All of our Audience Targeting Data is fully CCPA compliant and 100% sourced from SDKs (Software Development Kits), the most reliable and consistent mobile data stream with end user consent available with only a 4-5 day delay. This means that our location and device ID data comes from partnerships with over 1,500+ mobile apps. This data comes with an associated location which is how we are able to segment using geofences.

    Data Quality: In addition to partnering with trusted SDKs, DRAKO has additional screening methods to ensure that our mobile location data is consistent and reliable. This includes data harmonization and quality scoring from all of our partners in order to disregard MAIDs with a low quality score.

  6. Customer Segmentation for Targeted Campaigns

    • kaggle.com
    zip
    Updated May 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mani Devesh (2024). Customer Segmentation for Targeted Campaigns [Dataset]. https://www.kaggle.com/datasets/manidevesh/customer-sales-data
    Explore at:
    zip(914292 bytes)Available download formats
    Dataset updated
    May 21, 2024
    Authors
    Mani Devesh
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Project Overview: Customer Segmentation Using K-Means Clustering

    Introduction In this project, I analysed customer data from a retail store to identify distinct customer segments. The dataset includes key attributes such as age, city, and total sales of the customers. By leveraging K-Means clustering, an unsupervised machine learning technique, I aim to group customers based on their age and sales metrics. These insights will enable the creation of targeted marketing campaigns tailored to the specific needs and behaviours of each customer segment.

    Objectives - Cluster Customers: Use K-Means clustering to group customers based on age and total sales. - Analyse Segments: Examine the characteristics of each customer segment. - Targeted Marketing: Develop strategies for personalized marketing campaigns targeting each identified customer group.

    Data Description The dataset comprises:

    • Age: The age of the customers.
    • City: The city where the customers reside.
    • Total Sales: The total sales generated by each customer.

    Methodology - Data Preprocessing: Clean and preprocess the data to handle any missing or inconsistent entries. - Feature Selection: Focus on age and total sales as primary features for clustering. - K-Means Clustering: Apply the K-Means algorithm to identify distinct customer segments. - Cluster Analysis: Analyse the resulting clusters to understand the demographic and sales characteristics of each group. - Marketing Strategy Development: Create targeted marketing strategies for each customer segment to enhance engagement and sales.

    Expected Outcomes - Customer Segments: Clear identification of customer groups based on age and purchasing behaviour. - Insights for Marketing: Detailed understanding of each segment to inform targeted marketing efforts. - Business Impact: Enhanced ability to tailor marketing campaigns, potentially leading to increased customer satisfaction and sales.

    By clustering customers based on age and total sales, this project aims to provide actionable insights for personalized marketing, ultimately driving better customer engagement and higher sales for the retail store.

  7. d

    AI in Consumer Decision Making | Global Coverage | 190+ Countries

    • datarade.ai
    .json, .csv, .xls
    Updated Aug 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rwazi (2025). AI in Consumer Decision Making | Global Coverage | 190+ Countries [Dataset]. https://datarade.ai/data-products/ai-in-consumer-decision-making-global-coverage-190-count-rwazi
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Aug 21, 2025
    Dataset authored and provided by
    Rwazihttp://rwazi.com/
    Area covered
    United Kingdom
    Description

    AI in Consumer Decision-Making: Global Zero-Party Dataset

    This dataset captures how consumers around the world are using AI tools like ChatGPT, Perplexity, Gemini, Claude, and Copilot to guide their purchase decisions. It spans multiple product categories, demographics, and geographies, mapping the emerging role of AI as a decision-making companion across the consumer journey.

    What Makes This Dataset Unique

    Unlike datasets inferred from digital traces or modeled from third-party assumptions, this collection is built entirely on zero-party data: direct responses from consumers who voluntarily share their habits and preferences. That means the insights come straight from the people making the purchases, ensuring unmatched accuracy and relevance.

    For FMCG leaders, retailers, and financial services strategists, this dataset provides the missing piece: visibility into how often consumers are letting AI shape their decisions, and where that influence is strongest.

    Dataset Structure

    Each record is enriched with: Product Category – from high-consideration items like electronics to daily staples such as groceries and snacks. AI Tool Used – identifying whether consumers turn to ChatGPT, Gemini, Perplexity, Claude, or Copilot. Influence Level – the percentage of consumers in a given context who rely on AI to guide their choices. Demographics – generational breakdowns from Gen Z through Boomers. Geographic Detail – city- and country-level coverage across Africa, LATAM, Asia, Europe, and North America.

    This structure allows filtering and comparison across categories, age groups, and markets, giving users a multidimensional view of AI’s impact on purchasing.

    Why It Matters

    AI has become a trusted voice in consumers’ daily lives. From meal planning to product comparisons, many people now consult AI before making a purchase—often without realizing how much it shapes the options they consider. For brands, this means that the path to purchase increasingly runs through an AI filter.

    This dataset provides a comprehensive view of that hidden step in the consumer journey, enabling decision-makers to quantify: How much AI shapes consumer thinking before they even reach the shelf or checkout. Which product categories are most influenced by AI consultation. How adoption varies by geography and generation. Which AI platforms are most commonly trusted by consumers.

    Opportunities for Business Leaders

    FMCG & Retail Brands: Understand where AI-driven decision-making is already reshaping category competition. Marketers: Identify demographic segments most likely to consult AI, enabling targeted strategies. Retailers: Align assortments and promotions with the purchase patterns influenced by AI queries. Investors & Innovators: Gauge market readiness for AI-integrated commerce solutions.

    The dataset doesn’t just describe what’s happening—it opens doors to the “so what” questions that define strategy. Which categories are becoming algorithm-driven? Which markets are shifting fastest? Where is the opportunity to get ahead of competitors in an AI-shaped funnel?

    Why Now

    Consumer AI adoption is no longer a forecast; it is a daily behavior. Just as search engines once rewrote the rules of marketing, conversational AI is quietly rewriting how consumers decide what to buy. This dataset offers an early, detailed view into that change, giving brands the ability to act while competitors are still guessing.

    What You Get

    Users gain: A global, city-level view of AI adoption in consumer decision-making. Cross-category comparability to see where AI influence is strongest and weakest. Generational breakdowns that show how adoption differs between younger and older cohorts. AI platform analysis, highlighting how tool preferences vary by region and category. Every row is powered by zero-party input, ensuring the insights reflect actual consumer behavior—not modeled assumptions.

    How It’s Used

    Leverage this data to:

    Validate strategies before entering new markets or categories. Benchmark competitors on AI readiness and influence. Identify growth opportunities in categories where AI-driven recommendations are rapidly shaping decisions. Anticipate risks where brand visibility could be disrupted by algorithmic mediation.

    Core Insights

    The full dataset reveals: Surprising adoption curves across categories where AI wasn’t expected to play a role. Geographic pockets where AI has already become a standard step in purchase decisions. Demographic contrasts showing who trusts AI most—and where skepticism still holds. Clear differences between AI platforms and the consumer profiles most drawn to each.

    These patterns are not visible in traditional retail data, sales reports, or survey summaries. They are only captured here, directly from the consumers themselves.

    Summary

    Winning in FMCG and retail today means more than getting on shelves, capturing price points, or running promotions. It means understanding the invisible algorithms consumers are ...

  8. U.S. population by generation 2024

    • statista.com
    Updated Nov 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). U.S. population by generation 2024 [Dataset]. https://www.statista.com/statistics/797321/us-population-by-generation/
    Explore at:
    Dataset updated
    Nov 19, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    Millennials were the largest generation group in the United States in 2024, with an estimated population of ***** million. Born between 1981 and 1996, Millennials recently surpassed Baby Boomers as the biggest group, and they will continue to be a major part of the population for many years. The rise of Generation Alpha Generation Alpha is the most recent to have been named, and many group members will not be able to remember a time before smartphones and social media. As of 2024, the oldest Generation Alpha members were still only aging into adolescents. However, the group already makes up around ***** percent of the U.S. population, and they are said to be the most racially and ethnically diverse of all the generation groups. Boomers vs. Millennials The number of Baby Boomers, whose generation was defined by the boom in births following the Second World War, has fallen by around ***** million since 2010. However, they remain the second-largest generation group, and aging Boomers are contributing to steady increases in the median age of the population. Meanwhile, the Millennial generation continues to grow, and one reason for this is the increasing number of young immigrants arriving in the United States.

  9. Factors used to create segmentation and items comprising them.

    • plos.figshare.com
    ods
    Updated Jan 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes (2024). Factors used to create segmentation and items comprising them. [Dataset]. http://doi.org/10.1371/journal.pone.0296049.s002
    Explore at:
    odsAvailable download formats
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Factors used to create segmentation and items comprising them.

  10. Vaccination status and past two-week protective behavior by segment.

    • plos.figshare.com
    xls
    Updated Jan 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes (2024). Vaccination status and past two-week protective behavior by segment. [Dataset]. http://doi.org/10.1371/journal.pone.0296049.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Vaccination status and past two-week protective behavior by segment.

  11. Salmon Population

    • kaggle.com
    zip
    Updated Apr 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MiddleHigh (2024). Salmon Population [Dataset]. https://www.kaggle.com/datasets/middlehigh/salmon-population
    Explore at:
    zip(779085 bytes)Available download formats
    Dataset updated
    Apr 26, 2024
    Authors
    MiddleHigh
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides comprehensive information on global salmon populations, focusing on their decline in oceanic environments. It includes various data points collected over time to track and analyze trends in salmon populations. The key columns in this dataset are:

    SERIES - Internal code for dataset indicating Domain, species, and Status Review data set year and when applicable, method.

    NMFS_POPID - The unique numeric value for a population as determined by NMFS. This value will not change over time, even if the population name (NWR Population Name) does.

    RECOVERY_DOMAIN - Discrete geographic areas for which comprehensive recovery plans are being developed: Puget Sound, Willamette/Lower Columbia, Interior Columbia (including the Mid-Columbia, Upper Columbia, and Snake River sub-domains), Oregon Coast, and Southern/Oregon Northern California Coast.

    ESU - For populations listed under the federal ESA, this is the name of a defined Evolutionary Significant Unit (ESU) or Distinct Population Segment (DPS) as defined by NMFS Northwest Region or by USFWS.

    MAJOR_POPULATION_GROUP - Major Population Group, as defined by the NWR. Groups of populations within an ESU/DPS that are more similar to each other than they are to other populations. They are based on similarities in genetic characteristics, demographic patterns and habitat types and on geographic structure.

    POPULATION_NAME - Legal given name for a listed population within the ESU.

    COMMON_POPULATION_NAME - Shortened population name

    DISPLAY_ORDER - Geographically based display order within ESUs.

    SPECIES - Salmon species name

    RUN_TIMING - Run of fish, generally determined on the basis of the time of year at which adults enter fresh water to spawn. (Spring, Summer, Spring/Summer, Fall, Winter, early, or late)

    STREAM_NAME - Name of the primary stream for the Population

    YEAR - Calender year of return

    NUMBER_OF_SPAWNERS - Estimated number of natural origin (parents spawned in the wild) spawners contributing to spawning in a particular year. Includes both adults and jacks of natural origin (except for SR fall chinook which typically does have jack returns)

    FRACWILD - The fraction of the total spawners that are the progeny of naturally-spawning fish.

    CATCH - Terminal fishery harvest

    AGE_1_RETURNS - The fraction of fish who are defined as having an age of 1 that returned to spawn in a given year.

    AGE_2_RETURNS - The fraction of fish who are defined as having an age of 2 that returned to spawn in a given year.

    AGE_3_RETURNS - The fraction of fish who are defined as having an age of 3 that returned to spawn in a given year.

    AGE_4_RETURNS - The fraction of fish who are defined as having an age of 4 that returned to spawn in a given year.

    AGE_5_RETURNS - The fraction of fish who are defined as having an age of 5 that returned to spawn in a given year.

    AGE_6_RETURNS - The fraction of fish who are defined as having an age of 6 that returned to spawn in a given year.

    AGE_7_RETURNS - The fraction of fish who are defined as having an age of 7 that returned to spawn in a given year.

    METHOD - Survey (spawning ground), Model (PIT tag data), or GSI (genetic stock inventory data), Ladder count (at dam)

    CITATION - Data source citation

    CONTRIBUTOR - Agency, Tribe or other entity responsible for these data that is the best contact for questions that may arise about this data record.

    DOCUMENT_CITATION - Citation of the document this dataset archive informed.

    CODE_LINK - Location of the code used to generate analysis for the document

    You can access the dataset here

  12. Segmentation and socio-demographic variables.

    • plos.figshare.com
    xls
    Updated Jun 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mauricio Carvache-Franco; Tahani Hassan; Orly Carvache-Franco; Wilmer Carvache-Franco; Olga Martin-Moreno (2023). Segmentation and socio-demographic variables. [Dataset]. http://doi.org/10.1371/journal.pone.0287113.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mauricio Carvache-Franco; Tahani Hassan; Orly Carvache-Franco; Wilmer Carvache-Franco; Olga Martin-Moreno
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Food festivals have been a growing tourism sector in recent years due to their contributions to a region’s economic, marketing, brand, and social growth. This study analyses the demand for the Bahrain food festival. The stated objectives were: i) To identify the motivational dimensions of the demand for the food festival, (ii) To determine the segments of the demand for the food festival, and (iii) To establish the relationship between the demand segments and socio-demographic aspects. The food festival investigated was the Bahrain Food Festival held in Bahrain, located on the east coast of the Persian Gulf. The sample consisted of 380 valid questionnaires and was taken using social networks from those attending the event. The statistical techniques used were factorial analysis and the K-means grouping method. The results show five motivational dimensions: Local food, Art, Entertainment, Socialization, and Escape and novelty. In addition, two segments were found; the first, Entertainment and novelties, is related to attendees who seek to enjoy the festive atmosphere and discover new restaurants. The second is Multiple motives, formed by attendees with several motivations simultaneously. This segment has the highest income and expenses, making it the most important group for developing plans and strategies. The results will contribute to the academic literature and the organizers of food festivals.

  13. f

    Trust in information sources re Covid-19 guidance by segment.

    • plos.figshare.com
    xls
    Updated Jan 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes (2024). Trust in information sources re Covid-19 guidance by segment. [Dataset]. http://doi.org/10.1371/journal.pone.0296049.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Trust in information sources re Covid-19 guidance by segment.

  14. Mini Demographic and Health Survey 2019 - Ethiopia

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated May 11, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistical Agency (CSA) (2021). Mini Demographic and Health Survey 2019 - Ethiopia [Dataset]. https://microdata.worldbank.org/index.php/catalog/3946
    Explore at:
    Dataset updated
    May 11, 2021
    Dataset provided by
    Central Statistical Agencyhttps://ess.gov.et/
    Ethiopian Public Health Institute (EPHI)
    Federal Ministry of Health (FMoH)
    Time period covered
    2019
    Area covered
    Ethiopia
    Description

    Abstract

    The 2019 Ethiopia Mini Demographic and Health Survey (EMDHS) is a nationwide survey with a nationally representative sample of 9,150 selected households. All women age 15-49 who were usual members of the selected households and those who spent the night before the survey in the selected households were eligible to be interviewed in the survey. In the selected households, all children under age 5 were eligible for height and weight measurements. The survey was designed to produce reliable estimates of key indicators at the national level as well as for urban and rural areas and each of the 11 regions in Ethiopia.

    The primary objective of the 2019 EMDHS is to provide up-to-date estimates of key demographic and health indicators. Specifically, the main objectives of the survey are: ▪ To collect high-quality data on contraceptive use; maternal and child health; infant, child, and neonatal mortality levels; child nutrition; and other health issues relevant to achievement of the Sustainable Development Goals (SDGs) ▪ To collect information on health-related matters such as breastfeeding, maternal and child care (antenatal, delivery, and postnatal), children’s immunizations, and childhood diseases ▪ To assess the nutritional status of children under age 5 by measuring weight and height

    Geographic coverage

    National coverage

    Analysis unit

    • Household
    • Individual
    • Children age 0-5
    • Woman age 15-49
    • Health facility

    Universe

    The survey covered all de jure household members (usual residents), all women aged 15-49 and all children aged 0-5 resident in the household.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The sampling frame used for the 2019 EMDHS is a frame of all census enumeration areas (EAs) created for the 2019 Ethiopia Population and Housing Census (EPHC) and conducted by the Central Statistical Agency (CSA). The census frame is a complete list of the 149,093 EAs created for the 2019 EPHC. An EA is a geographic area covering an average of 131 households. The sampling frame contains information about EA location, type of residence (urban or rural), and estimated number of residential households.

    Administratively, Ethiopia is divided into nine geographical regions and two administrative cities. The sample for the 2019 EMDHS was designed to provide estimates of key indicators for the country as a whole, for urban and rural areas separately, and for each of the nine regions and the two administrative cities.

    The 2019 EMDHS sample was stratified and selected in two stages. Each region was stratified into urban and rural areas, yielding 21 sampling strata. Samples of EAs were selected independently in each stratum in two stages. Implicit stratification and proportional allocation were achieved at each of the lower administrative levels by sorting the sampling frame within each sampling stratum before sample selection, according to administrative units in different levels, and by using a probability proportional to size selection at the first stage of sampling.

    To ensure that survey precision was comparable across regions, sample allocation was done through an equal allocation wherein 25 EAs were selected from eight regions. However, 35 EAs were selected from each of the three larger regions: Amhara, Oromia, and the Southern Nations, Nationalities, and Peoples’ Region (SNNPR).

    In the first stage, a total of 305 EAs (93 in urban areas and 212 in rural areas) were selected with probability proportional to EA size (based on the 2019 EPHC frame) and with independent selection in each sampling stratum. A household listing operation was carried out in all selected EAs from January through April 2019. The resulting lists of households served as a sampling frame for the selection of households in the second stage. Some of the selected EAs for the 2019 EMDHS were large, with more than 300 households. To minimise the task of household listing, each large EA selected for the 2019 EMDHS was segmented. Only one segment was selected for the survey, with probability proportional to segment size. Household listing was conducted only in the selected segment; that is, a 2019 EMDHS cluster is either an EA or a segment of an EA.

    In the second stage of selection, a fixed number of 30 households per cluster were selected with an equal probability systematic selection from the newly created household listing. All women age 15-49 who were either permanent residents of the selected households or visitors who slept in the household the night before the survey were eligible to be interviewed. In all selected households, height and weight measurements were collected from children age 0-59 months, and women age 15-49 were interviewed using the Woman’s Questionnaire.

    For further details on sample selection, see Appendix A of the final report.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Research instrument

    Five questionnaires were used for the 2019 EMDHS: (1) the Household Questionnaire, (2) the Woman’s Questionnaire, (3) the Anthropometry Questionnaire, (4) the Health Facility Questionnaire, and (5) the Fieldworker’s Questionnaire. These questionnaires, based on The DHS Program’s standard questionnaires, were adapted to reflect the population and health issues relevant to Ethiopia. They were shortened substantially to collect data on indicators of particular relevance to Ethiopia and donors to child health programmes.

    Cleaning operations

    All electronic data files were transferred via the secure internet file streaming system (IFSS) to the EPHI central office in Addis Ababa, where they were stored on a password-protected computer. The data processing operation included secondary editing, which required resolution of computer-identified inconsistencies and coding of open-ended questions. The data were processed by EPHI staff members and an ICF consultant who took part in the main fieldwork training. They were supervised remotely by staff from The DHS Program. Data editing was accomplished using CSPro System software. During the fieldwork, field-check tables were generated to check various data quality parameters, and specific feedback was given to the teams to improve performance. Secondary editing, double data entry from both the anthropometry and health facility questionnaires, and data processing were initiated in April 2019 and completed in July 2019.

    Response rate

    A total of 9,150 households were selected for the sample, of which 8,794 were occupied. Of the occupied households, 8,663 were successfully interviewed, yielding a response rate of 99%.

    In the interviewed households, 9,012 eligible women were identified for individual interviews; interviews were completed with 8,885 women, yielding a response rate of 99%. Overall, there was little variation in response rates according to residence; however, rates were slightly higher in rural than in urban areas.

    Sampling error estimates

    The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2019 Ethiopia Mini Demographic and Health Survey (EMDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.

    Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2019 EMDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.

    Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.

    If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2019 EMDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed in SAS, using programs developed by ICF. These programs use the Taylor linearization method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.

    Note: A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.

    Data appraisal

    Data Quality Tables

    • Household age distribution

    - Age distribution of eligible and interviewed women

  15. RGB Image Pine-seedling Dataset: Three Population with half-sib structure,...

    • figshare.com
    zip
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiri Chuchlík; Jaroslav Čepl; Eva Neuwirthová; Jan Stejskal; Jiří Korecký (2025). RGB Image Pine-seedling Dataset: Three Population with half-sib structure, dataset for segmentation model training and data of mean seedlings' color [Dataset]. http://doi.org/10.6084/m9.figshare.28239326.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Jiri Chuchlík; Jaroslav Čepl; Eva Neuwirthová; Jan Stejskal; Jiří Korecký
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The datasets contain RGB photos of Scots pine seedlings of three populations from two different ecotypes originating in the Czech Republic:Plasy - lowland ecotype,Trebon - lowland ecotype,Decin - upland ecotype.These photos were taken in three different periods (September 10th 2021, October 23rd 2021, January 22nd 2022).File dataset_for_YOLOv7_training.zip contains image data with annotations for training YOLOv7 segmentation model (training and validation sets)The dataset also contains a table with information on individual Scots pine seedlings:affiliation to parent tree (mum)affiliation to population (site)row and column in which the seedling was grown (row, col)affiliation to the planter in which the seedling was grown (box)mean RGB values of pine seedling in three different periods (B_september, G_september, R_september B_october, G_october, R_october, B_january, G_january, R_january)mean HSV values of pine seedling in three different periods (H_september, S_september, V_september, H_october, S_october, V_october, H_january, S_january, V_january)

  16. Multidimensional scaling for preliminary assessment of segment...

    • plos.figshare.com
    zip
    Updated Jan 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes (2024). Multidimensional scaling for preliminary assessment of segment interpretability. [Dataset]. http://doi.org/10.1371/journal.pone.0296049.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Multidimensional scaling for preliminary assessment of segment interpretability.

  17. Assessing the validity of a data driven segmentation approach: A 4 year...

    • plos.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lian Leng Low; Shi Yan; Yu Heng Kwan; Chuen Seng Tan; Julian Thumboo (2023). Assessing the validity of a data driven segmentation approach: A 4 year longitudinal study of healthcare utilization and mortality [Dataset]. http://doi.org/10.1371/journal.pone.0195243
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Lian Leng Low; Shi Yan; Yu Heng Kwan; Chuen Seng Tan; Julian Thumboo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundSegmentation of heterogeneous patient populations into parsimonious and relatively homogenous groups with similar healthcare needs can facilitate healthcare resource planning and development of effective integrated healthcare interventions for each segment. We aimed to apply a data-driven, healthcare utilization-based clustering analysis to segment a regional health system patient population and validate its discriminative ability on 4-year longitudinal healthcare utilization and mortality data.MethodsWe extracted data from the Singapore Health Services Electronic Health Intelligence System, an electronic medical record database that included healthcare utilization (inpatient admissions, specialist outpatient clinic visits, emergency department visits, and primary care clinic visits), mortality, diseases, and demographics for all adult Singapore residents who resided in and had a healthcare encounter with our regional health system in 2012. Hierarchical clustering analysis (Ward’s linkage) and K-means cluster analysis using age and healthcare utilization data in 2012 were applied to segment the selected population. These segments were compared using their demographics (other than age) and morbidities in 2012, and longitudinal healthcare utilization and mortality from 2013–2016.ResultsAmong 146,999 subjects, five distinct patient segments “Young, healthy”; “Middle age, healthy”; “Stable, chronic disease”; “Complicated chronic disease” and “Frequent admitters” were identified. Healthcare utilization patterns in 2012, morbidity patterns and demographics differed significantly across all segments. The “Frequent admitters” segment had the smallest number of patients (1.79% of the population) but consumed 69% of inpatient admissions, 77% of specialist outpatient visits, 54% of emergency department visits, and 23% of primary care clinic visits in 2012. 11.5% and 31.2% of this segment has end stage renal failure and malignancy respectively. The validity of cluster-analysis derived segments is supported by discriminative ability for longitudinal healthcare utilization and mortality from 2013–2016. Incident rate ratios for healthcare utilization and Cox hazards ratio for mortality increased as patient segments increased in complexity. Patients in the “Frequent admitters” segment accounted for a disproportionate healthcare utilization and 8.16 times higher mortality rate.ConclusionOur data-driven clustering analysis on a general patient population in Singapore identified five patient segments with distinct longitudinal healthcare utilization patterns and mortality risk to provide an evidence-based segmentation of a regional health system’s healthcare needs.

  18. Demographic data: Gender and race distribution, and mean values with...

    • plos.figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Maetschke; Bhavna Antony; Hiroshi Ishikawa; Gadi Wollstein; Joel Schuman; Rahil Garnavi (2023). Demographic data: Gender and race distribution, and mean values with standard deviations and ranges for age, IOP, MD and GHT. [Dataset]. http://doi.org/10.1371/journal.pone.0219126.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Stefan Maetschke; Bhavna Antony; Hiroshi Ishikawa; Gadi Wollstein; Joel Schuman; Rahil Garnavi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Demographic data: Gender and race distribution, and mean values with standard deviations and ranges for age, IOP, MD and GHT.

  19. f

    Data_Sheet_1_The Effect of Training Sample Size on the Prediction of White...

    • frontiersin.figshare.com
    zip
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niklas Wulms; Lea Redmann; Christine Herpertz; Nadine Bonberg; Klaus Berger; Benedikt Sundermann; Heike Minnerup (2023). Data_Sheet_1_The Effect of Training Sample Size on the Prediction of White Matter Hyperintensity Volume in a Healthy Population Using BIANCA.zip [Dataset]. http://doi.org/10.3389/fnagi.2021.720636.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers
    Authors
    Niklas Wulms; Lea Redmann; Christine Herpertz; Nadine Bonberg; Klaus Berger; Benedikt Sundermann; Heike Minnerup
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction: White matter hyperintensities of presumed vascular origin (WMH) are an important magnetic resonance imaging marker of cerebral small vessel disease and are associated with cognitive decline, stroke, and mortality. Their relevance in healthy individuals, however, is less clear. This is partly due to the methodological challenge of accurately measuring rare and small WMH with automated segmentation programs. In this study, we tested whether WMH volumetry with FMRIB software library v6.0 (FSL; https://fsl.fmrib.ox.ac.uk/fsl/fslwiki) Brain Intensity AbNormality Classification Algorithm (BIANCA), a customizable and trainable algorithm that quantifies WMH volume based on individual data training sets, can be optimized for a normal aging population.Methods: We evaluated the effect of varying training sample sizes on the accuracy and the robustness of the predicted white matter hyperintensity volume in a population (n = 201) with a low prevalence of confluent WMH and a substantial proportion of participants without WMH. BIANCA was trained with seven different sample sizes between 10 and 40 with increments of 5. For each sample size, 100 random samples of T1w and FLAIR images were drawn and trained with manually delineated masks. For validation, we defined an internal and external validation set and compared the mean absolute error, resulting from the difference between manually delineated and predicted WMH volumes for each set. For spatial overlap, we calculated the Dice similarity index (SI) for the external validation cohort.Results: The study population had a median WMH volume of 0.34 ml (IQR of 1.6 ml) and included n = 28 (18%) participants without any WMH. The mean absolute error of the difference between BIANCA prediction and manually delineated masks was minimized and became more robust with an increasing number of training participants. The lowest mean absolute error of 0.05 ml (SD of 0.24 ml) was identified in the external validation set with a training sample size of 35. Compared to the volumetric overlap, the spatial overlap was poor with an average Dice similarity index of 0.14 (SD 0.16) in the external cohort, driven by subjects with very low lesion volumes.Discussion: We found that the performance of BIANCA, particularly the robustness of predictions, could be optimized for use in populations with a low WMH load by enlargement of the training sample size. Further work is needed to evaluate and potentially improve the prediction accuracy for low lesion volumes. These findings are important for current and future population-based studies with the majority of participants being normal aging people.

  20. Parameter values, sample properties and demographic models for the...

    • plos.figshare.com
    xls
    Updated Jan 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhendong Huang; Jerome Kelleher; Yao-ban Chan; David Balding (2025). Parameter values, sample properties and demographic models for the simulation study. [Dataset]. http://doi.org/10.1371/journal.pgen.1011537.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 21, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Zhendong Huang; Jerome Kelleher; Yao-ban Chan; David Balding
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Unless otherwise stated, 25 simulation replicates were generated in each scenario. Model Ga is used for inferences given true IBD and Model Gb is used for inferences from inferred IBD. The value of r is assumed known for all inferences, whereas μ, ϵ and N(g), g ≥ 0, are targets of inference.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes (2024). Demographic profile of audience segments. [Dataset]. http://doi.org/10.1371/journal.pone.0296049.t001
Organization logo

Demographic profile of audience segments.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jan 31, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Stephen Coleman; Michael D. Slater; Phil Wright; Oliver Wright; Lauren Skardon; Gillian Hayes
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Pandemics such as Covid-19 pose tremendous public health communication challenges in promoting protective behaviours, vaccination, and educating the public about risks. Segmenting audiences based on attitudes and behaviours is a means to increase the precision and potential effectiveness of such communication. The present study reports on such an audience segmentation effort for the population of England, sponsored by the United Kingdom Health Security Agency (UKHSA) and involving a collaboration of market research and academic experts. A cross-sectional online survey was conducted between 4 and 24 January 2022 with 5525 respondents (5178 used in our analyses) in England using market research opt-in panel. An additional 105 telephone interviews were conducted to sample persons without online or smartphone access. Respondents were quota sampled to be demographically representative. The primary analytic technique was k means cluster analysis, supplemented with other techniques including multi-dimensional scaling and use of respondent ‐ as well as sample-standardized data when necessary to address differences in response set for some groups of respondents. Identified segments were profiled against demographic, behavioural self-report, attitudinal, and communication channel variables, with differences by segment tested for statistical significance. Seven segments were identified, including distinctly different groups of persons who tended toward a high level of compliance and several that were relatively low in compliance. The segments were characterized by distinctive patterns of demographics, attitudes, behaviours, trust in information sources, and communication channels preferred. Segments were further validated by comparing the segmentation variable versus a set of demographic variables as predictors of reported protective behaviours in the past two weeks and of vaccine refusal; the demographics together had about one-quarter the effect size of the single seven-level segment variable. With respect to managerial implications, different communication strategies for each segment are suggested for each segment, illustrating advantages of rich segmentation descriptions for understanding public health communication audiences. Strengths and weaknesses of the methods used are discussed, to help guide future efforts.

Search
Clear search
Close search
Google apps
Main menu