48 datasets found
  1. Annual Population Survey Two-Year Longitudinal Dataset, January 2020 -...

    • beta.ukdataservice.ac.uk
    • datacatalogue.cessda.eu
    Updated 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office For National Statistics (2023). Annual Population Survey Two-Year Longitudinal Dataset, January 2020 - December 2021 [Dataset]. http://doi.org/10.5255/ukda-sn-8984-4
    Explore at:
    Dataset updated
    2023
    Dataset provided by
    DataCitehttps://www.datacite.org/
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Office For National Statistics
    Description

    The Annual Population Survey (APS) is a major survey series, which aims to provide data that can produce reliable estimates at local authority level. Key topics covered in the survey include education, employment, health and ethnicity. The APS comprises key variables from the Labour Force Survey (LFS), all its associated LFS boosts and the APS boost.

    The APS allows for analysis to be carried out on detailed subgroups and below regional level. In recent years (particularly with the sample size of the LFS 5 quarter dataset reducing) there has been some interest in producing a two year APS longitudinal dataset to look at any trends that may occur over a year. The APS Two-Year Longitudinal Datasets, covering 2012/13 onwards, have been deposited as a result of this work. Person- and Household-level APS datasets are also available.

    For further detailed information about methodology, users should consult the Labour Force Survey User Guide, included with the APS documentation.

    Occupation data for 2021 and 2022
    The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. None of ONS' headline statistics, other than those directly sourced from occupational data, are affected and you can continue to rely on their accuracy. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022

    Latest edition information

    For the fourth edition (September 2023), a new version of the data file with revised SOC variables was deposited. Further information on the SOC revisions can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022.

  2. d

    NYSERDA Low- to Moderate-Income New York State Census Population Analysis...

    • catalog.data.gov
    • datasets.ai
    • +3more
    Updated Nov 29, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2021). NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015 [Dataset]. https://catalog.data.gov/dataset/nyserda-low-to-moderate-income-new-york-state-census-population-analysis-dataset-aver-2013
    Explore at:
    Dataset updated
    Nov 29, 2021
    Dataset provided by
    data.ny.gov
    Area covered
    New York
    Description

    How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov. The Low- to Moderate-Income (LMI) New York State (NYS) Census Population Analysis dataset is resultant from the LMI market database designed by APPRISE as part of the NYSERDA LMI Market Characterization Study (https://www.nyserda.ny.gov/lmi-tool). All data are derived from the U.S. Census Bureau’s American Community Survey (ACS) 1-year Public Use Microdata Sample (PUMS) files for 2013, 2014, and 2015. Each row in the LMI dataset is an individual record for a household that responded to the survey and each column is a variable of interest for analyzing the low- to moderate-income population. The LMI dataset includes: county/county group, households with elderly, households with children, economic development region, income groups, percent of poverty level, low- to moderate-income groups, household type, non-elderly disabled indicator, race/ethnicity, linguistic isolation, housing unit type, owner-renter status, main heating fuel type, home energy payment method, housing vintage, LMI study region, LMI population segment, mortgage indicator, time in home, head of household education level, head of household age, and household weight. The LMI NYS Census Population Analysis dataset is intended for users who want to explore the underlying data that supports the LMI Analysis Tool. The majority of those interested in LMI statistics and generating custom charts should use the interactive LMI Analysis Tool at https://www.nyserda.ny.gov/lmi-tool. This underlying LMI dataset is intended for users with experience working with survey data files and producing weighted survey estimates using statistical software packages (such as SAS, SPSS, or Stata).

  3. GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version...

    • zenodo.org
    tiff
    Updated Sep 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie (2024). GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version 2.0-test-beta) [Dataset]. http://doi.org/10.5281/zenodo.11071404
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Sep 4, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data Usage Notice

    This version is not recommended for download. Please check the newest version.

    We would like to inform you that the updated GlobPOP dataset (2021-2022) have been available in version 2.0. The GlobPOP dataset (2021-2022) in the current version is not recommended for your work. The GlobPOP dataset (1990-2020) in the current version is the same as version 1.0.

    Thank you for your continued support of the GlobPOP.

    If you encounter any issues, please contact us via email at lulingliu@mail.bnu.edu.cn.

    Introduction

    Continuously monitoring global population spatial dynamics is essential for implementing effective policies related to sustainable development, such as epidemiology, urban planning, and global inequality.

    Here, we present GlobPOP, a new continuous global gridded population product with a high-precision spatial resolution of 30 arcseconds from 1990 to 2020. Our data-fusion framework is based on cluster analysis and statistical learning approaches, which intends to fuse the existing five products(Global Human Settlements Layer Population (GHS-POP), Global Rural Urban Mapping Project (GRUMP), Gridded Population of the World Version 4 (GPWv4), LandScan Population datasets and WorldPop datasets to a new continuous global gridded population (GlobPOP). The spatial validation results demonstrate that the GlobPOP dataset is highly accurate. To validate the temporal accuracy of GlobPOP at the country level, we have developed an interactive web application, accessible at https://globpop.shinyapps.io/GlobPOP/, where data users can explore the country-level population time-series curves of interest and compare them with census data.

    With the availability of GlobPOP dataset in both population count and population density formats, researchers and policymakers can leverage our dataset to conduct time-series analysis of population and explore the spatial patterns of population development at various scales, ranging from national to city level.

    Data description

    The product is produced in 30 arc-seconds resolution(approximately 1km in equator) and is made available in GeoTIFF format. There are two population formats, one is the 'Count'(Population count per grid) and another is the 'Density'(Population count per square kilometer each grid)

    Each GeoTIFF filename has 5 fields that are separated by an underscore "_". A filename extension follows these fields. The fields are described below with the example filename:

    GlobPOP_Count_30arc_1990_I32

    Field 1: GlobPOP(Global gridded population)
    Field 2: Pixel unit is population "Count" or population "Density"
    Field 3: Spatial resolution is 30 arc seconds
    Field 4: Year "1990"
    Field 5: Data type is I32(Int 32) or F32(Float32)

    More information

    Please refer to the paper for detailed information:

    Liu, L., Cao, X., Li, S. et al. A 31-year (1990–2020) global gridded population dataset generated by cluster analysis and statistical learning. Sci Data 11, 124 (2024). https://doi.org/10.1038/s41597-024-02913-0.

    The fully reproducible codes are publicly available at GitHub: https://github.com/lulingliu/GlobPOP.

  4. n

    Data from: Assessing cetacean populations using integrated population...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Mar 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eiren Jacobson; Charlotte Boyd; Tamara McGuire; Kim Shelden; Gina Himes Boor; André Punt (2020). Assessing cetacean populations using integrated population models: an example with Cook Inlet beluga whales [Dataset]. http://doi.org/10.5061/dryad.9zw3r229w
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 13, 2020
    Dataset provided by
    Montana State University
    National Oceanic and Atmospheric Administration
    Cook Inlet Beluga Whale Photo ID Project-Alaska WildLife Alliance*
    University of St Andrews
    University of Washington
    Authors
    Eiren Jacobson; Charlotte Boyd; Tamara McGuire; Kim Shelden; Gina Himes Boor; André Punt
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Cook Inlet
    Description

    Effective conservation and management of animal populations requires knowledge of abundance and trends. For many species, these quantities are estimated using systematic visual surveys. Additional individual-level data are available for some species. Integrated population modelling (IPM) offers a mechanism for leveraging these datasets into a single estimation framework. IPMs that incorporate both population- and individual-level data have previously been developed for birds, but have rarely been applied to cetaceans. Here, we explore how IPMs can be used to improve the assessment of cetacean populations. We combined three types of data that are typically available for cetaceans of conservation concern: population-level visual survey data, individual-level capture-recapture data, and data on anthropogenic mortality. We used this IPM to estimate the population dynamics of the Cook Inlet population of beluga whales (CIBW; Delphinapterus leucas) as a case study. Our state-space IPM included a population process model and three observational submodels: 1) a group detection model to describe group size estimates from aerial survey data; 2) a capture-recapture model to describe individual photographic capture-recapture data; and 3) a Poisson regression model to describe historical hunting data. The IPM produces biologically plausible estimates of population trajectories consistent with all three datasets. The estimated population growth rate since 2000 is less than expected for a recovering population. The estimated juvenile/adult survival rate is also low compared to other cetacean populations, indicating that low survival may be impeding recovery. This work demonstrates the value of integrating various data sources to assess cetacean populations and serves as an example of how multiple, imperfect datasets can be combined to improve our understanding of a population of interest. The model framework is applicable to other cetacean populations and to other taxa for which similar data types are available.

    Methods /Data/CIBW_RSideCapHist_McGuire&Stephens.csv contains a matrix of right side capture histories (1 = captured, 0 = not captured) for each individual (rows) and year (columns). Photographic capture-recapture data were collected by Tamara McGuire. These data are made available here, without restriction, but anyone wishing to use these data is requested to contact tamaracookinletbeluga@gmail.com, who can provide further information on how raw data were processed to provide capture histories.

    /Data/CIBW_HuntData_Mahoney&Shelden2000.xlsx contains the minimum documented number of animals killed (MinKilled) for years between 1950 and 1998 as published in Mahoney and Shelden 2000. Entries which are NA indicate that no data were available for that year.

    /Data/CIBW_Abundance_HobbsEtAl2015.xlsx contains the total group size estimates from Hobbs et al. 2015.

    /Data/CIBW_Abundance_BoydEtAl2019.txt contains an array with dimensions [1:1000, 1:8, 1:11] containing 1000 posterior samples of total group size for up to 8 survey days over 11 years, as described in Boyd et al. 2019.

  5. u

    American Community Survey

    • gstore.unm.edu
    csv, geojson, gml +5
    Updated Mar 6, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Earth Data Analysis Center (2020). American Community Survey [Dataset]. https://gstore.unm.edu/apps/rgis/datasets/cd10009e-a79f-4de5-a12c-87bb5b499e9f/metadata/FGDC-STD-001-1998.html
    Explore at:
    json(5), gml(5), xls(5), geojson(5), kml(5), zip(1), csv(5), shp(5)Available download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    Earth Data Analysis Center
    Time period covered
    2017
    Area covered
    West Bounding Coordinate -109.05017 East Bounding Coordinate -103.00196 North Bounding Coordinate 37.000293 South Bounding Coordinate 31.33217, New Mexico
    Description

    A broad and generalized selection of 2013-2017 US Census Bureau 2017 5-year American Community Survey population data estimates, obtained via Census API and joined to the appropriate geometry (in this case, New Mexico counties). The selection is not comprehensive, but allows a first-level characterization of total population, male and female, and both broad and narrowly-defined age groups. In addition to the standard selection of age-group breakdowns (by male or female), the dataset provides supplemental calculated fields which combine several attributes into one (for example, the total population of persons under 18, or the number of females over 65 years of age). The determination of which estimates to include was based upon level of interest and providing a manageable dataset for users.The U.S. Census Bureau's American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social, and economic data every year. The ACS collects long-form-type information throughout the decade rather than only once every 10 years. As in the decennial census, strict confidentiality laws protect all information that could be used to identify individuals or households.The ACS combines population or housing data from multiple years to produce reliable numbers for small counties, neighborhoods, and other local areas. To provide information for communities each year, the ACS provides 1-, 3-, and 5-year estimates. ACS 5-year estimates (multiyear estimates) are “period” estimates that represent data collected over a 60-month period of time (as opposed to “point-in-time” estimates, such as the decennial census, that approximate the characteristics of an area on a specific date). ACS data are released in the year immediately following the year in which they are collected. ACS estimates based on data collected from 2009–2014 should not be called “2009” or “2014” estimates. Multiyear estimates should be labeled to indicate clearly the full period of time. The primary advantage of using multiyear estimates is the increased statistical reliability of the data for less populated areas and small population subgroups. Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. While each full Data Profile contains margin of error (MOE) information, this dataset does not. Those individuals requiring more complete data are directed to download the more detailed datasets from the ACS American FactFinder website. This dataset is organized by New Mexico county boundaries.

  6. A

    ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis...

    • analyst-2.ai
    Updated Feb 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-nyserda-low-to-moderate-income-new-york-state-census-population-analysis-dataset-average-for-2013-2015-0724/f3a01d19/?iid=020-485&v=presentation
    Explore at:
    Dataset updated
    Feb 12, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New York
    Description

    Analysis of ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/8bd0ae94-40d3-4c9b-8a6b-de032e07929f on 12 February 2022.

    --- Dataset description provided by original source is as follows ---

    How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.

    The Low- to Moderate-Income (LMI) New York State (NYS) Census Population Analysis dataset is resultant from the LMI market database designed by APPRISE as part of the NYSERDA LMI Market Characterization Study (https://www.nyserda.ny.gov/lmi-tool). All data are derived from the U.S. Census Bureau’s American Community Survey (ACS) 1-year Public Use Microdata Sample (PUMS) files for 2013, 2014, and 2015.

    Each row in the LMI dataset is an individual record for a household that responded to the survey and each column is a variable of interest for analyzing the low- to moderate-income population.

    The LMI dataset includes: county/county group, households with elderly, households with children, economic development region, income groups, percent of poverty level, low- to moderate-income groups, household type, non-elderly disabled indicator, race/ethnicity, linguistic isolation, housing unit type, owner-renter status, main heating fuel type, home energy payment method, housing vintage, LMI study region, LMI population segment, mortgage indicator, time in home, head of household education level, head of household age, and household weight.

    The LMI NYS Census Population Analysis dataset is intended for users who want to explore the underlying data that supports the LMI Analysis Tool. The majority of those interested in LMI statistics and generating custom charts should use the interactive LMI Analysis Tool at https://www.nyserda.ny.gov/lmi-tool. This underlying LMI dataset is intended for users with experience working with survey data files and producing weighted survey estimates using statistical software packages (such as SAS, SPSS, or Stata).

    --- Original source retains full ownership of the source dataset ---

  7. A

    Broadband Adoption and Computer Use by year, state, demographic...

    • data.amerigeoss.org
    • datadiscoverystudio.org
    • +1more
    csv, json, rdf, xml
    Updated Jul 27, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States[old] (2019). Broadband Adoption and Computer Use by year, state, demographic characteristics [Dataset]. https://data.amerigeoss.org/pt_BR/dataset/broadband-adoption-and-computer-use-by-year-state-demographic-characteristics
    Explore at:
    rdf, csv, xml, jsonAvailable download formats
    Dataset updated
    Jul 27, 2019
    Dataset provided by
    United States[old]
    Description

    This dataset is imported from the US Department of Commerce, National Telecommunications and Information Administration (NTIA) and its "Data Explorer" site. The underlying data comes from the US Census

    1. dataset: Specifies the month and year of the survey as a string, in "Mon YYYY" format. The CPS is a monthly survey, and NTIA periodically sponsors Supplements to that survey.

    2. variable: Contains the standardized name of the variable being measured. NTIA identified the availability of similar data across Supplements, and assigned variable names to ease time-series comparisons.

    3. description: Provides a concise description of the variable.

    4. universe: Specifies the variable representing the universe of persons or households included in the variable's statistics. The specified variable is always included in the file. The only variables lacking universes are isPerson and isHouseholder, as they are themselves the broadest universes measured in the CPS.

    5. A large number of *Prop, *PropSE, *Count, and *CountSE columns comprise the remainder of the columns. For each demographic being measured (see below), four statistics are produced, including the estimated proportion of the group for which the variable is true (*Prop), the standard error of that proportion (*PropSE), the estimated number of persons or households in that group for which the variable is true (*Count), and the standard error of that count (*CountSE).

    DEMOGRAPHIC CATEGORIES

    1. us: The usProp, usPropSE, usCount, and usCountSE columns contain statistics about all persons and households in the universe (which represents the population of the fifty states and the District and Columbia). For example, to see how the prevelance of Internet use by Americans has changed over time, look at the usProp column for each survey's internetUser variable.

    2. age: The age category is divided into five ranges: ages 3-14, 15-24, 25-44, 45-64, and 65+. The CPS only includes data on Americans ages 3 and older. Also note that household reference persons must be at least 15 years old, so the age314* columns are blank for household-based variables. Those columns are also blank for person-based variables where the universe is "isAdult" (or a sub-universe of "isAdult"), as the CPS defines adults as persons ages 15 or older. Finally, note that some variables where children are technically in the univese will show zero values for the age314* columns. This occurs in cases where a variable simply cannot be true of a child (e.g. the workInternetUser variable, as the CPS presumes children under 15 are not eligible to work), but the topic of interest is relevant to children (e.g. locations of Internet use).

    3. work: Employment status is divided into "Employed," "Unemployed," and "NILF" (Not in the Labor Force). These three categories reflect the official BLS definitions used in official labor force statistics. Note that employment status is only recorded in the CPS for individuals ages 15 and older. As a result, children are excluded from the universe when calculating statistics by work status, even if they are otherwise considered part of the universe for the variable of interest.

    4. income: The income category represents annual family income, rather than just an individual person's income. It is divided into five ranges: below $25K, $25K-49,999, $50K-74,999, $75K-99,999, and $100K or more. Statistics by income group are only available in this file for Supplements beginning in 2010; prior to 2010, family income range is available in public use datasets, but is not directly comparable to newer datasets due to the 2010 introduction of the practice of allocating "don't know," "refused," and other responses that result in missing data. Prior to 2010, family income is unkown for approximately 20 percent of persons, while in 2010 the Census Bureau began imputing likely income ranges to replace missing data.

    5. education: Educational attainment is divided into "No Diploma," "High School Grad," "Some College," and "College Grad." High school graduates are considered to include GED completers, and those with some college include community college attendees (and graduates) and those who have attended certain postsecondary vocational or technical schools--in other words, it signifies additional education beyond high school, but short of attaining a bachelor's degree or equivilent. Note that educational attainment is only recorded in the CPS for individuals ages 15 and older. As a result, children are excluded from the universe when calculating statistics by education, even if they are otherwise considered part of the universe for the variable of interest.

    6. sex: "Male" and "Female" are the two groups in this category. The CPS does not currently provide response options for intersex individuals.

    7. race: This category includes "White," "Black," "Hispanic," "Asian," "Am Indian," and "Other" groups. The CPS asks about Hispanic origin separately from racial identification; as a result, all persons identifying as Hispanic are in the Hispanic group, regardless of how else they identify. Furthermore, all non-Hispanic persons identifying with two or more races are tallied in the "Other" group (along with other less-prevelant responses). The Am Indian group includes both American Indians and Alaska Natives.

    8. disability: Disability status is divided into "No" and "Yes" groups, indicating whether the person was identified as having a disability. Disabilities screened for in the CPS include hearing impairment, vision impairment (not sufficiently correctable by glasses), cognitive difficulties arising from physical, mental, or emotional conditions, serious difficulty walking or climbing stairs, difficulty dressing or bathing, and difficulties performing errands due to physical, mental, or emotional conditions. The Census Bureau began collecting data on disability status in June 2008; accordingly, this category is unavailable in Supplements prior to that date. Note that disability status is only recorded in the CPS for individuals ages 15 and older. As a result, children are excluded from the universe when calculating statistics by disability status, even if they are otherwise considered part of the universe for the variable of interest.

    9. metro: Metropolitan status is divided into "No," "Yes," and "Unkown," reflecting information in the dataset about the household's location. A household located within a metropolitan statistical area is assigned to the Yes group, and those outside such areas are assigned to No. However, due to the risk of de-anonymization, the metropolitan area status of certain households is unidentified in public use datasets. In those cases, the Census Bureau has determined that revealing this geographic information poses a disclosure risk. Such households are tallied in the Unknown group.

    10. scChldHome:

  8. c

    Annual Population Survey Two-Year Longitudinal Dataset, January 2018 -...

    • datacatalogue.cessda.eu
    Updated May 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Annual Population Survey Two-Year Longitudinal Dataset, January 2018 - December 2019 [Dataset]. http://doi.org/10.5255/UKDA-SN-8840-1
    Explore at:
    Dataset updated
    May 16, 2025
    Dataset provided by
    Social Survey Division
    Authors
    Office for National Statistics
    Area covered
    United Kingdom
    Variables measured
    Individuals, National
    Measurement technique
    Compilation/Synthesis
    Description

    Abstract copyright UK Data Service and data collection copyright owner.

    The Annual Population Survey (APS) is a major survey series, which aims to provide data that can produce reliable estimates at local authority level. Key topics covered in the survey include education, employment, health and ethnicity. The APS comprises key variables from the Labour Force Survey (LFS), all its associated LFS boosts and the APS boost.

    The APS allows for analysis to be carried out on detailed subgroups and below regional level. In recent years (particularly with the sample size of the LFS 5 quarter dataset reducing) there has been some interest in producing a two year APS longitudinal dataset to look at any trends that may occur over a year. The APS Two-Year Longitudinal Datasets, covering 2012/13 onwards, have been deposited as a result of this work. Person- and Household-level APS datasets are also available.

    For further detailed information about methodology, users should consult the Labour Force Survey User Guide, included with the APS documentation.

    Occupation data for 2021 and 2022
    The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. None of ONS' headline statistics, other than those directly sourced from occupational data, are affected and you can continue to rely on their accuracy. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022


    Main Topics:
    Topics covered include: household composition and relationships, housing tenure, nationality, ethnicity and residential history, employment and training (including government schemes), workplace and location, job hunting, educational background and qualifications. Many of the variables included in the survey are the same as those in the LFS.

  9. A

    ‘Broadband Adoption and Computer Use by year, state, demographic...

    • analyst-2.ai
    Updated Oct 29, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2015). ‘Broadband Adoption and Computer Use by year, state, demographic characteristics’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-broadband-adoption-and-computer-use-by-year-state-demographic-characteristics-49e2/latest
    Explore at:
    Dataset updated
    Oct 29, 2015
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Broadband Adoption and Computer Use by year, state, demographic characteristics’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/720f8c4b-7a1c-415c-9297-55904ba24840 on 26 January 2022.

    --- Dataset description provided by original source is as follows ---

    This dataset is imported from the US Department of Commerce, National Telecommunications and Information Administration (NTIA) and its "Data Explorer" site. The underlying data comes from the US Census

    1. dataset: Specifies the month and year of the survey as a string, in "Mon YYYY" format. The CPS is a monthly survey, and NTIA periodically sponsors Supplements to that survey.

    2. variable: Contains the standardized name of the variable being measured. NTIA identified the availability of similar data across Supplements, and assigned variable names to ease time-series comparisons.

    3. description: Provides a concise description of the variable.

    4. universe: Specifies the variable representing the universe of persons or households included in the variable's statistics. The specified variable is always included in the file. The only variables lacking universes are isPerson and isHouseholder, as they are themselves the broadest universes measured in the CPS.

    5. A large number of *Prop, *PropSE, *Count, and *CountSE columns comprise the remainder of the columns. For each demographic being measured (see below), four statistics are produced, including the estimated proportion of the group for which the variable is true (*Prop), the standard error of that proportion (*PropSE), the estimated number of persons or households in that group for which the variable is true (*Count), and the standard error of that count (*CountSE).

    DEMOGRAPHIC CATEGORIES

    1. us: The usProp, usPropSE, usCount, and usCountSE columns contain statistics about all persons and households in the universe (which represents the population of the fifty states and the District and Columbia). For example, to see how the prevelance of Internet use by Americans has changed over time, look at the usProp column for each survey's internetUser variable.

    2. age: The age category is divided into five ranges: ages 3-14, 15-24, 25-44, 45-64, and 65+. The CPS only includes data on Americans ages 3 and older. Also note that household reference persons must be at least 15 years old, so the age314* columns are blank for household-based variables. Those columns are also blank for person-based variables where the universe is "isAdult" (or a sub-universe of "isAdult"), as the CPS defines adults as persons ages 15 or older. Finally, note that some variables where children are technically in the univese will show zero values for the age314* columns. This occurs in cases where a variable simply cannot be true of a child (e.g. the workInternetUser variable, as the CPS presumes children under 15 are not eligible to work), but the topic of interest is relevant to children (e.g. locations of Internet use).

    3. work: Employment status is divided into "Employed," "Unemployed," and "NILF" (Not in the Labor Force). These three categories reflect the official BLS definitions used in official labor force statistics. Note that employment status is only recorded in the CPS for individuals ages 15 and older. As a result, children are excluded from the universe when calculating statistics by work status, even if they are otherwise considered part of the universe for the variable of interest.

    4. income: The income category represents annual family income, rather than just an individual person's income. It is divided into five ranges: below $25K, $25K-49,999, $50K-74,999, $75K-99,999, and $100K or more. Statistics by income group are only available in this file for Supplements beginning in 2010; prior to 2010, family income range is available in public use datasets, but is not directly comparable to newer datasets due to the 2010 introduction of the practice of allocating "don't know," "refused," and other responses that result in missing data. Prior to 2010, family income is unkown for approximately 20 percent of persons, while in 2010 the Census Bureau began imputing likely income ranges to replace missing data.

    5. education: Educational attainment is divided into "No Diploma," "High School Grad,

    --- Original source retains full ownership of the source dataset ---

  10. National Sustainable Development Plan Baseline Survey 2019, Household Income...

    • microdata.pacificdata.org
    Updated Oct 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vanuatu National Statistics Office (2020). National Sustainable Development Plan Baseline Survey 2019, Household Income and Expenditure Survey 2019 - Vanuatu [Dataset]. https://microdata.pacificdata.org/index.php/catalog/742
    Explore at:
    Dataset updated
    Oct 9, 2020
    Dataset authored and provided by
    Vanuatu National Statistics Office
    Time period covered
    2019 - 2020
    Area covered
    Vanuatu
    Description

    Abstract

    The National Sustainable Development Plan (NSDP) Baseline Survey 2019 is an expanded Household Income and Expenditure Survey (HIES) and is inclusive of health educational, cultural, and productive dimensions previously uncollected or in need of updating. The results of this survey will inform directly more than 30 key indicators listed in the NSDP M&E (Monitoring and Evaluation) Framework, as well as more than 40 of the listed indicators for the United Nations Sustainable Development Goals (SDGs). The NSDP Baseline Survey presents an opportunity as well for Vanuatu to establish a comprehensive Melanesian Wellbeing baseline as well as an updated baseline for the calculation of the Consumer Price Index (CPI) and revising National Accounts.

    Geographic coverage

    National coverage. Below are the details of this national coverage: 1. National (Vanuatu); 2. Provinces (Torba, Sanma, Penama, Malampa, Shefa, Tafea); 4. Area Councils (Torres Area council right to Futuna & Aneityum Area Council); 5. Villages / Towns; 6. Urban/Rural.

    Analysis unit

    Household and Individual.

    Universe

    All de jure residents.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The sample size for this survey was determined using the previous 2010 Household Income and Expenditure Survey (HIES) outputs, and especially the per capita monthly total expenditure. From the 2010 HIES the mean, standard deviation and standard error were computed (per capita expenditure) and from the 2016 Census the distribution of the population across the 6 provinces of Vanuatu was used as a base. According to the accuracy of this variable of interest within each province the sample size per province were adjusted in order to get an expected sampling error around 5% within each province. The sampling frame used is the last 2016 Vanuatu census for the computation of the probability of selection of the Enumeration Areas (EAs) and the random selection method started with the random selection of EAs using the probability proportional to size. Then within each selected EAs 10 households were randomly selected using the sampling uniformed method. Within each selected EA the household listing were updated by the team before random selection and interview.

    i) The only variable considered is per capita total household expenditure (variable of interest), as in addition to being one of the main indicators derived from the Household Income and Expenditure Survey (HIES), it is likely highly correlated with many other variables of interest (e.g. poverty). From the 2010 HIES dataset, using this variable of interest, a list of relevant indicators were calculated, those indicators provide information on: - (a)the status of the household expenditure distribution within each province, - (b) The efficiency provided by the 2010 HIES sample design - (c) The accuracy of the estimates calculated from the 2010 HIES dataset (especially the per capita household expenditure, our variable or interest)

    ii) The original dataset has been trimmed using the variable of interest, the lowest and the highest percentiles (the 1% households with the lowest and highest per capita total household expenditure) were removed from the analysis (outliers). The dataset ends up with 4,289 households (given 4,377 households were completed).

    iii) The 2010 Vanuatu HIES sample was based on a stratified multi stages selection - Stratification: geographical provinces (by urban / rural locations) - First stage of selection: Enumerations Areas (EAs) with probability of selection proportional to size - Second stage: households, with uniform probability of selection within the EAs

    iv) The mean and standard deviation indicate the status of the variable of interest within each strata. The intracluster correlation (p), and the design effect (DEFF) highlight the efficiency of the sampling strategy, and the standard error/relative standard error (SE/RSE) of the variable of interest show its accuracy.

    v) The purpose of this analysis is to get some insights from the 2010 HIES sample design in order to improve the 2019 survey. There is no point to improve the sample size in strata where the sample is not efficient (the gain in accuracy will be minor compared to the related cost).

    vi) The challenge in the 2019 Vanuatu baseline survey: - Meet precision targets in each strata (provincial level) including Penama where Ambae island has been evacuated at the time of the sample design. - Acceptable sample size (due to budget constraints) - Following international recommendations (12 months of field operation) - Enhance the monitoring and supervision of the field staff and simplify management of the logistics in the field

    ==> Optimize the variance/cost ratio of the survey design vii) Table 1 from the Document Sample Design (provided as External Resources) presents the Vanuatu 2010 HIES survey specifications, efficiency and accuracy in each strata (for the variable of interest). It shows that some improvements can be done in Torba, and Shefa rural (where the RSE is higher than 5%), and it shows a high intraclass correlation in Malampa, Shefa rural and Tafea (that lead to a high design effect in those strata). In Torba, the high design effect comes from the high number of households interviewed in each selected EA (on average 33 households per selected EA in this strata were interviewed). - Torba: the sample size is good, there is just a need to reduce the number of households to interview within each strata (and in order to keep a similar sample size the number of EAs to select in the province will be increased) - Malampa: given the high intracluster correlation in this province, a higher number of EAs to select is required (with the same number of households per EA to interview). - Shefa rural: keep the same number of households to interview within each EA, and increase the number of EA to select (this will lead to a higher sample size) - Tafea: similar to Malampa province, the high intraclass correlation indicates that the number of EAs to select has to be increased (therefore the sample size as well). The sample size has to be increased in Malampa, Shefa rural and Tafea, for the rest, the 2019 design will have to be similar as 2010 (in order to provide at least the same level of accuracy). viii) The 2019 Vanuatu base line survey follows the international recommendations in terms of data collection schedule (12-month coverage) and considers a better management and supervision of the field staff. In this context, the field staff will work by team, given that: - A team is made of 1 supervisor (team leader) and 2 or 3 interviewers - Each interviewer will be responsible for 5 interview per round - A round of survey is a 1 week period - 1 EA is covered during 1 round, after the round completion, the team moves to the next EA for the next round. - A team complete 32 rounds during the 12 month field operation period (roughly every 2 rounds/2 weeks) of work is followed by 1 round/1 week of rest). ix) Table 3 from the Document Sample Design (provided as External Resources) presents a survey schedule starting February 2019 and ending February 2020. During this period of 32 working weeks (corresponding to 32 different selected EAs) the teams will be on the field (a 3 weeks period of rest during Christmas period).

    x) The number of interviewer by team and number of team by province will determine the total sample size within each province. A team made of 3 interviewers can achieve 480 households over the period, while a team of 2 interviewers can achieve only 320 cases.

    xi) The intraclass correlation is used to calculate the precision loss due to clustering. Like the standard deviation, the intracluster correlation is considered to be a true population parameter, and therefore transferable between designs. We have to accept the hypothesis that this correlation factor has not changed during the period 2010-2019, and therefore can be used to predict DEFF and RSE for the next survey given an adjusted design (based on the conclusions provided by the 2010 design). Table 2 from the Document Sample Design (provided as External Resources) predicts the design effect and sampling error of the variable of interest given the new sample design that is based on: - the sample size within each strata - the number of teams within each strata - the number of interviewers per team In order to allow more flexibility in the sample size, it is preferable to set up some teams of 3 interviewers, that can achieve 480 households, which represent a good sample size for Torba and Sanma urban and some teams of 2 interviewers that will achieve 320 households each (2 teams will be required in other provinces).

    xii) The proposed design in Table 2 from the Document Sample Design (provided as External Resources) shows a total sample size of 4,640 households and a higher level of accuracy of the estimate of the variable of interest in all the stratas. Only Shefa rural shows a RSE higher than 5%, which will be still acceptable. The high intraclass correlation in Shefa rural impacts the variance of the estimates and lead to an increase the sample size or a decrease of the number of households to interview per EA which is logistically and financially not recommended.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Research instrument

    The questionnaire was developed in English using the World Bank software Survey Solutions. This questionnaire is divided into 18 modules that are detailed below.

    -Introduction (geographic areas, list of household members) -Module 1: Demographic characteristics: ethnicity, marital status; -Module 2: Wellbeing: culture

  11. GlobPOP: A 31-year (1990-2020) global gridded population dataset generated...

    • zenodo.org
    tiff
    Updated Apr 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie (2025). GlobPOP: A 31-year (1990-2020) global gridded population dataset generated by cluster analysis and statistical learning [Dataset]. http://doi.org/10.5281/zenodo.10088105
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Apr 18, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data Update Notice 数据更新通知

    We are pleased to announce that the GlobPOP dataset for the years 2021-2022 has undergone a comprehensive quality check and has now been updated accordingly. Following the established methodology that ensures the high precision and reliability, these latest updates allow for even more comprehensive time-series analysis. The updated GlobPOP dataset remains available in GeoTIFF format for easy integration into your existing workflows.

    2021-2022 年的 GlobPOP 数据集经过全面的质量检查,现已进行相应更新。 遵循确保高精度和可靠性的原有方法,本次更新允许进行更全面的时间序列分析。 更新后的 GlobPOP 数据集仍以 GeoTIFF 格式提供,以便轻松集成到您现有的工作流中。

    To reflect these updates, our interactive web application has also been refreshed. Users can now explore the updated national population time-series curves from 1990 to 2022. This can be accessed via the same link: https://globpop.shinyapps.io/GlobPOP/. Thank you for your continued support of the GlobPOP, and we hope that the updated data will further enhance your research and policy analysis endeavors.

    交互式网页反映了人口最新动态,用户现在可以探索感兴趣的国家1990 年至 2022 年人口时间序列曲线,并将其与人口普查数据进行比较。感谢您对 GlobPOP 的支持,我们希望更新的数据将进一步加强您的研究和政策分析工作。

    If you encounter any issues, please contact us via email at lulingliu@mail.bnu.edu.cn.

    如果您遇到任何问题,请通过电子邮件联系我们。

    Introduction

    Continuously monitoring global population spatial dynamics is essential for implementing effective policies related to sustainable development, such as epidemiology, urban planning, and global inequality.

    Here, we present GlobPOP, a new continuous global gridded population product with a high-precision spatial resolution of 30 arcseconds from 1990 to 2020. Our data-fusion framework is based on cluster analysis and statistical learning approaches, which intends to fuse the existing five products(Global Human Settlements Layer Population (GHS-POP), Global Rural Urban Mapping Project (GRUMP), Gridded Population of the World Version 4 (GPWv4), LandScan Population datasets and WorldPop datasets to a new continuous global gridded population (GlobPOP). The spatial validation results demonstrate that the GlobPOP dataset is highly accurate. To validate the temporal accuracy of GlobPOP at the country level, we have developed an interactive web application, accessible at https://globpop.shinyapps.io/GlobPOP/, where data users can explore the country-level population time-series curves of interest and compare them with census data.

    With the availability of GlobPOP dataset in both population count and population density formats, researchers and policymakers can leverage our dataset to conduct time-series analysis of population and explore the spatial patterns of population development at various scales, ranging from national to city level.

    Data description

    The product is produced in 30 arc-seconds resolution(approximately 1km in equator) and is made available in GeoTIFF format. There are two population formats, one is the 'Count'(Population count per grid) and another is the 'Density'(Population count per square kilometer each grid)

    Each GeoTIFF filename has 5 fields that are separated by an underscore "_". A filename extension follows these fields. The fields are described below with the example filename:

    GlobPOP_Count_30arc_1990_I32

    Field 1: GlobPOP(Global gridded population)
    Field 2: Pixel unit is population "Count" or population "Density"
    Field 3: Spatial resolution is 30 arc seconds
    Field 4: Year "1990"
    Field 5: Data type is I32(Int 32) or F32(Float32)

    More information

    Please refer to the paper for detailed information:

    Liu, L., Cao, X., Li, S. et al. A 31-year (1990–2020) global gridded population dataset generated by cluster analysis and statistical learning. Sci Data 11, 124 (2024). https://doi.org/10.1038/s41597-024-02913-0.

    The fully reproducible codes are publicly available at GitHub: https://github.com/lulingliu/GlobPOP.

  12. w

    National Population Database

    • data.wu.ac.at
    • gimi9.com
    wms
    Updated Apr 20, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health and Safety Laboratory (2018). National Population Database [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/NzJkOGJmNjMtN2NjMi00OGI2LThkOTctYTg1ZDQ4MmJmMjlj
    Explore at:
    wmsAvailable download formats
    Dataset updated
    Apr 20, 2018
    Dataset provided by
    Health and Safety Laboratory
    Area covered
    707bd9bad8997440d5674b70bc61d21f4a31c9b2
    Description

    The National Population Database (NPD) is a point-based Geographical Information System (GIS) dataset that combines locational information from providers like the Ordnance Survey with population information about those locations, mainly sourced from Government statistics. The points (and sometimes polygons) represent individual buildings, so the NPD allows detailed local analysis for anywhere in Great Britain.

    The Health & Safety Laboratory (HSL) working with Staffordshire University originally created the NPD in 2004 to help its parent organisation, the Health and Safety Executive (HSE), assess the risks to society of major hazard sites e.g. oil refineries, chemical works and gas holders. Of particular interest to HSE were 'sensitive' populations e.g. schools and hospitals where the people at those locations may be more vulnerable to harm and potentially harder to evacuate in an emergency. The data is split into 5 themes: residential, sensitive populations, transport, workplaces and leisure.

    More information about the NPD can be found here:

    https://www.hsl.gov.uk/what-we-do/better-decisions/geoanalytics/national-population-database

    The NPD was created using various datasets available within Government as part of the Public Sector Mapping Agreement (PSMA) and contains other intellectual property so is only available under license and for a fee. Please contact the HSL GIS Team if you would like to discuss gaining access to the sample or full dataset.

  13. Landing Page A/B Testing Dataset

    • kaggle.com
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FeelDidaxie (2024). Landing Page A/B Testing Dataset [Dataset]. https://www.kaggle.com/datasets/feeldidaxie/landing-page-ab-testing-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 28, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    FeelDidaxie
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The dataset originates from the book "Practical Statistics for Data Scientists" by Peter Bruce, Andrew Bruce, and Peter Gedeck.

    Context:

    A company selling a high-value service wants to determine which of two web presentations is more effective at selling. Due to the high value and infrequent nature of the sales, as well as the lengthy sales cycle, it would take too long to accumulate enough sales data to identify the superior presentation. Therefore, the company uses a proxy variable to measure effectiveness.

    A proxy variable stands in for the true variable of interest, which may be unavailable, too costly, or too time-consuming to measure directly. In this case, the proxy variable is the amount of time users spend on a detailed interior page that describes the service.

    Content:

    The dataset includes a total of 36 sessions across the two web presentations: 21 sessions for page A and 15 sessions for page B. The goal is to determine if users spend more time on page B compared to page A. If users spend more time on page B, it would suggest that page B is more effective at engaging potential customers, and therefore, does a better selling job.

    The time is expressed in hundredths of seconds. For example, a value of 0.1 indicates 10 seconds, and a value of 2.53 indicates 253 seconds.

  14. Living Standards Survey V 2005-2006 - World Bank SHIP Harmonized Dataset -...

    • dev.ihsn.org
    • microdata.worldbank.org
    Updated Apr 25, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghana Statistical Service (GSS) (2019). Living Standards Survey V 2005-2006 - World Bank SHIP Harmonized Dataset - Ghana [Dataset]. https://dev.ihsn.org/nada/catalog/73251
    Explore at:
    Dataset updated
    Apr 25, 2019
    Dataset provided by
    Ghana Statistical Services
    Authors
    Ghana Statistical Service (GSS)
    Time period covered
    2005 - 2006
    Area covered
    Ghana
    Description

    Abstract

    Survey based Harmonized Indicators (SHIP) files are harmonized data files from household surveys that are conducted by countries in Africa. To ensure the quality and transparency of the data, it is critical to document the procedures of compiling consumption aggregation and other indicators so that the results can be duplicated with ease. This process enables consistency and continuity that make temporal and cross-country comparisons consistent and more reliable.

    Four harmonized data files are prepared for each survey to generate a set of harmonized variables that have the same variable names. Invariably, in each survey, questions are asked in a slightly different way, which poses challenges on consistent definition of harmonized variables. The harmonized household survey data present the best available variables with harmonized definitions, but not identical variables. The four harmonized data files are

    a) Individual level file (Labor force indicators in a separate file): This file has information on basic characteristics of individuals such as age and sex, literacy, education, health, anthropometry and child survival. b) Labor force file: This file has information on labor force including employment/unemployment, earnings, sectors of employment, etc. c) Household level file: This file has information on household expenditure, household head characteristics (age and sex, level of education, employment), housing amenities, assets, and access to infrastructure and services. d) Household Expenditure file: This file has consumption/expenditure aggregates by consumption groups according to Purpose (COICOP) of Household Consumption of the UN.

    Geographic coverage

    National

    Analysis unit

    • Individual level for datasets with suffix _I and _L
    • Household level for datasets with suffix _H and _E

    Universe

    The survey covered all de jure household members (usual residents).

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Sampling Frame and Units As in all probability sample surveys, it is important that each sampling unit in the surveyed population has a known, non-zero probability of selection. To achieve this, there has to be an appropriate list, or sampling frame of the primary sampling units (PSUs).The universe defined for the GLSS 5 is the population living within private households in Ghana. The institutional population (such as schools, hospitals etc), which represents a very small percentage in the 2000 Population and Housing Census (PHC), is excluded from the frame for the GLSS 5.

    The Ghana Statistical Service (GSS) maintains a complete list of census EAs, together with their respective population and number of households as well as maps, with well defined boundaries, of the EAs. . This information was used as the sampling frame for the GLSS 5. Specifically, the EAs were defined as the primary sampling units (PSUs), while the households within each EA constituted the secondary sampling units (SSUs).

    Stratification In order to take advantage of possible gains in precision and reliability of the survey estimates from stratification, the EAs were first stratified into the ten administrative regions. Within each region, the EAs were further sub-divided according to their rural and urban areas of location. The EAs were also classified according to ecological zones and inclusion of Accra (GAMA) so that the survey results could be presented according to the three ecological zones, namely 1) Coastal, 2) Forest, and 3) Northern Savannah, and for Accra.

    Sample size and allocation The number and allocation of sample EAs for the GLSS 5 depend on the type of estimates to be obtained from the survey and the corresponding precision required. It was decided to select a total sample of around 8000 households nationwide.

    To ensure adequate numbers of complete interviews that will allow for reliable estimates at the various domains of interest, the GLSS 5 sample was designed to ensure that at least 400 households were selected from each region.

    A two-stage stratified random sampling design was adopted. Initially, a total sample of 550 EAs was considered at the first stage of sampling, followed by a fixed take of 15 households per EA. The distribution of the selected EAs into the ten regions or strata was based on proportionate allocation using the population.

    For example, the number of selected EAs allocated to the Western Region was obtained as: 1924577/18912079*550 = 56

    Under this sampling scheme, it was observed that the 400 households minimum requirement per region could be achieved in all the regions but not the Upper West Region. The proportionate allocation formula assigned only 17 EAs out of the 550 EAs nationwide and selecting 15 households per EA would have yielded only 255 households for the region. In order to surmount this problem, two options were considered: retaining the 17 EAs in the Upper West Region and increasing the number of selected households per EA from 15 to about 25, or increasing the number of selected EAs in the region from 17 to 27 and retaining the second stage sample of 15 households per EA.

    The second option was adopted in view of the fact that it was more likely to provide smaller sampling errors for the separate domains of analysis. Based on this, the number of EAs in Upper East and the Upper West were adjusted from 27 and 17 to 40 and 34 respectively, bringing the total number of EAs to 580 and the number of households to 8,700.

    A complete household listing exercise was carried out between May and June 2005 in all the selected EAs to provide the sampling frame for the second stage selection of households. At the second stage of sampling, a fixed number of 15 households per EA was selected in all the regions. In addition, five households per EA were selected as replacement samples.The overall sample size therefore came to 8,700 households nationwide.

    Mode of data collection

    Face-to-face [f2f]

  15. w

    National Population Database Northern Ireland

    • data.wu.ac.at
    • gimi9.com
    • +1more
    html, wms
    Updated Feb 10, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health and Safety Laboratory (2016). National Population Database Northern Ireland [Dataset]. https://data.wu.ac.at/odso/data_gov_uk/NjkwMzExY2MtN2JkNi00OGY0LWJhZWQtNjQ3ZTBjYTAwNTU2
    Explore at:
    wms, htmlAvailable download formats
    Dataset updated
    Feb 10, 2016
    Dataset provided by
    Health and Safety Laboratory
    Area covered
    73b73ed52009d67d5ee05f17f205239917d6b717, Northern Ireland
    Description

    The National Population Database (NPD) for Northern Ireland is a point-based Geographical Information System (GIS) dataset that combines locational information from Ordnance Survey Northern Ireland (OSNI) with population information about those locations, mainly sourced from Northern Irish government statistics. The points represent individual buildings allowing the NPD NI to provide detailed local analysis for anywhere in Northern Ireland.

    The Health and Safety Laboratory (HSL) working with Staffordshire University originally created the NPD for Great Britain in 2004 to help its parent organisation, the Health and Safety Executive (HSE), assess the risks to society of major hazard sites e.g. oil refineries, chemical works and gas holders. Of particular interest to HSE were ‘sensitive’ populations e.g. schools and hospitals where the people at those locations may be more vulnerable to harm and potentially harder to evacuate in an emergency. The data for the NPD NI includes residential, schools and colleges, hospitals and workplace layers.

    The NPD NI was created using various datasets from OSNI and government organisations and contains other intellectual property so is only available under a license and for a fee. Please contact the HSL GIS team if you would like to discuss gaining access to the sample or full dataset.

  16. H

    Current Population Survey

    • dataverse.harvard.edu
    • data.niaid.nih.gov
    Updated May 31, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2011). Current Population Survey [Dataset]. http://doi.org/10.7910/DVN/35IUVQ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 31, 2011
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Users can download data or view data tables on topics related to the labor force of the United States. Background Current Population Survey is a joint effort between the Bureau of Labor Statistics and the Census Bureau. It provides information and data on the labor force of the United States, such as: employment, unemployment, earnings, hours of work, school enrollment, health, employee benefits and income. The CPS is conducted monthly and has a sample of approximately 50,000 households. It is representative of the non-institutionalized US population. The sample provides estimates for the nation as a whole and serves as part of model-based estimates for individual states and other geographic areas. User Functionality Users can download data sets or view data tables on their topic of interest. Data can be organized by a variety of demographic variables, including: sex, age, race, marital status and educational attainment. Data is available on a national or state level. Data Notes The CPS is conducted monthly and has a sample of approximately 50,000 households. It is representative of the non-institutionalized US population. The sample provides estimates for th e nation as a whole and serves as part of model-based estimates for individual states and other geographic areas.

  17. Gallup World Poll 2013, June - Afghanistan, Angola, Albania...and 183 more

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Jun 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gallup, Inc. (2022). Gallup World Poll 2013, June - Afghanistan, Angola, Albania...and 183 more [Dataset]. https://datacatalog.ihsn.org/catalog/8494
    Explore at:
    Dataset updated
    Jun 14, 2022
    Dataset authored and provided by
    Gallup, Inc.http://gallup.com/
    Time period covered
    2005 - 2012
    Area covered
    Angola, Albania, Afghanistan
    Description

    Abstract

    Gallup Worldwide Research continually surveys residents in more than 150 countries, representing more than 98% of the world's adult population, using randomly selected, nationally representative samples. Gallup typically surveys 1,000 individuals in each country, using a standard set of core questions that has been translated into the major languages of the respective country. In some regions, supplemental questions are asked in addition to core questions. Face-to-face interviews are approximately 1 hour, while telephone interviews are about 30 minutes. In many countries, the survey is conducted once per year, and fieldwork is generally completed in two to four weeks. The Country Dataset Details spreadsheet displays each country's sample size, month/year of the data collection, mode of interviewing, languages employed, design effect, margin of error, and details about sample coverage.

    Gallup is entirely responsible for the management, design, and control of Gallup Worldwide Research. For the past 70 years, Gallup has been committed to the principle that accurately collecting and disseminating the opinions and aspirations of people around the globe is vital to understanding our world. Gallup's mission is to provide information in an objective, reliable, and scientifically grounded manner. Gallup is not associated with any political orientation, party, or advocacy group and does not accept partisan entities as clients. Any individual, institution, or governmental agency may access the Gallup Worldwide Research regardless of nationality. The identities of clients and all surveyed respondents will remain confidential.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    SAMPLING AND DATA COLLECTION METHODOLOGY With some exceptions, all samples are probability based and nationally representative of the resident population aged 15 and older. The coverage area is the entire country including rural areas, and the sampling frame represents the entire civilian, non-institutionalized, aged 15 and older population of the entire country. Exceptions include areas where the safety of interviewing staff is threatened, scarcely populated islands in some countries, and areas that interviewers can reach only by foot, animal, or small boat.

    Telephone surveys are used in countries where telephone coverage represents at least 80% of the population or is the customary survey methodology (see the Country Dataset Details for detailed information for each country). In Central and Eastern Europe, as well as in the developing world, including much of Latin America, the former Soviet Union countries, nearly all of Asia, the Middle East, and Africa, an area frame design is used for face-to-face interviewing.

    The typical Gallup Worldwide Research survey includes at least 1,000 surveys of individuals. In some countries, oversamples are collected in major cities or areas of special interest. Additionally, in some large countries, such as China and Russia, sample sizes of at least 2,000 are collected. Although rare, in some instances the sample size is between 500 and 1,000. See the Country Dataset Details for detailed information for each country.

    FACE-TO-FACE SURVEY DESIGN

    FIRST STAGE In countries where face-to-face surveys are conducted, the first stage of sampling is the identification of 100 to 135 ultimate clusters (Sampling Units), consisting of clusters of households. Sampling units are stratified by population size and or geography and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size, otherwise simple random sampling is used. Samples are drawn independent of any samples drawn for surveys conducted in previous years.

    There are two methods for sample stratification:

    METHOD 1: The sample is stratified into 100 to 125 ultimate clusters drawn proportional to the national population, using the following strata: 1) Areas with population of at least 1 million 2) Areas 500,000-999,999 3) Areas 100,000-499,999 4) Areas 50,000-99,999 5) Areas 10,000-49,999 6) Areas with less than 10,000

    The strata could include additional stratum to reflect populations that exceed 1 million as well as areas with populations less than 10,000. Worldwide Research Methodology and Codebook Copyright © 2008-2012 Gallup, Inc. All rights reserved. 8

    METHOD 2:

    A multi-stage design is used. The country is first stratified by large geographic units, and then by smaller units within geography. A minimum of 33 Primary Sampling Units (PSUs), which are first stage sampling units, are selected. The sample design results in 100 to 125 ultimate clusters.

    SECOND STAGE

    Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. To increase the probability of contact and completion, attempts are made at different times of the day, and where possible, on different days. If an interviewer cannot obtain an interview at the initial sampled household, he or she uses a simple substitution method. Refer to Appendix C for a more in-depth description of random route procedures.

    THIRD STAGE

    Respondents are randomly selected within the selected households. Interviewers list all eligible household members and their ages or birthdays. The respondent is selected by means of the Kish grid (refer to Appendix C) in countries where face-to-face interviewing is used. The interview does not inform the person who answers the door of the selection criteria until after the respondent has been identified. In a few Middle East and Asian countries where cultural restrictions dictate gender matching, respondents are randomly selected using the Kish grid from among all eligible adults of the matching gender.

    TELEPHONE SURVEY DESIGN

    In countries where telephone interviewing is employed, random-digit-dial (RDD) or a nationally representative list of phone numbers is used. In select countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to reach a person in each household, spread over different days and times of day. Appointments for callbacks that fall within the survey data collection period are made.

    PANEL SURVEY DESIGN

    Prior to 2009, United States data were collected using The Gallup Panel. The Gallup Panel is a probability-based, nationally representative panel, for which all members are recruited via random-digit-dial methodology and is only used in the United States. Participants who elect to join the panel are committing to the completion of two to three surveys per month, with the typical survey lasting 10 to 15 minutes. The Gallup Worldwide Research panel survey is conducted over the telephone and takes approximately 30 minutes. No incentives are given to panel participants. Worldwide Research Methodology and Codebook Copyright © 2008-2012 Gallup, Inc. All rights reserved. 9

    Research instrument

    QUESTION DESIGN

    Many of the Worldwide Research questions are items that Gallup has used for years. When developing additional questions, Gallup employed its worldwide network of research and political scientists1 to better understand key issues with regard to question development and construction and data gathering. Hundreds of items were developed, tested, piloted, and finalized. The best questions were retained for the core questionnaire and organized into indexes. Most items have a simple dichotomous ("yes or no") response set to minimize contamination of data because of cultural differences in response styles and to facilitate cross-cultural comparisons.

    The Gallup Worldwide Research measures key indicators such as Law and Order, Food and Shelter, Job Creation, Migration, Financial Wellbeing, Personal Health, Civic Engagement, and Evaluative Wellbeing and demonstrates their correlations with world development indicators such as GDP and Brain Gain. These indicators assist leaders in understanding the broad context of national interests and establishing organization-specific correlations between leading indexes and lagging economic outcomes.

    Gallup organizes its core group of indicators into the Gallup World Path. The Path is an organizational conceptualization of the seven indexes and is not to be construed as a causal model. The individual indexes have many properties of a strong theoretical framework. A more in-depth description of the questions and Gallup indexes is included in the indexes section of this document. In addition to World Path indexes, Gallup Worldwide Research questions also measure opinions about national institutions, corruption, youth development, community basics, diversity, optimism, communications, religiosity, and numerous other topics. For many regions of the world, additional questions that are specific to that region or country are included in surveys. Region-specific questions have been developed for predominantly Muslim nations, former Soviet Union countries, the Balkans, sub-Saharan Africa, Latin America, China and India, South Asia, and Israel and the Palestinian Territories.

    The questionnaire is translated into the major conversational languages of each country. The translation process starts with an English, French, or Spanish version, depending on the region. One of two translation methods may be used.

    METHOD 1: Two independent translations are completed. An independent third party, with some knowledge of survey research methods, adjudicates the differences. A professional translator translates the final version back into the source language.

    METHOD 2: A translator

  18. R

    WageIndicator Survey

    • dataverse.iza.org
    • datasets.iza.org
    zip
    Updated Jan 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research Data Center of IZA (IDSC) (2024). WageIndicator Survey [Dataset]. http://doi.org/10.15185/wif.1
    Explore at:
    zip(1429922892), zip(4109134)Available download formats
    Dataset updated
    Jan 29, 2024
    Dataset provided by
    Research Data Center of IZA (IDSC)
    License

    https://www.iza.org/wc/dataverse/IIL-1.0.pdfhttps://www.iza.org/wc/dataverse/IIL-1.0.pdf

    Time period covered
    2000 - 2021
    Area covered
    Angola, Georgia, United States, Netherlands, South Africa, Ukraine, United Kingdom, Malawi, Mozambique, Italy
    Description

    The WageIndicator Survey is a continuous, multilingual, multi-country web-survey, counducted across 65 countries since 2000. The web-survey generates cross sectional and longitudinal data which might provide data especially about wages, benefits, working hours, working conditions and industrial relations. The survey has detailed questions about earnings, benefits, working conditions, employment contracts and training, as well as questions about education, occupation, industry and household characteristics. The WageIndicator Survey is a multilingual questionnaire and aims to collect information on wages and working conditions. As labour markets and wage setting processes vary across countries, country specific translations have been favoured over literal translations. The WageIndicator Survey includes regularly extra survey questions for project targeting specific countries, for specific groups or about specific events. These projects usually address a specific audience (employees of a company, employees in an industry, readers of a magazine, members of a trade union or an occupational association, and alike). The data of the project questions are included in the dataset. Bias: Non-Probability web based surveys are problematic because not every individual has the same probability of being selected into the survey. The probability of being selected depends on national or regional internet access rates and on numbers of visitors accessing the webiste. Data of such surveys form a convenience rather than a probability sample. Due to the non-probability based nature of the survey and its selectivity the obtained results cannot be generalized for the population of interest; i.e. the labor force. Comparisons with representative studies found an underrepresentation of male labour force, part-timers, older age groups, and low educated persons. Besides other strategies to reduce the bias the WageIndicators provides different weighting schemes in order to correct for selection bias. Data Characteristics: The data is organised in annual releases. The data of the period 2000-2005 is released as one dataset. Each data release consists of a dataset with continuous variables and one with project variables. The continuous variables can be merged across years. All variable and value labels are in English. The data does not include the text variables and verbatims form open-ended survey questions, these are available in Excel-Format upon request. Spatial Coverage: The survey started in 2000 in the Netherlands. Since 2004, websites have been launched in many European countries, in North and South America and in countries in Asia. From 2008 on web sites have been launched in more African countries, as well as in Indonesia and in a number of post-Soviet countries. For each country, the questions have been translated. Multilingual countries employ multilingual questionnaires. Country-specific translations and locally accepted terminology have been favored over literal translations.

  19. c

    Verwijzing naar de data van: WageIndicator continuous web-survey on work and...

    • datacatalogue.cessda.eu
    • ssh.datastations.nl
    Updated Apr 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tijdens; Stichting Loonwijzer (2023). Verwijzing naar de data van: WageIndicator continuous web-survey on work and wages 2000 - (ongoing) [Dataset]. http://doi.org/10.17026/dans-zpb-xqpv
    Explore at:
    Dataset updated
    Apr 11, 2023
    Dataset provided by
    University of Amsterdam
    Amsterdam Institute for Advanced Labour Studies - AIAS
    K.G. and P.Osse
    Authors
    Tijdens; Stichting Loonwijzer
    Description

    The WageIndicator Survey is a continuous, multilingual, multi-country web-survey, counducted across 65 countries since 2000. The web-survey generates cross sectional and longitudinal data which might provide data especially about wages, benefits, working hours, working conditions and industrial relations.
    The survey has detailed questions about earnings, benefits, working conditions, employment contracts and training, as well as questions about education, occupation, industry and household characteristics.

    Research Focus:
    The WageIndicator Survey is a multilingual questionnaire and aims to collect information on wages and working conditions. As labour markets and wage setting processes vary across countries, country specific translations have been favoured over literal translations. The WageIndicator Survey includes regularly extra survey questions for project targeting specific countries, for specific groups or about specific events.
    These projects usually address a specific audience (employees of a company, employees in an industry, readers of a magazine, members of a trade union or an occupational association, and alike). The data of the project questions are included in the dataset.

    Sample:
    The target population of the WageIndicator is the labour force, that is, individuals in paid employment as well as job seekers. In addition to workers in formal dependent employment the survey aims to include apprentices, employers, own-account workers, freelancers, workers in family businesses, workers in the informal sector, unemployed workers, job seekers individuals who never had a job, as well as retired workers and housewifes school pupils or students with a job on the side and persons performing voluntary work.
    The WageIndicator data is derived from a volunteer survey, inviting webvisitors to the national WageIndicator websites to complete the web-survey. Annually, the websites receive millions of web-visitors.

    Bias:
    Non-Probability web based surveys are problematic because not every individual has the same probability of being selected into the survey. The probability of being selected depends on national or regional internet access rates and on numbers of visitors accessing the webiste. Data of such surveys form a convenience rather than a probability sample. Due to the non-probability based nature of the survey and its selectivity the obtained results cannot be generalized for the population of interest; i.e. the labor force.
    Comparisons with representative studies found an underrepresentation of male labour force, part-timers, older age groups, and low educated persons.
    Besides other strategies to reduce the bias the WageIndicators provides different weighting schemes in order to correct for selection bias.

    Data Characteristics:
    The data is organised in annual releases. The data of the period 2000-2005 is released as one dataset. Each data release consists of a dataset with continuous variables and one with project variables. The continuous variables can be merged across years. All variable and value labels are in English. The data does not include the text variables and verbatims form open-ended survey questions, these are available in Excel-Format upon request.

    Spatial Coverage:
    The survey started in 2000 in the Netherlands. Since 2004, websites have been launched in many European countries, in North and South America and in countries in Asia. From 2008 on web sites have been launched in more African countries, as well as in Indonesia and in a number of post-Soviet countries.
    For each country each, the questions have been translated. Multilingual countries employ multilingual questionnaires. Country-specific translations and locally accepted terminology have been favored over literal translations.

    Rights: Due to the confidential character of the WageIndicator microdata, direct access to the data is only provided by means of research contracts. Access is in principle restricted to universities and research institutes.


    Date: 2000 -

  20. i

    Social Giving Survey 2003 - South Africa

    • datacatalog.ihsn.org
    • dev.ihsn.org
    • +2more
    Updated Mar 29, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Southern African Grantmakers’ Association (SAGA) (2019). Social Giving Survey 2003 - South Africa [Dataset]. https://datacatalog.ihsn.org/catalog/3358
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset provided by
    National Development Agency (NDA)
    Centre for Civil Society (CCS)
    Southern African Grantmakers’ Association (SAGA)
    Time period covered
    2003
    Area covered
    South Africa
    Description

    Abstract

    The State of Giving project, established by the Centre for Civil Society (CCS) at the University of KwaZulu-Natal (UKZN), the Southern African Grantmakers’ Association (SAGA) and the National Development Agency (NDA), was initiated to generate information on and analyse the resource flows to poverty alleviation and development in South Africa. One component of the broader project was a focus on individual-level giving, which involved the design, implementation and analysis of a national sample survey on individual level giving behaviour. The sample, a random stratified one comprising 3000 respondents, is representative of all South Africans aged 18 and above. It thus speaks to both the urban and rural and the formal and informal dimensions of our social context. The survey collected data on who gives, why and how much they give, as well as what they give and the recipients of their giving.

    Geographic coverage

    National coverage

    Analysis unit

    Units of analysis in the survey were households and individuals

    Universe

    The population of interest in the survey was all South Africans aged 18 and above.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A random stratified survey sample was drawn by Ross Jennings at S&T. The sample was stratified by race and province at the first level, and then by area (rural/urban/etc.) at the second level. The sample frame comprised 3000 respondents, yielding an error bar of 1.8%. The results are representative of all South Africans aged 18 and above, in all parts of the country, including formal and informal dwellings. Unlike many surveys, the project partners ensured that the rural component of the sample (commonly the most expensive for logistical reasons) was large and did not require heavy weighting (where a small number of respondents have to represent the views of a far larger community).

    Randomness was built into the selection of starting points (from which fieldworkers begin their work) - every 5th dwelling was selected, after a randomly selected starting point had been identified - and into the selection of respondents, where the birthday rule was applied. That is, a household roster was completed, all those aged 18 and above were listed, and the householder whose birthday came next was identified as the respondent. Three call-backs were undertaken to interview the selected respondent; if s/he was unavailable, the household was substituted.

    A second sample was drawn, specifically to boost the minority religious groups – namely Hindus, Jews and Muslims. They are separately analysed and reported as part of the broader project, since area sampling was used, disallowing us from incorporating them into the national survey dataset.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    A set of focus groups were staged across the country in order to inform questionnaire design. Groups were recruited across a range of criteria, including demographic and religious differences, in order to ensure a wide range of views were canvassed. Direct input from focus group participants informed a series of robust design sessions with all the project partners, from which a draft questionnaire was designed. The questionnaire was piloted in two provinces, involving urban and rural respondents and covering all four race groups. The pilot included testing specific questions, and the overall methodological approach, namely our ability to quantify giving. After the pilot results had been assessed, the questionnaire was revised before going into field.

    Data appraisal

    1. "0" values in some variables Many of the variables have a "0" value in addition to the values for responses, e.g. variables with yes/no responses are coded "0" "1""2". There is no indication that the 0 represents "missing" (only Q75 specifies the use of "0" for none/nobody).

    2. Variable Q9 (Question 9) Q8 lists the number of resident children under the age of 18. Q9 refers to this question with: "of these children aged below 16 living in your household". This should probably be "aged below 18", in line with Q8 The data only reflects children under 16, so the question should probably have been "of these children, how many below the age of 16 are (Q9A) children of the head of the household and (Q9B) children not born to the head of household, i.e. children born to others. It seems though, that Q8 and Q9 should match, with Q8 identifying children and Q9 identifying children of the household head. If specifying 16 rather than 18 in Q9 is an error, then this has been reflected in the data. This means that household members 17-18 years are listed, but the data does not record whether they are children of the household head.

    3. Variable Q21 (Question 21) "What do you think is the most deserving cause that you support or would support if you could?" There are 14 values for Q21 (1-14).According to the report (Everatt, D. and G. Solanki. 2005. A Nation of givers: Social giving amongst South Africans) this and other open-ended questions were later categorised and given numeric codes. However, a codebook was not included with the documentation provided to DataFirst

    4. Variable Q22 (Question 22) "Is there one cause or charity or organisation you would definitely NOT give money to?" There are 14 values for Q22 (1-14). Again, this requires a code list for explanation.

    5. Variable Q29 (Question 29) Q28 deals with the giving of goods/food/clothes. Q29 provides a breakdown of these items, and Q28Q29L lists time/labour as one of these. It seems that Q29L is incorrectly listed as a sub-set of goods/food/clothes. Also, giving time to causes is dealt with extensively in Q30A-Q and Q31A-Q, so this variable seems out of place.

    6. Variable Q39 (Question 36) This concerns the giving of food, goods, or other forms of help to beggars/street children/people asking for help, but the question text does not specifically mention these forms of help, so can be misleading.

    7. Variable Q44 (Question 44) Q44 asks the respondent to complete the sentence "Help the poor because…." There are 8 values for this variable (0-7 and 11). Again, a code list is required to explain these values.

    8. Variable Q59 (Question 59) This question has three coded responses (1-3) so should have three values (or 4, with a "missing" value). There are 12 values for this variable, though (59A-59L). It is possible that this variable has been swopped with Q60 (However, Q60 only has 11 options in the questionnaire)

    9. Variable Q60 (Question 60) The variable from this question only has 4 values, but there are 11 possible responses to this question (60A-60K). This variable could have been swopped with Q59 (In which case, the extra value needs explanation, as Q59 only has 11 options in the questionnaire.

    10. Variables Q67 - Q82 From this point on the order of variables seems wrong, as the responses don't match the number of values listed in the questionnaire. The variables seem to refer to the next question along, e.g. Variable Q67 seems to have data emanating from Question 68, and so on. The data in the revised dataset has been corrected to reflect this.

    11. There is no variable Q83 in the dataset, although there is a question 83 in the questionnaire. This seems to support the above explanation. Data users are requested to provide any additional findings on this that come to light in their research.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Office For National Statistics (2023). Annual Population Survey Two-Year Longitudinal Dataset, January 2020 - December 2021 [Dataset]. http://doi.org/10.5255/ukda-sn-8984-4
Organization logoOrganization logo

Annual Population Survey Two-Year Longitudinal Dataset, January 2020 - December 2021

Explore at:
471 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
2023
Dataset provided by
DataCitehttps://www.datacite.org/
UK Data Servicehttps://ukdataservice.ac.uk/
Authors
Office For National Statistics
Description

The Annual Population Survey (APS) is a major survey series, which aims to provide data that can produce reliable estimates at local authority level. Key topics covered in the survey include education, employment, health and ethnicity. The APS comprises key variables from the Labour Force Survey (LFS), all its associated LFS boosts and the APS boost.

The APS allows for analysis to be carried out on detailed subgroups and below regional level. In recent years (particularly with the sample size of the LFS 5 quarter dataset reducing) there has been some interest in producing a two year APS longitudinal dataset to look at any trends that may occur over a year. The APS Two-Year Longitudinal Datasets, covering 2012/13 onwards, have been deposited as a result of this work. Person- and Household-level APS datasets are also available.

For further detailed information about methodology, users should consult the Labour Force Survey User Guide, included with the APS documentation.

Occupation data for 2021 and 2022
The ONS has identified an issue with the collection of some occupational data in 2021 and 2022 data files in a number of their surveys. While they estimate any impacts will be small overall, this will affect the accuracy of the breakdowns of some detailed (four-digit Standard Occupational Classification (SOC)) occupations, and data derived from them. None of ONS' headline statistics, other than those directly sourced from occupational data, are affected and you can continue to rely on their accuracy. Further information can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022

Latest edition information

For the fourth edition (September 2023), a new version of the data file with revised SOC variables was deposited. Further information on the SOC revisions can be found in the ONS article published on 11 July 2023: Revision of miscoded occupational data in the ONS Labour Force Survey, UK: January 2021 to September 2022.

Search
Clear search
Close search
Google apps
Main menu