94 datasets found

High Frequency Survey 2021 - Ecuador
microdata.worldbank.org
catalog.ihsn.org
+1more
Updated Jan 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UN Refugee Agency (UNHCR) (2023). High Frequency Survey 2021 - Ecuador [Dataset]. https://microdata.worldbank.org/index.php/catalog/5289
Explore at:
Dataset updated
Jan 20, 2023
Dataset provided by
United Nations High Commissioner for Refugeeshttp://www.unhcr.org/
Authors
UN Refugee Agency (UNHCR)
Time period covered
2021
Area covered
Ecuador
Description
Abstract

The data was collected using the High Frequency Survey (HFS), the new regional data collection tool & methodology launched in the Americas. The survey allowed for better reaching populations of interest with new remote modalities (phone interviews and self-administered surveys online) and improved sampling guidance and strategies. It includes a set of standardized regional core questions while allowing for operation-specific customizations. The core questions revolve around populations of interest's demographic profile, difficulties during their journey, specific protection needs, access to documentation & regularization, health access, coverage of basic needs, coping capacity & negative mechanisms used, and well-being & local integration. The data collected has been used by countries in their protection monitoring analysis and vulnerability analysis.

Geographic coverage

Whole country

Analysis unit

Household

Universe

All people of concern.

Kind of data

Sample survey data [ssd]

Sampling procedure

In the absence of a well-developed sampling-frame for forcibly displaced populations in the Americas, the High Frequency Survey employed a multi-frame sampling strategy where respondents entered the sample through one of three channels: (i) those who opt-in to complete an online self-administered version of the questionnaire which was widely circulated through refugee social media; (ii) persons identified through UNHCR and partner databases who were remotely-interviewed by phone; and (iii) random selection from the cases approaching UNHCR for registration or assistance. The total sample size was 3950 households. At the time of the survey, the population of concern was estimated at around 500000 individuals.

Mode of data collection

Other [oth]

Research instrument

Questionaire contained the following sections: journey, family composition, vulnerability, basic Needs, coping capacity,well-being,COVID-19 Impact.
n
Data from: Assessing cetacean populations using integrated population...
data.niaid.nih.gov
datadryad.org
zip
Updated Mar 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eiren Jacobson; Charlotte Boyd; Tamara McGuire; Kim Shelden; Gina Himes Boor; André Punt (2020). Assessing cetacean populations using integrated population models: an example with Cook Inlet beluga whales [Dataset]. http://doi.org/10.5061/dryad.9zw3r229w
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.9zw3r229w
Dataset updated
Mar 13, 2020
Dataset provided by
National Oceanic and Atmospheric Administration
Montana State University
University of Washington
University of St Andrews
Cook Inlet Beluga Whale Photo ID Project-Alaska WildLife Alliance*
Authors
Eiren Jacobson; Charlotte Boyd; Tamara McGuire; Kim Shelden; Gina Himes Boor; André Punt
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Cook Inlet
Description
Effective conservation and management of animal populations requires knowledge of abundance and trends. For many species, these quantities are estimated using systematic visual surveys. Additional individual-level data are available for some species. Integrated population modelling (IPM) offers a mechanism for leveraging these datasets into a single estimation framework. IPMs that incorporate both population- and individual-level data have previously been developed for birds, but have rarely been applied to cetaceans. Here, we explore how IPMs can be used to improve the assessment of cetacean populations. We combined three types of data that are typically available for cetaceans of conservation concern: population-level visual survey data, individual-level capture-recapture data, and data on anthropogenic mortality. We used this IPM to estimate the population dynamics of the Cook Inlet population of beluga whales (CIBW; Delphinapterus leucas) as a case study. Our state-space IPM included a population process model and three observational submodels: 1) a group detection model to describe group size estimates from aerial survey data; 2) a capture-recapture model to describe individual photographic capture-recapture data; and 3) a Poisson regression model to describe historical hunting data. The IPM produces biologically plausible estimates of population trajectories consistent with all three datasets. The estimated population growth rate since 2000 is less than expected for a recovering population. The estimated juvenile/adult survival rate is also low compared to other cetacean populations, indicating that low survival may be impeding recovery. This work demonstrates the value of integrating various data sources to assess cetacean populations and serves as an example of how multiple, imperfect datasets can be combined to improve our understanding of a population of interest. The model framework is applicable to other cetacean populations and to other taxa for which similar data types are available.

Methods /Data/CIBW_RSideCapHist_McGuire&Stephens.csv contains a matrix of right side capture histories (1 = captured, 0 = not captured) for each individual (rows) and year (columns). Photographic capture-recapture data were collected by Tamara McGuire. These data are made available here, without restriction, but anyone wishing to use these data is requested to contact tamaracookinletbeluga@gmail.com, who can provide further information on how raw data were processed to provide capture histories.

/Data/CIBW_HuntData_Mahoney&Shelden2000.xlsx contains the minimum documented number of animals killed (MinKilled) for years between 1950 and 1998 as published in Mahoney and Shelden 2000. Entries which are NA indicate that no data were available for that year.

/Data/CIBW_Abundance_HobbsEtAl2015.xlsx contains the total group size estimates from Hobbs et al. 2015.

/Data/CIBW_Abundance_BoydEtAl2019.txt contains an array with dimensions [1:1000, 1:8, 1:11] containing 1000 posterior samples of total group size for up to 8 survey days over 11 years, as described in Boyd et al. 2019.
d
NYSERDA Low- to Moderate-Income New York State Census Population Analysis...
catalog.data.gov
datasets.ai
+3more
Updated Nov 29, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ny.gov (2021). NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015 [Dataset]. https://catalog.data.gov/dataset/nyserda-low-to-moderate-income-new-york-state-census-population-analysis-dataset-aver-2013
Explore at:
Dataset updated
Nov 29, 2021
Dataset provided by
data.ny.gov
Area covered
New York
Description
How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov. The Low- to Moderate-Income (LMI) New York State (NYS) Census Population Analysis dataset is resultant from the LMI market database designed by APPRISE as part of the NYSERDA LMI Market Characterization Study (https://www.nyserda.ny.gov/lmi-tool). All data are derived from the U.S. Census Bureau’s American Community Survey (ACS) 1-year Public Use Microdata Sample (PUMS) files for 2013, 2014, and 2015. Each row in the LMI dataset is an individual record for a household that responded to the survey and each column is a variable of interest for analyzing the low- to moderate-income population. The LMI dataset includes: county/county group, households with elderly, households with children, economic development region, income groups, percent of poverty level, low- to moderate-income groups, household type, non-elderly disabled indicator, race/ethnicity, linguistic isolation, housing unit type, owner-renter status, main heating fuel type, home energy payment method, housing vintage, LMI study region, LMI population segment, mortgage indicator, time in home, head of household education level, head of household age, and household weight. The LMI NYS Census Population Analysis dataset is intended for users who want to explore the underlying data that supports the LMI Analysis Tool. The majority of those interested in LMI statistics and generating custom charts should use the interactive LMI Analysis Tool at https://www.nyserda.ny.gov/lmi-tool. This underlying LMI dataset is intended for users with experience working with survey data files and producing weighted survey estimates using statistical software packages (such as SAS, SPSS, or Stata).
High Frequency Survey 2021 - Peru
microdata.worldbank.org
datacatalog.ihsn.org
+1more
Updated Dec 16, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UN Refugee Agency (UNHCR) (2022). High Frequency Survey 2021 - Peru [Dataset]. https://microdata.worldbank.org/index.php/catalog/5317
Explore at:
Dataset updated
Dec 16, 2022
Dataset provided by
United Nations High Commissioner for Refugeeshttp://www.unhcr.org/
Authors
UN Refugee Agency (UNHCR)
Time period covered
2021
Area covered
Peru
Description
Abstract

The data was collected using the High Frequency Survey (HFS), the new regional data collection tool & methodology launched in the Americas. The survey allowed for better reaching populations of interest with new remote modalities (phone interviews and self-administered surveys online) and improved sampling guidance and strategies. It includes a set of standardized regional core questions while allowing for operation-specific customizations. The core questions revolve around populations of interest's demographic profile, difficulties during their journey, specific protection needs, access to documentation & regularization, health access, coverage of basic needs, coping capacity & negative mechanisms used, and well-being & local integration. The data collected has been used by countries in their protection monitoring analysis and vulnerability analysis.

Geographic coverage

Whole country

Analysis unit

Household

Universe

All people of concern.

Kind of data

Sample survey data [ssd]

Sampling procedure

In the absence of a well-developed sampling-frame for forcibly displaced populations in the Americas, the High Frequency Survey employed a multi-frame sampling strategy where respondents entered the sample through one of three channels: (i) those who opt-in to complete an online self-administered version of the questionnaire which was widely circulated through refugee social media; (ii) persons identified through UNHCR and partner databases who were remotely-interviewed by phone; and (iii) random selection from the cases approaching UNHCR for registration or assistance. The total sample size was 3343 households. At the time of the survey, the population of concern was estimated at around 1600000 individuals.

Mode of data collection

Other [oth]

Research instrument

Questionaire contained the following sections: journey, family composition, vulnerability, basic Needs, coping capacity,well-being,COVID-19 Impact.
National Sustainable Development Plan Baseline Survey 2019, Household Income...
microdata.pacificdata.org
Updated Oct 9, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vanuatu National Statistics Office (2020). National Sustainable Development Plan Baseline Survey 2019, Household Income and Expenditure Survey 2019 - Vanuatu [Dataset]. https://microdata.pacificdata.org/index.php/catalog/742
Explore at:
Dataset updated
Oct 9, 2020
Dataset authored and provided by
Vanuatu National Statistics Office
Time period covered
2019 - 2020
Area covered
Vanuatu
Description
Abstract

The National Sustainable Development Plan (NSDP) Baseline Survey 2019 is an expanded Household Income and Expenditure Survey (HIES) and is inclusive of health educational, cultural, and productive dimensions previously uncollected or in need of updating. The results of this survey will inform directly more than 30 key indicators listed in the NSDP M&E (Monitoring and Evaluation) Framework, as well as more than 40 of the listed indicators for the United Nations Sustainable Development Goals (SDGs). The NSDP Baseline Survey presents an opportunity as well for Vanuatu to establish a comprehensive Melanesian Wellbeing baseline as well as an updated baseline for the calculation of the Consumer Price Index (CPI) and revising National Accounts.

Geographic coverage

National coverage. Below are the details of this national coverage: 1. National (Vanuatu); 2. Provinces (Torba, Sanma, Penama, Malampa, Shefa, Tafea); 4. Area Councils (Torres Area council right to Futuna & Aneityum Area Council); 5. Villages / Towns; 6. Urban/Rural.

Analysis unit

Household and Individual.

Universe

All de jure residents.

Kind of data

Sample survey data [ssd]

Sampling procedure

The sample size for this survey was determined using the previous 2010 Household Income and Expenditure Survey (HIES) outputs, and especially the per capita monthly total expenditure. From the 2010 HIES the mean, standard deviation and standard error were computed (per capita expenditure) and from the 2016 Census the distribution of the population across the 6 provinces of Vanuatu was used as a base. According to the accuracy of this variable of interest within each province the sample size per province were adjusted in order to get an expected sampling error around 5% within each province. The sampling frame used is the last 2016 Vanuatu census for the computation of the probability of selection of the Enumeration Areas (EAs) and the random selection method started with the random selection of EAs using the probability proportional to size. Then within each selected EAs 10 households were randomly selected using the sampling uniformed method. Within each selected EA the household listing were updated by the team before random selection and interview.

i) The only variable considered is per capita total household expenditure (variable of interest), as in addition to being one of the main indicators derived from the Household Income and Expenditure Survey (HIES), it is likely highly correlated with many other variables of interest (e.g. poverty). From the 2010 HIES dataset, using this variable of interest, a list of relevant indicators were calculated, those indicators provide information on: - (a)the status of the household expenditure distribution within each province, - (b) The efficiency provided by the 2010 HIES sample design - (c) The accuracy of the estimates calculated from the 2010 HIES dataset (especially the per capita household expenditure, our variable or interest)

ii) The original dataset has been trimmed using the variable of interest, the lowest and the highest percentiles (the 1% households with the lowest and highest per capita total household expenditure) were removed from the analysis (outliers). The dataset ends up with 4,289 households (given 4,377 households were completed).

iii) The 2010 Vanuatu HIES sample was based on a stratified multi stages selection - Stratification: geographical provinces (by urban / rural locations) - First stage of selection: Enumerations Areas (EAs) with probability of selection proportional to size - Second stage: households, with uniform probability of selection within the EAs

iv) The mean and standard deviation indicate the status of the variable of interest within each strata. The intracluster correlation (p), and the design effect (DEFF) highlight the efficiency of the sampling strategy, and the standard error/relative standard error (SE/RSE) of the variable of interest show its accuracy.

v) The purpose of this analysis is to get some insights from the 2010 HIES sample design in order to improve the 2019 survey. There is no point to improve the sample size in strata where the sample is not efficient (the gain in accuracy will be minor compared to the related cost).

vi) The challenge in the 2019 Vanuatu baseline survey: - Meet precision targets in each strata (provincial level) including Penama where Ambae island has been evacuated at the time of the sample design. - Acceptable sample size (due to budget constraints) - Following international recommendations (12 months of field operation) - Enhance the monitoring and supervision of the field staff and simplify management of the logistics in the field

==> Optimize the variance/cost ratio of the survey design vii) Table 1 from the Document Sample Design (provided as External Resources) presents the Vanuatu 2010 HIES survey specifications, efficiency and accuracy in each strata (for the variable of interest). It shows that some improvements can be done in Torba, and Shefa rural (where the RSE is higher than 5%), and it shows a high intraclass correlation in Malampa, Shefa rural and Tafea (that lead to a high design effect in those strata). In Torba, the high design effect comes from the high number of households interviewed in each selected EA (on average 33 households per selected EA in this strata were interviewed). - Torba: the sample size is good, there is just a need to reduce the number of households to interview within each strata (and in order to keep a similar sample size the number of EAs to select in the province will be increased) - Malampa: given the high intracluster correlation in this province, a higher number of EAs to select is required (with the same number of households per EA to interview). - Shefa rural: keep the same number of households to interview within each EA, and increase the number of EA to select (this will lead to a higher sample size) - Tafea: similar to Malampa province, the high intraclass correlation indicates that the number of EAs to select has to be increased (therefore the sample size as well). The sample size has to be increased in Malampa, Shefa rural and Tafea, for the rest, the 2019 design will have to be similar as 2010 (in order to provide at least the same level of accuracy). viii) The 2019 Vanuatu base line survey follows the international recommendations in terms of data collection schedule (12-month coverage) and considers a better management and supervision of the field staff. In this context, the field staff will work by team, given that: - A team is made of 1 supervisor (team leader) and 2 or 3 interviewers - Each interviewer will be responsible for 5 interview per round - A round of survey is a 1 week period - 1 EA is covered during 1 round, after the round completion, the team moves to the next EA for the next round. - A team complete 32 rounds during the 12 month field operation period (roughly every 2 rounds/2 weeks) of work is followed by 1 round/1 week of rest). ix) Table 3 from the Document Sample Design (provided as External Resources) presents a survey schedule starting February 2019 and ending February 2020. During this period of 32 working weeks (corresponding to 32 different selected EAs) the teams will be on the field (a 3 weeks period of rest during Christmas period).

x) The number of interviewer by team and number of team by province will determine the total sample size within each province. A team made of 3 interviewers can achieve 480 households over the period, while a team of 2 interviewers can achieve only 320 cases.

xi) The intraclass correlation is used to calculate the precision loss due to clustering. Like the standard deviation, the intracluster correlation is considered to be a true population parameter, and therefore transferable between designs. We have to accept the hypothesis that this correlation factor has not changed during the period 2010-2019, and therefore can be used to predict DEFF and RSE for the next survey given an adjusted design (based on the conclusions provided by the 2010 design). Table 2 from the Document Sample Design (provided as External Resources) predicts the design effect and sampling error of the variable of interest given the new sample design that is based on: - the sample size within each strata - the number of teams within each strata - the number of interviewers per team In order to allow more flexibility in the sample size, it is preferable to set up some teams of 3 interviewers, that can achieve 480 households, which represent a good sample size for Torba and Sanma urban and some teams of 2 interviewers that will achieve 320 households each (2 teams will be required in other provinces).

xii) The proposed design in Table 2 from the Document Sample Design (provided as External Resources) shows a total sample size of 4,640 households and a higher level of accuracy of the estimate of the variable of interest in all the stratas. Only Shefa rural shows a RSE higher than 5%, which will be still acceptable. The high intraclass correlation in Shefa rural impacts the variance of the estimates and lead to an increase the sample size or a decrease of the number of households to interview per EA which is logistically and financially not recommended.

Mode of data collection

Computer Assisted Personal Interview [capi]

Research instrument

The questionnaire was developed in English using the World Bank software Survey Solutions. This questionnaire is divided into 18 modules that are detailed below.

-Introduction (geographic areas, list of household members) -Module 1: Demographic characteristics: ethnicity, marital status; -Module 2: Wellbeing: culture
H
Consumer Expenditure Survey (CE)
dataverse.harvard.edu
Updated May 30, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anthony Damico (2013). Consumer Expenditure Survey (CE) [Dataset]. http://doi.org/10.7910/DVN/UTNJAH
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/UTNJAH
Dataset updated
May 30, 2013
Dataset provided by
Harvard Dataverse
Authors
Anthony Damico
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
analyze the consumer expenditure survey (ce) with r the consumer expenditure survey (ce) is the primo data source to understand how americans spend money. participating households keep a running diary about every little purchase over the year. those diaries are then summed up into precise expenditure categories. how else are you gonna know that the average american household spent $34 (±2) on bacon, $826 (±17) on cellular phones, and $13 (±2) on digital e-readers in 2011? an integral component of the market basket calculation in the consumer price index, this survey recently became available as public-use microdata and they're slowly releasing historical files back to 1996. hooray! for a t aste of what's possible with ce data, look at the quick tables listed on their main page - these tables contain approximately a bazillion different expenditure categories broken down by demographic groups. guess what? i just learned that americans living in households with $5,000 to $9,999 of annual income spent an average of $283 (±90) on pets, toys, hobbies, and playground equipment (pdf page 3). you can often get close to your statistic of interest from these web tables. but say you wanted to look at domestic pet expenditure among only households with children between 12 and 17 years old. another one of the thirteen web tables - the consumer unit composition table - shows a few different breakouts of households with kids, but none matching that exact population of interest. the bureau of labor statistics (bls) (the survey's designers) and the census bureau (the survey's administrators) have provided plenty of the major statistics and breakouts for you, but they're not psychic. if you want to comb through this data for specific expenditure categories broken out by a you-defined segment of the united states' population, then let a little r into your life. fun starts now. fair warning: only analyze t he consumer expenditure survey if you are nerd to the core. the microdata ship with two different survey types (interview and diary), each containing five or six quarterly table formats that need to be stacked, merged, and manipulated prior to a methodologically-correct analysis. the scripts in this repository contain examples to prepare 'em all, just be advised that magnificent data like this will never be no-assembly-required. the folks at bls have posted an excellent summary of what's av ailable - read it before anything else. after that, read the getting started guide. don't skim. a few of the descriptions below refer to sas programs provided by the bureau of labor statistics. you'll find these in the C:\My Directory\CES\2011\docs directory after you run the download program. this new github repository contains three scripts: 2010-2011 - download all microdata.R lo op through every year and download every file hosted on the bls's ce ftp site import each of the comma-separated value files into r with read.csv depending on user-settings, save each table as an r data file (.rda) or stat a-readable file (.dta) 2011 fmly intrvw - analysis examples.R load the r data files (.rda) necessary to create the 'fmly' table shown in the ce macros program documentation.doc file construct that 'fmly' table, using five quarters of interviews (q1 2011 thru q1 2012) initiate a replicate-weighted survey design object perform some lovely li'l analysis examples replicate the %mean_variance() macro found in "ce macros.sas" and provide some examples of calculating descriptive statistics using unimputed variables replicate the %compare_groups() macro found in "ce macros.sas" and provide some examples of performing t -tests using unimputed variables create an rsqlite database (to minimize ram usage) containing the five imputed variable files, after identifying which variables were imputed based on pdf page 3 of the user's guide to income imputation initiate a replicate-weighted, database-backed, multiply-imputed survey design object perform a few additional analyses that highlight the modified syntax required for multiply-imputed survey designs replicate the %mean_variance() macro found in "ce macros.sas" and provide some examples of calculating descriptive statistics using imputed variables repl icate the %compare_groups() macro found in "ce macros.sas" and provide some examples of performing t-tests using imputed variables replicate the %proc_reg() and %proc_logistic() macros found in "ce macros.sas" and provide some examples of regressions and logistic regressions using both unimputed and imputed variables replicate integrated mean and se.R match each step in the bls-provided sas program "integr ated mean and se.sas" but with r instead of sas create an rsqlite database when the expenditure table gets too large for older computers to handle in ram export a table "2011 integrated mean and se.csv" that exactly matches the contents of the sas-produced "2011 integrated mean and se.lst" text file click here to view these three scripts for...
2010 American Community Survey: B19054 | INTEREST, DIVIDENDS, OR NET RENTAL...
data.census.gov
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ACS, 2010 American Community Survey: B19054 | INTEREST, DIVIDENDS, OR NET RENTAL INCOME IN THE PAST 12 MONTHS FOR HOUSEHOLDS (ACS 5-Year Estimates Selected Population Detailed Tables) [Dataset]. https://data.census.gov/table/ACSDT5YSPT2010.B19054?q=American+Trailer+Rentals+Inc
Explore at:
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
ACS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2010
Description
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, for 2010, the 2010 Census provides the official counts of the population and housing units for the nation, states, counties, cities and towns. For 2006 to 2009, the Population Estimates Program provides intercensal estimates of the population for the nation, states, and counties..Explanation of Symbols:.An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2006-2010 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2006-2010 American Community Survey
f
Data from: Target Population Statistical Inference With Data Integration...
tandf.figshare.com
txt
Updated Feb 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xihao Li; Yang Song (2024). Target Population Statistical Inference With Data Integration Across Multiple Sources—An Approach to Mitigate Information Shortage in Rare Disease Clinical Trials [Dataset]. http://doi.org/10.6084/m9.figshare.9594392.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9594392.v2
Dataset updated
Feb 12, 2024
Dataset provided by
Taylor & Francis
Authors
Xihao Li; Yang Song
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A major challenge for rare disease clinical trials is the limited amount of available information for making robust statistical inference. While external data present information integration opportunities to enhance statistical inference, conventional data combining methods, for example, meta-analysis, usually do not adequately address study population differences. Matching methods, on the other hand, directly account for population characteristics but often lead to inefficient use of data by underutilizing unmatched data points. Aiming at a better bias-variance tradeoff, we propose an intuitive integrated inference framework to borrow information from all relevant data sources and make inference on the response of interest over a target population precisely characterized by the joint distribution of baseline covariates. The method is easily implemented and can be complemented by modern statistical learning or machine learning tools. Statistical inference is facilitated by the bootstrap. We argue that the integrated inference framework not only provides an intuitive and coherent perspective for a variety of clinical trial inference problems but also has broad application areas in clinical trial settings and beyond, as a quantitative data integration tool for making robust inference in a target population precise manner for policy and decision makers.
Annual Survey of Refugees 2018 - United States
catalog.ihsn.org
datacatalog.ihsn.org
Updated Dec 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office of Refugee Resettlement (2022). Annual Survey of Refugees 2018 - United States [Dataset]. https://catalog.ihsn.org/catalog/10755
Explore at:
Dataset updated
Dec 22, 2022
Dataset provided by
Office of Refugee Resettlement
Urban Institute (Contractor)
Time period covered
2019
Area covered
United States
Description
Abstract

Since the 1980s, the Office of Refugee Resettlement (ORR) has conducted the Annual Survey of Refugees (ASR), which collects information on refugees during their first five years after arrival in the U.S. The ASR is the only scientifically-collected source of national data on refugees’ progress toward self-sufficiency and integration. ORR uses the ASR results alongside other information sources to fulfill its Congressionally-mandated reporting following the Refugee Act of 1980. Historically, the microdata from these surveys have generally been unavailable to researchers.

In the spring of 2019, ORR completed its 52nd Annual Survey of Refugees (ASR). The data from the ASR offer a window into respondents’ first five years in the United States and show the progress that refugee families made towards learning English, participating in the workforce, and establishing permanent residence.

Geographic coverage

National coverage

Analysis unit

Households and individuals

Universe

Refugees aged 16 years old or over at the time of interview who arrived in the U.S. during FY 2013-2017 inclusive.

While this covers five distinct fiscal years of refugee entrants, there is special policy/analytic interest in collapsing years into three domains as follows:

• Cohort 1 – Refugees entering FY 2013 and FY 2014, • Cohort 2 – Refugees entering FY 2015 and FY 2016, and • Cohort 3 – Refugees entering FY 2017

Kind of data

Sample survey data [ssd]

Sampling procedure

The 2018 ASR employed a stratified probability sample design of refugees. The first stage of selection was the household (PA) and the second stage was the selection of persons within households. Principal features of the sample design are highlighted below.

The 2018 ASR design replicated the 2017 and 2016 ASR design, which used a full cross-sectional national sample of refugees entering within the past five years. This section documents the research design, data collection and data processing protocols. It also presents outcomes (e.g., sample sizes) and paradata results such as response rates.

The population of interest - the study population - for the 2018 ASR is defined as refugees entering the U.S. between FY 2013 and FY 2017, inclusive, who are at ages 16 and over at the time of the 2018 ASR interview3. Because the interviews were conducted in early 2019, the population includes a small number of refugee respondents younger than 16 at the time of arrival to the U.S.

The 2018 ASR targeted 1,500 completed interviews from refugee households entering the U.S. between FY 2013-2017. The sample was designed to allow for separate estimates and analyses from each of the three designated cohorts. Moreover, the design needed to accommodate both household- and person-level analyses. The sample was drawn as fresh cross sections by cohort; there was no longitudinal component. The survey objectives required that - in addition to primary stratification by cohort - the sample of households (i.e., PAs) be stratified at least by year of entry and geographic region of origin.

The 2018 ASR sampling frame was ORR's Refugee Arrivals Data System (RADS) dataset.

Within each of the three cohort strata, the following factors were used for stratification: year of arrival (for cohorts 1 and 2 only), geographic region, native language, age group, gender, and family size at arrival (1, 2, 3+ persons). Missing contact information status was also used as a stratification variable for cohort 3 due to an unusual degree of missing contact information among FY 2017 arrivals. Proportionate stratified samples were drawn independently within cohort.

Sampling deviation

The 2018 ASR employed a sample management plan integrating the sample design and field protocols to include locating subjects, contacting them and conducting telephone interviews. A sample of 3,050 PAs was released at the start of data collection. A reserve sample of about 6,100 was held in case some portion was needed to meet the interview target of 1,500.

Mode of data collection

Telephone interview

Response rate

An overall response rate of 21 percent was achieved. The response rate was driven by the ability to locate and speak to (1,514+510)/ 7,315 = 28 percent of the sample, meaning that two thirds of the sample could neither be located nor (if located) successfully contacted.

The overall response rates decreased with time since arrival to the U.S., varying from 17 percent for FY 2013-14 refugees to 23 percent for FY 2015-16 refugees and a high of 25 percent for FY 2017 refugees.
High Frequency Survey 2020, Quarter 4 - Panama
microdata.worldbank.org
catalog.ihsn.org
+1more
Updated May 5, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UN Refugee Agency (UNHCR) (2022). High Frequency Survey 2020, Quarter 4 - Panama [Dataset]. https://microdata.worldbank.org/index.php/catalog/4470
Explore at:
Dataset updated
May 5, 2022
Dataset provided by
United Nations High Commissioner for Refugeeshttp://www.unhcr.org/
Authors
UN Refugee Agency (UNHCR)
Time period covered
2020
Area covered
Panama
Description
Abstract

The data was collected using the High Frequency Survey (HFS), the new regional data collection tool & methodology launched in the Americas. The survey allowed for better reaching populations of interest with new remote modalities (phone interviews and self-administered surveys online) and improved sampling guidance and strategies. It includes a set of standardized regional core questions while allowing for operation-specific customizations. The core questions revolve around populations of interest’s demographic profile, difficulties during their journey, specific protection needs, access to documentation & regularization, health access, coverage of basic needs, coping capacity & negative mechanisms used, and well-being & local integration. The data collected has been used by countries in their protection monitoring analysis and vulnerability analysis.

Geographic coverage

National coverage

Analysis unit

Household

Universe

All people of concern.

Kind of data

Sample survey data [ssd]

Sampling procedure

In the absence of a well-developed sampling-frame for forcibly displaced populations in the Americas, the High Frequency Survey employed a multi-frame sampling strategy where respondents entered the sample through one of three channels: (i) those who opt-in to complete an online self-administered version of the questionnaire which was widely circulated through refugee social media; (ii) persons identified through UNHCR and partner databases who were remotely-interviewed by phone; and (iii) random selection from the cases approaching UNHCR for registration or assistance. The total sample size was 388 refugee households.

Mode of data collection

Other [oth]

Research instrument

The questionnaire contained the following sections: journey, family composition, vulnerability, basic Needs, coping capacity, well-being, COVID-19 Impact.
p
Household Income and Expenditure Survey 2022 - Tuvalu
microdata.pacificdata.org
Updated May 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Central Statistics Division (2025). Household Income and Expenditure Survey 2022 - Tuvalu [Dataset]. https://microdata.pacificdata.org/index.php/catalog/880
Explore at:
Dataset updated
May 15, 2025
Dataset authored and provided by
Central Statistics Division
Time period covered
2022 - 2023
Area covered
Tuvalu
Description
Abstract

The main purpose of a Household Income and Expenditure Survey (HIES) survey was to present high quality and representative national household data on income and expenditure in order to update Consumer Price Index (CPI), improve statistics on National Accounts and measure poverty within the country. These statistics are a requirement for evidence based policy-making in reducing poverty within the country and monitor progress in the national strategic plan in place.

Geographic coverage

Urban (Funafuti) and rural areas (outer islands).

Analysis unit

Household and Individual.

Universe

Private households.

Kind of data

Sample survey data [ssd]

Sampling procedure

The sampling design of the Tuvalu 2022 HIES consists in the random selection of the appropriate numbers of households (within each strata urban and rural) in order to be able to disaggregate HIES results at the strata level (in addition to National level). The urban strata of Tuvalu is made of the island of Funafuti (as a whole) and the rest of the country (all outer islands) compose the rural strata. The statistical unit used to run this sampling analysis is the household. The sample procedure is based on the following steps: - Assessment of the accuracy of the previous 2015 HIES in terms of per capita total expenditure (variable of interest) and check whether the sample size at that time were appropriate and correctly distributed among both stratas, - Update this assessment process by using the most recent population count to get the new sample size and distribution, - Proceed to the random selection of households using this most recent population count. The sampling frame (most recent household listing and population count) used to update and select is the 2021 Tuvalu Household Listing conducted by the Central Statistics Division of Tuvalu. At the National level, the 2015 Tuvalu HIES reported a good accuracy of the per capita total expenditure (less than 5%) but the disaggregation results by strata showed a lower quality of the result in Tuvalu urban. The Tuvalu 2021 household listing provides the most recent distribution of the households across all the islands of Tuvalu. This step consists in updating the accuracy of the previous 2015 HIES by using this recent household count and get the appropriate RSE by changing the sample size. For budget constraint, the total sample size cannot get increased, as the funding situation does not allow higher sample size. It means that the only parameter that can be modified is the distribution of the sample across the strata. Sample size by stratum: -Urban: 350 (out of 1,010 urban households as per the 2021 listing) -Rural: 310 (out of 835 rural households as per the 2021 listing) -National: 660 (out of 1,845 total households as per the 2021 listing)

2015 per capita mean total expenditure (AUD): -Urban: 3,190 -Rural: 2,780 -National: 3,000

Relative Standard Error (RSE): -Urban: 5.1% -Rural: 4.1% -National: 3.3%

It results from this new sample design a new distribution that shows an increase in Funafuti urban, mainly due to: - The low quality of the survey results from the 2015 HIES, - The number of households that have increased by more than 15% between 2015 and 2020 in Tuvalu urban area.

The household selection process is based on a simple random procedure within each stratum: - The 350 households in Funafuti are selected using the same probability of selection across all villages of the islands - The 310 household in rural Tuvalu are distributed proportionally to the size of each rural island of Tuvalu. This proportional allocation of the sample across rural Tuvalu islands generates the best accuracy at the strata level.

Distribution of sample accross strata: Urban: Funafuti 350 Rural: Nanumea 42
Nanumaga 37 Niutao 46
Nui 39
Vaitupu 75
Nukufetau 45
Nukulaelae 23
Niukalita 4

Non-response is a problem in surveys, and it is crucial that the field teams interview the selected households (the location on the map and the name of the household head are used to help to determine the selected households). During the first visit, interviewers must do their best to convince the household head to participate in the survey (and get his/her approval to proceed to interview). It may happen in the field that the first visit results in: I. A refusal: the household head does not show any interest in the survey and is reluctant to participate, II. The house is empty (household members away at the time of the visit).

(I) Refusal: if the interviewer cannot convince the household head to participate, he has to liaise with the survey management, and the supervisor will help in the discussion to convince the household head to respond. In this case, it is important to mention that all responses are kept confidential and insist on the importance of it for the benefit of Tuvalu population. (II) Empty house: the interviewer must investigate (checking with neighbours) whether or not the house is still inhabited by the family: o If it is not the case, the dwelling is then vacant, and the replacement procedure must be activated. o If the dwelling is still occupied, interviewer must come back later the same day or the day after at different time

Only in extreme cases of persistent refusal or empty house (household members away during the time of the collection) the replacement procedure must be activated. The replacement procedure consists in changing the selected household to the closest neighbour who is available.

Mode of data collection

Computer Assisted Personal Interview [capi]

Research instrument

The 2022 Tuvalu Household Income and Expenditure Survey (HIES) questionnaire was developed in English language and it follows the Pacific Standard HIES questionnaire structure. It is administered on CAPI using Survey Solution, and the diary is no longer part of the form. All transactions (food, non food, home production and gifts) are collected through different recall sections during the same visit. The traditional 14 days diary is no longer recommended in the region. This new method of implementing the HIES present some interesting and valuable advantages such as: cost saving, data quality, time reduction for data processing and reporting. The 2022 HIES of Tuvalu was directly integrated to a census through a Long Form Census (LFC). The LFC was an experiment led by the World Bank and the Pacific Community to try and group a census and a HIES collection. All households were normally enumerated during the 2022 Census and households selected to participate to the HIES were then asked the HIES questions.

Below is a list of all modules in this questionnaire: -Household ID -Demographic characteristics -Education -Health -Functional difficulties -Communication -Alcohol -Other individual expenses -Labour force -Fisheries -Handicraft and home-processed food -Dwelling characteristics -Assets -Home maintenance -Vehicles -International trips -Domestic trips -Household services -Financial support -Other household expenditure -Ceremonies -Remittances -Food insecurity -Financial inclusion -Livestock & aquaculture -Agriculture parcel -Agriculture vegetables -Agriculture rootcrops -Agriculture fruits

The survey questionnaire can be found in this documentation.

Cleaning operations

Data was edited, cleaned and imputed using the software Stata.

Response rate

There was a total of 662 households from the original selection of the sample. 592 of them were contacted 528 accepted the interviews. The number of valid households is 464, or 70% of households before replacement. After replacement, 54 households were considered valid making the final completion rate at 78% (73% in urban and 85% in rural area).
w
Annual Agricultural Survey 2022-2023 - Senegal
microdata.worldbank.org
microdata.fao.org
+1more
Updated Oct 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Directorate of Analysis, Forecasting and Agricultural Statistics (2024). Annual Agricultural Survey 2022-2023 - Senegal [Dataset]. https://microdata.worldbank.org/index.php/catalog/6387
Explore at:
Dataset updated
Oct 30, 2024
Dataset authored and provided by
Directorate of Analysis, Forecasting and Agricultural Statistics
Time period covered
2022 - 2023
Area covered
Senegal
Description
Abstract

The agricultural survey in its current form covers all regions of the country and all 45 departments of Senegal. The agricultural survey is an annual statistical operation whose general objective is to estimate the level of the main agricultural output of family-type agricultural holdings. It also provides information on the physical characteristics of cultivated plots (geo-location, area) and major investments made in them (agricultural inputs, cultivation operations, soil management and restoration). The main indicators relate to yield levels, areas sown, production and means of production.

Following a modular approach, the 2022-2023 edition of the annual agricultural survey is characterized by the integration of the MEA module (Machines, Equipment and other Agricultural Assets). In addition, the basic module of the 50x2030 questionnaire allows the collection of data for the calculation of SDG 5.a.1.

Geographic coverage

The annual agricultural survey covers all 45 departments of Senegal. However, for reasons related to anonymization, the variable "Department" has been replaced by the variable "Agroecological Zone" which constitutes groupings in relation to the departments. The variable "Region" remains in the anonymized version of the data.

Analysis unit

Households and agricultural plots

Universe

The agricultural survey covers all households and plots in the 45 departments of Senegal.

Kind of data

Sample survey data [ssd]

Sampling procedure

The AAS was built on a two-stage survey, with census districts (CDs) as primary units (PUs) and agricultural households as secondary units (SUs), as defined during the general census of population and l'Habitat, de l'Agriculture et de l'Élevage (RGPHAE) of 2013. In line with the broadening of the scope of the survey recommended by the AGRIS approach, from this campaign onwards the sample design incorporated a first-stage stratification, induced by the second-stage stratification, to better reflect the various agricultural activities and improve the efficiency of the estimates. The choice of a first-degree stratification induced by that of the second degree, although less efficient than an independent first-degree stratification, was guided by the constraint of non-existence of relevant variables of interest in the sampling frame of the RGPHAE to discriminate against the CDs. The stratification took into account the relative importance of the main agricultural activities (in terms of household size) identified during the 2013 RGPHAE, namely rainfed agriculture, livestock and horticulture.

Thus, four strata were formed as follows: - the "rainfed only" stratum which groups together all the households practicing only rainfed crops; - the "livestock only" stratum for households that practice animal husbandry only; - the "Horticulture and other crops" stratum, which includes households that mainly practice horticulture and secondarily other crops (forestry, fruit growing, etc.); - the "Rainfed-livestock" stratum made up of households that practice both rainfed agriculture and livestock breeding.

The size of the sample of agricultural households to be surveyed was calculated by department (area of study) by setting a relative error of 10% on the variable of interest. The distribution of the sample of each department in the strata was made using the Bankier method (1988) developed in the methodological guide to the main sampling frame practices (pp. 79-81) of the Global Strategy for Agricultural and Rural Statistics (GSARS).

At the national level, the total theoretical sample is equal to 7,450 households, spread over 1,460 physical CDs, with 5 households per CD. At the end of the enumeration operation carried out in the physical sample CDs, adjustments were made to take into account the actual updated size of the CDs, which led to a final size of 7,378 households, or 1,382 CDs.

Sampling deviation

Compared to the survey plan, adjustments were made based on the response rate at each phase.

Mode of data collection

Computer Assisted Personal Interview [capi]

Research instrument

The first questionnaire collected information on census and characteristics of agricultural household plots. The second questionnaire collected information on agricultural production, machinery, equipment and agricultural productivity.

Response rate

First phase: sample of 7378 households, including 6360 surveyed, i.e. a coverage rate of 86%.

Second phase: sample of 7218 households, including 6,834 surveyed, i.e. a coverage rate of 95%.
A
Broadband Adoption and Computer Use by year, state, demographic...
data.amerigeoss.org
datadiscoverystudio.org
+1more
csv, json, rdf, xml
Updated Oct 31, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States (2019). Broadband Adoption and Computer Use by year, state, demographic characteristics [Dataset]. https://data.amerigeoss.org/dataset/broadband-adoption-and-computer-use-by-year-state-demographic-characteristics1
Explore at:
xml, json, rdf, csvAvailable download formats
Dataset updated
Oct 31, 2019
Dataset provided by
United States
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
This dataset is imported from the US Department of Commerce, National Telecommunications and Information Administration (NTIA) and its "Data Explorer" site. The underlying data comes from the US Census

dataset: Specifies the month and year of the survey as a string, in "Mon YYYY" format. The CPS is a monthly survey, and NTIA periodically sponsors Supplements to that survey.

variable: Contains the standardized name of the variable being measured. NTIA identified the availability of similar data across Supplements, and assigned variable names to ease time-series comparisons.

description: Provides a concise description of the variable.

universe: Specifies the variable representing the universe of persons or households included in the variable's statistics. The specified variable is always included in the file. The only variables lacking universes are isPerson and isHouseholder, as they are themselves the broadest universes measured in the CPS.

A large number of *Prop, *PropSE, *Count, and *CountSE columns comprise the remainder of the columns. For each demographic being measured (see below), four statistics are produced, including the estimated proportion of the group for which the variable is true (*Prop), the standard error of that proportion (*PropSE), the estimated number of persons or households in that group for which the variable is true (*Count), and the standard error of that count (*CountSE).

DEMOGRAPHIC CATEGORIES

us: The usProp, usPropSE, usCount, and usCountSE columns contain statistics about all persons and households in the universe (which represents the population of the fifty states and the District and Columbia). For example, to see how the prevelance of Internet use by Americans has changed over time, look at the usProp column for each survey's internetUser variable.

age: The age category is divided into five ranges: ages 3-14, 15-24, 25-44, 45-64, and 65+. The CPS only includes data on Americans ages 3 and older. Also note that household reference persons must be at least 15 years old, so the age314* columns are blank for household-based variables. Those columns are also blank for person-based variables where the universe is "isAdult" (or a sub-universe of "isAdult"), as the CPS defines adults as persons ages 15 or older. Finally, note that some variables where children are technically in the univese will show zero values for the age314* columns. This occurs in cases where a variable simply cannot be true of a child (e.g. the workInternetUser variable, as the CPS presumes children under 15 are not eligible to work), but the topic of interest is relevant to children (e.g. locations of Internet use).

work: Employment status is divided into "Employed," "Unemployed," and "NILF" (Not in the Labor Force). These three categories reflect the official BLS definitions used in official labor force statistics. Note that employment status is only recorded in the CPS for individuals ages 15 and older. As a result, children are excluded from the universe when calculating statistics by work status, even if they are otherwise considered part of the universe for the variable of interest.

income: The income category represents annual family income, rather than just an individual person's income. It is divided into five ranges: below $25K, $25K-49,999, $50K-74,999, $75K-99,999, and $100K or more. Statistics by income group are only available in this file for Supplements beginning in 2010; prior to 2010, family income range is available in public use datasets, but is not directly comparable to newer datasets due to the 2010 introduction of the practice of allocating "don't know," "refused," and other responses that result in missing data. Prior to 2010, family income is unkown for approximately 20 percent of persons, while in 2010 the Census Bureau began imputing likely income ranges to replace missing data.

education: Educational attainment is divided into "No Diploma," "High School Grad,
A
‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis...
analyst-2.ai
Updated Feb 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-nyserda-low-to-moderate-income-new-york-state-census-population-analysis-dataset-average-for-2013-2015-0724/f3a01d19/?iid=020-481&v=presentation
Explore at:
Dataset updated
Feb 12, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
New York
Description
Analysis of ‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis Dataset: Average for 2013-2015’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/8bd0ae94-40d3-4c9b-8a6b-de032e07929f on 12 February 2022.

--- Dataset description provided by original source is as follows ---

How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.

The Low- to Moderate-Income (LMI) New York State (NYS) Census Population Analysis dataset is resultant from the LMI market database designed by APPRISE as part of the NYSERDA LMI Market Characterization Study (https://www.nyserda.ny.gov/lmi-tool). All data are derived from the U.S. Census Bureau’s American Community Survey (ACS) 1-year Public Use Microdata Sample (PUMS) files for 2013, 2014, and 2015.

Each row in the LMI dataset is an individual record for a household that responded to the survey and each column is a variable of interest for analyzing the low- to moderate-income population.

The LMI dataset includes: county/county group, households with elderly, households with children, economic development region, income groups, percent of poverty level, low- to moderate-income groups, household type, non-elderly disabled indicator, race/ethnicity, linguistic isolation, housing unit type, owner-renter status, main heating fuel type, home energy payment method, housing vintage, LMI study region, LMI population segment, mortgage indicator, time in home, head of household education level, head of household age, and household weight.

The LMI NYS Census Population Analysis dataset is intended for users who want to explore the underlying data that supports the LMI Analysis Tool. The majority of those interested in LMI statistics and generating custom charts should use the interactive LMI Analysis Tool at https://www.nyserda.ny.gov/lmi-tool. This underlying LMI dataset is intended for users with experience working with survey data files and producing weighted survey estimates using statistical software packages (such as SAS, SPSS, or Stata).

--- Original source retains full ownership of the source dataset ---
GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version...
zenodo.org
tiff
Updated Sep 4, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie (2024). GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version 2.0-test-alpha) [Dataset]. http://doi.org/10.5281/zenodo.11071249
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11071249
Dataset updated
Sep 4, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Luling Liu; Xin Cao; Xin Cao; Shijie Li; Na Jie; Luling Liu; Shijie Li; Na Jie
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data Usage Notice

This version is not recommended for download. Please check the newest version.

We would like to inform you that the updated GlobPOP dataset (2021-2022) have been available in version 2.0. The GlobPOP dataset (2021-2022) in the current version is not recommended for your work. The GlobPOP dataset (1990-2020) in the current version is the same as version 1.0.

Thank you for your continued support of the GlobPOP.

If you encounter any issues, please contact us via email at lulingliu@mail.bnu.edu.cn.

Introduction

Continuously monitoring global population spatial dynamics is essential for implementing effective policies related to sustainable development, such as epidemiology, urban planning, and global inequality.

Here, we present GlobPOP, a new continuous global gridded population product with a high-precision spatial resolution of 30 arcseconds from 1990 to 2020. Our data-fusion framework is based on cluster analysis and statistical learning approaches, which intends to fuse the existing five products(Global Human Settlements Layer Population (GHS-POP), Global Rural Urban Mapping Project (GRUMP), Gridded Population of the World Version 4 (GPWv4), LandScan Population datasets and WorldPop datasets to a new continuous global gridded population (GlobPOP). The spatial validation results demonstrate that the GlobPOP dataset is highly accurate. To validate the temporal accuracy of GlobPOP at the country level, we have developed an interactive web application, accessible at https://globpop.shinyapps.io/GlobPOP/, where data users can explore the country-level population time-series curves of interest and compare them with census data.

With the availability of GlobPOP dataset in both population count and population density formats, researchers and policymakers can leverage our dataset to conduct time-series analysis of population and explore the spatial patterns of population development at various scales, ranging from national to city level.

Data description

The product is produced in 30 arc-seconds resolution(approximately 1km in equator) and is made available in GeoTIFF format. There are two population formats, one is the 'Count'(Population count per grid) and another is the 'Density'(Population count per square kilometer each grid)

Each GeoTIFF filename has 5 fields that are separated by an underscore "_". A filename extension follows these fields. The fields are described below with the example filename:

GlobPOP_Count_30arc_1990_I32

Field 1: GlobPOP(Global gridded population)
Field 2: Pixel unit is population "Count" or population "Density"
Field 3: Spatial resolution is 30 arc seconds
Field 4: Year "1990"
Field 5: Data type is I32(Int 32) or F32(Float32)

More information

Please refer to the paper for detailed information:

Liu, L., Cao, X., Li, S. et al. A 31-year (1990–2020) global gridded population dataset generated by cluster analysis and statistical learning. Sci Data 11, 124 (2024). https://doi.org/10.1038/s41597-024-02913-0.

The fully reproducible codes are publicly available at GitHub: https://github.com/lulingliu/GlobPOP.
2010 American Community Survey: B19064 | AGGREGATE INTEREST, DIVIDENDS, OR...
data.census.gov
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ACS, 2010 American Community Survey: B19064 | AGGREGATE INTEREST, DIVIDENDS, OR NET RENTAL INCOME IN THE PAST 12 MONTHS (IN 2010 INFLATION-ADJUSTED DOLLARS) FOR HOUSEHOLDS (ACS 5-Year Estimates Selected Population Detailed Tables) [Dataset]. https://data.census.gov/table/ACSDT5YSPT2010.B19064
Explore at:
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
ACS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2010
Description
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, for 2010, the 2010 Census provides the official counts of the population and housing units for the nation, states, counties, cities and towns. For 2006 to 2009, the Population Estimates Program provides intercensal estimates of the population for the nation, states, and counties..Explanation of Symbols:.An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2006-2010 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2006-2010 American Community Survey
A
‘Broadband Adoption and Computer Use by year, state, demographic...
analyst-2.ai
Updated Oct 29, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2015). ‘Broadband Adoption and Computer Use by year, state, demographic characteristics’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-broadband-adoption-and-computer-use-by-year-state-demographic-characteristics-49e2/latest
Explore at:
Dataset updated
Oct 29, 2015
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Broadband Adoption and Computer Use by year, state, demographic characteristics’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/720f8c4b-7a1c-415c-9297-55904ba24840 on 26 January 2022.

--- Dataset description provided by original source is as follows ---

This dataset is imported from the US Department of Commerce, National Telecommunications and Information Administration (NTIA) and its "Data Explorer" site. The underlying data comes from the US Census

dataset: Specifies the month and year of the survey as a string, in "Mon YYYY" format. The CPS is a monthly survey, and NTIA periodically sponsors Supplements to that survey.

variable: Contains the standardized name of the variable being measured. NTIA identified the availability of similar data across Supplements, and assigned variable names to ease time-series comparisons.

description: Provides a concise description of the variable.

universe: Specifies the variable representing the universe of persons or households included in the variable's statistics. The specified variable is always included in the file. The only variables lacking universes are isPerson and isHouseholder, as they are themselves the broadest universes measured in the CPS.

A large number of *Prop, *PropSE, *Count, and *CountSE columns comprise the remainder of the columns. For each demographic being measured (see below), four statistics are produced, including the estimated proportion of the group for which the variable is true (*Prop), the standard error of that proportion (*PropSE), the estimated number of persons or households in that group for which the variable is true (*Count), and the standard error of that count (*CountSE).

DEMOGRAPHIC CATEGORIES

us: The usProp, usPropSE, usCount, and usCountSE columns contain statistics about all persons and households in the universe (which represents the population of the fifty states and the District and Columbia). For example, to see how the prevelance of Internet use by Americans has changed over time, look at the usProp column for each survey's internetUser variable.

age: The age category is divided into five ranges: ages 3-14, 15-24, 25-44, 45-64, and 65+. The CPS only includes data on Americans ages 3 and older. Also note that household reference persons must be at least 15 years old, so the age314* columns are blank for household-based variables. Those columns are also blank for person-based variables where the universe is "isAdult" (or a sub-universe of "isAdult"), as the CPS defines adults as persons ages 15 or older. Finally, note that some variables where children are technically in the univese will show zero values for the age314* columns. This occurs in cases where a variable simply cannot be true of a child (e.g. the workInternetUser variable, as the CPS presumes children under 15 are not eligible to work), but the topic of interest is relevant to children (e.g. locations of Internet use).

work: Employment status is divided into "Employed," "Unemployed," and "NILF" (Not in the Labor Force). These three categories reflect the official BLS definitions used in official labor force statistics. Note that employment status is only recorded in the CPS for individuals ages 15 and older. As a result, children are excluded from the universe when calculating statistics by work status, even if they are otherwise considered part of the universe for the variable of interest.

income: The income category represents annual family income, rather than just an individual person's income. It is divided into five ranges: below $25K, $25K-49,999, $50K-74,999, $75K-99,999, and $100K or more. Statistics by income group are only available in this file for Supplements beginning in 2010; prior to 2010, family income range is available in public use datasets, but is not directly comparable to newer datasets due to the 2010 introduction of the practice of allocating "don't know," "refused," and other responses that result in missing data. Prior to 2010, family income is unkown for approximately 20 percent of persons, while in 2010 the Census Bureau began imputing likely income ranges to replace missing data.

education: Educational attainment is divided into "No Diploma," "High School Grad,

--- Original source retains full ownership of the source dataset ---
Data from: Population 24/7 Near Real Time: Data Library, Sample Outputs and...
beta.ukdataservice.ac.uk
Updated 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UK Data Service (2021). Population 24/7 Near Real Time: Data Library, Sample Outputs and Batch Files for England, 2011 [Dataset]. http://doi.org/10.5255/ukda-sn-853950
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-853950
Dataset updated
2021
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
datacite
Area covered
England
Description
This data collection comprises a data library, sample outputs, batch files and accompanying documentation from the ESRC-funded project “Population247NRT: Near real-time spatiotemporal population estimates for health, emergency response and national security”. The data comprise a structured set of input data for use with the authors’ SurfaceBuilder247 software and sample outputs which estimate the population distribution of England at specific times on specific dates, referenced to 2011 census population totals. The sample output files (provided as GeoTIFFs) contain population estimates in 200m grid cells, based on the British National Grid, for 02:00 (2am) and 14:00 (2pm) on a typical weekday in University and school term-time and out of term-time. The estimates are broken down by seven age/economic activity sub-groups for term-time and six for out of term-time, and include estimates of population activity in residential, workplace, education, healthcare and road transportation domains. The data library, which has been constructed entirely using open data sources, comprises population estimates, by age/economic activity sub-groups, for point locations (typically population-weighted centroids of census output areas and workplace zones, or postcode centroids of sites such as schools or hospitals); time profiles representing usual patterns of population activity at these sites during a 24-hour period; and background grid layers representing the land surface area and major road network. SurfaceBuilder247 uses the data library to generate time-specific gridded population estimates by redistributing the population of each sub-group across the available locations and background grid in accordance with the reference time profiles. The sample output grids provided in this resource may be used directly in GIS software or, alternatively, the input data library may be reprocessed using SurfaceBuilder247 to generate estimates for specific dates and times of interest to the user. Sample batch and session parameter files are included in the resource.
d
Map Data | Asia & MENA | Premium Demographics & Point-of-Interest Data To...
datarade.ai
.json, .csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GapMaps, Map Data | Asia & MENA | Premium Demographics & Point-of-Interest Data To Optimise Business Decisions | GIS Data | Demographic Data [Dataset]. https://datarade.ai/data-products/gapmaps-global-map-data-asia-mena-150m-x-150m-grids-cu-gapmaps
Explore at:
.json, .csvAvailable download formats
Dataset authored and provided by
GapMaps
Area covered
Malaysia, Indonesia, Saudi Arabia, India, Singapore, Philippines, Asia
Description
Sourcing accurate and up-to-date map data across Asia and MENA has historically been difficult for retail brands looking to expand their store networks in these regions. Either the data does not exist or it isn't readily accessible or updated regularly.

GapMaps Map Data uses known population data combined with billions of mobile device location points to provide highly accurate and globally consistent demographics data across Asia and MENA at 150m x 150m grid levels in major cities and 1km grids outside of major cities.

GapMaps Map Data also includes the latest Point-of-Interest (POI) Data for leading retail brands across a range of categories including Fast Food/ QSR, Health & Fitness, Supermarket/Grocery and Cafe sectors which is updated monthly.

With this information, brands can get a detailed understanding of who lives in a catchment, where they work and their spending potential which allows you to:

Better understand your customers

Identify optimal locations to expand your retail footprint

Define sales territories for franchisees

Run targeted marketing campaigns.

GapMaps Map Data for Asia and MENA can be utilized in any GIS platform and includes the latest estimates (updated annually) on:

Population (how many people live in your local catchment)

Demographics (who lives within your local catchment)

Worker population (how many people work within your local catchment)

Consuming Class and Premium Consuming Class (who can can afford to buy goods & services beyond their basic needs and /or shop at premium retailers)

Retail Spending (Food & Beverage, Grocery, Apparel, Other). How much are consumers spending on retail goods and services by category.

Primary Use Cases for GapMaps Map Data:

Retail Site Selection - identify optimal locations for future expansion and benchmark performance across existing locations.

Customer Profiling: get a detailed understanding of the demographic profile of your customers, where they work and their spending potential

Analyse your trade areas at a granular 150m x 150m grid levels using all the key metrics

Target Marketing: Develop effective marketing strategies to acquire more customers.

Integrate GapMaps demographic data with your existing GIS or BI platform to generate powerful visualizations.

Marketing / Advertising (Billboards/OOH, Marketing Agencies, Indoor Screens)

Customer Profiling

Target Marketing

Market Share Analysis
High Frequency Survey 2021, Quarter 3 - Bolivia
microdata.worldbank.org
catalog.ihsn.org
Updated May 5, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UN Refugee Agency (UNHCR) (2022). High Frequency Survey 2021, Quarter 3 - Bolivia [Dataset]. https://microdata.worldbank.org/index.php/catalog/4455
Explore at:
Dataset updated
May 5, 2022
Dataset provided by
United Nations High Commissioner for Refugeeshttp://www.unhcr.org/
Authors
UN Refugee Agency (UNHCR)
Time period covered
2021
Area covered
Bolivia
Description
Abstract

The data was collected using the High Frequency Survey (HFS), the new regional data collection tool & methodology launched in the Americas. The survey allowed for better reaching populations of interest with new remote modalities (phone interviews and self-administered surveys online) and improved sampling guidance and strategies. It includes a set of standardized regional core questions while allowing for operation-specific customizations. The core questions revolve around populations of interest's demographic profile, difficulties during their journey, specific protection needs, access to documentation & regularization, health access, coverage of basic needs, coping capacity & negative mechanisms used, and well-being & local integration. The data collected has been used by countries in their protection monitoring analysis and vulnerability analysis.

Geographic coverage

National coverage

Analysis unit

Household

Universe

All people of concern.

Kind of data

Sample survey data [ssd]

Sampling procedure

In the absence of a well-developed sampling-frame for forcibly displaced populations in the Americas, the High Frequency Survey employed a multi-frame sampling strategy where respondents entered the sample through one of three channels: (i) those who opt-in to complete an online self-administered version of the questionnaire which was widely circulated through refugee social media; (ii) persons identified through UNHCR and partner databases who were remotely-interviewed by phone; and (iii) random selection from the cases approaching UNHCR for registration or assistance. The total sample size was 129 households. At the time of the survey, the population of concern was estimated at around 11000 individuals.

Mode of data collection

Other [oth]

Research instrument

The questionnaire contained the following sections: journey, family composition, vulnerability, basic Needs, coping capacity, well-being, COVID-19 Impact.

Facebook

Twitter

Click to copy link

Link copied

Cite

UN Refugee Agency (UNHCR) (2023). High Frequency Survey 2021 - Ecuador [Dataset]. https://microdata.worldbank.org/index.php/catalog/5289

High Frequency Survey 2021 - Ecuador

Explore at:

Dataset updated

Jan 20, 2023

Dataset provided by

United Nations High Commissioner for Refugeeshttp://www.unhcr.org/

Authors

UN Refugee Agency (UNHCR)

Time period covered

2021

Area covered

Ecuador

Description

Abstract

The data was collected using the High Frequency Survey (HFS), the new regional data collection tool & methodology launched in the Americas. The survey allowed for better reaching populations of interest with new remote modalities (phone interviews and self-administered surveys online) and improved sampling guidance and strategies. It includes a set of standardized regional core questions while allowing for operation-specific customizations. The core questions revolve around populations of interest's demographic profile, difficulties during their journey, specific protection needs, access to documentation & regularization, health access, coverage of basic needs, coping capacity & negative mechanisms used, and well-being & local integration. The data collected has been used by countries in their protection monitoring analysis and vulnerability analysis.

Geographic coverage

Whole country

Analysis unit

Household

Universe

All people of concern.

Kind of data

Sample survey data [ssd]

Sampling procedure

In the absence of a well-developed sampling-frame for forcibly displaced populations in the Americas, the High Frequency Survey employed a multi-frame sampling strategy where respondents entered the sample through one of three channels: (i) those who opt-in to complete an online self-administered version of the questionnaire which was widely circulated through refugee social media; (ii) persons identified through UNHCR and partner databases who were remotely-interviewed by phone; and (iii) random selection from the cases approaching UNHCR for registration or assistance. The total sample size was 3950 households. At the time of the survey, the population of concern was estimated at around 500000 individuals.

Mode of data collection

Other [oth]

Research instrument

Questionaire contained the following sections: journey, family composition, vulnerability, basic Needs, coping capacity,well-being,COVID-19 Impact.

Clear search

Close search

Google apps

Main menu

High Frequency Survey 2021 - Ecuador

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Data from: Assessing cetacean populations using integrated population...

NYSERDA Low- to Moderate-Income New York State Census Population Analysis...

High Frequency Survey 2021 - Peru

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

National Sustainable Development Plan Baseline Survey 2019, Household Income...

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Consumer Expenditure Survey (CE)

2010 American Community Survey: B19054 | INTEREST, DIVIDENDS, OR NET RENTAL...

Data from: Target Population Statistical Inference With Data Integration...

Annual Survey of Refugees 2018 - United States

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Sampling deviation

Mode of data collection

Response rate

High Frequency Survey 2020, Quarter 4 - Panama

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Household Income and Expenditure Survey 2022 - Tuvalu

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Response rate

Annual Agricultural Survey 2022-2023 - Senegal

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Sampling deviation

Mode of data collection

Research instrument

Response rate

Broadband Adoption and Computer Use by year, state, demographic...

‘NYSERDA Low- to Moderate-Income New York State Census Population Analysis...

GlobPOP: A 33-year (1990-2022) global gridded population dataset (Version...

Data Usage Notice

This version is not recommended for download. Please check the newest version.

Introduction

Data description

High Frequency Survey 2021 - Ecuador