100+ datasets found

d
Statistics review 2: Samples and populations
catalog.data.gov
data.virginia.gov
+1more
Updated Jul 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (2025). Statistics review 2: Samples and populations [Dataset]. https://catalog.data.gov/dataset/statistics-review-2-samples-and-populations
Explore at:
Dataset updated
Jul 24, 2025
Dataset provided by
National Institutes of Health
Description
The previous review in this series introduced the notion of data description and outlined some of the more common summary measures used to describe a dataset. However, a dataset is typically only of interest for the information it provides regarding the population from which it was drawn. The present review focuses on estimation of population values from a sample.
i
Population and Family Health Survey 1997 - Jordan
catalog.ihsn.org
datacatalog.ihsn.org
+1more
Updated Mar 29, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Statistics (DOS) (2019). Population and Family Health Survey 1997 - Jordan [Dataset]. http://catalog.ihsn.org/catalog/182
Explore at:
Dataset updated
Mar 29, 2019
Dataset authored and provided by
Department of Statistics (DOS)
Time period covered
1997
Area covered
Jordan
Description
Abstract

The 1997 Jordan Population and Family Health Survey (JPFHS) is a national sample survey carried out by the Department of Statistics (DOS) as part of its National Household Surveys Program (NHSP). The JPFHS was specifically aimed at providing information on fertility, family planning, and infant and child mortality. Information was also gathered on breastfeeding, on maternal and child health care and nutritional status, and on the characteristics of households and household members. The survey will provide policymakers and planners with important information for use in formulating informed programs and policies on reproductive behavior and health.

Geographic coverage

National

Analysis unit

Household

Children under five years

Women age 15-49

Men

Kind of data

Sample survey data

Sampling procedure

SAMPLE DESIGN AND IMPLEMENTATION

The 1997 JPFHS sample was designed to produce reliable estimates of major survey variables for the country as a whole, for urban and rural areas, for the three regions (each composed of a group of governorates), and for the three major governorates, Amman, Irbid, and Zarqa.

The 1997 JPFHS sample is a subsample of the master sample that was designed using the frame obtained from the 1994 Population and Housing Census. A two-stage sampling procedure was employed. First, primary sampling units (PSUs) were selected with probability proportional to the number of housing units in the PSU. A total of 300 PSUs were selected at this stage. In the second stage, in each selected PSU, occupied housing units were selected with probability inversely proportional to the number of housing units in the PSU. This design maintains a self-weighted sampling fraction within each governorate.

UPDATING OF SAMPLING FRAME

Prior to the main fieldwork, mapping operations were carried out and the sample units/blocks were selected and then identified and located in the field. The selected blocks were delineated and the outer boundaries were demarcated with special signs. During this process, the numbers on buildings and housing units were updated, listed and documented, along with the name of the owner/tenant of the unit or household and the name of the household head. These activities took place between January 7 and February 28, 1997.

Note: See detailed description of sample design in APPENDIX A of the survey report.

Mode of data collection

Face-to-face

Research instrument

The 1997 JPFHS used two questionnaires, one for the household interview and the other for eligible women. Both questionnaires were developed in English and then translated into Arabic. The household questionnaire was used to list all members of the sampled households, including usual residents as well as visitors. For each member of the household, basic demographic and social characteristics were recorded and women eligible for the individual interview were identified. The individual questionnaire was developed utilizing the experience gained from previous surveys, in particular the 1983 and 1990 Jordan Fertility and Family Health Surveys (JFFHS).

The 1997 JPFHS individual questionnaire consists of 10 sections: - Respondent’s background - Marriage - Reproduction (birth history) - Contraception - Pregnancy, breastfeeding, health and immunization - Fertility preferences - Husband’s background, woman’s work and residence - Knowledge of AIDS - Maternal mortality - Height and weight of children and mothers.

Cleaning operations

Fieldwork and data processing activities overlapped. After a week of data collection, and after field editing of questionnaires for completeness and consistency, the questionnaires for each cluster were packaged together and sent to the central office in Amman where they were registered and stored. Special teams were formed to carry out office editing and coding.

Data entry started after a week of office data processing. The process of data entry, editing, and cleaning was done by means of the ISSA (Integrated System for Survey Analysis) program DHS has developed especially for such surveys. The ISSA program allows data to be edited while being entered. Data entry was completed on November 14, 1997. A data processing specialist from Macro made a trip to Jordan in November and December 1997 to identify problems in data entry, editing, and cleaning, and to work on tabulations for both the preliminary and final report.

Response rate

A total of 7,924 occupied housing units were selected for the survey; from among those, 7,592 households were found. Of the occupied households, 7,335 (97 percent) were successfully interviewed. In those households, 5,765 eligible women were identified, and complete interviews were obtained with 5,548 of them (96 percent of all eligible women). Thus, the overall response rate of the 1997 JPFHS was 93 percent. The principal reason for nonresponse among the women was the failure of interviewers to find them at home despite repeated callbacks.

Note: See summarized response rates by place of residence in Table 1.1 of the survey report.

Sampling error estimates

The estimates from a sample survey are subject to two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the result of mistakes made in implementing data collection and data processing (such as failure to locate and interview the correct household, misunderstanding questions either by the interviewer or the respondent, and data entry errors). Although during the implementation of the 1997 JPFHS numerous efforts were made to minimize this type of error, nonsampling errors are not only impossible to avoid but also difficult to evaluate statistically.

Sampling errors, on the other hand, can be evaluated statistically. The respondents selected in the 1997 JPFHS constitute only one of many samples that could have been selected from the same population, given the same design and expected size. Each of those samples would have yielded results differing somewhat from the results of the sample actually selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.

A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.

If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, since the 1997 JDHS-II sample resulted from a multistage stratified design, formulae of higher complexity had to be used. The computer software used to calculate sampling errors for the 1997 JDHS-II was the ISSA Sampling Error Module, which uses the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics, such as fertility and mortality rates.

Note: See detailed estimate of sampling error calculation in APPENDIX B of the survey report.

Data appraisal

Data Quality Tables - Household age distribution - Age distribution of eligible and interviewed women - Completeness of reporting - Births by calendar years - Reporting of age at death in days - Reporting of age at death in months

Note: See detailed tables in APPENDIX C of the survey report.
n
Census Microdata Samples Project
neuinfo.org
dknet.org
+2more
Updated Sep 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Census Microdata Samples Project [Dataset]. http://identifiers.org/RRID:SCR_008902
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008902
Dataset updated
Sep 12, 2024
Description
A data set of cross-nationally comparable microdata samples for 15 Economic Commission for Europe (ECE) countries (Bulgaria, Canada, Czech Republic, Estonia, Finland, Hungary, Italy, Latvia, Lithuania, Romania, Russia, Switzerland, Turkey, UK, USA) based on the 1990 national population and housing censuses in countries of Europe and North America to study the social and economic conditions of older persons. These samples have been designed to allow research on a wide range of issues related to aging, as well as on other social phenomena. A common set of nomenclatures and classifications, derived on the basis of a study of census data comparability in Europe and North America, was adopted as a standard for recoding. This series was formerly called Dynamics of Population Aging in ECE Countries. The recommendations regarding the design and size of the samples drawn from the 1990 round of censuses envisaged: (1) drawing individual-based samples of about one million persons; (2) progressive oversampling with age in order to ensure sufficient representation of various categories of older people; and (3) retaining information on all persons co-residing in the sampled individual''''s dwelling unit. Estonia, Latvia and Lithuania provided the entire population over age 50, while Finland sampled it with progressive over-sampling. Canada, Italy, Russia, Turkey, UK, and the US provided samples that had not been drawn specially for this project, and cover the entire population without over-sampling. Given its wide user base, the US 1990 PUMS was not recoded. Instead, PAU offers mapping modules, which recode the PUMS variables into the project''''s classifications, nomenclatures, and coding schemes. Because of the high sampling density, these data cover various small groups of older people; contain as much geographic detail as possible under each country''''s confidentiality requirements; include more extensive information on housing conditions than many other data sources; and provide information for a number of countries whose data were not accessible until recently. Data Availability: Eight of the fifteen participating countries have signed the standard data release agreement making their data available through NACDA/ICPSR (see links below). Hungary and Switzerland require a clearance to be obtained from their national statistical offices for the use of microdata, however the documents signed between the PAU and these countries include clauses stipulating that, in general, all scholars interested in social research will be granted access. Russia requested that certain provisions for archiving the microdata samples be removed from its data release arrangement. The PAU has an agreement with several British scholars to facilitate access to the 1991 UK data through collaborative arrangements. Statistics Canada and the Italian Institute of statistics (ISTAT) provide access to data from Canada and Italy, respectively. * Dates of Study: 1989-1992 * Study Features: International, Minority Oversamples * Sample Size: Approx. 1 million/country Links: * Bulgaria (1992), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/02200 * Czech Republic (1991), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06857 * Estonia (1989), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06780 * Finland (1990), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06797 * Romania (1992), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06900 * Latvia (1989), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/02572 * Lithuania (1989), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/03952 * Turkey (1990), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/03292 * U.S. (1990), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06219
C
China Population Statistics: Sample Survey: Sampling Fraction
ceicdata.com
Updated Dec 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2020). China Population Statistics: Sample Survey: Sampling Fraction [Dataset]. https://www.ceicdata.com/en/china/population-sample-survey-level-of-education/population-statistics-sample-survey-sampling-fraction
Explore at:
Dataset updated
Dec 15, 2020
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2012 - Dec 1, 2023
Area covered
China
Variables measured
Population
Description
China Population Statistics: Sample Survey: Sampling Fraction data was reported at 0.105 % in 2023. This records an increase from the previous number of 0.102 % for 2022. China Population Statistics: Sample Survey: Sampling Fraction data is updated yearly, averaging 0.100 % from Dec 1982 (Median) to 2023, with 37 observations. The data reached an all-time high of 100.000 % in 2020 and a record low of 0.063 % in 1994. China Population Statistics: Sample Survey: Sampling Fraction data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Socio-Demographic – Table CN.GA: Population: Sample Survey: Level of Education.
European Union Statistics on Income and Living Conditions 2013 -...
catalog.ihsn.org
Updated Mar 29, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eurostat (2019). European Union Statistics on Income and Living Conditions 2013 - Cross-Sectional User Database - Netherlands [Dataset]. https://catalog.ihsn.org/index.php/catalog/7684
Explore at:
Dataset updated
Mar 29, 2019
Dataset authored and provided by
Eurostathttps://ec.europa.eu/eurostat
Time period covered
2013
Area covered
Netherlands
Description
Abstract

In 2013, the EU-SILC instrument covered all EU Member States plus Iceland, Turkey, Norway, Switzerland and Croatia. EU-SILC has become the EU reference source for comparative statistics on income distribution and social exclusion at European level, particularly in the context of the "Program of Community action to encourage cooperation between Member States to combat social exclusion" and for producing structural indicators on social cohesion for the annual spring report to the European Council. The first priority is to be given to the delivery of comparable, timely and high quality cross-sectional data.

There are two types of datasets: 1) Cross-sectional data pertaining to fixed time periods, with variables on income, poverty, social exclusion and living conditions. 2) Longitudinal data pertaining to individual-level changes over time, observed periodically - usually over four years.

Social exclusion and housing-condition information is collected at household level. Income at a detailed component level is collected at personal level, with some components included in the "Household" section. Labor, education and health observations only apply to persons aged 16 and over. EU-SILC was established to provide data on structural indicators of social cohesion (at-risk-of-poverty rate, S80/S20 and gender pay gap) and to provide relevant data for the two 'open methods of coordination' in the field of social inclusion and pensions in Europe.

This is the 1st version of the 2013 Cross-Sectional User Database as released in July 2015.

Geographic coverage

The survey covers following countries: Austria; Belgium; Bulgaria; Croatia; Cyprus; Czech Republic; Denmark; Estonia; Finland; France; Germany; Greece; Spain; Ireland; Italy; Latvia; Lithuania; Luxembourg; Hungary; Malta; Netherlands; Poland; Portugal; Romania; Slovenia; Slovakia; Serbia; Sweden; United Kingdom; Iceland; Norway; Turkey; Switzerland

Small parts of the national territory amounting to no more than 2% of the national population and the national territories listed below may be excluded from EU-SILC: France - French Overseas Departments and territories; Netherlands - The West Frisian Islands with the exception of Texel; Ireland - All offshore islands with the exception of Achill, Bull, Cruit, Gorumna, Inishnee, Lettermore, Lettermullan and Valentia; United Kingdom - Scotland north of the Caledonian Canal, the Scilly Islands.

Analysis unit

Households;

Individuals 16 years and older.

Universe

The survey covered all household members over 16 years old. Persons living in collective households and in institutions are generally excluded from the target population.

Kind of data

Sample survey data [ssd]

Sampling procedure

On the basis of various statistical and practical considerations and the precision requirements for the most critical variables, the minimum effective sample sizes to be achieved were defined. Sample size for the longitudinal component refers, for any pair of consecutive years, to the number of households successfully interviewed in the first year in which all or at least a majority of the household members aged 16 or over are successfully interviewed in both the years.

For the cross-sectional component, the plans are to achieve the minimum effective sample size of around 131.000 households in the EU as a whole (137.000 including Iceland and Norway). The allocation of the EU sample among countries represents a compromise between two objectives: the production of results at the level of individual countries, and production for the EU as a whole. Requirements for the longitudinal data will be less important. For this component, an effective sample size of around 98.000 households (103.000 including Iceland and Norway) is planned.

Member States using registers for income and other data may use a sample of persons (selected respondents) rather than a sample of complete households in the interview survey. The minimum effective sample size in terms of the number of persons aged 16 or over to be interviewed in detail is in this case taken as 75 % of the figures shown in columns 3 and 4 of the table I, for the cross-sectional and longitudinal components respectively.

The reference is to the effective sample size, which is the size required if the survey were based on simple random sampling (design effect in relation to the 'risk of poverty rate' variable = 1.0). The actual sample sizes will have to be larger to the extent that the design effects exceed 1.0 and to compensate for all kinds of non-response. Furthermore, the sample size refers to the number of valid households which are households for which, and for all members of which, all or nearly all the required information has been obtained. For countries with a sample of persons design, information on income and other data shall be collected for the household of each selected respondent and for all its members.

At the beginning, a cross-sectional representative sample of households is selected. It is divided into say 4 sub-samples, each by itself representative of the whole population and similar in structure to the whole sample. One sub-sample is purely cross-sectional and is not followed up after the first round. Respondents in the second sub-sample are requested to participate in the panel for 2 years, in the third sub-sample for 3 years, and in the fourth for 4 years. From year 2 onwards, one new panel is introduced each year, with request for participation for 4 years. In any one year, the sample consists of 4 sub-samples, which together constitute the cross-sectional sample. In year 1 they are all new samples; in all subsequent years, only one is new sample. In year 2, three are panels in the second year; in year 3, one is a panel in the second year and two in the third year; in subsequent years, one is a panel for the second year, one for the third year, and one for the fourth (final) year.

According to the Commission Regulation on sampling and tracing rules, the selection of the sample will be drawn according to the following requirements:

For all components of EU-SILC (whether survey or register based), the crosssectional and longitudinal (initial sample) data shall be based on a nationally representative probability sample of the population residing in private households within the country, irrespective of language, nationality or legal residence status. All private households and all persons aged 16 and over within the household are eligible for the operation.

Representative probability samples shall be achieved both for households, which form the basic units of sampling, data collection and data analysis, and for individual persons in the target population.

The sampling frame and methods of sample selection shall ensure that every individual and household in the target population is assigned a known and non-zero probability of selection.

By way of exception, paragraphs 1 to 3 shall apply in Germany exclusively to the part of the sample based on probability sampling according to Article 8 of the Regulation of the European Parliament and of the Council (EC) No 1177/2003 concerning

Community Statistics on Income and Living Conditions. Article 8 of the EU-SILC Regulation of the European Parliament and of the Council mentions: 1. The cross-sectional and longitudinal data shall be based on nationally representative probability samples. 2. By way of exception to paragraph 1, Germany shall supply cross-sectional data based on a nationally representative probability sample for the first time for the year 2008. For the year 2005, Germany shall supply data for one fourth based on probability sampling and for three fourths based on quota samples, the latter to be progressively replaced by random selection so as to achieve fully representative probability sampling by 2008. For the longitudinal component, Germany shall supply for the year 2006 one third of longitudinal data (data for year 2005 and 2006) based on probability sampling and two thirds based on quota samples. For the year 2007, half of the longitudinal data relating to years 2005, 2006 and 2007 shall be based on probability sampling and half on quota sample. After 2007 all of the longitudinal data shall be based on probability sampling.

Detailed information about sampling is available in Quality Reports in Related Materials.

Mode of data collection

Mixed
f
Data from: RESEARCH METHODOLOGY FOR NOVELTY TECHNOLOGY
scielo.figshare.com
search.datacite.org
jpeg
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
P.C. Lai (2023). RESEARCH METHODOLOGY FOR NOVELTY TECHNOLOGY [Dataset]. http://doi.org/10.6084/m9.figshare.7482734.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7482734.v1
Dataset updated
May 31, 2023
Dataset provided by
SciELO journals
Authors
P.C. Lai
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract This paper contributes to the existing literature by reviewing the research methodology and the literature review with the focus on potential applications for the novelty technology of the single platform E-payment. These included, but were not restricted to the subjects, population, sample size requirement, data collection method and measurement of variables, pilot study and statistical techniques for data analysis. The reviews will shed some light and potential applications for future researchers, students and others to conceptualize, operationalize and analyze the underlying research methodology to assist in the development of their research methodology.
Sample Size and Population Estimates Tables (Prevalence Estimates) - 3.1 to...
catalog.data.gov
data.virginia.gov
+1more
Updated Sep 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Substance Abuse and Mental Health Services Administration (2025). Sample Size and Population Estimates Tables (Prevalence Estimates) - 3.1 to 3.8 [Dataset]. https://catalog.data.gov/dataset/sample-size-and-population-estimates-tables-prevalence-estimates-3-1-to-3-8-439cc
Explore at:
Dataset updated
Sep 7, 2025
Dataset provided by
Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
Description
These detailed tables show sample sizes and population estimates pertaining to mental health from the 2010 National Survey on Drug Use and Health (NSDUH). Samples sizes and population estimates are provided by age group, gender, race/ethnicity, education level, employment status, poverty level, geographic area, insurance status.
H
Current Population Survey (CPS)
dataverse.harvard.edu
search.dataone.org
Updated May 30, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anthony Damico (2013). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/AK4FDD
Dataset updated
May 30, 2013
Dataset provided by
Harvard Dataverse
Authors
Anthony Damico
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
'Dataset1' - Who Tweets with Their Location? Understanding the Relationship...
figshare.com
datasetcatalog.nlm.nih.gov
zip
Updated Jan 20, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luke Sloan (2016). 'Dataset1' - Who Tweets with Their Location? Understanding the Relationship Between Demographic Characteristics and the Use of Geoservices and Geotagging on Twitter [Dataset]. http://doi.org/10.6084/m9.figshare.1572291.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1572291.v2
Dataset updated
Jan 20, 2016
Dataset provided by
Figsharehttp://figshare.com/
Authors
Luke Sloan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data associated with the paper: Who Tweets with Their Location? Understanding the Relationship Between Demographic Characteristics and the Use of Geoservices and Geotagging on Twitter Luke Sloan & Jeffrey Morgan
e
Respondent-Driven Sampling and Total Population Data from a Rural Ugandan...
b2find.eudat.eu
Updated Nov 9, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2011). Respondent-Driven Sampling and Total Population Data from a Rural Ugandan Cohort, 2010: Special Licence Access - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/a5bd5cb6-6712-5850-97b4-7f0c7b7b9281
Explore at:
Dataset updated
Nov 9, 2011
Area covered
Uganda
Description
Abstract copyright UK Data Service and data collection copyright owner. This is a mixed-methods data collection. This study used Respondent Driven Sampling (RDS) methodology, which is a sampling method designed to generate unbiased estimates of population characteristics for populations where a sampling frame is not available. It is a form of snowball or link-tracing sampling, where respondents are given coupons to recruit other members of the target population, and where respondents are rewarded for both participating and for recruiting others. In addition to variables of interest, data are collected on the number of members of the target population each participant knows. Estimation methods are then applied to account for the non-random sample selection in an attempt to generate unbiased estimates for the target population. In 2010, the researchers conducted an RDS study in a rural Ugandan population where total population data were available. The aim of this study was to evaluate whether RDS could generate representative data on a rural Ugandan population by comparing estimates from an RDS survey with total-population data. The data used to define the target population (male household heads) were available from an ongoing general population cohort of 25 villages in rural Masaka, Uganda covering an area of approximately 38km. Annually, households in the study villages are mapped and after obtaining consent, a total-population household census and an individual questionnaire are administered and blood taken for HIV-1 testing. A random sample of eligible men in the target population who were not recruited during the RDS study were also interviewed, using the same RDS questionnaire. Finally, 49 qualitative interviews (of which summaries have been deposited) were conducted with a range of people (men and women) including RDS participants and non-participants, and RDS interviewers. These data can be used to evaluate the RDS sampling method, and to test new RDS estimators. Further information may be found in the documentation and in the journal articles listed in the Publications section. Special Licence access and geographic data This data collection is subject to Special Licence access conditions (see Access section for details). Data are analysable at individual village level, and GPS point data are available for the villages and interview sites. Finer detail geographic variables may be available for certain research questions. If these are required, users should request this when making their Special Licence application. Main Topics: Quantitative data: demographic characteristics of the individual, including household composition, age, HIV status, tribe, religion, relationship between target population sample member and contacts, geographic data. Qualitative interview summaries: respondents' opinions of the study, the conduct of the research and the incentives used. Respondent Driven Sampling methods were used - see Abstract and documentation for details.
V
Sample Size and Population Estimates Tables (Standard Errors and P Values) -...
data.virginia.gov
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
+1more
html
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Substance Abuse and Mental Health Services Administration (2025). Sample Size and Population Estimates Tables (Standard Errors and P Values) - 8.1 to 8.13 [Dataset]. https://data.virginia.gov/dataset/sample-size-and-population-estimates-tables-standard-errors-and-p-values-8-1-to-8-132
Explore at:
htmlAvailable download formats
Dataset updated
Jul 30, 2025
Dataset provided by
Substance Abuse and Mental Health Services Administration
Description
These detailed tables show standard errors for sample sizes and population estimates from the 2011 National Survey on Drug Use and Health (NSDUH). Standard errors for samples sizes and population estimates are provided by age group, gender, race/ethnicity, education level, employment status, geographic area, pregnancy status, college enrollment status, and probation/parole status.
Census of Population and Housing, 1960: Public Use Sample, 1 in 100
archive.ciser.cornell.edu
Updated Feb 13, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of the Census (2020). Census of Population and Housing, 1960: Public Use Sample, 1 in 100 [Dataset]. http://doi.org/10.6077/j5/ohycfx
Explore at:
Unique identifier
https://doi.org/10.6077/j5/ohycfx
Dataset updated
Feb 13, 2020
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
Bureau of the Census
Variables measured
Individual, Household
Description
This collection contains individual-level and 1-percent national sample data from the 1960 Census of Population and Housing conducted by the Census Bureau. It consists of a representative sample of the records from the 1960 sample questionnaires. The data are stored in 30 separate files, containing in total over two million records, organized by state. Some files contain the sampled records of several states while other files contain all or part of the sample for a single state. There are two types of records stored in the data files: one for households and one for persons. Each household record is followed by a variable number of person records, one for each of the household members. Data items in this collection include the individual responses to the basic social, demographic, and economic questions asked of the population in the 1960 Census of Population and Housing. Data are provided on household characteristics and features such as the number of persons in household, number of rooms and bedrooms, and the availability of hot and cold piped water, flush toilet, bathtub or shower, sewage disposal, and plumbing facilities. Additional information is provided on tenure, gross rent, year the housing structure was built, and value and location of the structure, as well as the presence of air conditioners, radio, telephone, and television in the house, and ownership of an automobile. Other demographic variables provide information on age, sex, marital status, race, place of birth, nationality, education, occupation, employment status, income, and veteran status. The data files were obtained by ICPSR from the Center for Social Analysis, Columbia University. (Source: downloaded from ICPSR 7/13/10)

Please Note: This dataset is part of the historical CISER Data Archive Collection and is also available at ICPSR at https://doi.org/10.3886/ICPSR07756.v1. We highly recommend using the ICPSR version as they may make this dataset available in multiple data formats in the future.
C
China Population: County
ceicdata.com
Updated Apr 14, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2018). China Population: County [Dataset]. https://www.ceicdata.com/en/china/population-sample-survey
Explore at:
Dataset updated
Apr 14, 2018
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2011 - Dec 1, 2022
Area covered
China
Variables measured
Population
Description
Population: County data was reported at 502.967 Person th in 2022. This records a decrease from the previous number of 527.827 Person th for 2021. Population: County data is updated yearly, averaging 753.829 Person th from Dec 1982 (Median) to 2022, with 34 observations. The data reached an all-time high of 797,604.783 Person th in 1982 and a record low of 430.197 Person th in 2019. Population: County data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Socio-Demographic – Table CN.GA: Population: Sample Survey.
w
Synthetic Data for an Imaginary Country, Sample, 2023 - World
microdata.worldbank.org
nada-demo.ihsn.org
Updated Jul 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Development Data Group, Data Analytics Unit (2023). Synthetic Data for an Imaginary Country, Sample, 2023 - World [Dataset]. https://microdata.worldbank.org/index.php/catalog/5906
Explore at:
Dataset updated
Jul 7, 2023
Dataset authored and provided by
Development Data Group, Data Analytics Unit
Time period covered
2023
Area covered
World
Description
Abstract

The dataset is a relational dataset of 8,000 households households, representing a sample of the population of an imaginary middle-income country. The dataset contains two data files: one with variables at the household level, the other one with variables at the individual level. It includes variables that are typically collected in population censuses (demography, education, occupation, dwelling characteristics, fertility, mortality, and migration) and in household surveys (household expenditure, anthropometric data for children, assets ownership). The data only includes ordinary households (no community households). The dataset was created using REaLTabFormer, a model that leverages deep learning methods. The dataset was created for the purpose of training and simulation and is not intended to be representative of any specific country.

The full-population dataset (with about 10 million individuals) is also distributed as open data.

Geographic coverage

The dataset is a synthetic dataset for an imaginary country. It was created to represent the population of this country by province (equivalent to admin1) and by urban/rural areas of residence.

Analysis unit

Household, Individual

Universe

The dataset is a fully-synthetic dataset representative of the resident population of ordinary households for an imaginary middle-income country.

Kind of data

ssd

Sampling procedure

The sample size was set to 8,000 households. The fixed number of households to be selected from each enumeration area was set to 25. In a first stage, the number of enumeration areas to be selected in each stratum was calculated, proportional to the size of each stratum (stratification by geo_1 and urban/rural). Then 25 households were randomly selected within each enumeration area. The R script used to draw the sample is provided as an external resource.

Mode of data collection

other

Research instrument

The dataset is a synthetic dataset. Although the variables it contains are variables typically collected from sample surveys or population censuses, no questionnaire is available for this dataset. A "fake" questionnaire was however created for the sample dataset extracted from this dataset, to be used as training material.

Cleaning operations

The synthetic data generation process included a set of "validators" (consistency checks, based on which synthetic observation were assessed and rejected/replaced when needed). Also, some post-processing was applied to the data to result in the distributed data files.

Response rate

This is a synthetic dataset; the "response rate" is 100%.
u
1961 Census Microdata Individual File for Great Britain: 5% Sample
beta.ukdataservice.ac.uk
Updated 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UK Data Service (2023). 1961 Census Microdata Individual File for Great Britain: 5% Sample [Dataset]. http://doi.org/10.5255/ukda-sn-8272-1
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-8272-1
Dataset updated
2023
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
datacite
Area covered
United Kingdom
Description
The 1961 Census Microdata Individual File for Great Britain: 5% Sample dataset was created from existing digital records from the 1961 Census under a project known as Enhancing and Enriching Historic Census Microdata Samples (EEHCM), which was funded by the Economic and Social Research Council with input from the Office for National Statistics and National Records of Scotland. The project ran from 2012-2014 and was led from the UK Data Archive, University of Essex, in collaboration with the Cathie Marsh Institute for Social Research (CMIST) at the University of Manchester and the Census Offices. In addition to the 1961 data, the team worked on files from the 1971 Census and 1981 Census.

The original 1961 records preceded current data archival standards and were created before microdata sets for secondary use were anticipated. A process of data recovery and quality checking was necessary to maximise their utility for current researchers, though some imperfections remain (see the User Guide for details). Three other 1961 Census datasets have been created:
SN 8273 - 1961 Census Microdata Household File for Great Britain: 0.95% Sample, which links household members together to allow individuals to be understood within their household context, and is available to registered UK Data Service users based in the United Kingdom (see Access section for non-UK access restrictions);
SN 8274 - 1961 Census Microdata Teaching Dataset for Great Britain: 1% Sample: Open Access, which can be used as a taster file and is freely available for anyone to download under an Open Government Licence; and
SN 8275 - 1961 Census Microdata for Great Britain: 9% Sample: Secure Access, which comprises a larger population sample and so contains sufficient information to constitute personal data, meaning that it is only available to Accredited Researchers, under restrictive Secure Access conditions.
General Population Census of 1968 - IPUMS Subset - France
microdata.worldbank.org
datacatalog.ihsn.org
Updated Aug 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IPUMS (2025). General Population Census of 1968 - IPUMS Subset - France [Dataset]. https://microdata.worldbank.org/index.php/catalog/2143
Explore at:
Dataset updated
Aug 1, 2025
Dataset provided by
The National Institute of Statistics and Economic Studieshttp://insee.fr/
IPUMS
Time period covered
1968
Area covered
France
Description
Analysis unit

Persons, households, and dwellings

UNITS IDENTIFIED: - Dwellings: yes* - Vacant Units: No - Households: yes - Individuals: yes - Group quarters: yes*

UNIT DESCRIPTIONS: - Dwellings: no - Households: Yes - Group quarters: A collective household is a group of persons that does not live in an ordinary household, but lives in a collective establishment, sharing meal times.

Universe

Residents in France, of any nationality. Does not include French citizens living in other countries, foreign tourists, or people passing through. Reintegrated persons: Persons living in group quarters or without a fixed address but having a usual home elsewhere (i.e., enumerated away from their usual residence). During data processing, most of these people are reintegrated into their usual households, except in the case of persons in psychiatric hospitals and prisons. Legal population refers to de jure population plus population compte a part.

Kind of data

Population and Housing Census [hh/popcen]

Sampling procedure

MICRODATA SOURCE: INSEE (Institut National de la Statisque et des Etudes Economiques)

SAMPLE SIZE (person records): 2487778.

SAMPLE DESIGN: Systematic manual sorting into lots with different sample units according to target population. Lots divide the population into different samples (1/20,1/5,3/4). Reintegrated persons: Persons living in group quarters or without a fixed address but having a usual home elsewhere (i.e., enumerated away from their usual residence). During data processing, most of these people are reintegrated into their usual households, except in the case of persons in psychiatric hospitals and prisons. Legal population refers to de jure population plus population compte a part.

Mode of data collection

Face-to-face [f2f]

Research instrument

Separate forms for buildings, group quarters (collective households), group quarters (compte a part), private households, and boats. Four forms for individuals (living in group quarters and private dwellings; two different forms for people compte a part; living in boats).
e
Data from: The Global Population Dynamics Database
knb.ecoinformatics.org
search.dataone.org
Updated May 18, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Prendergast; Ellen Bazeley-White; Owen Smith; John Lawton; Pablo Inchausti; David Kidd; Sarah Knight (2020). The Global Population Dynamics Database [Dataset]. http://doi.org/10.5063/F1BZ63Z8
Explore at:
Unique identifier
https://doi.org/10.5063/F1BZ63Z8
Dataset updated
May 18, 2020
Dataset provided by
Knowledge Network for Biocomplexity
Authors
John Prendergast; Ellen Bazeley-White; Owen Smith; John Lawton; Pablo Inchausti; David Kidd; Sarah Knight
Time period covered
Jan 1, 1538 - Jan 1, 2003
Area covered
Earth
Variables measured
End, Area, East, EorW, NorS, West, Year, Begin, LatDD, North, and 71 more
Description
As a source of animal and plant population data, the Global Population Dynamics Database (GPDD) is unrivalled. Nearly five thousand separate time series are available here. In addition to all the population counts, there are taxonomic details of over 1400 species. The type of data contained in the GPDD varies enormously, from annual counts of mammals or birds at individual sampling sites, to weekly counts of zooplankton and other marine fauna. The project commenced in October 1994, following discussions on ways in which the collaborating partners could make a practical and enduring contribution to research into population dynamics. A small team was assembled and, with assistance and advice from numerous interested parties we decided to construct the database using the popular Microsoft Access platform. After an initial design phase, the major task has been that of locating, extracting, entering and validating the data in all the various tables. Now, nearly 5000 individual datasets have been entered onto the GPDD. The Global Population Dynamics Database comprises six Tables of data and information. The tables are linked to each other as shown in the diagram shown in figure 3 of the GPDD User Guide (GPDD-User-Guide.pdf). Referential integrity is maintained through record ID numbers which are held, along with other information in the Main Table. It's structure obeys all the rules of a standard relational database.
u
1961 Census Microdata for Great Britain: 9% Sample: Secure Access
beta.ukdataservice.ac.uk
Updated 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UK Data Service (2019). 1961 Census Microdata for Great Britain: 9% Sample: Secure Access [Dataset]. http://doi.org/10.5255/ukda-sn-8275-1
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-8275-1
Dataset updated
2019
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
datacite
Area covered
United Kingdom
Description
The 1961 Census Microdata for Great Britain: 9% Sample: Secure Access dataset was created from existing digital records from the 1961 Census. It comprises a larger population sample than the other files available from the 1961 Census (see below) and so contains sufficient information to constitute personal data, meaning that it is only available to Accredited Researchers, under restrictive Secure Access conditions. See Access section for further details.

The file was created under a project known as Enhancing and Enriching Historic Census Microdata Samples (EEHCM), which was funded by the Economic and Social Research Council with input from the Office for National Statistics and National Records of Scotland. The project ran from 2012-2014 and was led from the UK Data Archive, University of Essex, in collaboration with the Cathie Marsh Institute for Social Research (CMIST) at the University of Manchester and the Census Offices. In addition to the 1961 data, the team worked on files from the 1971 Census and 1981 Census.

The original 1961 records preceded current data archival standards and were created before microdata sets for secondary use were anticipated. A process of data recovery and quality checking was necessary to maximise their utility for current researchers, though some imperfections remain (see the User Guide for details).

Three other 1961 Census datasets have been created; users should obtain the other datasets in the series first to see whether they are sufficient for their research needs before considering making an application for this study (SN 8275), the Secure Access version:
SN 8272 - 1961 Census Microdata Individual File for Great Britain: 5% Sample, which contains information on individuals in larger local authorities;
SN 8273 - 1961 Census Microdata Household File for Great Britain: 0.95% Sample, which links household members together to allow individuals to be understood within their household context. SNs 8272 and 8273 are both available to registered UK Data Service users based in the United Kingdom (see Access section for non-UK access restrictions); and
SN 8274 - 1961 Census Microdata Teaching Dataset for Great Britain: 1% Sample: Open Access, which can be used as a taster file and is freely available for anyone to download under an Open Government Licence.
C
China Population: City: Age 0 to 14: Jiangsu
ceicdata.com
Updated Apr 4, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2018). China Population: City: Age 0 to 14: Jiangsu [Dataset]. https://www.ceicdata.com/en/china/population-sample-survey-by-age-and-region-city
Explore at:
Dataset updated
Apr 4, 2018
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2011 - Dec 1, 2022
Area covered
China
Variables measured
Population
Description
Population: City: Age 0 to 14: Jiangsu data was reported at 6.392 Person th in 2023. This records an increase from the previous number of 6.010 Person th for 2022. Population: City: Age 0 to 14: Jiangsu data is updated yearly, averaging 3.119 Person th from Dec 1997 (Median) to 2023, with 27 observations. The data reached an all-time high of 5,972.581 Person th in 2020 and a record low of 1.922 Person th in 1999. Population: City: Age 0 to 14: Jiangsu data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Socio-Demographic – Table CN.GA: Population: Sample Survey: By Age and Region: City.
C
China Population: City: Age 65 and Above: Shanghai
ceicdata.com
Updated Apr 4, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2018). China Population: City: Age 65 and Above: Shanghai [Dataset]. https://www.ceicdata.com/en/china/population-sample-survey-by-age-and-region-city
Explore at:
Dataset updated
Apr 4, 2018
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2011 - Dec 1, 2022
Area covered
China
Variables measured
Population
Description
Population: City: Age 65 and Above: Shanghai data was reported at 4.310 Person th in 2023. This records an increase from the previous number of 3.851 Person th for 2022. Population: City: Age 65 and Above: Shanghai data is updated yearly, averaging 2.021 Person th from Dec 1997 (Median) to 2023, with 27 observations. The data reached an all-time high of 3,226.728 Person th in 2020 and a record low of 1.218 Person th in 2011. Population: City: Age 65 and Above: Shanghai data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Socio-Demographic – Table CN.GA: Population: Sample Survey: By Age and Region: City.

Facebook

Twitter

Click to copy link

Link copied

Cite

National Institutes of Health (2025). Statistics review 2: Samples and populations [Dataset]. https://catalog.data.gov/dataset/statistics-review-2-samples-and-populations

Statistics review 2: Samples and populations

Explore at:

Dataset updated

Jul 24, 2025

Dataset provided by

National Institutes of Health

Description

The previous review in this series introduced the notion of data description and outlined some of the more common summary measures used to describe a dataset. However, a dataset is typically only of interest for the information it provides regarding the population from which it was drawn. The present review focuses on estimation of population values from a sample.

Clear search

Close search

Google apps

Main menu

Statistics review 2: Samples and populations

Population and Family Health Survey 1997 - Jordan

Abstract

Geographic coverage

Analysis unit

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Response rate

Sampling error estimates

Data appraisal

Census Microdata Samples Project

China Population Statistics: Sample Survey: Sampling Fraction

European Union Statistics on Income and Living Conditions 2013 -...

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Data from: RESEARCH METHODOLOGY FOR NOVELTY TECHNOLOGY

Sample Size and Population Estimates Tables (Prevalence Estimates) - 3.1 to...

Current Population Survey (CPS)

'Dataset1' - Who Tweets with Their Location? Understanding the Relationship...

Respondent-Driven Sampling and Total Population Data from a Rural Ugandan...

Sample Size and Population Estimates Tables (Standard Errors and P Values) -...

Census of Population and Housing, 1960: Public Use Sample, 1 in 100

China Population: County

Synthetic Data for an Imaginary Country, Sample, 2023 - World

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Response rate

1961 Census Microdata Individual File for Great Britain: 5% Sample

General Population Census of 1968 - IPUMS Subset - France

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Data from: The Global Population Dynamics Database

1961 Census Microdata for Great Britain: 9% Sample: Secure Access

China Population: City: Age 0 to 14: Jiangsu

China Population: City: Age 65 and Above: Shanghai

Statistics review 2: Samples and populations