28 datasets found
  1. Summary statistics of population and samples taken at different sampling...

    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria M; Ibrahim M. Almanjahie; Muhammad Ismail; Ammara Nawaz Cheema (2023). Summary statistics of population and samples taken at different sampling schemes for n = 4, r = 1. [Dataset]. http://doi.org/10.1371/journal.pone.0275340.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Maria M; Ibrahim M. Almanjahie; Muhammad Ismail; Ammara Nawaz Cheema
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary statistics of population and samples taken at different sampling schemes for n = 4, r = 1.

  2. f

    Sampled population parameters and corresponding node-based metrics, node...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tessier, Nathalie; Lapointe, François-Joseph; Bouchard, Cindy; Lord, Étienne (2022). Sampled population parameters and corresponding node-based metrics, node removal statistics and BRIDES results. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000436321
    Explore at:
    Dataset updated
    Aug 12, 2022
    Authors
    Tessier, Nathalie; Lapointe, François-Joseph; Bouchard, Cindy; Lord, Étienne
    Description

    Sampled population parameters and corresponding node-based metrics, node removal statistics and BRIDES results.

  3. Synthetic dataset of Luxembourg citizens

    • kaggle.com
    zip
    Updated Jan 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olaf Yunus Laitinen Imanov (2025). Synthetic dataset of Luxembourg citizens [Dataset]. https://www.kaggle.com/datasets/olaflundstrom/synthetic-dataset-of-luxembourg-citizens
    Explore at:
    zip(3016850 bytes)Available download formats
    Dataset updated
    Jan 21, 2025
    Authors
    Olaf Yunus Laitinen Imanov
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Luxembourg
    Description

    The dataset has been created by using the open-source code released by LNDS (Luxembourg National Data Service). It is meant to be an example of the dataset structure anyone can generate and personalize in terms of some fixed parameter, including the sample size. The file format is .csv, and the data are organized by individual profiles on the rows and their personal features on the columns. The information in the dataset has been generated based on the statistical information about the age-structure distribution, the number of populations over municipalities, the number of different nationalities present in Luxembourg, and salary statistics per municipality. The STATEC platform, the statistics portal of Luxembourg, is the public source we used to gather the real information that we ingested into our synthetic generation model. Other features like Date of birth, Social matricule, First name, Surname, Ethnicity, and physical attributes have been obtained by a logical relationship between variables without exploiting any additional real information. We are in compliance with the law in putting close to zero the risk of identifying a real person completely by chance.

  4. d

    Data from: A hierarchical statistical model for estimating population...

    • catalog.data.gov
    • healthdata.gov
    • +2more
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). A hierarchical statistical model for estimating population properties of quantitative genes [Dataset]. https://catalog.data.gov/dataset/a-hierarchical-statistical-model-for-estimating-population-properties-of-quantitative-gene
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Earlier methods for detecting major genes responsible for a quantitative trait rely critically upon a well-structured pedigree in which the segregation pattern of genes exactly follow Mendelian inheritance laws. However, for many outcrossing species, such pedigrees are not available and genes also display population properties. Results In this paper, a hierarchical statistical model is proposed to monitor the existence of a major gene based on its segregation and transmission across two successive generations. The model is implemented with an EM algorithm to provide maximum likelihood estimates for genetic parameters of the major locus. This new method is successfully applied to identify an additive gene having a large effect on stem height growth of aspen trees. The estimates of population genetic parameters for this major gene can be generalized to the original breeding population from which the parents were sampled. A simulation study is presented to evaluate finite sample properties of the model. Conclusions A hierarchical model was derived for detecting major genes affecting a quantitative trait based on progeny tests of outcrossing species. The new model takes into account the population genetic properties of genes and is expected to enhance the accuracy, precision and power of gene detection.

  5. i

    Reproductive Health Survey 2005 - Georgia

    • catalog.ihsn.org
    Updated Mar 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (CDC) (2019). Reproductive Health Survey 2005 - Georgia [Dataset]. http://catalog.ihsn.org/catalog/1888
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset provided by
    Georgian Ministry of Health (MoLHSA)
    Georgian Centers for Disease Control (NCDC)
    Centers for Disease Control and Prevention (CDC)
    Time period covered
    2005
    Area covered
    Georgia
    Description

    Geographic coverage

    National, with tthe exception of the separatist regions of Abkhazia and South Ossetia.

    Analysis unit

    Women aged 15-44 years

    Universe

    Because the survey collected information from a representative sample of Georgian women aged 15-44 years, the data can be used to estimate percentages, averages, and other measures for the entire population of women of reproductive age residing in Georgian households in 2005.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Similar to the 1999 RHS survey, the GERHS05 was a population-based probability survey consisting of face to face interviews with women of reproductive age (15-44 years) at their homes. The survey was designed to collect information from a representative sample of approximately 6,000 women of reproductive age throughout Georgia (excluding the separatist regions of Abkhazia and South Ossetia). The population from which the respondents were selected included all females between the ages of 15 and 44 years, regardless of marital status, who were living in households in Georgia during the survey period.

    The current survey used a stratifi ed multistage sampling design that used the 2002 Georgia census as the sampling frame (State Department for Statistics, 2003). To better assist key stakeholders in assessing the baseline situation at a sub-national level, the sample was designed to produce estimates for 11 regions of the country. Census sectors were grouped into 11 strata, corresponding to Georgia’s administrative regions; three small regions, Racha-Lechkhumi, Kvemo Svaneti, and Zemo Svaneti were included in one stratum, identifi ed as the Racha-Svaneti stratum. Data are also representative for the urban-rural distribution of the population at the national level.

    The first stage of the three stage sample design was selection of census sectors, with probability of selection proportional to the number of households in each of the 11 regional sectors. The first stage was accomplished by using a systematic sampling process with a random starting point in each stratum. During the fi rst stage, 310 census sectors were selected as primary sampling units (PSUs).

    The overall sample consisted of 310 PSUs, and the target number of completed interviews was 6,200 for the entire sample, with an average of 20 completed interviews per PSU. The minimum acceptable number of interviews per stratum was set at 400, so that the minimum number of PSUs per stratum was set at 20. With these criteria, 20 PSUs were allocated to each stratum, which accounted for 220 of the available PSUs. The remaining 80 PSUs were distributed in the largest regions in order to obtain a distribution of PSUs approximately proportional to the distribution of households in the 2002 census. An additional 10 PSUs were added to the smallest stratum, Racha-Svaneti, to compensate for the considerable sparseness of women of reproductive age in this stratum.

    Unlike the 1999 survey, a separate sample of internally displaced persons was not selected for the 2005 survey.

    The sampling fraction ranges from 1 in 16 households in the Racha-Svaneti stratum (the least populated stratum) to 1 in 146 in Adjara. The ratio of households in the census to households in the sample is above 100.0, the region has been under-sampled, whereas if the ratio is less than 100.0, the region has been oversampled.

    In the second stage of sampling, clusters of households were randomly selected from each census sector chosen in the first stage. Determination of cluster size was based on the number of households required to obtain an average of 20 completed interviews per cluster. The total number of households in each cluster took into account estimates of unoccupied households, average number of women aged 15–44 years per household, the interview of only one respondent per household, and an estimated response rate of 98%. In the case of households with more than one woman between the ages of 15 and 44, one woman was selected at random to be interviewed.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire, already refined during the first RHS in Georgia in 1999, was revised carefully and reviewed by a panel of Georgian experts; in subsequent meetings and informal consultations, CDC sought advice on how to design a more effective and useful survey instrument. As a result, the content of the questionnaire was expanded substantially and made more relevant for programmatic needs.

    The questionnaire was designed to collect information on the following: - Demographic characteristics - Household assets (durable goods and dwelling characteristics) - Fertility and child mortality - Family planning and reproduction preferences - Use of reproductive and child health care services - Range and quality of maternity care services - Use of preventive and curative health care services - Reproductive health care expenditures - Perceptions of health service quality - Risky health behaviors (smoking and alcohol use) - Young adult health education and behaviors - Intimate partner violence - HIV/AIDS and other STDs

    The questionnaire was tested extensively, both before and during the pretest and prior to beginning the field work. Testing included practice field interviews and simulated interviews conducted by both CDC and NCDC staff. The questionnaire was translated into Georgian and Russian and back-translated into English.

    The inclusion of life histories (marital history and pregnancy history) and the five-year month-by-month calendar of pregnancy, contraceptive use, and union status helped respondents accurately recall the dates of one event in relation to the dates of others they had already recorded.

    Cleaning operations

    Legal ranges, pre-coded variables, consistency checks, and skips were programmed into the data entry software, so that data entry supervisors would notice errors or inconsistencies and could send problematic interviews back to the field for follow-up visits.

    Response rate

    Of the 12,338 households selected in the household sample, 6,402 included at least one eligible woman (aged 15–44 years). Of these identified respondents, 6,376 women were successfully interviewed, yielding a response rate of 99%. Virtually all respondents who were selected to participate and who could be reached agreed to be interviewed and were very cooperative. Response rates did not vary signifi cantly by geographical location.

    Sampling error estimates

    The estimates for a sample survey are affected by two types of errors: non-sampling error and sampling error. Non-sampling error is the result of mistakes made in carrying out data collection and data processing, including the failure to locate and interview the right household, errors in the way questions are asked or understood, and data entry errors. Although intensive quality-control efforts were made during the implementation of the GERHS05 to minimize this type of error, non-sampling errors are impossible to avoid altogether and difficult to evaluate statistically.

    Sampling error is a measure of the variability between an estimate and the true value of the population parameter intended to be estimated, which can be attributed to the fact that a sample rather than a complete enumeration was used to produce it. In other words, sampling error is the difference between the expected value for any variable measured in a survey and the value estimated by the survey. This sample is only one of the many probability samples that could have been selected from the female population aged 15–44 using the same sample design and projected sample size. Each of these samples would have yielded slightly different results from the actual sample selected. Because the statistics presented here are based on a sample, they may differ by chance variations from the statistics that would result if all women 15–44 years of age in Azerbaijan would have been interviewed.

    Sampling error is usually measured in terms of the variance and standard error (square root of the variance) for a particular statistic (mean, proportion, or ratio). The standard error (SE) can be used to calculate confidence intervals (CI) of the estimates within which we can say with a given level of certainty that the true value of population parameter lies. For example, for any given statistic calculated from the survey sample, there is a 95 percent probability that the true value of that statistic will lie within a range of plus or minus two SE of the survey estimate. The chances are about 68 out of 100 (about two out of three) that a sample estimate would fall within one standard error of a statistic based on a complete count of the population. The estimated sampling errors for 95% confidence intervals (1.96 x SE) for selected proportions and sample sizes are shown in Table A.1 of the Final Report. The estimates in Table A.1 can be used to estimate 95% confidence intervals for the estimated proportions shown for each sample size. The sampling error estimates include an average design effect of 1.6, needed because the GERHS05 did not employ a simple random sample but included clusters of elements in the second stage of the sample selection.

    The selection of clusters is generally characterized by some homogeneity that tends to increase the variance of the sample. Thus, the variance in the sample for the GERHS05 is greater than a simple random sample would be due to the effect of clustering. The design effect represents the ratio of the two variance estimates: the variance of the complex design using clusters, divided by the variance of a simple random sample

  6. u

    Data from: Estimation of genetic parameters and their sampling variances for...

    • agdatacommons.nal.usda.gov
    • datasetcatalog.nlm.nih.gov
    bin
    Updated Feb 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frank M. You; Qijian Song; Gaofeng Jia; Yanzhao Cheng; Scott D. Duguid; Helen M. Booker; Sylvie J. Cloutier (2024). Data from: Estimation of genetic parameters and their sampling variances for quantitative traits in the type 2 modified augmented design [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/Data_from_Estimation_of_genetic_parameters_and_their_sampling_variances_for_quantitative_traits_in_the_type_2_modified_augmented_design/24662844
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 13, 2024
    Dataset provided by
    Science Direct
    Authors
    Frank M. You; Qijian Song; Gaofeng Jia; Yanzhao Cheng; Scott D. Duguid; Helen M. Booker; Sylvie J. Cloutier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The type 2 modified augmented design (MAD2) is an efficient unreplicated experimental design used for evaluating large numbers of lines in plant breeding and for assessing genetic variation in a population. Statistical methods and data adjustment for soil heterogeneity have been previously described for this design. In the absence of replicated test genotypes in MAD2, their total variance cannot be partitioned into genetic and error components as required to estimate heritability and genetic correlation of quantitative traits, the two conventional genetic parameters used for breeding selection. We propose a method of estimating the error variance of unreplicated genotypes that uses replicated controls, and then of estimating the genetic parameters. Using the Delta method, we also derived formulas for estimating the sampling variances of the genetic parameters. Computer simulations indicated that the proposed method for estimating genetic parameters and their sampling variances was feasible and the reliability of the estimates was positively associated with the level of heritability of the trait. A case study of estimating the genetic parameters of three quantitative traits, iodine value, oil content, and linolenic acid content, in a biparental recombinant inbred line population of flax with 243 individuals, was conducted using our statistical models. A joint analysis of data over multiple years and sites was suggested for genetic parameter estimation. A pipeline module using SAS and Perl was developed to facilitate data analysis and appended to the previously developed MAD data analysis pipeline (http://probes.pw.usda.gov/bioinformatics_tools/MADPipeline/index.html). Resources in this dataset:Resource Title: Table S1. The raw phenotypic data of a population with 243 RILs derived from a cross between ‘CDC Bethune’ and ‘Macbeth’ (BM) for the case study.. File Name: 1-s2.0-S2214514116000179-mmc1.xlsx, url: https://ars.els-cdn.com/content/image/1-s2.0-S2214514116000179-mmc1.xlsx Supplementary data

  7. i

    Household Budget Survey 2010 - Estonia

    • catalog.ihsn.org
    Updated Mar 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Estonia (2019). Household Budget Survey 2010 - Estonia [Dataset]. https://catalog.ihsn.org/index.php/catalog/4509
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset authored and provided by
    Statistics Estonia
    Time period covered
    2010
    Area covered
    Estonia
    Description

    Abstract

    The aim of the 2010 Estonia Household Budget Survey is to get reliable information on the expenditures and consumption of households. Besides obtaining data about the household composition, the survey also provides information on household members’ main demographic and social indicators (marital status, employment, education), as well as on living conditions and owning of durable goods. The data of the survey are used a lot by ministries and research institutions.

    Since 2000 the HBS consisting of four parts has been rather voluminous. The Household Picture concerns general data about the household’s background data such as sex, age, marital status, education, coping, employment, etc. of household members. Post-Interview is intended for registering the changes entered during the survey. The Diary Book for Food Expenditure reflects the expenditure made by the household during half a month. The Diary Book for Income, Taxes and Expenditure contains data about monetary and non-monetary income received by the household as well as the expenditures on all commodities and services.

    Geographic coverage

    National

    Analysis unit

    • Households;
    • Individuals.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The population of the Household Budget Survey was made up of all permanent residents of the Republic of Estonia aged 15 or older as of 1 January 2010, who live in private households, excl. those residing in institutions on a long-term basis (at least for a year). The Estonian Population Register, administered by the Ministry of Internal Affairs, was used as a sampling frame representing the survey population.

    The HBS is a sample survey i.e. the population is evaluated on the basis of the data collected from the sample. The survey sample was drawn from among the persons registered in the Population Register who were 15 years of age or older as at 1 January 2009. The person included in the sample (address person) brought his/her household into the sample.

    Sample persons were drawn from the Population Register by the stratified unproportional systematic sampling procedure. In case of this sampling procedure, the population is divided into non-overlapping subpopulations or strata, and independent subsamples are drawn separately from every subpopulation following the systematic sampling procedure and by applying different inclusion probabilities. The population was stratified by the county in which the address person's place of residence was. In the stratification procedure, the stratification principles worked out for and applied to the Estonian Social Survey, which has been carried out on an annual basis since 2004, were used, and thus three strata were formed by the number of inhabitants in the respective county. Hiiu county being smaller than other counties comprised a separate stratum, the remaining counties were distributed into two strata - the larger and smaller ones. Counties with the population less than 60,000 belonged to the stratum of smaller counties (as at 1 January of the survey year).

    To ensure an even distribution of the sample and preclude several address persons living at the same address from falling into the sample, records in the strata were sorted by address: first by the county code; within the county, by the rural municipality code; within the rural municipality, by the name of village; next, by the street name; and finally, by the house number.

    The original sample included 8,100 persons. In order not to put an excessive burden on the respondents, those who had participated in Statistics Esonia's surveys before were excluded. The final size of the sample was 7,803 persons.

    Although the inclusion probability is smaller in the stratum of larger counties than in other strata, the result gives a relatively large sample for Tallinn. This is necessary for the purpose of analysis, because in Tallinn the response probability is the lowest, but the diversity of households is the largest. Thus, a larger sample size from other (more homogenous) regions guarantees a required accuracy of estimates.

    Mode of data collection

    Face-to-face [f2f]

    Sampling error estimates

    Only a part of the population can be surveyed by sample survey. Because of that, the indicators calculated on the basis of sample data are always somewhat different from the actual value of the estimated population parameter. Such a difference is called the random error or sampling error of estimation. It is not possible to specify the sampling error exactly, but it can be estimated statistically by taking the variability or dispersion of the statistic that is used for parameter estimation as the basis for the sample design used in the survey. In addition to the sample design, the sampling error depends on the sample size. A smaller sampling error can be expected in case of larger sample sizes.

    An important group of quality indicators consists of the accuracy estimations of parameters calculated on the basis of the survey. The accuracy estimations provided by Statistics Estonia are estimates of the sampling error i.e. these estimations do not reflect other possible error sources. Estimates of sampling errors are calculated for more important indicators.

    Standard error is the main sampling error estimate. Standard error is a mathematical value that describes the variance of parameter estimates given on the basis of the sample. As the sample is selected randomly, the parameter estimate is also a random variable and variance can be calculated for it. The smaller the variance, the more exact is the parameter estimate. The variance of estimate depends on the sample size and sample design.

    Relative standard error shows the proportion that the estimate’s standard error forms of the estimated value. As a rule, it is presented as a percentage. Relative standard error is independent from measuring units, due to that it allows for comparing of different parameter estimations with each other irrespective of measurement units. Relative standard error is an operative tool in order to receive a quick overview of the accuracy of estimates.

  8. Summary of sampling statistics for extensive and intensive (arrays A–E)...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Clay M. Wilton; Emily E. Puckett; Jeff Beringer; Beth Gardner; Lori S. Eggert; Jerrold L. Belant (2023). Summary of sampling statistics for extensive and intensive (arrays A–E) black bear survey configurations in south-central Missouri, USA. [Dataset]. http://doi.org/10.1371/journal.pone.0111257.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Clay M. Wilton; Emily E. Puckett; Jeff Beringer; Beth Gardner; Lori S. Eggert; Jerrold L. Belant
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Missouri, United States
    Description

    Order of values are mean (standard deviation, total) over six sessions. Note the sum of new detections (u) was 92 total individuals for the intensive design due to two individuals being detected in two arrays (i.e., total individuals was actually 90).aNumber of lured snares in each session.bNumber of individuals detected for the first time on each session.cNumber of individuals detected on each session.dNumber of detections, including within-session recaptures.eNumber of snares having at least one detection per session.Summary of sampling statistics for extensive and intensive (arrays A–E) black bear survey configurations in south-central Missouri, USA.

  9. f

    Summary of heterozygosity, f-statistics, and polymorphism by population for...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martinelli, Tommaso; De Vita, Pasquale; Puglisi, Damiano; Pecchioni, Nicola; Paris, Roberta; Bassolino, Laura; Pasquariello, Marianna; Esposito, Salvatore (2024). Summary of heterozygosity, f-statistics, and polymorphism by population for codominant data. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001424662
    Explore at:
    Dataset updated
    Aug 7, 2024
    Authors
    Martinelli, Tommaso; De Vita, Pasquale; Puglisi, Damiano; Pecchioni, Nicola; Paris, Roberta; Bassolino, Laura; Pasquariello, Marianna; Esposito, Salvatore
    Description

    The table displays the following parameters: sample size (N); number of different alleles (Na); number of effective alleles , Shannon’s information index I = −1⋅∑(pi⋅Ln(pi)); observed , expected He = 1−∑⋅pi2, and unbiased expected heterozygosity ; fixation index . (CSV)

  10. Statistical analyses of vector competence parameters of Ae. albopictus...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Sep 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Hafsia; Tatiana Barbar; Haoues Alout; Fiona Baudino; Cyrille Lebon; Yann Gomard; David A. Wilkinson; Toscane Fourié; Patrick Mavingui; Célestine Atyame (2024). Statistical analyses of vector competence parameters of Ae. albopictus populations infected with the DENV-1 strain. [Dataset]. http://doi.org/10.1371/journal.pone.0310635.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 19, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Sarah Hafsia; Tatiana Barbar; Haoues Alout; Fiona Baudino; Cyrille Lebon; Yann Gomard; David A. Wilkinson; Toscane Fourié; Patrick Mavingui; Célestine Atyame
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Mosquitoes were examined at 14, 21 and 28 days post-exposure (dpe). In these analyses, the influence of mosquito population, dpe, generation and area were tested. d.f. is the degree of freedom and X2 is the Chi-square value.

  11. d

    Data from: Expert opinions of demographic rates of Argentine black and white...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Oct 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Expert opinions of demographic rates of Argentine black and white tegus in South Florida [Dataset]. https://catalog.data.gov/dataset/expert-opinions-of-demographic-rates-of-argentine-black-and-white-tegus-in-south-florida
    Explore at:
    Dataset updated
    Oct 29, 2025
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Florida
    Description

    We illustrate the utility of expert elicitation, explicit recognition of uncertainty, and the value of information for directing management and research efforts for invasive species, using tegu lizards (Salvator merianae) in southern Florida as a case study. We posited a post-birth pulse, matrix model, which was parameterized using a 3-point process to elicit estimates of tegu demographic rates from herpetology experts. We fit statistical distributions for each parameter and for each expert, then drew and pooled a large number of replicate samples from these to form a distribution for each demographic parameter. Using these distributions, we generated a large sample of matrix models to infer how the tegu population might respond to control efforts. We used the concepts of Pareto efficiency and stochastic dominance to conclude that targeting older age classes at relatively high rates appears to have the best chance of minimizing tegu abundance and control costs. Expert opinion combined with an explicit consideration of uncertainty can be valuable for conducting an initial assessment of the effort needed to control the invader. The value of information can be used to focus research in a way that not only helps increases the efficacy of control, but minimizes costs as well.

  12. Progress in International Reading and Literacy Study 2001 - International

    • datafirst.uct.ac.za
    Updated May 14, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    International Association for the Evaluation of Educational Achievement (2020). Progress in International Reading and Literacy Study 2001 - International [Dataset]. https://www.datafirst.uct.ac.za/dataportal/index.php/catalog/543
    Explore at:
    Dataset updated
    May 14, 2020
    Dataset provided by
    International Association for the Evaluation of Educational Achievement
    International Study Centre
    Time period covered
    2001
    Area covered
    International
    Description

    Abstract

    The PIRLS 2001 aimed to generate a database of student achievement data in addition to information on student, parent, teacher, and school background data for the 35 countries that participated in PIRLS 2001.

    Geographic coverage

    The survey had international coverage

    Analysis unit

    Individuals and establishments

    Universe

    The PIRLS 2001 target populations are all children in "the upper of the two grades with the most 9-year-olds at the time of testing" (PIRLS, 1999) in each participating country. This corresponds to the fourth grade in most countries. This population was chosen because it represents an important transition point in children's development as readers. In most countries, by the end of fourth grade, children are expected to have learned how to read, and are now reading to learn.

    The teachers in the PIRLS 2001 international database do not constitute representative samples of teachers in the participating countries. Rather, they are the teachers of nationally representative samples of students. Therefore, analyses with teacher data should be made with students as the units of analysis and reported in terms of students who are taught by teachers with a particular attribute. Teacher data are analyzed by linking the students to their teachers. The student-teacher linkage data files are used for this purpose. The same caveat applies to analyses of schools and parents.

    Kind of data

    Sample survey data

    Sampling procedure

    To be acceptable for PIRLS 2001, national sample designs had to result in probability samples that gave accurate weighted estimates of population parameters such as means and percentages, and for which estimates of sampling variance could be computed. The PIRLS 2001 sample design is derived from the design of IEA's TIMSS (see Foy & Joncas, 2000), with minor refinements. Since sampling for PIRLS was to be implemented by the National Research Coordinator (NRC) in each participating country - often with limited resources - it was essential that the design be simple and easy to implement while yielding accurate and efficient samples of both schools and students.

    The international project team provided manuals and expert advice to help NRCs adapt the PIRLS sample design to their national system, and to guide them through the phases of sampling. The School Sampling Manual (PIRLS, 1999) describes how to implement the international sample design to select the school sample; and offers advice on initial planning, adapting the design to national situations, establishing appropriate sample selection procedures, and conducting fieldwork. The Survey Operations Manual and School Coordinator Manual (PIRLS, 2001b, 2001a) provide information on sampling within schools, assigning assessment booklets and questionnaires to sampled students, and tracking respondents and non-respondents. To automate the rather complex within-school sampling procedures, NRCs were provided with sampling software jointly developed by the IEA Data Processing Center and Statistics Canada (IEA, 2001).

    In IEA studies, the target population for all countries is known as the international desired target population. This is the grade or age level that each country should address in its sampling activities. The international desired target population for PIRLS 2001 was the following: "All students enrolled in the upper of the two adjacent grades that contain the largest proportion of 9-year-olds at the time of testing."

    PIRLS expected all participating countries to define their national desired population to correspond as closely as possible to its definition of the international desired population. Using its national desired population as a basis, each participating country had to define its population in operational terms for sampling purposes. This definition, known in IEA terminology as the national defined population, is essentially the sampling frame from which the first stage of sampling takes place. Ideally, the national defined population should coincide with the national desired population, although in reality there may be some school types or regions that cannot be included; consequently, the national defined population is usually a very large subset of the national desired population. All schools and students in the desired population not included in the defined population are referred to as the excluded population.

    The international sample design for PIRLS is generally referred to as a two-stage stratified cluster sample design. The first stage consists of a sample of schools, which may be stratified; the second stage consists of a sample of one or more classrooms from the target grade in sampled schools.

    For more information on the approach to sampling adopted please consult section 5 of the PIRLS 2001 user guide.

    Sampling deviation

    Although countries were expected to do everything possible to maximize coverage of the population by the sampling plan, schools could be excluded, where necessary, from the sampling frame for the following reasons:

    • They were in geographically remote regions. • They were extremely small in size. • They offered a curriculum or a school structure that was different from the mainstream educational system(s). • They provided instruction only to students in the categories defined as “within-school exclusions.”

    Within-school exclusions were limited to students who, because of some disability,were unable to take the PIRLS tests. NRCs were asked to define anticipated withinschool exclusions. Because these definitions can vary internationally, they were also asked to follow certain rules adapted to their jurisdictions. In addition, they were to estimate the size of the included population so that their compliance with the 95 percent rule could be projected. The general PIRLS rules for defining within-school exclusions included the following three groups:

    • Educable mentally-disabled students. These are students who were considered, in the professional opinion of the school principal or other qualified staff members, to be educable mentally disabled – or who had been so diagnosed in psychological tests. This category included students who were emotionally or mentally unable to follow even the general instructions of the PIRLS test. It did not include students who merely exhibited poor academic performance or discipline problems. • Functionally-disabled students. These are students who were permanently physically disabled in such a way that they could not perform in the PIRLS tests. Functionally-disabled students who could perform were included in the testing. • Non-native-language speakers. These are students who could not read or speak the language of the test, and so could not overcome the language barrier of testing. Typically, a student who had received less than one year of instruction in the language of the test was excluded, but this definition was adapted in different countries. A major objective of PIRLS was that the effective target population, the population actually sampled by PIRLS, be as close as possible to the international desired population. Each country had to account for any exclusion of eligible students from the international desired population. This applied to school-level exclusions as well as within-school exclusions. See Appendix B of the PIRLS 2001 Technical Report for a detailed account of sample implementation in each country.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    PIRLS Background Questionnaires By gathering information about children’s experiences together with reading achievement on the PIRLS test, it is possible to identify the factors or combinations of factors that relate to high reading literacy. An important part of the PIRLS design is a set of questionnaires targeting factors related to reading literacy. PIRLS administered four questionnaires: to the tested students, to their parents, to their reading teachers, and to their school principals.

    Student Questionnaire Each student taking the PIRLS reading assessment completes the student questionnaire. The questionnaire asks about aspects of students’ home and school experiences – including instructional experiences and reading for homework, selfperceptions and attitudes towards reading, out-of-school reading habits, computer use, home literacy resources, and basic demographic information.

    Learning to Read (Home) Survey The learning to read survey is completed by the parents or primary caregivers of each student taking the PIRLS reading assessment. It addresses child-parent literacy interactions, home literacy resources, parents’ reading habits and attitudes, homeschool connections, and basic demographic and socioeconomic indicators.

    Teacher Questionnaire The reading teacher of each fourth-grade class sampled for PIRLS completes a questionnaire designed to gather information about classroom contexts for developing reading literacy. This questionnaire asks teachers about characteristics of the class tested (such as size, reading levels of the students, and the language abilities of the students). It also asks about instructional time, materials and activities for teaching reading and promoting the development of their students’ reading literacy, and the grouping of students for reading instruction. Questions about classroom resources, assessment practices, and home-school connections also are included. The questionnaire also asks teachers for their views on opportunities for professional development and collaboration with other teachers, and for information about their education and training.

    School Questionnaire The principal of each

  13. Descriptive Statistics of Sample (N = 6091).

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ann M. Swartz; Young Cho; Whitney A. Welch; Scott J. Strath (2023). Descriptive Statistics of Sample (N = 6091). [Dataset]. http://doi.org/10.1371/journal.pone.0150325.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ann M. Swartz; Young Cho; Whitney A. Welch; Scott J. Strath
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Descriptive Statistics of Sample (N = 6091).

  14. p

    Household Income and Expenditure Survey 2022 - Tuvalu

    • microdata.pacificdata.org
    Updated May 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistics Division (2025). Household Income and Expenditure Survey 2022 - Tuvalu [Dataset]. https://microdata.pacificdata.org/index.php/catalog/880
    Explore at:
    Dataset updated
    May 15, 2025
    Dataset authored and provided by
    Central Statistics Division
    Time period covered
    2022 - 2023
    Area covered
    Tuvalu
    Description

    Abstract

    The main purpose of a Household Income and Expenditure Survey (HIES) survey was to present high quality and representative national household data on income and expenditure in order to update Consumer Price Index (CPI), improve statistics on National Accounts and measure poverty within the country. These statistics are a requirement for evidence based policy-making in reducing poverty within the country and monitor progress in the national strategic plan in place.

    Geographic coverage

    Urban (Funafuti) and rural areas (outer islands).

    Analysis unit

    Household and Individual.

    Universe

    Private households.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The sampling design of the Tuvalu 2022 HIES consists in the random selection of the appropriate numbers of households (within each strata urban and rural) in order to be able to disaggregate HIES results at the strata level (in addition to National level). The urban strata of Tuvalu is made of the island of Funafuti (as a whole) and the rest of the country (all outer islands) compose the rural strata. The statistical unit used to run this sampling analysis is the household. The sample procedure is based on the following steps: - Assessment of the accuracy of the previous 2015 HIES in terms of per capita total expenditure (variable of interest) and check whether the sample size at that time were appropriate and correctly distributed among both stratas, - Update this assessment process by using the most recent population count to get the new sample size and distribution, - Proceed to the random selection of households using this most recent population count. The sampling frame (most recent household listing and population count) used to update and select is the 2021 Tuvalu Household Listing conducted by the Central Statistics Division of Tuvalu. At the National level, the 2015 Tuvalu HIES reported a good accuracy of the per capita total expenditure (less than 5%) but the disaggregation results by strata showed a lower quality of the result in Tuvalu urban. The Tuvalu 2021 household listing provides the most recent distribution of the households across all the islands of Tuvalu. This step consists in updating the accuracy of the previous 2015 HIES by using this recent household count and get the appropriate RSE by changing the sample size. For budget constraint, the total sample size cannot get increased, as the funding situation does not allow higher sample size. It means that the only parameter that can be modified is the distribution of the sample across the strata. Sample size by stratum: -Urban: 350 (out of 1,010 urban households as per the 2021 listing) -Rural: 310 (out of 835 rural households as per the 2021 listing) -National: 660 (out of 1,845 total households as per the 2021 listing)

    2015 per capita mean total expenditure (AUD): -Urban: 3,190 -Rural: 2,780 -National: 3,000

    Relative Standard Error (RSE): -Urban: 5.1% -Rural: 4.1% -National: 3.3%

    It results from this new sample design a new distribution that shows an increase in Funafuti urban, mainly due to: - The low quality of the survey results from the 2015 HIES, - The number of households that have increased by more than 15% between 2015 and 2020 in Tuvalu urban area.

    The household selection process is based on a simple random procedure within each stratum: - The 350 households in Funafuti are selected using the same probability of selection across all villages of the islands - The 310 household in rural Tuvalu are distributed proportionally to the size of each rural island of Tuvalu. This proportional allocation of the sample across rural Tuvalu islands generates the best accuracy at the strata level.

    Distribution of sample accross strata: Urban: Funafuti 350 Rural: Nanumea 42
    Nanumaga 37 Niutao 46
    Nui 39
    Vaitupu 75
    Nukufetau 45
    Nukulaelae 23
    Niukalita 4

    Non-response is a problem in surveys, and it is crucial that the field teams interview the selected households (the location on the map and the name of the household head are used to help to determine the selected households). During the first visit, interviewers must do their best to convince the household head to participate in the survey (and get his/her approval to proceed to interview). It may happen in the field that the first visit results in: I. A refusal: the household head does not show any interest in the survey and is reluctant to participate, II. The house is empty (household members away at the time of the visit).

    (I) Refusal: if the interviewer cannot convince the household head to participate, he has to liaise with the survey management, and the supervisor will help in the discussion to convince the household head to respond. In this case, it is important to mention that all responses are kept confidential and insist on the importance of it for the benefit of Tuvalu population. (II) Empty house: the interviewer must investigate (checking with neighbours) whether or not the house is still inhabited by the family: o If it is not the case, the dwelling is then vacant, and the replacement procedure must be activated. o If the dwelling is still occupied, interviewer must come back later the same day or the day after at different time

    Only in extreme cases of persistent refusal or empty house (household members away during the time of the collection) the replacement procedure must be activated. The replacement procedure consists in changing the selected household to the closest neighbour who is available.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Research instrument

    The 2022 Tuvalu Household Income and Expenditure Survey (HIES) questionnaire was developed in English language and it follows the Pacific Standard HIES questionnaire structure. It is administered on CAPI using Survey Solution, and the diary is no longer part of the form. All transactions (food, non food, home production and gifts) are collected through different recall sections during the same visit. The traditional 14 days diary is no longer recommended in the region. This new method of implementing the HIES present some interesting and valuable advantages such as: cost saving, data quality, time reduction for data processing and reporting. The 2022 HIES of Tuvalu was directly integrated to a census through a Long Form Census (LFC). The LFC was an experiment led by the World Bank and the Pacific Community to try and group a census and a HIES collection. All households were normally enumerated during the 2022 Census and households selected to participate to the HIES were then asked the HIES questions.

    Below is a list of all modules in this questionnaire: -Household ID -Demographic characteristics -Education -Health -Functional difficulties -Communication -Alcohol -Other individual expenses -Labour force -Fisheries -Handicraft and home-processed food -Dwelling characteristics -Assets -Home maintenance -Vehicles -International trips -Domestic trips -Household services -Financial support -Other household expenditure -Ceremonies -Remittances -Food insecurity -Financial inclusion -Livestock & aquaculture -Agriculture parcel -Agriculture vegetables -Agriculture rootcrops -Agriculture fruits

    The survey questionnaire can be found in this documentation.

    Cleaning operations

    Data was edited, cleaned and imputed using the software Stata.

    Response rate

    There was a total of 662 households from the original selection of the sample. 592 of them were contacted 528 accepted the interviews. The number of valid households is 464, or 70% of households before replacement. After replacement, 54 households were considered valid making the final completion rate at 78% (73% in urban and 85% in rural area).

  15. Enterprise Survey 2002 - Poland

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Sep 26, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Bank for Reconstruction and Development (2013). Enterprise Survey 2002 - Poland [Dataset]. https://microdata.worldbank.org/index.php/catalog/383
    Explore at:
    Dataset updated
    Sep 26, 2013
    Dataset provided by
    European Bank for Reconstruction and Developmenthttp://ebrd.com/
    World Bank Grouphttp://www.worldbank.org/
    Time period covered
    2002
    Area covered
    Poland
    Description

    Abstract

    This research was conducted in Poland from June 19 to July 31, 2002, as part of the second round of the Business Environment and Enterprise Performance Survey. The objective of the survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms. Through face-to-face interviews with firms in the manufacturing and services sectors, the survey assesses the constraints to private sector growth and creates statistically significant business environment indicators that are comparable across countries.

    The survey topics include company's characteristics, information about sales and suppliers, competition, infrastructure services, judiciary and law enforcement, security, government policies and regulations, bribery, sources of financing, overall business environment, performance and investment activities, and workforce composition.

    Geographic coverage

    National

    Analysis unit

    The primary sampling unit of the study is the establishment.

    Universe

    The manufacturing and services sectors are the primary business sectors of interest.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The information below is taken from "The Business Environment and Enterprise Performance Survey - 2002. A brief report on observations, experiences and methodology from the survey" prepared by MEMRB Custom Research Worldwide (now part of Synovate), a research company that implemented BEEPS II instrument.

    The general targeted distributional criteria of the sample in BEEPS II countries were to be as follows:

    1) Coverage of countries: The BEEPS II instrument was to be administered to approximately 6,500 enterprises in 28 transition economies: 16 from CEE (Albania, Bosnia and Herzegovina, Bulgaria, Croatia, Czech Republic, Estonia, FR Yugoslavia, FYROM, Hungary, Latvia, Lithuania, Poland, Romania, Slovak Republic, Slovenia and Turkey) and 12 from the CIS (Armenia, Azerbaijan, Belarus, Georgia, Kazakhstan, Kyrgyzstan, Moldova, Russia, Tajikistan, Turkmenistan, Ukraine and Uzbekistan).

    2) In each country, the sector composition of the total sample in terms of manufacturing versus services (including commerce) was to be determined by the relative contribution of GDP, subject to a 15% minimum for each category. Firms that operated in sectors subject to government price regulations and prudential supervision, such as banking, electric power, rail transport, and water and wastewater were excluded.

    Eligible enterprise activities were as follows (ISIC sections): - Mining and quarrying (Section C: 10-14), Construction (Section F: 45), Manufacturing (Section D: 15-37) - Transportation, storage and communications (Section I: 60-64), Wholesale, retail, repairs (Section G: 50-52), Real estate, business services (Section K: 70-74), Hotels and restaurants (Section H: 55), Other community, social and personal activities (Section O: selected groups).

    3) Size: At least 10% of the sample was to be in the small and 10% in the large size categories. A small firm was defined as an establishment with 2-49 employees, medium - with 50-249 workers, and large - with 250 - 9,999 employees. Companies with only one employee or more than 10,000 employees were excluded.

    4) Ownership: At least 10% of the firms were to have foreign control (more than 50% shareholding) and 10% of companies - state control.

    5) Exporters: At least 10% of the firms were to be exporters. A firm should be regarded as an exporter if it exported 20% or more of its total sales.

    6) Location: At least 10% of firms were to be in the category "small city/countryside" (population under 50,000).

    7) Year of establishment: Enterprises which were established later than 2000 should be excluded.

    The sample structure for BEEPS II was designed to be as representative (self-weighted) as possible to the population of firms within the industry and service sectors subject to the various minimum quotas for the total sample. This approach ensured that there was sufficient weight in the tails of the distribution of firms by the various relevant controlled parameters (sector, size, location and ownership).

    As pertinent data on the actual population or data which would have allowed the estimation of the population of foreign-owned and exporting enterprises were not available, it was not feasible to build these two parameters into the design of the sample guidelines from the onset. The primary parameters used for the design of the sample were: - Total population of enterprises; - Ownership: private and state; - Size of enterprise: Small, medium and large; - Geographic location: Capital, over 1 million, 1 million-250,000, 250-50,000 and under 50,000; - Sub-sectors (e.g. mining, construction, wholesale, etc).

    For certain parameters where statistical information was not available, enterprise populations and distributions were estimated from other accessible demographic (e.g. human population concentrations in rural and urban areas) and socio-economic (e.g. employment levels) data.

    Sampling deviation

    The survey was discontinued in Turkmenistan due to concerns about Turkmen government interference with implementation of the study.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The current survey instruments are available: - Screener and Main Questionnaires.

    The survey topics include company's characteristics, information about sales and suppliers, competition, infrastructure services, judiciary and law enforcement, security, government policies and regulations, bribery, sources of financing, overall business environment, performance and investment activities, and workforce composition.

    Cleaning operations

    Data entry and first checking and validation of the results were undertaken locally. Final checking and validation of the results were made at MEMRB Custom Research Worldwide headquarters.

    Response rate

    Overall, in all BEEPS II countries, the implementing agency contacted 18,052 enterprises and achieved an interview completion rate of 36.93%.

    Respondents who either refused outright (i.e. not interested) or were unavailable to be interviewed (i.e. on holiday, etc) accounted for 38.34% of all contacts. Enterprises which were contacted but were non-eligible (i.e. business activity, year of establishment, etc) or quotas were already met (i.e. size, ownership etc) or to which “blind calls” were made to meet quotas (i.e. foreign ownership, exporters, etc) accounted for 24.73% of the total number of enterprises contacted.

  16. Parameter values, sample properties and demographic models for the...

    • plos.figshare.com
    xls
    Updated Jan 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhendong Huang; Jerome Kelleher; Yao-ban Chan; David Balding (2025). Parameter values, sample properties and demographic models for the simulation study. [Dataset]. http://doi.org/10.1371/journal.pgen.1011537.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 21, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Zhendong Huang; Jerome Kelleher; Yao-ban Chan; David Balding
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Unless otherwise stated, 25 simulation replicates were generated in each scenario. Model Ga is used for inferences given true IBD and Model Gb is used for inferences from inferred IBD. The value of r is assumed known for all inferences, whereas μ, ϵ and N(g), g ≥ 0, are targets of inference.

  17. p

    Labour Force Survey 2018 - Tonga

    • microdata.pacificdata.org
    Updated Jul 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tonga Statistics Department (TSD) (2019). Labour Force Survey 2018 - Tonga [Dataset]. https://microdata.pacificdata.org/index.php/catalog/256
    Explore at:
    Dataset updated
    Jul 5, 2019
    Dataset authored and provided by
    Tonga Statistics Department (TSD)
    Time period covered
    2018
    Area covered
    Tonga
    Description

    Abstract

    This is the fourth Labor Force Survey of Tonga. The first one was conducted in 1990. Earlier surveys were conducted in 1990, 1993/94, and 2003 and the results of those surveys were published by the Statistics Department.

    The objective of the LFS survey is providing information on not only well-known employment and unemployment as well as providing comprehensive information on other standard indicators characterizing the country labour market. It covers those age 10 and over in the whole Kingdom. Information includes age, sex, activity, current and usual employment status, hours worked and wages and in addition included a seperate Food Insecurity Experiences Survey (FIES) questionniare module at the Household Level.

    The conceptual framework used in this labour force survey in Tonga aligns closely with the standards and guidelines set out in Resolutions of International Conferences of Labour Statistician.

    Geographic coverage

    National coverage.

    There are six statistical regions known as Division's in Tonga namely Tongatapu urban area, Tongatapu rural area, Vava'u, Ha'pai, Eua and the Niuas.Tongatapu Urban refers to the capital Nuku'alofa is the urban area while the other five divisions are rural areas. Each Division is subdivided into political districts, each district into villages and each village into census enumeration areas known as Census Blocks. The sample for the 2018 Labour Force Survey (LFS) was designed to cover at least 2500 employed population aged 10 years and over from all the regions. This was made mainly to have sufficient cases to provide information on the employed population.

    Analysis unit

    • Households (for food insecurity module questionnaire)
    • Individuals.

    Universe

    Population living in private households in Tonga. The labour force questionnaire is directed to the population aged 10 and above. Disability short set of questions is directed to all individuals age 2 and above and the food insecurity experience scale is directed to the head of household.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    2018 Tonga Labour force survey aimed at estimating all the main ILO indicators at the island group level (geographical stratas). The sampling strategy is based on a two stages stratified random survey.

    1. Computation of the survey parameters: Total sample size per strata, number of households to interview in each Primary Sampling Unit (PSU = census block) and number of PSUs to select The stratification of the survey is the geographical breakdown by island group (6 stratas Tongatapu urban, Tongatapu rural, Vava'u, Ha'apai, 'Eua, Niuas)
    2. The selection strategy is a 2 stages random survey where: Random selection of census blocks within each
    3. Census blocks are randomly selected in first place, using probability proportional to size
    4. 15 households per block are randomly selected using uniform probability

    5. The sampling frame used to select PSUs (census blocks) and household is the 2016 Tonga population census.

    The computation of sample size required the use of: - Tonga 2015 HIES dataset (labour force section) - Tonga 2016 population census (distribution of households across the stratas) The resource variable used to compute the sample size is the labour force participation rate from the 2015 HIES. The use of the 2015 labour force section of the Tonga HIES allows the computation of the design effect of the labour force participation rate within each strata. The design effect and sampling errors of the labour force participation rate estimated from the 2015 HIES in combination with the 2016 household population distribution allow to predict the minimum sample size required (per strata) to get a robust estimate from the 2018 LFS.

    Total sample size: 2685 households Geographical stratification: 6 island groups Selection process: 2 stages random survey where census blocks are selected using Probability Proportional to Size (Primary Sampling Unit) in the first place and households are randomly selected within each selected blocks (15 households per block) Non response: a 10% increase of the sample happened in all stratas to account for non-response Sampling frame: the household listing from the 2016 population census was used as a sampling frame and the 2015 labour force section of the HIES was used to compute the sample size (using labour force participation rate.

    Sampling deviation

    No major deviation from the original sample has taken place.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Research instrument

    The 2018 Tonga Labour Force Survey questionnaire included 15 sections:

    IDENTIFICATION SECTION B: INDIVIDUAL CHARACTERISTICS SECTION C: EDUCATION (AGE 3+) SECTIONS B & C: EMPLOYMENT IDENTIFICATION AND TEMPORARY ABSENCE (AGE 10+) SECTION D: AGRICULTURE WORK AND MARKET DESTINATION SECTION E1: MAIN EMPLOYMENT CHARACTERISTICS SECTION E2: SECOND PAID JOB/ BUSINESS ACTIVITY CHARACTERISTICS SECTION F: INCOME FROM EMPLOYMENT SECTION G: WORKING TIME SECTION H: JOB SEARCH SECTION I: PREVIOUS WORK EXPERIENCE SECTION J: MAIN ACTIVITY SECTION K: OWN USE PRODUCTION WORK FOOD INSECURITY EXPERIENCES GPS + PHOTO

    The questionniares were developed and administered in English and were translated into Tongan language. The questionnaire is provided as external resources.

    The draft questionnaire was pre-tested during the supervisors training and during the enumerators training and it was finally tested during the pilot test. The pilot testing was undertaken on the 27th of May to the 1st of June 2018 in Tongatapu Urban and Rural areas. The questionnaire was revised rigorously in accordance to the feedback received from each test. At the same time, a field operations manual for supervisors and enumerators was prepared and modified accordingly for field operators to use as a reference during the field work.

    Cleaning operations

    The World Bank Survey Solutions software was used for Data Processing, STATA software was used for data cleaning, tabulation tabulation and analysis.

    Editing and tabulation of the data will be undertaken in February/March 2019 in collaboration with SPC and ILO.

    Response rate

    A total, 2,685 households were selected for the sample. Of these existing households, 2,584 were successfully interviewed, giving a household response rate of 96.2%.

    Response rates were higher in urban areas than in the rural area of Tongatapu.

    -1 Tongatapu urban: 97.30%
    -2 Tongatapu rural: 93.00%
    -3 Vava'u: 100.00% -4 Ha'pai: 100.00% -5 Eua: 95.20% -6 Niuas: 80.00% -Total: 96.20%.

    Sampling error estimates

    Sampling errors were computed and are presented in the final report.

    The sampling error were computed using the survey set package in Stata. The Finite Population Correction was included in the sample design (optional in svy set Stata command) as follow: - Fpc 1: total number of census blocks within the strata (variable toteas) - Fpc 2: Here is a list of some LF indicators presented with sampling error

    -RSE: Labour force population: 2.2% Employment - population in employment: 2.2% Labour force participation rate (%): 1.7% Unemployment rate (%): 13.5% Composite rate of labour underutilization (%): 7.3% Youth unemployment rate (%): 18.2% Informal employment rate (%): 2.7% Average monthly wages - employees (TOP): 12%.

    -95% Interval: Labour force population: 28,203 => 30,804 Employment - population in employment: 27,341 => 29,855 Labour force participation rate (%): 45.2% => 48.2% Unemployment rate (%): 2.2% => 3.9% Composite rate of labour underutilization (%): 16% => 21.4% Youth unemployment rate (%): 5.7% => 12.1% Informal employment rate (%): 44.3% => 49.4% Average monthly wages - employees (TOP): 1,174 => 1,904.

  18. Data from: Genomic insights on conservation priorities for North Sea houting...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aja Noersgaard Buur Tengstedt; Shenglin Liu; Magnus W. Jacobsen; Camilla Gundlund; Peter Rask Møller; Søren Berg; Dorte Bekkevold; Michael M. Hansen (2024). Genomic insights on conservation priorities for North Sea houting and European lake whitefish (Coregonus spp.) [Dataset]. http://doi.org/10.5061/dryad.qfttdz0r0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 8, 2024
    Dataset provided by
    Technical University of Denmark
    Aarhus University
    University of Copenhagen
    Authors
    Aja Noersgaard Buur Tengstedt; Shenglin Liu; Magnus W. Jacobsen; Camilla Gundlund; Peter Rask Møller; Søren Berg; Dorte Bekkevold; Michael M. Hansen
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    North Sea
    Description

    Population genomics analysis holds great potential for informing conservation of endangered populations. We focused on a controversial case of European whitefish (Coregonus spp.) populations. The endangered North Sea houting is the only coregonid fish that tolerates oceanic salinities and was previously considered a species (C. oxyrhinchus) distinct from European lake whitefish (C. lavaretus). However, no firm evidence for genetic-based salinity adaptation has been available. Also, studies based on microsatellite and mitogenome data suggested surprisingly recent divergence (ca. 2,500 years bp) between houting and lake whitefish. These data types furthermore have provided no evidence for possible inbreeding. Finally, a controversial taxonomic revision recently classified all whitefish in the region as C. maraena, calling conservation priorities of houting into question. We used whole genome and ddRAD sequencing to analyze six lake whitefish populations and the only extant indigenous houting population. Demographic inference indicated postglacial expansion and divergence between lake whitefish and houting occurring not long after the Last Glaciation, implying deeper population histories than previous analyses. Runs of Homozygosity analysis suggested high inbreeding (FROH up to 30.6%) in some freshwater populations, but also FROH up to 10.6% in the houting prompting conservation concerns. Finally, outlier scans provided evidence for adaptation to high salinities in the houting. Applying a framework for defining conservation units based on current and historical reproductive isolation and adaptive divergence led us to recommend that the houting be treated as a separate conservation unit regardless of species status. In total, the results underscore the potential of genomics to inform conservation practices, in this case clarifying conservation units and highlighting populations of concern. Methods A. Sampling and DNA extraction Samples of lake whitefish were collected in 1995-2012 from five locations in Denmark; brackish populations in the Ringkoebing fjord (RIN) and Nissum fjord (NIS), two lagoons connected with the North Sea, and freshwater populations from Lake Flynder (FLYN), Lake Glenstrup (GLEN) and Lake Nors (NORS); and a brackish population from one German location, Achterwasser (ACHT), a lagoon flowing into the Baltic Sea. Houting were collected from the single extant population in Vidaa (VID), a river with outlet into the Wadden Sea (Fig. 1A). Sampling was conducted by electrofishing (VID) and net fishing (remaining populations). Tissue samples consisted of adipose fin clips stored in ethanol at -20°C. DNA was extracted using either a phenol-chloroform method (Taggart et al., 1992) (ACHT, FLYN) or the E.Z.N.A.® Tissue DNA Kit (OMEGA, Bio-tek, CA, USA) following the manufacturer's recommendations (the remaining samples). In total, 35 individuals were whole-genome sequenced and 95 were ddRAD-sequenced (Table 1). A group of 23 individuals occur in both data sets and were consequently both ddRAD and whole-genome sequenced. B. Whole-genome sequencing, mapping, and variant calling Library construction (using insert size ~300 bp) and whole-genome sequencing was outsourced to BGI (Beijing Genomics Institute, Hongkong, China). Paired-end Illumina sequencing was conducted using the Illumina HiSeq 2500 platform with a read length of 150 bp. The sequence reads were mapped to the Coregonus sp. “Balchen” Alpine whitefish reference genome (De-Kayne et al. (2020); GenBank accession: GCA_902810595.1) using BWA-MEM v.0.7.17 (Li, 2013; Li & Durbin, 2009a) with default parameters. SAM format files were sorted, indexed and converted into BAM files using SAMtools v.1.9 (Danecek et al., 2021). Variants were called using BCFtools v.1.2 (Danecek et al., 2021) function mpileup and call with a minimum mapping quality requirement of 20. We used the ‘--multiallelic-caller’ for calling combined with ‘--variants-only’ to output only variant sites. To produce an ‘all sites’ data set containing both monomorphic and polymorphic sites, we repeated the SNP calling process without the ‘--variants-only’ parameter in BCFtools call. C. WGS data set generation We filtered the resulting VCF file containing variant sites with VCFutils.pl (Li et al., 2009b) and VCFtools v.0.1.16 (Danecek et al., 2011) to remove indels, monomorphic sites, multi-allelic SNPs and SNPs with a variant quality <20 or extreme depth of coverage (lower than 400 or higher than 1000 across all individuals) determined from the coverage distribution of SNPs (Fig. S1). The bimodal coverage distribution with two distinct peaks suggested the presence of paralogous loci, a well-known issue in salmonid fishes due to their tetraploid origin. In addition to excluding the variants in the higher coverage peak, which was centered at approximately twice the depth of the lower peak and thus likely represented duplicated regions, we also used VCFtools to discard SNPs located within putative duplicated genomic regions identified by De-Kayne et al. (2020). Furthermore, as loci with an excess of heterozygotes can also represent duplicated genomic regions, we removed SNPs out of Hardy-Weinberg equilibrium (HWE) in one or more populations using a custom R script (https://github.com/shenglin-liu/VCF_HWF). Tests for HWE were conducted using the statistic (Brown, 1970), where is Wright’s fixation index within populations and is the sample size. The statistic follows a standard normal distribution with a mean of 0 and a standard deviation of 1. Negative values denote heterozygote excess and positive values heterozygote deficit, and values > |1.96| are significant at the 5 % level. The effects of the individual filtering steps are detailed in Supplementary Table S1. The resulting data set, hereafter referred to as the ‘HW-filtered WGS data set’, contained 16,898,181 SNPs. Additionally, we produced a ‘LD-pruned WGS data set’ with the addition of 5 individuals of the alpine whitefish species C. arenicolus (AREN) as an outgroup (Extended methods S1) by pruning SNPs on the basis of linkage disequilibrium (LD) in the HW-filtered WGS data set. Pruning was performed with the indep-pairwise function in PLINK v.1.9 (Purcell et al., 2007), where SNPs with r2>0.1 were removed from sliding windows of 50 SNPs with 10 SNPs of overlap. A total of 596,078 SNPs remained after pruning. The ‘all sites’ data set was filtered to remove indels and sites with extreme depth of coverage or located in putative duplicated regions and SNPs not in HWE, as detailed for the ‘variant sites’ data set above. No filtering for minor allele frequency or missing data was performed. After filtering, the VCF contained 1,181,919,736 sites with individuals exhibiting between x and y % missing genotypes. D. Filtering for ROH analyses We opted to further filter our the HW-filtered WGS data set to ensure only the most reliable genotype calls were retained. Following the protocol implemented in Balboa et al. (2024), we estimated mappability of the genome assembly with GENMAP v.1.3.0 (Pockrandt et al., 2020) using 100 bp k-mers and allowing for up to two mismatches, and we identified repetitive elements in the assembly with RepeatMasker v.4.1.2 (Smit et al., 2013) using ‘rmblast’ as the search engine and ‘Actinopterygii’ (ray-finned fishes) as the query species. Repeat regions and sites with a mappability score <1 were excluded from the analyses. In addition to the extreme depth filters applied as previously described, we furthermore used VCFtools to change individual genotypes with very low (DP<10) or very high read depth (DP>40) and genotypes with low quality (GQ<30) to missing (./.). Finally, only SNPs with variant quality (QUAL) >30 and no missing data were kept, resulting in a data set containing 2,646,198 SNPs. E. ddRAD sequencing, mapping, and loci assembly Samples were prepared using ddRADseq (Peterson et al., 2012). The ddRADseq libraries used PstI (6-base) and MspI (4-base) restriction enzymes. Two libraries of equal size were constructed (using insert size of 200-500 bp) and sequenced on an Illumina HiSeq2000 platform with 100 bp paired-end reads at BGI (Hong Kong, China). Raw reads were cleaned and demultiplexed with process_radtags in Stacks v.2.55 (Catchen et al., 2011; Catchen et al., 2013) in addition to being truncated to 90 bp (-t 90). Low-quality reads (phred score < 10 over a sliding window of 15% of the read length) were discarded. Mapping of reads to the Alpine whitefish reference genome (De-Kayne et al., 2020) progressed as described for the whole-genome sequencing data. Loci were assembled from the aligned and sorted reads using gstacks v.2.55 with default parameters. F. ddRADseq data set generation The populations program in Stacks (Catchen et al., 2011; Catchen et al., 2013) was used to generate a preliminary VCF file including only loci present in all six populations (-p 6; GLEN was not analyzed by ddRAD sequencing) and at least 70 percent of individuals within each population (-r 0.7). Exports were ordered (--ordered-export) to ensure that only a single representative of each overlapping site was included. Loci out of HWE in one or more populations were filtered out using a custom R script, as previously described. Based on this data set, five individuals (two from ACHT, one from each of the populations NIS, NORS, and RIN) with more than 10 % missing data were identified. We then generated a new VCF file excluding these five individuals using populations with parameters as previously stated, yielding a total of 347,397 SNPs, and a second VCF file with data analysis restricted to one random SNP per locus, yielding 141,157 SNPs. Both files were filtered to remove SNPs located within potentially duplicated regions of the genome (De-Kayne et al., 2020) and SNPs out of HWE in one or more populations as described for WGS data. A total of 254,693 SNPs and 105,452 SNPs,

  19. S1 Data -

    • plos.figshare.com
    xlsx
    Updated Dec 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natalia Popierz-Rydlewska; Sylwia Merkiel-Pawłowska; Anna Łojko-Dankowska; Mieczysław Komarnicki; Wojciech Chalcarz (2023). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0295308.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Natalia Popierz-Rydlewska; Sylwia Merkiel-Pawłowska; Anna Łojko-Dankowska; Mieczysław Komarnicki; Wojciech Chalcarz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionIn the literature there is lack of information on the influence of gender and time since autologous hematopoietic stem cell transplantation (HSCT) on the immune reconstitution in multiple myeloma (MM) patients.ObjectiveThe aim of this study was to assess the diversity of the immune reconstitution according to gender in MM patients after autologous HSCT on the day of the clinic discharge and on the 29th day after discharge, as well as to investigate the changes in the immune system in females and males after staying at home for 28 days.MethodThe studied population comprised 13 females and 13 males after autologous HSCT. On the day of the clinic discharge and on the 29th day after discharge blood samples were taken to analyse 22 immunological parameters. Statistical analysis was performed using STATISTICA 10 StatSoft Poland. For multiple comparisons, the Bonferroni correction was used.ResultsNo statistically significant differences were observed in the analysed immunological parameters between the studied females and males with MM on the day of the clinic discharge and on the 29th day after discharge. However, on the 29th day after the clinic discharge compared to the day of the clinic discharge, statistically significant differences were found in 8 immunological parameters among females and 6 immunological parameters among males.Conclusion and recommendationOur results indicate that the immune reconstitution is similar but not the same in patients of both genders. Statistically significant differences in the immune response in the studied females and males imply that gender may play a role in the immune reconstitution and that the results obtained in MM patients should be analysed separately in females and males. In order to explain the observed changes in the immune system according to gender, further research should be carried out on a larger population. This would most probably make it possible to find their clinical application.

  20. Index of notation for parameter values, with default values, and variables...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    D. B. Bonnéry; L. -S. Pretorius; A. E. C. Jooste; A. D. W. Geering; C. A. Gilligan (2023). Index of notation for parameter values, with default values, and variables used in computing an optimal sampling design for disease-free status. [Dataset]. http://doi.org/10.1371/journal.pone.0277725.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 17, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    D. B. Bonnéry; L. -S. Pretorius; A. E. C. Jooste; A. D. W. Geering; C. A. Gilligan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Index of notation for parameter values, with default values, and variables used in computing an optimal sampling design for disease-free status.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Maria M; Ibrahim M. Almanjahie; Muhammad Ismail; Ammara Nawaz Cheema (2023). Summary statistics of population and samples taken at different sampling schemes for n = 4, r = 1. [Dataset]. http://doi.org/10.1371/journal.pone.0275340.t001
Organization logo

Summary statistics of population and samples taken at different sampling schemes for n = 4, r = 1.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 3, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Maria M; Ibrahim M. Almanjahie; Muhammad Ismail; Ammara Nawaz Cheema
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Summary statistics of population and samples taken at different sampling schemes for n = 4, r = 1.

Search
Clear search
Close search
Google apps
Main menu