Facebook
TwitterPanel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.
This Basic Information Document (BID) provides a brief overview of the Tanzania National Panel Survey (NPS), but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the NPS. As this Basic Information Document (BID) for the UPD does not describe in detail the background, development, or use of the NPS itself, the round-specific NPS BIDs should supplement the information provided here.
The NPS Uniform Panel Dataset (UPD) consists of both survey instruments and datasets, meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the NPS. The NPS-UPD provides a consistent and straightforward means of conducting not only user-driven analyses using convenient, standardized tools, but also for monitoring MKUKUTA, FYDP II, and other national level development indicators reported by the NPS.
The design of the NPS-UPD combines the four completed rounds of the NPS – NPS 2008/09 (R1), NPS 2010/11 (R2), NPS 2012/13 (R3), and NPS 2014/15 (R4) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.
Designed for analysis of key indicators at four primary domains of inference, namely: Dar es Salaam, other urban, rural, Zanzibar.
The universe includes all households and individuals in Tanzania with the exception of those residing in military barracks or other institutions.
Sample survey data [ssd]
While the same sample of respondents was maintained over the first three rounds of the NPS, longitudinal surveys tend to suffer from bias introduced by households leaving the survey over time; i.e. attrition. Although the NPS maintains a highly successful recapture rate (roughly 96% retention at the household level), minimizing the escalation of this selection bias, a refresh of longitudinal cohorts was done for the NPS 2014/15 to ensure proper representativeness of estimates while maintaining a sufficient primary sample to maintain cohesion within panel analysis. A newly completed Population and Housing Census (PHC) in 2012, providing updated population figures along with changes in administrative boundaries, emboldened the opportunity to realign the NPS sample and abate collective bias potentially introduced through attrition.
To maintain the panel concept of the NPS, the sample design for NPS 2014/2015 consisted of a combination of the original NPS sample and a new NPS sample. A nationally representative sub-sample was selected to continue as part of the “Extended Panel” while an entirely new sample, “Refresh Panel”, was selected to represent national and sub-national domains. Similar to the sample in NPS 2008/2009, the sample design for the “Refresh Panel” allows analysis at four primary domains of inference, namely: Dar es Salaam, other urban areas on mainland Tanzania, rural mainland Tanzania, and Zanzibar. This new cohort in NPS 2014/2015 will be maintained and tracked in all future rounds between national censuses.
Face-to-face [f2f]
The format of the NPS-UPD survey instrument is similar to previously disseminated NPS survey instruments. Each module has a questionnaire and clearly identifies if the module collects information at the individual or household level. Within each module-specific questionnaire of the NPS-UPD survey instrument, there are five distinct sections, arranged vertically: (1) the UPD - “U” on the survey instrument, (2) R4, (3), R3, (4) R2, and (5) R1 – the latter 4 sections presenting each questionnaire in its original form at time of its respective dissemination.
The uppermost section of each module’s questionnaire (“U”) represents the model universal panel questionnaire, with questions generated from the comprehensive listing of questions across all four rounds of the NPS and codes generated from the comprehensive collection of codes. The following sections are arranged vertically by round, considering R4 as most recent. While not all rounds will have data reported for each question in the UPD and not each question will have reports for each of the UPD codes listed, the NPS-UPD survey instrument represents the visual, all-inclusive set of information collected by the NPS over time.
The four round-specific sections (R4, R3, R2, R1) are aligned with their UPD-equivalent question, visually presenting their contribution to compatibility with the UPD. Each round-specific section includes the original round-specific variable names, response codes and skip patterns (corresponding to their respective round-specific NPS data sets, and despite their variance from other rounds or from the comprehensive UPD code listing)4.
Facebook
TwitterPanel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.
The Basic Information Document (BID) provides a brief overview of the Nigerian General Household Survey (GHS) but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the GHS. As the BID does not describe in detail the background, development, or use of the GHS itself, the wave-specific GHS BIDs should supplement the information provided here.
The Nigeria Universal Panel Data (NUPD) consists of both survey instruments and datasets from the two survey visits of the GHS - Post-Planting (PP) and Post-Harvest (PH) - meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the GHS. The NUPD provides a consistent and straightforward means of conducting user-driven analyses using convenient, standardized tools.
The design of the NUPD combines the four completed Waves of the GHS Household Post-Planting and Post-Harvest Surveys – Wave 1 (2010/11), Wave 2 (2012/13), Wave 3 (2015/16), and Wave 4 (2018/19) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.
National
The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.
Sample survey data [ssd]
Please see the GHS BIDs for each round for detailed descriptions of the sample design used in each round and their respective implementation efforts as this is a compilation of datasets from all previous waves.
Face-to-face [f2f]
The larger GHS-Panel project consists of three questionnaires (Household Questionnaire, Agriculture Questionnaire, Community Questionnaire) for each of the two visits (Post-Planting and Post-Harvest). The GHS-NUPD only consists of the Household Questionnaire.
GHS-Panel Household Questionnaire: The Household Questionnaire provides information on demographics; education; health (including anthropometric measurement for children); labor; food and non-food expenditure; household nonfarm income-generating activities; food security and shocks; safety nets; housing conditions; assets; information and communication technology; and other sources of household income.
The Household Questionnaire is slightly different for the two visits. Some information was collected only in the post-planting visit, some only in the post-harvest visit, and some in both visits.
Please see the GHS BIDs for each round for detailed descriptions of data editing and additional data processing efforts as this is a compilation of datasets from all previous waves.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 1. Datasets.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper extends the Common Correlated Effects Pooled (CCEP) estimator to homogeneous dynamic panels. In this setting CCEP suffers from a large bias when the time span (T) of the dataset is fixed. We develop a bias-corrected CCEP estimator that is consistent as the number of cross-sectional units (N) tends to infinity, for T fixed or growing large, provided that the specification is augmented with a sufficient number of cross-sectional averages, and lags thereof. Monte Carlo experiments show that the correction offers strong improvements in terms of bias and variance. We apply our approach to estimate the dynamic impact of temperature shocks on aggregate output growth.
Facebook
TwitterThis paper studies the empirical relevance of precautionary and other motives for household portfolio behaviour using recent panel data from the Netherlands. Dutch households' portfolios exhibit low degrees of risk taking and diversification. It is possible that this is the outcome of a rational, precautionary response to unavoidable exposure to background risk (stemming from the labour market or health conditions, etc.). We consider as alternative explanations liquidity needs and habits. The endogenous variable is the fraction of clearly safe in total financial assets at the household level. Parametric and semi-parametric censored regression models for pooled cross-sections and random and fixed effects models for panel data show that both heteroscedasticity and unobserved heterogeneity are of major importance in the data. With subjective indicators of income uncertainty we find a limited role for precautionary motives.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.
Facebook
Twitterhttps://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
English:
The HaSpaD project harmonizes and pools longitudinal data for the analysis of partnership biographies from nine German survey programs. These are in detail:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract The objective of this study was to analyze the determining factors that explain the capital structure decisions of small and medium-sized enterprises (SMEs) in the province of Cabinda, Angola. In this study, debt maturity was also analyzed and, therefore, total indebtedness was broken down into short, medium, and long-term debt ratios. This study is motivated the poor number of studies on the determinants of the capital structure of SMEs in developing countries, more specifically in Cabinda, Angola. This research is relevant for Corporate Finance, particularly regarding the capital structure of SMEs located in a developing country like Angola. Also, it corroborates previous studies on the applicability of the principles of the pecking-order theory to SMEs in developed countries. This research present contributions to Corporate Finance, as it identifies the determinants of the capital structure of SMEs in a developing country - considering the debt maturity -, through the analysis of total debt ratios-, short-, medium- and long-term debt. Based on a sample of 73 SMEs for the period between 2011 and 2016, we used panel data models (pooled OLS, fixed and random effects). The results of this study show that tangibility, age, liquidity, and non-debt tax shield are determining factors in the decisions of the capital structure of SMEs in the province of Cabinda, Angola. Furthermore, they suggest that these firms follow the principles of pecking-order theory in capital structure decisions. The research contributes to increase studies in Corporate Finance, particularly concerning the determinants of the capital structure of SMEs located in a developing country.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Subtitle: 3-Year Weekly Multi-Channel FMCG Marketing Mix Panel for India Grain: Week-ending Saturday × Geography × Brand × SKU Span: 156 weeks (2 Jul 2022 – 27 Jun 2025) Scope: 8 Indian geographies • 3 brands × 3 SKUs each (9 SKUs) • Full marketing, trade, price, distribution & macro controls • AI creative quality scores for digital banners.
This dataset is synthetic but behaviorally realistic, generated to help analysts experiment with Marketing Mix Modeling (MMM), media effectiveness, price/promo analytics, distribution effects, and hierarchical causal inference without using proprietary commercial data.
Real MMM training data is rarely public due to confidentiality. This synthetic panel:
| File | Description |
|---|---|
synthetic_mmm_weekly_india_SAT.csv | Main dataset. 11,232 rows × 28 columns. Weekly (week-ending Saturday). |
(If you also upload the Monday version, note it clearly and point users to which to use.)
import pandas as pd
df = pd.read_csv("/kaggle/input/synthetic-india-fmcg-mmm/synthetic_mmm_weekly_india_SAT.csv",
parse_dates=["Week"])
df.info()
df.head()
geo_brand = (
df.groupby(["Week","Geo","Brand"], as_index=False)
.sum(numeric_only=True)
)
Example: log-transform sales value, normalize media, build price index.
import numpy as np
m = geo_brand.copy()
m["log_sales_val"] = np.log1p(m["Sales_Value"])
m["price_index"] = m["Net_Price"] / m.groupby(["Geo","Brand"])["Net_Price"].transform("mean")
W-SAT).To derive a week-start (Sunday) date:
df["Week_Start"] = df["Week"] - pd.Timedelta(days=6)
| Column | Type | Description |
|---|---|---|
| Week | date | Week-ending Saturday timestamp. |
| Geo | categorical | 8 rollups: NORTH, SOUTH, EAST, WEST, CENTRAL, NORTHEAST, METRO_DELHI, METRO_MUMBAI. |
| Brand | categorical | BrandA / BrandB / BrandC. |
| SKU | categorical | Brand-level SKU IDs (3 per brand). |
| Column | Type | Notes |
|---|---|---|
| Sales_Units | float | Modeled weekly unit sales after macro, distribution, price, promo & media effects. Lognormal noise added. |
| Sales_Value | float | Sales_Units × Net_Price. Use for revenue MMM or ROI analyses. |
| Column | Type | Notes |
|---|---|---|
| MRP | float | Baseline list price (per-unit). Drifts with CPI & brand positioning. |
| Net_Price | float | Effective real... |
Facebook
TwitterThe LFS is a twice-yearly rotating panel household survey, specifically designed to measure the dynamics of employment and unemployment in South Africa. It measures a variety of issues related to the labour market,including unemployment rates (official and expanded), according to standard definitions of the International Labour Organisation (ILO).
All editions of the LFS have been updated (some more than once) since their release. These version changes are detailed in a document available from DataFirst (in the "external documents" section titled "LFS 2000-2008 Collated Version Notes on the South African LFS").
National coverage
Individuals
The LFS sample covers the non-institutional population except for workers' hostels. However, persons living in private dwelling units within institutions are also enumerated. For example, within a school compound, one would enumerate the schoolmaster's house and teachers' accommodation because these are private dwellings. Students living in a dormitory on the school compound would, however, be excluded.
Sample survey data [ssd]
Statistics South Africa uses a rotating panel methodology for the labour force survey. The rotating panel methodology involves visiting the same dwelling units on a number of occasions (in this instance, five at most). After the panel is established, a proportion of the dwelling units is replaced each round (in this instance, 20%). New dwelling units are added to the sample to replace those that are taken out.
Enumeration Areas (EAs) that had a household count of less than twenty-five were omitted from the census 2001 frame that was used to draw the sample of Primary Sampling Units (PSUs) for the new Master Sample. Other omissions from the Master Sample frame included all institution EAs except workers, hostels, convents and monasteries. EAs from census 2001 were pooled in two stages, before and after sampling. Before sampling the criterion that was used to pool EAs was that they should contain a minimum of one hundred households. However, during listing it was discovered that there were discrepancies between the information on the database and what was on the ground.
Therefore, in the second stage of pooling, EAs that were found to have less than sixty dwelling units during listing were pooled. The Master Sample is a multi-stage stratified sample. The overall sample size of PSUs was 3000. The explicit strata were the 53 district councils/metros (DCs). The 3000 PSUs were allocated to these DCs using the power allocation method. The PSUs were then sampled using probability proportional to size principles. The measure of size used was the number of households in a PSU as calculated in the census. The sampled PSUs were listed with the dwelling unit as the listing unit. From these listings systematic samples of dwelling units per PSU were drawn. These samples of dwelling units form clusters. The size of the clusters differs depending on the specific survey requirements. The LFS uses one of the clusters that contain ten dwelling units.
Face-to-face [f2f]
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract This paper measures the tax effort of a group of fifty-nine developed and developing countries over the period 1996-2015 by comparing a country’s actual tax/GDP ratio with the ratio predicted derived from an international tax function which relates tax revenue to various measures of a country’s taxable capacity such as the level of per capita income; the share of trade in GDP; the productive structure, and the level of financial deepening. The tax function is estimated using cross section data; pooled time series/cross section data, and panel data using a fixed effects estimator. The results are compared and show a range of tax effort from South Africa with the highest effort and Switzerland with the lowest effort. Implications for policy are drawn. The paper is critical of studies that include institutional variables (and other variables not related to the tax base of countries) to measure tax effort when they are really explanations of why the tax ratio differs between countries not of tax effort itself.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
aReference SNP panel identified with a MAF of 5 percent in data pooled from equal numbers of reads of all eight varieties.b32 samples for ApeKI libraries, and 29 for PstI libraries (three samples were removed due to very low sequencing coverage).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract Purpose: This study aims to analyze the impact on human development of rates of innovative entrepreneurship and necessity entrepreneurship. Design/methodology/approach: Our empirical study is based on samples from countries with information about rates of entrepreneurship, human development, and social progress. The data are analyzed by means of pooled least squares and panel data techniques. Findings: Innovative entrepreneurship improves the quality of life in the dimensions measured by the Social Progress Index and Modified Human Development Index. Necessity entrepreneurship does not favor an increase of human development, at least in the dimensions measured by the two indexes, since this is a subsistence entrepreneurship type. Originality/value: This study presents new evidence that contributes to the knowledge on how entrepreneurship improves quality of life.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Estimated coefficient of the two-step system GMM, pooled OLS, medium quantile regression, and IV2SLS models for robustness check.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Panel long-run and dynamic short-run coefficients using PMG/ARDL model.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Pooled OLS, fixed effects, and random effects models.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.
Facebook
TwitterPanel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.
This Basic Information Document (BID) provides a brief overview of the Tanzania National Panel Survey (NPS), but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the NPS. As this Basic Information Document (BID) for the UPD does not describe in detail the background, development, or use of the NPS itself, the round-specific NPS BIDs should supplement the information provided here.
The NPS Uniform Panel Dataset (UPD) consists of both survey instruments and datasets, meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the NPS. The NPS-UPD provides a consistent and straightforward means of conducting not only user-driven analyses using convenient, standardized tools, but also for monitoring MKUKUTA, FYDP II, and other national level development indicators reported by the NPS.
The design of the NPS-UPD combines the four completed rounds of the NPS – NPS 2008/09 (R1), NPS 2010/11 (R2), NPS 2012/13 (R3), and NPS 2014/15 (R4) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.
Designed for analysis of key indicators at four primary domains of inference, namely: Dar es Salaam, other urban, rural, Zanzibar.
The universe includes all households and individuals in Tanzania with the exception of those residing in military barracks or other institutions.
Sample survey data [ssd]
While the same sample of respondents was maintained over the first three rounds of the NPS, longitudinal surveys tend to suffer from bias introduced by households leaving the survey over time; i.e. attrition. Although the NPS maintains a highly successful recapture rate (roughly 96% retention at the household level), minimizing the escalation of this selection bias, a refresh of longitudinal cohorts was done for the NPS 2014/15 to ensure proper representativeness of estimates while maintaining a sufficient primary sample to maintain cohesion within panel analysis. A newly completed Population and Housing Census (PHC) in 2012, providing updated population figures along with changes in administrative boundaries, emboldened the opportunity to realign the NPS sample and abate collective bias potentially introduced through attrition.
To maintain the panel concept of the NPS, the sample design for NPS 2014/2015 consisted of a combination of the original NPS sample and a new NPS sample. A nationally representative sub-sample was selected to continue as part of the “Extended Panel” while an entirely new sample, “Refresh Panel”, was selected to represent national and sub-national domains. Similar to the sample in NPS 2008/2009, the sample design for the “Refresh Panel” allows analysis at four primary domains of inference, namely: Dar es Salaam, other urban areas on mainland Tanzania, rural mainland Tanzania, and Zanzibar. This new cohort in NPS 2014/2015 will be maintained and tracked in all future rounds between national censuses.
Face-to-face [f2f]
The format of the NPS-UPD survey instrument is similar to previously disseminated NPS survey instruments. Each module has a questionnaire and clearly identifies if the module collects information at the individual or household level. Within each module-specific questionnaire of the NPS-UPD survey instrument, there are five distinct sections, arranged vertically: (1) the UPD - “U” on the survey instrument, (2) R4, (3), R3, (4) R2, and (5) R1 – the latter 4 sections presenting each questionnaire in its original form at time of its respective dissemination.
The uppermost section of each module’s questionnaire (“U”) represents the model universal panel questionnaire, with questions generated from the comprehensive listing of questions across all four rounds of the NPS and codes generated from the comprehensive collection of codes. The following sections are arranged vertically by round, considering R4 as most recent. While not all rounds will have data reported for each question in the UPD and not each question will have reports for each of the UPD codes listed, the NPS-UPD survey instrument represents the visual, all-inclusive set of information collected by the NPS over time.
The four round-specific sections (R4, R3, R2, R1) are aligned with their UPD-equivalent question, visually presenting their contribution to compatibility with the UPD. Each round-specific section includes the original round-specific variable names, response codes and skip patterns (corresponding to their respective round-specific NPS data sets, and despite their variance from other rounds or from the comprehensive UPD code listing)4.