42 datasets found
  1. w

    National Panel Survey 2008-2015, Uniform Panel Dataset - Tanzania

    • microdata.worldbank.org
    • datacatalog.ihsn.org
    • +1more
    Updated Mar 17, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Bureau of Statistics (2021). National Panel Survey 2008-2015, Uniform Panel Dataset - Tanzania [Dataset]. https://microdata.worldbank.org/index.php/catalog/3814
    Explore at:
    Dataset updated
    Mar 17, 2021
    Dataset authored and provided by
    National Bureau of Statistics
    Time period covered
    2008 - 2015
    Area covered
    Tanzania
    Description

    Abstract

    Panel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.

    This Basic Information Document (BID) provides a brief overview of the Tanzania National Panel Survey (NPS), but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the NPS. As this Basic Information Document (BID) for the UPD does not describe in detail the background, development, or use of the NPS itself, the round-specific NPS BIDs should supplement the information provided here.

    The NPS Uniform Panel Dataset (UPD) consists of both survey instruments and datasets, meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the NPS. The NPS-UPD provides a consistent and straightforward means of conducting not only user-driven analyses using convenient, standardized tools, but also for monitoring MKUKUTA, FYDP II, and other national level development indicators reported by the NPS.

    The design of the NPS-UPD combines the four completed rounds of the NPS – NPS 2008/09 (R1), NPS 2010/11 (R2), NPS 2012/13 (R3), and NPS 2014/15 (R4) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.

    Geographic coverage

    Designed for analysis of key indicators at four primary domains of inference, namely: Dar es Salaam, other urban, rural, Zanzibar.

    Analysis unit

    • Households
    • Individuals

    Universe

    The universe includes all households and individuals in Tanzania with the exception of those residing in military barracks or other institutions.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    While the same sample of respondents was maintained over the first three rounds of the NPS, longitudinal surveys tend to suffer from bias introduced by households leaving the survey over time; i.e. attrition. Although the NPS maintains a highly successful recapture rate (roughly 96% retention at the household level), minimizing the escalation of this selection bias, a refresh of longitudinal cohorts was done for the NPS 2014/15 to ensure proper representativeness of estimates while maintaining a sufficient primary sample to maintain cohesion within panel analysis. A newly completed Population and Housing Census (PHC) in 2012, providing updated population figures along with changes in administrative boundaries, emboldened the opportunity to realign the NPS sample and abate collective bias potentially introduced through attrition.

    To maintain the panel concept of the NPS, the sample design for NPS 2014/2015 consisted of a combination of the original NPS sample and a new NPS sample. A nationally representative sub-sample was selected to continue as part of the “Extended Panel” while an entirely new sample, “Refresh Panel”, was selected to represent national and sub-national domains. Similar to the sample in NPS 2008/2009, the sample design for the “Refresh Panel” allows analysis at four primary domains of inference, namely: Dar es Salaam, other urban areas on mainland Tanzania, rural mainland Tanzania, and Zanzibar. This new cohort in NPS 2014/2015 will be maintained and tracked in all future rounds between national censuses.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The format of the NPS-UPD survey instrument is similar to previously disseminated NPS survey instruments. Each module has a questionnaire and clearly identifies if the module collects information at the individual or household level. Within each module-specific questionnaire of the NPS-UPD survey instrument, there are five distinct sections, arranged vertically: (1) the UPD - “U” on the survey instrument, (2) R4, (3), R3, (4) R2, and (5) R1 – the latter 4 sections presenting each questionnaire in its original form at time of its respective dissemination.

    The uppermost section of each module’s questionnaire (“U”) represents the model universal panel questionnaire, with questions generated from the comprehensive listing of questions across all four rounds of the NPS and codes generated from the comprehensive collection of codes. The following sections are arranged vertically by round, considering R4 as most recent. While not all rounds will have data reported for each question in the UPD and not each question will have reports for each of the UPD codes listed, the NPS-UPD survey instrument represents the visual, all-inclusive set of information collected by the NPS over time.

    The four round-specific sections (R4, R3, R2, R1) are aligned with their UPD-equivalent question, visually presenting their contribution to compatibility with the UPD. Each round-specific section includes the original round-specific variable names, response codes and skip patterns (corresponding to their respective round-specific NPS data sets, and despite their variance from other rounds or from the comprehensive UPD code listing)4.

  2. w

    General Household Survey - Panel 2010-2019, Uniform Panel Data - Nigeria

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated Nov 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Bureau of Statistics (NBS) (2025). General Household Survey - Panel 2010-2019, Uniform Panel Data - Nigeria [Dataset]. https://microdata.worldbank.org/index.php/catalog/5835
    Explore at:
    Dataset updated
    Nov 3, 2025
    Dataset authored and provided by
    National Bureau of Statistics (NBS)
    Time period covered
    2010 - 2019
    Area covered
    Nigeria
    Description

    Abstract

    Panel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.

    The Basic Information Document (BID) provides a brief overview of the Nigerian General Household Survey (GHS) but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the GHS. As the BID does not describe in detail the background, development, or use of the GHS itself, the wave-specific GHS BIDs should supplement the information provided here.

    The Nigeria Universal Panel Data (NUPD) consists of both survey instruments and datasets from the two survey visits of the GHS - Post-Planting (PP) and Post-Harvest (PH) - meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the GHS. The NUPD provides a consistent and straightforward means of conducting user-driven analyses using convenient, standardized tools.

    The design of the NUPD combines the four completed Waves of the GHS Household Post-Planting and Post-Harvest Surveys – Wave 1 (2010/11), Wave 2 (2012/13), Wave 3 (2015/16), and Wave 4 (2018/19) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.

    Geographic coverage

    National

    Analysis unit

    • Households
    • Individuals

    Universe

    The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Please see the GHS BIDs for each round for detailed descriptions of the sample design used in each round and their respective implementation efforts as this is a compilation of datasets from all previous waves.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The larger GHS-Panel project consists of three questionnaires (Household Questionnaire, Agriculture Questionnaire, Community Questionnaire) for each of the two visits (Post-Planting and Post-Harvest). The GHS-NUPD only consists of the Household Questionnaire.

    GHS-Panel Household Questionnaire: The Household Questionnaire provides information on demographics; education; health (including anthropometric measurement for children); labor; food and non-food expenditure; household nonfarm income-generating activities; food security and shocks; safety nets; housing conditions; assets; information and communication technology; and other sources of household income.

    The Household Questionnaire is slightly different for the two visits. Some information was collected only in the post-planting visit, some only in the post-harvest visit, and some in both visits.

    Cleaning operations

    Please see the GHS BIDs for each round for detailed descriptions of data editing and additional data processing efforts as this is a compilation of datasets from all previous waves.

  3. f

    MOESM1 of Does globalization accelerate economic growth? South Asian...

    • springernature.figshare.com
    • figshare.com
    xlsx
    Updated Jul 27, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Hasan (2019). MOESM1 of Does globalization accelerate economic growth? South Asian experience using panel data [Dataset]. http://doi.org/10.6084/m9.figshare.9119216.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 27, 2019
    Dataset provided by
    figshare
    Authors
    Md Hasan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 1. Datasets.

  4. f

    Data from: Bias-corrected Common Correlated Effects Pooled estimation in...

    • figshare.com
    pdf
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ignace De Vos; Gerdie Everaert (2023). Bias-corrected Common Correlated Effects Pooled estimation in dynamic panels [Dataset]. http://doi.org/10.6084/m9.figshare.9594299.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Ignace De Vos; Gerdie Everaert
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper extends the Common Correlated Effects Pooled (CCEP) estimator to homogeneous dynamic panels. In this setting CCEP suffers from a large bias when the time span (T) of the dataset is fixed. We develop a bias-corrected CCEP estimator that is consistent as the number of cross-sectional units (N) tends to infinity, for T fixed or growing large, provided that the specification is augmented with a sufficient number of cross-sectional averages, and lags thereof. Monte Carlo experiments show that the correction offers strong improvements in terms of bias and variance. We apply our approach to estimate the dynamic impact of temperature shocks on aggregate output growth.

  5. r

    Precautionary motives and portfolio decisions (replication data)

    • resodate.org
    Updated Oct 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Hochguertel (2025). Precautionary motives and portfolio decisions (replication data) [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9qb3VybmFsZGF0YS56YncuZXUvZGF0YXNldC9wcmVjYXV0aW9uYXJ5LW1vdGl2ZXMtYW5kLXBvcnRmb2xpby1kZWNpc2lvbnM=
    Explore at:
    Dataset updated
    Oct 2, 2025
    Dataset provided by
    ZBW
    ZBW Journal Data Archive
    Journal of Applied Econometrics
    Authors
    Stefan Hochguertel
    Description

    This paper studies the empirical relevance of precautionary and other motives for household portfolio behaviour using recent panel data from the Netherlands. Dutch households' portfolios exhibit low degrees of risk taking and diversification. It is possible that this is the outcome of a rational, precautionary response to unavoidable exposure to background risk (stemming from the labour market or health conditions, etc.). We consider as alternative explanations liquidity needs and habits. The endogenous variable is the fraction of clearly safe in total financial assets at the household level. Parametric and semi-parametric censored regression models for pooled cross-sections and random and fixed effects models for panel data show that both heteroscedasticity and unobserved heterogeneity are of major importance in the data. With subjective indicators of income uncertainty we find a limited role for precautionary motives.

  6. Results of pooled OLS.

    • plos.figshare.com
    xls
    Updated Sep 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Haris; HongXing Yao; Hijab Fatima (2024). Results of pooled OLS. [Dataset]. http://doi.org/10.1371/journal.pone.0308356.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 9, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Muhammad Haris; HongXing Yao; Hijab Fatima
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.

  7. f

    S1 Data -

    • plos.figshare.com
    xlsx
    Updated Sep 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Haris; HongXing Yao; Hijab Fatima (2024). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0308356.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Sep 9, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Muhammad Haris; HongXing Yao; Hijab Fatima
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.

  8. g

    Harmonizing and synthesizing partnership histories from different research...

    • search.gesis.org
    Updated Jun 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schulz, Sonja; Weiß, Bernd; Sterl, Sebastian; Haensch, Anna-Carolina; Schmid, Lisa; May, Antonia (2022). Harmonizing and synthesizing partnership histories from different research data infrastructures: A model project for linking research data from various infrastructure (HaSpaD). [Dataset]. http://doi.org/10.7802/2429
    Explore at:
    Dataset updated
    Jun 21, 2022
    Dataset provided by
    GESIS, Köln
    GESIS search
    Authors
    Schulz, Sonja; Weiß, Bernd; Sterl, Sebastian; Haensch, Anna-Carolina; Schmid, Lisa; May, Antonia
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Description

    English:
    The HaSpaD project harmonizes and pools longitudinal data for the analysis of partnership biographies from nine German survey programs. These are in detail:

    • The German Family Panel (pairfam), Data file Version 12.0.0
    • ALLBUS/GGSS 1980-2016 (Kumulierte Allgemeine Bevölkerungsumfrage der Sozialwissenschaften / Cumulated German General Social Survey 1980-2016)
    • Family Surveys 1988-2000 (Change and Development of Forms of Family Life in West Germany (Survey of Families), Family and Partner Relations in Eastern Germany (Survey of Families), Change and Development of Ways of Family Life - 2nd Wave (Survey of Families), Change and Development of Families` Way of Life - 3rd Wave (Family Survey))
    • Mannheim Divorce Study 1996
    • German Fertility and Family Survey (FFS) 1992
    • German Life History Studies (Courses of Life and Historical Change in East Germany (Life History Study LV DDR), Courses of Life and Social Change: Courses of Life and Welfare Development (Life History Study LV-West I), Courses of Life and Social Change: The Between-the-War Cohort in Transition to Retirement (Life History Study LV-West II A - Personal Interview), Courses of Life and Social Change: The Between-the-War Cohort in Transition to Retirement (Life History Study LV-West II T - Telephone Interview), Courses of Life and Social Change: Access to Occupation in Employment Crisis (Life History Study LV-West III), East German Life Courses After Unification (Life History Study LV-Ost Panel), East German Life Courses After Unification (Life History Study LV Ost 71), Education, Training, and Occupation: Life Courses of the 1964 and 1971 Birth Cohorts in West Germany (Life History Study LV-West 64/71), Early Careers and Starting a Family: Life Courses of the 1971 Birth Cohorts in East and West Germany (Life History Study LV-Panel 71))
    • Generations & Gender Survey (German Subsample) GGS Waves 1 and 2
    • The Survey of Health, Ageing and Retirement in Europe (SHARE), German Sample (Share Waves 1, 2, and 3) and
    • Socio-Economic Panel (SOEP), data for the years 1984-2018.

    The HaSpaD projects does not distribute own datasets. Instead, the HaSpaD syntax package allows to harmonize and pool all German surveys with partnership biographical data which are available for secondary use via a research data repository. Data access to these source data must be arranged autonomously by users of the HaSpaD syntax. The scripts harmonize and pool the partnership biographical data, as well as additional variables on respondents and their partnerships. These include, for example, gender, religious affiliation, and nationality of the respondents. The pooled data set provides the opportunity to analyse previously unanswered questions on marriage and partnership stability from a historical and life course theoretical perspective, in particular on the long-term increase in divorce rates and on social changes in risk factors for separation. In addition, methodological developments of research syntheses will be facilitated.


    Deutsch:
    Das HaSpaD-Projekt harmonisiert und kumuliert Längsschnittdaten zur Analyse von Partnerschaftsbiografien aus neun deutschen Umfrageprogrammen. Dies sind im Einzelnen:
    • Beziehungs- und Familienpanels pairfam, Release 12.0
    • Kumulierte Allgemeine Bevölkerungsumfrage der Sozialwissenschaften (ALLBUS / GGSS) 1980-2016
    • Familiensurvey 1988 - 2000 (Wandel und Entwicklung familialer Lebensformen in Westdeutschland (Familiensurvey), Familie und Partnerbeziehungen in Ostdeutschland (Familiensurvey), Wandel und Entwicklung familialer Lebensformen - 2. Welle (Familiensurvey), Wandel und Entwicklung familialer Lebensformen - 3. Welle (Familiensurvey))
    • Mannheimer Scheidungsstudie 1996
    • Deutscher Fertility and Family Survey 1992
    • Lebensverlaufsstudien (Lebensverläufe und historischer Wandel in Ostdeutschland (Lebensverlaufsstudie LV-DDR), Lebensverläufe und gesellschaftlicher Wandel: Lebensverläufe und Wohlfahrtsentwicklung (Lebensverlaufsstudie LV-West I), Lebensverläufe und gesellschaftlicher Wandel: Die Zwischenkriegskohorte im Übergang zum Ruhestand (Lebensverlaufsstudie LV-West II A - Persönliche Befragung), Lebensverläufe und gesellschaftlicher Wandel: Die Zwischenkriegskohorte im Übergang zum Ruhestand (Lebensverlaufsstudie LV-West II T - Telefonische Befragung), Lebensverläufe und gesellschaftlicher Wandel: Berufszugang in der Beschäftigungskr...

  9. Data from: What are the determining factors in the capital structure...

    • scielo.figshare.com
    xls
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João Lussuamo; Zélia Serrasqueiro (2023). What are the determining factors in the capital structure decisions of small and medium-sized firms in Cabinda, Angola?, [Dataset]. http://doi.org/10.6084/m9.figshare.19905299.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    João Lussuamo; Zélia Serrasqueiro
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Cabinda, Angola
    Description

    Abstract The objective of this study was to analyze the determining factors that explain the capital structure decisions of small and medium-sized enterprises (SMEs) in the province of Cabinda, Angola. In this study, debt maturity was also analyzed and, therefore, total indebtedness was broken down into short, medium, and long-term debt ratios. This study is motivated the poor number of studies on the determinants of the capital structure of SMEs in developing countries, more specifically in Cabinda, Angola. This research is relevant for Corporate Finance, particularly regarding the capital structure of SMEs located in a developing country like Angola. Also, it corroborates previous studies on the applicability of the principles of the pecking-order theory to SMEs in developed countries. This research present contributions to Corporate Finance, as it identifies the determinants of the capital structure of SMEs in a developing country - considering the debt maturity -, through the analysis of total debt ratios-, short-, medium- and long-term debt. Based on a sample of 73 SMEs for the period between 2011 and 2016, we used panel data models (pooled OLS, fixed and random effects). The results of this study show that tangibility, age, liquidity, and non-debt tax shield are determining factors in the decisions of the capital structure of SMEs in the province of Cabinda, Angola. Furthermore, they suggest that these firms follow the principles of pecking-order theory in capital structure decisions. The research contributes to increase studies in Corporate Finance, particularly concerning the determinants of the capital structure of SMEs located in a developing country.

  10. MMM Weekly Data - Geo:India

    • kaggle.com
    zip
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SubhagatoAdak (2025). MMM Weekly Data - Geo:India [Dataset]. https://www.kaggle.com/datasets/subhagatoadak/mmm-weekly-data-geoindia
    Explore at:
    zip(2463044 bytes)Available download formats
    Dataset updated
    Jul 18, 2025
    Authors
    SubhagatoAdak
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    India
    Description

    Synthetic India FMCG MMM Dataset (Weekly, 3 Years, Multi-Geo / Multi-Channel)

    Subtitle: 3-Year Weekly Multi-Channel FMCG Marketing Mix Panel for India Grain: Week-ending Saturday × Geography × Brand × SKU Span: 156 weeks (2 Jul 2022 – 27 Jun 2025) Scope: 8 Indian geographies • 3 brands × 3 SKUs each (9 SKUs) • Full marketing, trade, price, distribution & macro controls • AI creative quality scores for digital banners.

    This dataset is synthetic but behaviorally realistic, generated to help analysts experiment with Marketing Mix Modeling (MMM), media effectiveness, price/promo analytics, distribution effects, and hierarchical causal inference without using proprietary commercial data.

    Why This Dataset?

    Real MMM training data is rarely public due to confidentiality. This synthetic panel:

    • Mirrors common FMCG (CPG) category dynamics in India (festive spikes, monsoon effects, geo scale differences).
    • Includes paid media channels (TV, YouTube, Facebook, Instagram, Print, Radio).
    • Captures promotions & trade levers (feature, display, temporary price reduction, trade spend).
    • Provides distribution & availability metrics (Weighted Distribution, Numeric Distribution, TDP, NOS).
    • Includes pricing (MRP, Net Price under TPR).
    • Adds macro signals (CPI, GDP, Festival Index, Rainfall Index) aligned to India’s seasonality.
    • Introduces AI Content Scores (Facebook & Instagram banner creative quality) — letting you explore creative × media interaction models.
    • Delivered at a granular panel (Geo × Brand × SKU) suitable for pooled, hierarchical, or Bayesian MMM workflows.

    Files

    FileDescription
    synthetic_mmm_weekly_india_SAT.csvMain dataset. 11,232 rows × 28 columns. Weekly (week-ending Saturday).

    (If you also upload the Monday version, note it clearly and point users to which to use.)

    Quick Start

    import pandas as pd
    
    df = pd.read_csv("/kaggle/input/synthetic-india-fmcg-mmm/synthetic_mmm_weekly_india_SAT.csv",
             parse_dates=["Week"])
    
    df.info()
    df.head()
    

    Aggregate to Geo-Brand Weekly

    geo_brand = (
      df.groupby(["Week","Geo","Brand"], as_index=False)
       .sum(numeric_only=True)
    )
    

    Create Modeling-Friendly Features

    Example: log-transform sales value, normalize media, build price index.

    import numpy as np
    
    m = geo_brand.copy()
    m["log_sales_val"] = np.log1p(m["Sales_Value"])
    m["price_index"] = m["Net_Price"] / m.groupby(["Geo","Brand"])["Net_Price"].transform("mean")
    

    Calendar Notes

    • Week variable = week-ending Saturday (Pandas freq W-SAT).
    • First week: 2022-07-02; last week: 2025-06-27 (depending on 156-week span anchor).
    • To derive a week-start (Sunday) date:

      df["Week_Start"] = df["Week"] - pd.Timedelta(days=6)
      

    Data Dictionary

    Key Dimensions

    ColumnTypeDescription
    WeekdateWeek-ending Saturday timestamp.
    Geocategorical8 rollups: NORTH, SOUTH, EAST, WEST, CENTRAL, NORTHEAST, METRO_DELHI, METRO_MUMBAI.
    BrandcategoricalBrandA / BrandB / BrandC.
    SKUcategoricalBrand-level SKU IDs (3 per brand).

    Commercial Outcomes

    ColumnTypeNotes
    Sales_UnitsfloatModeled weekly unit sales after macro, distribution, price, promo & media effects. Lognormal noise added.
    Sales_ValuefloatSales_Units × Net_Price. Use for revenue MMM or ROI analyses.

    Pricing

    ColumnTypeNotes
    MRPfloatBaseline list price (per-unit). Drifts with CPI & brand positioning.
    Net_PricefloatEffective real...
  11. Labour Force Survey 2007, March - South Africa

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +2more
    Updated May 1, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics South Africa (2014). Labour Force Survey 2007, March - South Africa [Dataset]. https://microdata.worldbank.org/index.php/catalog/955
    Explore at:
    Dataset updated
    May 1, 2014
    Dataset authored and provided by
    Statistics South Africahttp://www.statssa.gov.za/
    Time period covered
    2007
    Area covered
    South Africa
    Description

    Abstract

    The LFS is a twice-yearly rotating panel household survey, specifically designed to measure the dynamics of employment and unemployment in South Africa. It measures a variety of issues related to the labour market,including unemployment rates (official and expanded), according to standard definitions of the International Labour Organisation (ILO).

    All editions of the LFS have been updated (some more than once) since their release. These version changes are detailed in a document available from DataFirst (in the "external documents" section titled "LFS 2000-2008 Collated Version Notes on the South African LFS").

    Geographic coverage

    National coverage

    Analysis unit

    Individuals

    Universe

    The LFS sample covers the non-institutional population except for workers' hostels. However, persons living in private dwelling units within institutions are also enumerated. For example, within a school compound, one would enumerate the schoolmaster's house and teachers' accommodation because these are private dwellings. Students living in a dormitory on the school compound would, however, be excluded.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Statistics South Africa uses a rotating panel methodology for the labour force survey. The rotating panel methodology involves visiting the same dwelling units on a number of occasions (in this instance, five at most). After the panel is established, a proportion of the dwelling units is replaced each round (in this instance, 20%). New dwelling units are added to the sample to replace those that are taken out.

    Enumeration Areas (EAs) that had a household count of less than twenty-five were omitted from the census 2001 frame that was used to draw the sample of Primary Sampling Units (PSUs) for the new Master Sample. Other omissions from the Master Sample frame included all institution EAs except workers, hostels, convents and monasteries. EAs from census 2001 were pooled in two stages, before and after sampling. Before sampling the criterion that was used to pool EAs was that they should contain a minimum of one hundred households. However, during listing it was discovered that there were discrepancies between the information on the database and what was on the ground.

    Therefore, in the second stage of pooling, EAs that were found to have less than sixty dwelling units during listing were pooled. The Master Sample is a multi-stage stratified sample. The overall sample size of PSUs was 3000. The explicit strata were the 53 district councils/metros (DCs). The 3000 PSUs were allocated to these DCs using the power allocation method. The PSUs were then sampled using probability proportional to size principles. The measure of size used was the number of households in a PSU as calculated in the census. The sampled PSUs were listed with the dwelling unit as the listing unit. From these listings systematic samples of dwelling units per PSU were drawn. These samples of dwelling units form clusters. The size of the clusters differs depending on the specific survey requirements. The LFS uses one of the clusters that contain ten dwelling units.

    Mode of data collection

    Face-to-face [f2f]

  12. Data from: The Determinants of Tax Revenue and Tax Effort in Developed and...

    • scielo.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcelo Piancastelli; A.P. Thirlwall (2023). The Determinants of Tax Revenue and Tax Effort in Developed and Developing Countries: Theory and New Evidence 1996-2015 [Dataset]. http://doi.org/10.6084/m9.figshare.14304647.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Marcelo Piancastelli; A.P. Thirlwall
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract This paper measures the tax effort of a group of fifty-nine developed and developing countries over the period 1996-2015 by comparing a country’s actual tax/GDP ratio with the ratio predicted derived from an international tax function which relates tax revenue to various measures of a country’s taxable capacity such as the level of per capita income; the share of trade in GDP; the productive structure, and the level of financial deepening. The tax function is estimated using cross section data; pooled time series/cross section data, and panel data using a fixed effects estimator. The results are compared and show a range of tax effort from South Africa with the highest effort and Switzerland with the lowest effort. Implications for policy are drawn. The paper is critical of studies that include institutional variables (and other variables not related to the tax base of countries) to measure tax effort when they are really explanations of why the tax ratio differs between countries not of tax effort itself.

  13. The number of SNP positions from the reference panel that was identified in...

    • plos.figshare.com
    xls
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen Byrne; Adrian Czaban; Bruno Studer; Frank Panitz; Christian Bendixen; Torben Asp (2023). The number of SNP positions from the reference panel that was identified in the pooled data set, which could be genotyped in at least 75 percent of individual samples with increasing minimum coverage thresholds. [Dataset]. http://doi.org/10.1371/journal.pone.0057438.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Stephen Byrne; Adrian Czaban; Bruno Studer; Frank Panitz; Christian Bendixen; Torben Asp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    aReference SNP panel identified with a MAF of 5 percent in data pooled from equal numbers of reads of all eight varieties.b32 samples for ApeKI libraries, and 29 for PstI libraries (three samples were removed due to very low sequencing coverage).

  14. Quantile regression (ROE).

    • plos.figshare.com
    xls
    Updated Sep 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Haris; HongXing Yao; Hijab Fatima (2024). Quantile regression (ROE). [Dataset]. http://doi.org/10.1371/journal.pone.0308356.t011
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 9, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Muhammad Haris; HongXing Yao; Hijab Fatima
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.

  15. Data from: Entrepreneurship and Human Development: An International Analysis...

    • scielo.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    José Antonio Camacho Ballesta; Bladimir José de la Hoz Rosales; Ignacio Tamayo Torres (2023). Entrepreneurship and Human Development: An International Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.14326823.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    José Antonio Camacho Ballesta; Bladimir José de la Hoz Rosales; Ignacio Tamayo Torres
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract Purpose: This study aims to analyze the impact on human development of rates of innovative entrepreneurship and necessity entrepreneurship. Design/methodology/approach: Our empirical study is based on samples from countries with information about rates of entrepreneurship, human development, and social progress. The data are analyzed by means of pooled least squares and panel data techniques. Findings: Innovative entrepreneurship improves the quality of life in the dimensions measured by the Social Progress Index and Modified Human Development Index. Necessity entrepreneurship does not favor an increase of human development, at least in the dimensions measured by the two indexes, since this is a subsistence entrepreneurship type. Originality/value: This study presents new evidence that contributes to the knowledge on how entrepreneurship improves quality of life.

  16. Estimated coefficient of the two-step system GMM, pooled OLS, medium...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Arslan Iqbal; Md Abdur Rouf Sarkar; Majed Alharthi; Md Jahid Ebn Jalal; Md. Naimur Rahman (2025). Estimated coefficient of the two-step system GMM, pooled OLS, medium quantile regression, and IV2SLS models for robustness check. [Dataset]. http://doi.org/10.1371/journal.pone.0324147.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Muhammad Arslan Iqbal; Md Abdur Rouf Sarkar; Majed Alharthi; Md Jahid Ebn Jalal; Md. Naimur Rahman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Estimated coefficient of the two-step system GMM, pooled OLS, medium quantile regression, and IV2SLS models for robustness check.

  17. Panel long-run and dynamic short-run coefficients using PMG/ARDL model.

    • plos.figshare.com
    xls
    Updated Feb 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Habtamu Getachew Tegegne (2024). Panel long-run and dynamic short-run coefficients using PMG/ARDL model. [Dataset]. http://doi.org/10.1371/journal.pone.0297142.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 1, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Habtamu Getachew Tegegne
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Panel long-run and dynamic short-run coefficients using PMG/ARDL model.

  18. Augmented Dickey-Fuller (ADF) unit root test.

    • figshare.com
    xls
    Updated Sep 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Haris; HongXing Yao; Hijab Fatima (2024). Augmented Dickey-Fuller (ADF) unit root test. [Dataset]. http://doi.org/10.1371/journal.pone.0308356.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 9, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Muhammad Haris; HongXing Yao; Hijab Fatima
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.

  19. Pooled OLS, fixed effects, and random effects models.

    • plos.figshare.com
    xls
    Updated Mar 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Ibrahim Nor (2025). Pooled OLS, fixed effects, and random effects models. [Dataset]. http://doi.org/10.1371/journal.pone.0318170.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 7, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mohamed Ibrahim Nor
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pooled OLS, fixed effects, and random effects models.

  20. Correlation matrix.

    • plos.figshare.com
    xls
    Updated Sep 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Haris; HongXing Yao; Hijab Fatima (2024). Correlation matrix. [Dataset]. http://doi.org/10.1371/journal.pone.0308356.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 9, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Muhammad Haris; HongXing Yao; Hijab Fatima
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The COVID-19 outbreak caused a massive setback to the stability of financial system due to emergence of several other risks with COVID, which significantly influenced the continuity of profitable banking operations. Therefore, this study aims to see that how differently the liquidity risk and credit risk influenced the banking profitability during Covid-19 (Q12020 to Q42021) than before COVID (Q12018 to Q42019). The study employs pooled OLS, and OLS fixed & random effects models, to analyze the panel data on a sample of 37 banks currently operating in Pakistan. The results depict that liquidity risk has a positive and significant relationship with return on assets and return on equity, but insignificant relationship with net interest margin. Credit risk has a negative and significant relationship with return on assets, return on equity, and net interest margin. The study also applies quantile regression to address the normality issue in data. The quantile regression results are consistent with pooled OLS, and OLS fixed and random effects results. The study makes valuable suggestions for regulators, policymakers, and others users of financial institutional data. The current study will help to set policies for efficient management of LR and CR.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Bureau of Statistics (2021). National Panel Survey 2008-2015, Uniform Panel Dataset - Tanzania [Dataset]. https://microdata.worldbank.org/index.php/catalog/3814

National Panel Survey 2008-2015, Uniform Panel Dataset - Tanzania

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Mar 17, 2021
Dataset authored and provided by
National Bureau of Statistics
Time period covered
2008 - 2015
Area covered
Tanzania
Description

Abstract

Panel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.

This Basic Information Document (BID) provides a brief overview of the Tanzania National Panel Survey (NPS), but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the NPS. As this Basic Information Document (BID) for the UPD does not describe in detail the background, development, or use of the NPS itself, the round-specific NPS BIDs should supplement the information provided here.

The NPS Uniform Panel Dataset (UPD) consists of both survey instruments and datasets, meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the NPS. The NPS-UPD provides a consistent and straightforward means of conducting not only user-driven analyses using convenient, standardized tools, but also for monitoring MKUKUTA, FYDP II, and other national level development indicators reported by the NPS.

The design of the NPS-UPD combines the four completed rounds of the NPS – NPS 2008/09 (R1), NPS 2010/11 (R2), NPS 2012/13 (R3), and NPS 2014/15 (R4) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.

Geographic coverage

Designed for analysis of key indicators at four primary domains of inference, namely: Dar es Salaam, other urban, rural, Zanzibar.

Analysis unit

  • Households
  • Individuals

Universe

The universe includes all households and individuals in Tanzania with the exception of those residing in military barracks or other institutions.

Kind of data

Sample survey data [ssd]

Sampling procedure

While the same sample of respondents was maintained over the first three rounds of the NPS, longitudinal surveys tend to suffer from bias introduced by households leaving the survey over time; i.e. attrition. Although the NPS maintains a highly successful recapture rate (roughly 96% retention at the household level), minimizing the escalation of this selection bias, a refresh of longitudinal cohorts was done for the NPS 2014/15 to ensure proper representativeness of estimates while maintaining a sufficient primary sample to maintain cohesion within panel analysis. A newly completed Population and Housing Census (PHC) in 2012, providing updated population figures along with changes in administrative boundaries, emboldened the opportunity to realign the NPS sample and abate collective bias potentially introduced through attrition.

To maintain the panel concept of the NPS, the sample design for NPS 2014/2015 consisted of a combination of the original NPS sample and a new NPS sample. A nationally representative sub-sample was selected to continue as part of the “Extended Panel” while an entirely new sample, “Refresh Panel”, was selected to represent national and sub-national domains. Similar to the sample in NPS 2008/2009, the sample design for the “Refresh Panel” allows analysis at four primary domains of inference, namely: Dar es Salaam, other urban areas on mainland Tanzania, rural mainland Tanzania, and Zanzibar. This new cohort in NPS 2014/2015 will be maintained and tracked in all future rounds between national censuses.

Mode of data collection

Face-to-face [f2f]

Research instrument

The format of the NPS-UPD survey instrument is similar to previously disseminated NPS survey instruments. Each module has a questionnaire and clearly identifies if the module collects information at the individual or household level. Within each module-specific questionnaire of the NPS-UPD survey instrument, there are five distinct sections, arranged vertically: (1) the UPD - “U” on the survey instrument, (2) R4, (3), R3, (4) R2, and (5) R1 – the latter 4 sections presenting each questionnaire in its original form at time of its respective dissemination.

The uppermost section of each module’s questionnaire (“U”) represents the model universal panel questionnaire, with questions generated from the comprehensive listing of questions across all four rounds of the NPS and codes generated from the comprehensive collection of codes. The following sections are arranged vertically by round, considering R4 as most recent. While not all rounds will have data reported for each question in the UPD and not each question will have reports for each of the UPD codes listed, the NPS-UPD survey instrument represents the visual, all-inclusive set of information collected by the NPS over time.

The four round-specific sections (R4, R3, R2, R1) are aligned with their UPD-equivalent question, visually presenting their contribution to compatibility with the UPD. Each round-specific section includes the original round-specific variable names, response codes and skip patterns (corresponding to their respective round-specific NPS data sets, and despite their variance from other rounds or from the comprehensive UPD code listing)4.

Search
Clear search
Close search
Google apps
Main menu