1 dataset found
  1. Symbolic Institutional Traps: Language Regimes, Legal Legacy, and...

    • zenodo.org
    bin
    Updated Apr 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scott Brown; Scott Brown (2025). Symbolic Institutional Traps: Language Regimes, Legal Legacy, and Organizational Constraint in Postcolonial Economies [Dataset]. http://doi.org/10.5281/zenodo.15285179
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 26, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Scott Brown; Scott Brown
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    README: Symbolic Institutional Traps and the Liability of Foreignness

    Scott M. Brown (University of Puerto Rico)
    Email: scott.brown@upr.edu
    Data DOI: 10.5281/zenodo.15050209

    Overview

    This project empirically tests how language regimes embedded in legal and administrative systems create institutional traps that constrain multinational enterprise (MNE) operations and economic integration.
    The study combines national and subnational data across four key datasets to measure how symbolic misalignment (such as monolingualism in non-commercial languages) affects regulatory quality, business formation, and workforce access.

    📂 Datasets

    You must upload the following four files into your Google Colab session before running the code:

    Uploaded FileDescription
    /content/2020_Rankings.xlsxWorld Bank Ease of Doing Business (EODB) — Global regulatory efficiency indicators (2020 Edition)
    /content/DBNA 2022 Rank and Scores.xlsxDoing Business North America (DBNA 2022) — City-level institutional performance across 83 U.S. cities
    /content/Spanish_Speakers_All_States.xlsxU.S. Census American Community Survey (ACS) — State-level Spanish-speaking and English proficiency data
    /content/wgidataset.xlsxWorld Governance Indicators (WGI) — Governance quality measures (Regulatory Quality, Government Effectiveness, etc.)

    📋 How to Run the Study

    1. Open Google Colab.

    2. Upload the four Excel files listed above.

    3. Copy and paste the Python code provided below into a Colab notebook cell.

    4. Run the code to automatically load the datasets, clean the data, and estimate key regression models.

    🚀 Required Python Code

    python
    # --- 0. Imports ---
    import pandas as pd
    import statsmodels.api as sm
    import statsmodels.formula.api as smf


    # --- 1. Load Clean Datasets ---
    dbna = pd.read_excel('/content/DBNA 2022 Rank and Scores.xlsx')
    acs = pd.read_excel('/content/Spanish_Speakers_All_States.xlsx')
    wgi = pd.read_excel('/content/wgidataset.xlsx') # Optional: Governance analysis

    # --- 2. Standardize Column Names ---
    dbna.columns = dbna.columns.str.strip().str.replace(' ', '_')
    acs.columns = acs.columns.str.strip().str.replace(' ', '_')
    wgi.columns = wgi.columns.str.strip().str.replace(' ', '_')

    # --- 3. Merge Datasets ---
    # Merge DBNA and ACS on 'State'
    merged_dbna = dbna.merge(acs, on='State', how='left')

    # --- 4. Regressions: Language vs Institutional Outcomes ---

    # H1: Language (% Spanish) and Starting a Business Score
    model1 = smf.ols('Starting_a_Business_Score ~ Percent_Spanish_Speakers', data=merged_dbna).fit()
    print(" Regression: Starting a Business Score ~ Percent Spanish Speakers")
    print(model1.summary())

    # H3: Language (% Spanish) and Land and Space Use Score
    model2 = smf.ols('Land_and_Space_Use_Score ~ Percent_Spanish_Speakers', data=merged_dbna).fit()
    print(" Regression: Land and Space Use Score ~ Percent Spanish Speakers")
    print(model2.summary())

    # H3: Language (% Spanish) and Getting Electricity Score
    model3 = smf.ols('Getting_Electricity_Score ~ Percent_Spanish_Speakers', data=merged_dbna).fit()
    print(" Regression: Getting Electricity Score ~ Percent Spanish Speakers")
    print(model3.summary())

    # H4: Language (% Spanish) and Employing Workers Score
    model4 = smf.ols('Employing_Workers_Score ~ Percent_Spanish_Speakers', data=merged_dbna).fit()
    print(" Regression: Employing Workers Score ~ Percent Spanish Speakers")
    print(model4.summary())

    # --- 5. (Optional) Governance Analysis: Percent Spanish vs. WGI Regulatory Quality ---
    # If WGI includes 'State' or 'Country' to merge, otherwise skip
    # Example assuming WGI has 'Country' to match 'State'

    #wgi_merged = wgi.merge(acs, left_on='Country', right_on='State', how='left')
    #model5 = smf.ols('Regulatory_Quality ~ Percent_Spanish_Speakers', data=wgi_merged).fit()
    #print(" Regression: Regulatory Quality ~ Percent Spanish Speakers")
    #print(model5.summary())

    # --- 6. End ---
    print(" All regressions completed.")

    🧠 Key Concepts

    • Symbolic Institutional Traps: Language regimes act as hidden barriers, complicating regulatory navigation and labor market integration.

    • Symbolic Misalignment: Misfit between administrative languages and global commercial norms raises onboarding costs for MNEs.

    • Institutional Friction: Language encapsulation isolates economies and reduces foreign direct investment (FDI) attractiveness.

    📜 Data Documentation

    Each dataset has been:

    • Cleaned for consistent formatting.

    • Harmonized for cross-dataset integration.

    • Standardized to facilitate reproducible econometric analysis.

    • Full codebooks and metadata are available in the appendix of the research paper.

    ⚡ Notes

    • The EF EPI (English Proficiency) dataset was not uploaded here. If available, further regressions on symbolic distance can be run.

    • If any columns do not match exactly (e.g., different spellings), modify the variable names slightly based on print(dbna.columns).

    📈 Planned Outputs

    The code generates:

    • Regression outputs on how Spanish-speaking prevalence correlates with:

      • Starting a business

      • Ease of Doing Business

      • Regulatory quality

    • Subnational institutional performance differences (Puerto Rico vs. U.S. states).

    🌍 License and Reuse

    • Open Data: CC BY 4.0 License

    • Citation Requested:
      Brown, S.M. (2025). Symbolic Institutional Traps and the Liability of Foreignness: Language Regimes as Hidden Barriers to Multinational Entry. University of Puerto Rico. DOI: 10.5281/zenodo.15050209

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Scott Brown; Scott Brown (2025). Symbolic Institutional Traps: Language Regimes, Legal Legacy, and Organizational Constraint in Postcolonial Economies [Dataset]. http://doi.org/10.5281/zenodo.15285179
Organization logo

Symbolic Institutional Traps: Language Regimes, Legal Legacy, and Organizational Constraint in Postcolonial Economies

Explore at:
binAvailable download formats
Dataset updated
Apr 26, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Scott Brown; Scott Brown
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

README: Symbolic Institutional Traps and the Liability of Foreignness

Scott M. Brown (University of Puerto Rico)
Email: scott.brown@upr.edu
Data DOI: 10.5281/zenodo.15050209

Overview

This project empirically tests how language regimes embedded in legal and administrative systems create institutional traps that constrain multinational enterprise (MNE) operations and economic integration.
The study combines national and subnational data across four key datasets to measure how symbolic misalignment (such as monolingualism in non-commercial languages) affects regulatory quality, business formation, and workforce access.

📂 Datasets

You must upload the following four files into your Google Colab session before running the code:

Uploaded FileDescription
/content/2020_Rankings.xlsxWorld Bank Ease of Doing Business (EODB) — Global regulatory efficiency indicators (2020 Edition)
/content/DBNA 2022 Rank and Scores.xlsxDoing Business North America (DBNA 2022) — City-level institutional performance across 83 U.S. cities
/content/Spanish_Speakers_All_States.xlsxU.S. Census American Community Survey (ACS) — State-level Spanish-speaking and English proficiency data
/content/wgidataset.xlsxWorld Governance Indicators (WGI) — Governance quality measures (Regulatory Quality, Government Effectiveness, etc.)

📋 How to Run the Study

  1. Open Google Colab.

  2. Upload the four Excel files listed above.

  3. Copy and paste the Python code provided below into a Colab notebook cell.

  4. Run the code to automatically load the datasets, clean the data, and estimate key regression models.

🚀 Required Python Code

python
# --- 0. Imports ---
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf


# --- 1. Load Clean Datasets ---
dbna = pd.read_excel('/content/DBNA 2022 Rank and Scores.xlsx')
acs = pd.read_excel('/content/Spanish_Speakers_All_States.xlsx')
wgi = pd.read_excel('/content/wgidataset.xlsx') # Optional: Governance analysis

# --- 2. Standardize Column Names ---
dbna.columns = dbna.columns.str.strip().str.replace(' ', '_')
acs.columns = acs.columns.str.strip().str.replace(' ', '_')
wgi.columns = wgi.columns.str.strip().str.replace(' ', '_')

# --- 3. Merge Datasets ---
# Merge DBNA and ACS on 'State'
merged_dbna = dbna.merge(acs, on='State', how='left')

# --- 4. Regressions: Language vs Institutional Outcomes ---

# H1: Language (% Spanish) and Starting a Business Score
model1 = smf.ols('Starting_a_Business_Score ~ Percent_Spanish_Speakers', data=merged_dbna).fit()
print(" Regression: Starting a Business Score ~ Percent Spanish Speakers")
print(model1.summary())

# H3: Language (% Spanish) and Land and Space Use Score
model2 = smf.ols('Land_and_Space_Use_Score ~ Percent_Spanish_Speakers', data=merged_dbna).fit()
print(" Regression: Land and Space Use Score ~ Percent Spanish Speakers")
print(model2.summary())

# H3: Language (% Spanish) and Getting Electricity Score
model3 = smf.ols('Getting_Electricity_Score ~ Percent_Spanish_Speakers', data=merged_dbna).fit()
print(" Regression: Getting Electricity Score ~ Percent Spanish Speakers")
print(model3.summary())

# H4: Language (% Spanish) and Employing Workers Score
model4 = smf.ols('Employing_Workers_Score ~ Percent_Spanish_Speakers', data=merged_dbna).fit()
print(" Regression: Employing Workers Score ~ Percent Spanish Speakers")
print(model4.summary())

# --- 5. (Optional) Governance Analysis: Percent Spanish vs. WGI Regulatory Quality ---
# If WGI includes 'State' or 'Country' to merge, otherwise skip
# Example assuming WGI has 'Country' to match 'State'

#wgi_merged = wgi.merge(acs, left_on='Country', right_on='State', how='left')
#model5 = smf.ols('Regulatory_Quality ~ Percent_Spanish_Speakers', data=wgi_merged).fit()
#print(" Regression: Regulatory Quality ~ Percent Spanish Speakers")
#print(model5.summary())

# --- 6. End ---
print(" All regressions completed.")

🧠 Key Concepts

  • Symbolic Institutional Traps: Language regimes act as hidden barriers, complicating regulatory navigation and labor market integration.

  • Symbolic Misalignment: Misfit between administrative languages and global commercial norms raises onboarding costs for MNEs.

  • Institutional Friction: Language encapsulation isolates economies and reduces foreign direct investment (FDI) attractiveness.

📜 Data Documentation

Each dataset has been:

  • Cleaned for consistent formatting.

  • Harmonized for cross-dataset integration.

  • Standardized to facilitate reproducible econometric analysis.

  • Full codebooks and metadata are available in the appendix of the research paper.

⚡ Notes

  • The EF EPI (English Proficiency) dataset was not uploaded here. If available, further regressions on symbolic distance can be run.

  • If any columns do not match exactly (e.g., different spellings), modify the variable names slightly based on print(dbna.columns).

📈 Planned Outputs

The code generates:

  • Regression outputs on how Spanish-speaking prevalence correlates with:

    • Starting a business

    • Ease of Doing Business

    • Regulatory quality

  • Subnational institutional performance differences (Puerto Rico vs. U.S. states).

🌍 License and Reuse

  • Open Data: CC BY 4.0 License

  • Citation Requested:
    Brown, S.M. (2025). Symbolic Institutional Traps and the Liability of Foreignness: Language Regimes as Hidden Barriers to Multinational Entry. University of Puerto Rico. DOI: 10.5281/zenodo.15050209

Search
Clear search
Close search
Google apps
Main menu