33 datasets found

e
List of Top Schools of The Stata Journal sorted by citations
exaly.com
csv, json
Updated Nov 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). List of Top Schools of The Stata Journal sorted by citations [Dataset]. https://exaly.com/journal/22728/the-stata-journal/top-schools
Explore at:
csv, jsonAvailable download formats
Dataset updated
Nov 1, 2025
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
List of Top Schools of The Stata Journal sorted by citations.
d
Data and code from: The impact of light-rail stations on income sorting in...
search.dataone.org
datadryad.org
Updated Oct 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erik Nelson (2025). Data and code from: The impact of light-rail stations on income sorting in US urban areas [Dataset]. http://doi.org/10.5061/dryad.q573n5tww
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.q573n5tww
Dataset updated
Oct 23, 2025
Dataset provided by
Dryad Digital Repository
Authors
Erik Nelson
Description
The impact of public transit (PT) on income sorting in U.S. cities has long been debated. Theory suggests that richer households may cluster near PT stations to minimize commute time â€“ or avoid them in favor of more convenient automobile commuting. The equilibrium depends on factors such as PT speed relative to cars and the income gap between rich and poor households. Empirical evidence supports both possibilities, but prior multi-city studies suffer from identification flaws. Using data from 21 U.S. light-rail (LR) systems built or expanded since 1991, this study estimates the effect of new LR stations on nearby neighborhood incomes. My event-study design improves upon earlier work by constructing controls that match pre-treatment conditions and trends in treated station areas and by correcting for the bias that staggered treatment timing can introduce to event study estimates. Across the pooled sample, there is little evidence that new LR stations make surrounding neighborhoods poorer..., , This README.txt file was generated on 2025-10-04 by Erik Nelson.

GENERAL INFORMATION

Title of Dataset: The impact of light-rail stations on income sorting in US urban areas.

Author Information Name: Erik Nelson Institution: Bowdoin College Address: 9700 College Station Brunswick, ME 04011-8497. Email: enelson2@bowdoin.edu

The Stata .do files in this depository generate the results that are plotted or presented in table format in the paper "The impact of light-rail stations on income sorting in US urban areas." All .do files load the needed datasets. All datasets are .xlsx format. Each Excel file contains data for the urban area that is part of the file's name. The data in each Excel file is in panel form. Each observation in a dataset represents a treated or control area i in urban area u in year t. We observe each area i's average nominal per capita and median HH income in year t = 1990, 2000, 2010, 2017, 2019, 2021, and 2022 (thes...,
s
Data from: Data files used to study change dynamics in software systems
figshare.swinburne.edu.au
pdf
Updated Jul 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajesh Vasa (2024). Data files used to study change dynamics in software systems [Dataset]. http://doi.org/10.25916/sut.26288227.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25916/sut.26288227.v1
Dataset updated
Jul 22, 2024
Dataset provided by
Swinburne
Authors
Rajesh Vasa
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
It is a widely accepted fact that evolving software systems change and grow. However, it is less well-understood how change is distributed over time, specifically in object oriented software systems. The patterns and techniques used to measure growth permit developers to identify specific releases where significant change took place as well as to inform them of the longer term trend in the distribution profile. This knowledge assists developers in recording systemic and substantial changes to a release, as well as to provide useful information as input into a potential release retrospective. However, these analysis methods can only be applied after a mature release of the code has been developed. But in order to manage the evolution of complex software systems effectively, it is important to identify change-prone classes as early as possible. Specifically, developers need to know where they can expect change, the likelihood of a change, and the magnitude of these modifications in order to take proactive steps and mitigate any potential risks arising from these changes. Previous research into change-prone classes has identified some common aspects, with different studies suggesting that complex and large classes tend to undergo more changes and classes that changed recently are likely to undergo modifications in the near future. Though the guidance provided is helpful, developers need more specific guidance in order for it to be applicable in practice. Furthermore, the information needs to be available at a level that can help in developing tools that highlight and monitor evolution prone parts of a system as well as support effort estimation activities. The specific research questions that we address in this chapter are: (1) What is the likelihood that a class will change from a given version to the next? (a) Does this probability change over time? (b) Is this likelihood project specific, or general? (2) How is modification frequency distributed for classes that change? (3) What is the distribution of the magnitude of change? Are most modifications minor adjustments, or substantive modifications? (4) Does structural complexity make a class susceptible to change? (5) Does popularity make a class more change-prone? We make recommendations that can help developers to proactively monitor and manage change. These are derived from a statistical analysis of change in approximately 55000 unique classes across all projects under investigation. The analysis methods that we applied took into consideration the highly skewed nature of the metric data distributions. The raw metric data (4 .txt files and 4 .log files in a .zip file measuring ~2MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).
Fee vs Fine
zenodo.org
bin
Updated Aug 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rafael Nunes Teixeira; Rafael Nunes Teixeira (2025). Fee vs Fine [Dataset]. http://doi.org/10.5281/zenodo.16989639
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.16989639
Dataset updated
Aug 28, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rafael Nunes Teixeira; Rafael Nunes Teixeira
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description

This dataset contains data from an online experiment designed to test whether economically equivalent penalties—fees (paid before taking) and fines (paid after taking)—influence prosocial behaviour differently. Participants played a modified dictator game in which they could take points from another participant.

The dataset is provided in Excel format (Full-data.xlsx), along with a Stata do-file (submit.do) that reshapes, cleans, and analyses the data.

Data Collection

Platform: oTree

Recruitment: Prolific

Sample size: 201 participants

Design: Each participant played 20 rounds: 10 in the control condition and 10 in one treatment condition (fee or fine). Order of blocks was randomised.

Payment: 200 points = £1. One round was randomly selected for payment.

Variables

Identification

session – Session number

id – Participant ID

treatment – Assigned treatment (1 = Fee, 2 = Fine)

order – Order of blocks (0 = Control first, 1 = Treatment first)

Decision Rounds

For each round, participants made decisions in both control (c) and treatment (t) conditions.

c1, t1, c2, t2, … – Tokens available and/or allocated across control and treatment rounds.

takeX – Amount taken from the other participant in case X.

Norm Elicitation

Social norms were elicited after the taking task. Variables include empirical, normative, and responsibility measures at both extensive and intensive margins:

eyX, etX – Empirical expectations (beliefs about what others do)

nyX, ntX – Normative expectations (beliefs about what others think is appropriate)

ryX, rtX – Responsibility measures

casenormX – Case identifier for norm elicitation

Demographics

From survey responses:

Sex – Gender

Ethnicitysimplified – Simplified ethnicity category

Countryofresidence – Participant’s country of residence

Other

order, session – Experimental setup metadata

Stata Do-File (analysis.do)

The .do file performs the following steps:

Data Preparation

Import raw Excel file

Reshape from wide to long format (cases per participant)

Declare panel data (xtset id)

Variable Generation

Rename variables for clarity (e.g., take for amount taken)

Generate treatment dummies (treat)

Construct demographic dummies (gender, race, nationality)

Analysis Preparation

Create extensive and intensive margin variables

Generate expectation and norm measures

Output

Ready-to-analyse panel dataset for regression and statistical analysis
Mini EDHS 2019 data set in excel Stata form.
plos.figshare.com
xlsx
Updated Nov 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elsabeth Addisu; Niguss Cherie; Tesfaye Birhane; Zinet Abegaz; Abel Endawkie; Anissa Mohammed; Dagnachew Melak; Fekade Demeke Bayou; Ahmed Hussien Asfaw; Husniya Yasin; Aregash Abebayehu Zerga; Birhanu Wagaye; Fanos Yeshanew Ayele; Natnael Kebede; Asnakew Molla Mekonen; Mengistu Mera Mihiretu; Amare Muche; Yawkal Tsega (2024). Mini EDHS 2019 data set in excel Stata form. [Dataset]. http://doi.org/10.1371/journal.pone.0310901.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0310901.s001
Dataset updated
Nov 18, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Elsabeth Addisu; Niguss Cherie; Tesfaye Birhane; Zinet Abegaz; Abel Endawkie; Anissa Mohammed; Dagnachew Melak; Fekade Demeke Bayou; Ahmed Hussien Asfaw; Husniya Yasin; Aregash Abebayehu Zerga; Birhanu Wagaye; Fanos Yeshanew Ayele; Natnael Kebede; Asnakew Molla Mekonen; Mengistu Mera Mihiretu; Amare Muche; Yawkal Tsega
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundFull antenatal care utilization is a key intervention that creates the opportunity to provide all the necessary health services during pregnancy that aims to reduce maternal and newborn morbidity and mortality. However, there is still a gap in utilizing this service between rural and urban women. So, this study aimed to identify the sources of variations in full antenatal care utilization between the rural and urban areas of Ethiopia.MethodsThe study used the data on a nationwide representative sample of the Mini- Demographic and Health Survey (DHS) of Ethiopia. The data were collected from March 21, 2019, to June 28, 2019, in all regions of Ethiopia. Two stage cluster sampling techniques were used to select the study participants. This study included about 3,927 (weighted samples) of women aged from 15 to 49 years. A multivariate decomposition analysis technique was performed to observe the rural-urban disparities in full antenatal care utilization explained by residence difference in components of endowments and coefficients.ResultsThe prevalence of full antenatal care utilization was 43.25% (95% CI: 41.7%, 44.8%). The difference in the prevalence of full antenatal care utilization between rural and urban women was (rural prevalence was 27.73%, while in urban areas it was 15.52%). These results showed a statistically significant full antenatal care utilization gap in rural urban resident women (-0.21807, 95% CI:(-0.27397, -0.16217)). The majority of the gap was explained by the covariate distribution, which accounted for 76.84%, and the rest, 23.16%, was due to the effect of covariate differences. Educational status, wealth status, religion, region, birth order, and parity differences between urban and rural women explain most of the full antenatal care utilization disparities.Conclusion and recommendationsThere is a significant full antenatal care utilization disparity between rural and urban women in Ethiopia. This variation in the rural-urban full antenatal care utilization was explained by differences in characteristics (endowment). So to decrease this gap, emphasis should be given to resource distribution targeting rural households, improvement of maternal education and creating a platform to access information about the service and its relevance.
m
Dataset for Transient and persistent efficiency and spatial spillovers:...
data.mendeley.com
narcis.nl
Updated Jun 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samuel Faria (2021). Dataset for Transient and persistent efficiency and spatial spillovers: Evidence from the Portuguese wine industry [Dataset]. http://doi.org/10.17632/tcymhpxc86.1
Explore at:
Unique identifier
https://doi.org/10.17632/tcymhpxc86.1
Dataset updated
Jun 10, 2021
Authors
Samuel Faria
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Portugal
Description
This dataset consists of full database of finantial and operational data of Portuguese firms covering the period 2014-2019. In addition, geographical location data is also shared, in order to construct the spatial weights matrix. The Stata do. file is also shared with the computed routines explained in the manuscript. Any question/inquiry should be addressed to samuelf@utad.pt.
d
Replication Data for: Tabloid Media Campaigns and Public Opinion:...
search.dataone.org
dataverse.harvard.edu
Updated Nov 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Foos, Florian; Bischof, Daniel (2023). Replication Data for: Tabloid Media Campaigns and Public Opinion: Quasi-Experimental Evidence on Euroscepticism in England [Dataset]. http://doi.org/10.7910/DVN/NYPOQD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/NYPOQD
Dataset updated
Nov 14, 2023
Dataset provided by
Harvard Dataverse
Authors
Foos, Florian; Bischof, Daniel
Description
The files provided within this .zip file are meant to reproduce the tables and figures included in the article "Tabloid Media Campaigns and Public Opinion: Quasi-Experimental Evidence on Euroscepticism in England" by Florian Foos and Daniel Bischof in the APSR. Notice: - This is a fully reproducible archive written in Stata's project environment: https://www.statalist.org/forums/forum/general-stata-discussion/general/1302147-how-project-from-ssc-is-different-from-stata-built-in-project. - As the code is written in a project environment we advise all users to carefully read the README.TXT in order to understand how reproduction in Stata's project environment works. - The largest part of our analyses are based on yearly attitudinal data from the British Social Attitudes Survey (BSA): https://www.bsa.natcen.ac.uk. The BSA does not allow researchers to upload these data as part of their replication files; we are also not allowed to upload a recoded version of the data file. However, all yearly BSA surveys are available via the UK Data Service. In order to reproduce the results reported in this paper, you will need to a) register with the UK Data Service (https://beta.ukdataservice.ac.uk/myaccount/login) and b) access and download the relevant .dta files and place them into the replication archive (data_original/BSA/*YEAR*).
d
Data from: Conflicting identities and cooperation between groups:...
search.dataone.org
datadryad.org
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonio M. EspÃn; Maria Paz Espinosa; Maria J. VÃ¡zquez-De Francisco; Pablo BraÃ±as-Garza (2025). Conflicting identities and cooperation between groups: Experimental evidence from a mentoring program [Dataset]. http://doi.org/10.5061/dryad.rr4xgxdkv
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.rr4xgxdkv
Dataset updated
Jul 3, 2025
Dataset provided by
Dryad Digital Repository
Authors
Antonio M. EspÃn; Maria Paz Espinosa; Maria J. VÃ¡zquez-De Francisco; Pablo BraÃ±as-Garza
Description
We experimentally investigate cooperation in 14 centres of a mentoring program where participants have two possible natural identitiesâ€”individuals raised under legal guardianship, suffering a negative stereotype (G; n=112) and users without such a social stigma (NG; n=82). Participants played a Prisonersâ€™ Dilemma game with an anonymous partner from the same centre (centre-ingroup) and from another centre (centre-outgroup). The folder contains the raw data in csv and dta (STATA) format and the script (STATA do file) used to define all the variables used and conduct all the analyses reported in the article. The analyses in the script appear in the exact order of appearance in the text, starting from the Methods section and then the Results section., , # Data from: Conflicting identities and cooperation between groups: Experimental evidence from a mentoring program

Associated article: Antonio M. EspÃn, MarÃa Paz Espinosa, MarÃa J. VÃ¡zquez-De Francisco, Pablo BraÃ±as-Garza. Proceedings B. DOI: 10.1098/rspb.2025.1363

The folder contains the raw data in csv (Espin_et_al_ProcB_dataset.csv) and STATA (Espin_et_al_ProcB_dataset.dta) format, and the script (STATA do file: Espin_et_al_ProcB_script.do) used to define all the variables used and conduct all the analyses reported in the article. The analyses in the script appear in the exact order of appearance in the associated manuscript text, starting from the Methods section and then the Results section.

Variable description: cod_center - code of the centre the participant belongs to id - id code of the participant id_incenter - id code of the participant within the centre pop_propGcenter - proportion of G users (raised under legal guardianship) in the centre population based on administr..., We confirm that we received explicit consent from participants to publish the de-identified data in the public domain. All participants are identified by an anonymous numeric code in the data files.
Final data for in Stata.dta
figshare.com
bin
Updated Jun 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Beker Ahmed (2024). Final data for in Stata.dta [Dataset]. http://doi.org/10.6084/m9.figshare.26065222.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26065222.v1
Dataset updated
Jun 19, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Beker Ahmed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Antenatal care (ANC) is the care given to pregnant by qualified medical experts in order to guarantee the optimal health conditions for the mother and the unborn child during pregnancy. Four or fewer antenatal care (ANC) visits are strongly linked to maternal and perinatal death. Because of this, the World Health Organization created a new model known as minimum of eight antenatal care (ANC8+) contact. This study aims to focus on the current antenatal care contact which not previously addressed. Therefore, the aim of this to investigate time to first antenatal care contact and its predictors among pregnant women at Bishoftu General Hospital 2023/24Methods: An institutional-based cross-sectional study design was conducted among 347 study participants which was selected by systematic random sampling method. The data was collected using pretested, structured questionnaires. Data was entered into Epi Data version 4.6 and analyzed using STATA 15. Descriptive summary statistics like median survival time, Kaplan Meier survival curve, and Log-rank test were computed. Bivariate and multivariable Weibull regresion models were fitted to identify the time to first antenatal care contact and predictors. A hazard ratio with a 95% confidence interval was calculated and p-values < 0.05 were considered statistically significantEthical approval and informed consentEthical clearance was obtained from an institutional Research Ethics Review Board (IRB) of the University of Arsi University (with Reference number, A/CHS/18/2023). In addition, a letter of ethical approval was sent to Bishoftu General Hospital to be obtained from the hospital’s administrators. Informed, voluntary, and verbal were obtained from the head of the hospital and mothers. There are no study participants under the age of 18 years. Before conducting the interviews, information was given to the participants, and were assured of voluntary participation, confidentiality, and freedom to withdraw from the study at any time. The nature and significance of the study were explained to the participantsData collection tool and proceduresTo ensure the quality of data at the beginning, a data collection questionnaire was pre-tested on 5% of the calculated sample size at Chelelaka Health Center and necessary modifications will be made based on gaps identified in the questionnaire. Any error found during the process of checking will be corrected and modifications will be made to the final version of the data abstraction format. Training will be given to data collectors and supervisors for 01 days before the actual data collection task on the already existing records, half-day theoretical and half-day practical training. Data quality will be controlled by designing the proper data collection materials, through continuous supervision. All completed data collection forms will be examined for completeness and consistency during data management, storage, cleaning, and analysis. The data will be entered and cleaned by the principal investigator before analysis. Midwives, who are working in the maternity ward, will collect the data. The principal investigator of the study will control the overall activity.
Labor Force Survey, LFS 2006 - Egypt
erfdataportal.com
Updated Feb 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Central Agency For Public Mobilization And Statistics (2023). Labor Force Survey, LFS 2006 - Egypt [Dataset]. https://www.erfdataportal.com/index.php/catalog/146
Explore at:
Dataset updated
Feb 5, 2023
Dataset provided by
Central Agency for Public Mobilization and Statisticshttps://www.capmas.gov.eg/
Economic Research Forum
Time period covered
2006
Area covered
Egypt
Description
Abstract

THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE CENTRAL AGENCY FOR PUBLIC MOBILIZATION AND STATISTICS (CAPMAS)

In any society, the human element represents the basis of the work force which exercises all the service and production activities. Therefore, it is a mandate to produce labor force statistics and studies, that is related to the growth and distribution of manpower and labor force distribution by different types and characteristics.

In this context, the Central Agency for Public Mobilization and Statistics conducts "Quarterly Labor Force Survey" which includes data on the size of manpower and labor force (employed and unemployed) and their geographical distribution by their characteristics.

By the end of each year, CAPMAS issues the annual aggregated labor force bulletin publication that includes the results of the quarterly survey rounds that represent the manpower and labor force characteristics during the year.

----> Historical Review of the Labor Force Survey:

1- The First Labor Force survey was undertaken in 1957. The first round was conducted in November of that year, the survey continued to be conducted in successive rounds (quarterly, bi-annually, or annually) till now.

2- Starting the October 2006 round, the fieldwork of the labor force survey was developed to focus on the following two points: a. The importance of using the panel sample that is part of the survey sample, to monitor the dynamic changes of the labor market. b. Improving the used questionnaire to include more questions, that help in better defining of relationship to labor force of each household member (employed, unemployed, out of labor force ...etc.). In addition to re-order of some of the already existing questions in much logical way.

3- Starting the January 2008 round, the used methodology was developed to collect more representative sample during the survey year. this is done through distributing the sample of each governorate into five groups, the questionnaires are collected from each of them separately every 15 days for 3 months (in the middle and the end of the month)

----> The survey aims at covering the following topics:

1- Measuring the size of the Egyptian labor force among civilians (for all governorates of the republic) by their different characteristics. 2- Measuring the employment rate at national level and different geographical areas. 3- Measuring the distribution of employed people by the following characteristics: gender, age, educational status, occupation, economic activity, and sector. 4- Measuring unemployment rate at different geographic areas. 5- Measuring the distribution of unemployed people by the following characteristics: gender, age, educational status, unemployment type "ever employed/never employed", occupation, economic activity, and sector for people who have ever worked.

The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing labor force surveys in several Arab countries.

Geographic coverage

Covering a sample of urban and rural areas in all the governorates.

Analysis unit

1- Household/family. 2- Individual/person.

Universe

The survey covered a national sample of households and all individuals permanently residing in surveyed households.

Kind of data

Sample survey data [ssd]

Sampling procedure

THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE CENTRAL AGENCY FOR PUBLIC MOBILIZATION AND STATISTICS (CAPMAS)

----> Sample Design and Selection

The sample of the LFS 2006 survey is a simple systematic random sample.

----> Sample Size

The sample size varied in each quarter (it is Q1=19429, Q2=19419, Q3=19119 and Q4=18835) households with a total number of 76802 households annually. These households are distributed on the governorate level (urban/rural).

A more detailed description of the different sampling stages and allocation of sample across governorates is provided in the Methodology document available among external resources in Arabic.

Mode of data collection

Face-to-face [f2f]

Research instrument

The questionnaire design follows the latest International Labor Organization (ILO) concepts and definitions of labor force, employment, and unemployment.

The questionnaire comprises 3 tables in addition to the identification and geographic data of household on the cover page.

----> Table 1- Demographic and employment characteristics and basic data for all household individuals

Including: gender, age, educational status, marital status, residence mobility and current work status

----> Table 2- Employment characteristics table

This table is filled by employed individuals at the time of the survey or those who were engaged to work during the reference week, and provided information on: - Relationship to employer: employer, self-employed, waged worker, and unpaid family worker - Economic activity - Sector - Occupation - Effective working hours - Work place - Average monthly wage

----> Table 3- Unemployment characteristics table

This table is filled by all unemployed individuals who satisfied the unemployment criteria, and provided information on: - Type of unemployment (unemployed, unemployed ever worked) - Economic activity and occupation in the last held job before being unemployed - Last unemployment duration in months - Main reason for unemployment

Cleaning operations

----> Raw Data

Office editing is one of the main stages of the survey. It started once the questionnaires were received from the field and accomplished by the selected work groups. It includes: a-Editing of coverage and completeness b-Editing of consistency

----> Harmonized Data

The STATA is used to clean and SPSS is used harmonize the datasets.

The harmonization process starts with a cleaning process for all raw data files received from the Statistical Agency.

All cleaned data files are then merged to produce one data file on the individual level containing all variables subject to harmonization.

A country-specific program is generated for each dataset to generate/ compute/ recode/ rename/ format/ label harmonized variables.

A post-harmonization cleaning process is then conducted on the data.

Harmonized data is saved on the household as well as the individual level, in SPSS and then converted to STATA, to be disseminated.
m
Data for: Evaluation of intrinsic and extrinsic risk factors for dog...
data.mendeley.com
Updated Jul 4, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Azzurra Carnio (2020). Data for: Evaluation of intrinsic and extrinsic risk factors for dog visceral hemangiosarcoma: a retrospective case-control study register-based in Lazio region, Italy [Dataset]. http://doi.org/10.17632/xpjwcswnpw.1
Explore at:
Unique identifier
https://doi.org/10.17632/xpjwcswnpw.1
Dataset updated
Jul 4, 2020
Authors
Azzurra Carnio
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Area covered
Italy, Lazio
Description
The present dataset contains the data about cases and controls extracted from the Animal Tumours Registry of Lazio region (Italy). The excel file contains three excel worksheets. The first one is the raw dataset, the second one is the STATA codified dataset with the coding legend and the third one is the codified dataset, in order to allow the statistical analysis by STATA software.
Percentage distribution of socio-demographic characteristics among...
plos.figshare.com
xls
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abiyu Abadi Tareke; Ermias Bekele Enyew; Berhanu Fikadie Endehabtu; Abiy Tasew Dubale; Habitu Birhan Eshetu; Sisay Maru Wubante (2023). Percentage distribution of socio-demographic characteristics among respondents, 2005, 2011 and 2016 EDHS. [Dataset]. http://doi.org/10.1371/journal.pone.0272701.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0272701.t001
Dataset updated
Jun 16, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Abiyu Abadi Tareke; Ermias Bekele Enyew; Berhanu Fikadie Endehabtu; Abiy Tasew Dubale; Habitu Birhan Eshetu; Sisay Maru Wubante
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Percentage distribution of socio-demographic characteristics among respondents, 2005, 2011 and 2016 EDHS.
m
Experimenting exogenous sanctioning in a public good: when the order matters...
data.mendeley.com
Updated Nov 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adriana Alventosa (2023). Experimenting exogenous sanctioning in a public good: when the order matters [Dataset]. http://doi.org/10.17632/6szdbhrbg7.1
Explore at:
Unique identifier
https://doi.org/10.17632/6szdbhrbg7.1
Dataset updated
Nov 14, 2023
Authors
Adriana Alventosa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the database and Stata commands used to generate the results obtained on the paper with the same name (currently under review)
d
Replication data for: What is the Active Prevalence of COVID-19?
search.dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang, Mu-Jeung; Bertanha, Marinho; Seegert, Nathan; Gaulin, Maclean; Looney, Adam; Orleans, Brian; Pavia, Andrew T.; Stratford, Kristina; Samore, Matthew; Alder, Steven (2023). Replication data for: What is the Active Prevalence of COVID-19? [Dataset]. http://doi.org/10.7910/DVN/BCBUCE
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/BCBUCE
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Yang, Mu-Jeung; Bertanha, Marinho; Seegert, Nathan; Gaulin, Maclean; Looney, Adam; Orleans, Brian; Pavia, Andrew T.; Stratford, Kristina; Samore, Matthew; Alder, Steven
Description
What is the Active Prevalence of COVID-19? By Mu-Jeung Yang, Marinho Bertanha, Nathan Seegert, Maclean Gaulin, Adam Looney, Brian Orleans, Andrew T. Pavia, Kristina Stratford, Matthew Samore, Steven Alder Code repository to recreate the figures and tables in “What is the Active Prevalence of COVID-19?”, Review of Economics and Statistics, 2023 Data • Our primary data on COVID-19 positivity rates and case counts are publicly available from covidtracking.com • Population data for Utah is publicly available from the Census Bureau. • Our testing data used to calibrate our model contains sensitive private information, and is thus not available for distribution. However, researchers interested in replicating this part of the analysis can apply with an email to mjyang@ou.edu, for an anonymized and randomized subsample that replicates our main results. Decisions about data sharing will be made on a case-by-case basis. Instructions Code can generally be run in numerical order presented in filenames. All but one are stata files, run using Stata 17 (but should be generally compatible with other versions): 1. 1.0_load_data.do is run by other files, not individually. 2. 1.1_cache-load_lasso_data.do is used to create the dataset for lasso regressions, which use interactions. This file makes those interaction variables, and names them appropriately to be used in loops and with Stata’s * notation. 3. 2.1_cache_bootstrap_results.do caches the CIs from our SE bootstrap procedure, because it takes a long time to run. Caches bootstrap results to ./output/bootstrap/. 4. 3.0_table_1.do creates summary statistics and tex variables to be used in the paper. 5. 3.1_table_2.do creates table 2, which uses bootstrap SEs, so 2.1_cache_bootstrap_results.do should have been run first. Also saves off data to a temporary file for use in making figures below. 6. 3.2_table_3.do makes the state estimates in table 3. 7. 4.0_figure_1.ipynb uses python to generate Figure 1. 8. 4.1_figure_2.do makes both panels of figure 2, using the cached file from 3.1_table_2.do. 9. 5.0_appendix_c_table_1.do makes Table 1 in Appendix C. 10. 5.1_appendix_c_table_2.do makes Table 2 in Appendix C. 11. 6.0_appendix_b_figure_3.do makes figure 3 in Appendix B. To run, extract this repo to ~/Desktop/RESTAT_CODE and execute the files in Stata or Python as per above.
e
Rewards and cooperation in social dilemma games. Journal of Environmental...
datarepository.eur.nl
bin
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jan Stoop; Daan van Soest; Jana Vyrastekova (2023). Rewards and cooperation in social dilemma games. Journal of Environmental Economics and Management_stata data and do file [Dataset]. http://doi.org/10.25397/eur.14636343.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.25397/eur.14636343.v1
Dataset updated
May 31, 2023
Dataset provided by
Erasmus University Rotterdam (EUR)
Authors
Jan Stoop; Daan van Soest; Jana Vyrastekova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Stata dta and do file of the paper "Stoop, J., Van Soest, D., and Vyrastekova, J. (2018). Rewards and cooperation in social dilemma games. Journal of Environmental Economics and Management, 88, 300-310".The do file shows the statistical analyses of this paper, showed in the order in which they appear in the paper.
Labor Force Survey, LFS 2017 - Palestine
erfdataportal.com
Updated Mar 22, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Palestinian Central Bureau of Statistics (2021). Labor Force Survey, LFS 2017 - Palestine [Dataset]. https://www.erfdataportal.com/index.php/catalog/170
Explore at:
Dataset updated
Mar 22, 2021
Dataset provided by
Palestinian Central Bureau of Statisticshttps://pcbs.gov/
Economic Research Forum
Time period covered
2017
Area covered
Palestine
Description
Abstract

THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE PALESTINIAN CENTRAL BUREAU OF STATISTICS

The Palestinian Central Bureau of Statistics (PCBS) carried out four rounds of the Labor Force Survey 2017 (LFS). The survey rounds covered a total sample of about 23,120 households (5,780 households per quarter).

The main objective of collecting data on the labour force and its components, including employment, unemployment and underemployment, is to provide basic information on the size and structure of the Palestinian labour force. Data collected at different points in time provide a basis for monitoring current trends and changes in the labour market and in the employment situation. These data, supported with information on other aspects of the economy, provide a basis for the evaluation and analysis of macro-economic policies.

The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing labor force surveys in several Arab countries.

Geographic coverage

Covering a representative sample on the region level (West Bank, Gaza Strip), the locality type (urban, rural, camp) and the governorates.

Analysis unit

1- Household/family. 2- Individual/person.

Universe

The survey covered all Palestinian households who are a usual residence of the Palestinian Territory.

Kind of data

Sample survey data [ssd]

Sampling procedure

THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE PALESTINIAN CENTRAL BUREAU OF STATISTICS

The methodology was designed according to the context of the survey, international standards, data processing requirements and comparability of outputs with other related surveys.

---> Target Population: It consists of all individuals aged 10 years and Above and there are staying normally with their households in the state of Palestine during 2017.

---> Sampling Frame: The sampling frame consists of the master sample, which was updated in 2011: each enumeration area consists of buildings and housing units with an average of about 124 households. The master sample consists of 596 enumeration areas; we used 494 enumeration areas as a framework for the labor force survey sample in 2017 and these units were used as primary sampling units (PSUs).

---> Sampling Size: The estimated sample size is 5,780 households in each quarter of 2017.

---> Sample Design The sample is two stage stratified cluster sample with two stages : First stage: we select a systematic random sample of 494 enumeration areas for the whole round ,and we excluded the enumeration areas which its sizes less than 40 households. Second stage: we select a systematic random sample of 16 households from each enumeration area selected in the first stage, se we select a systematic random of 16 households of the enumeration areas which its size is 80 household and over and the enumeration areas which its size is less than 80 households we select systematic random of 8 households.

---> Sample strata: The population was divided by: 1- Governorate (16 governorate) 2- Type of Locality (urban, rural, refugee camps).

---> Sample Rotation: Each round of the Labor Force Survey covers all of the 494 master sample enumeration areas. Basically, the areas remain fixed over time, but households in 50% of the EAs were replaced in each round. The same households remain in the sample for two consecutive rounds, left for the next two rounds, then selected for the sample for another two consecutive rounds before being dropped from the sample. An overlap of 50% is then achieved between both consecutive rounds and between consecutive years (making the sample efficient for monitoring purposes).

Mode of data collection

Face-to-face [f2f]

Research instrument

The survey questionnaire was designed according to the International Labour Organization (ILO) recommendations. The questionnaire includes four main parts:

---> 1. Identification Data: The main objective for this part is to record the necessary information to identify the household, such as, cluster code, sector, type of locality, cell, housing number and the cell code.

---> 2. Quality Control: This part involves groups of controlling standards to monitor the field and office operation, to keep in order the sequence of questionnaire stages (data collection, field and office coding, data entry, editing after entry and store the data.

---> 3. Household Roster: This part involves demographic characteristics about the household, like number of persons in the household, date of birth, sex, educational level…etc.

---> 4. Employment Part: This part involves the major research indicators, where one questionnaire had been answered by every 15 years and over household member, to be able to explore their labour force status and recognize their major characteristics toward employment status, economic activity, occupation, place of work, and other employment indicators.

Cleaning operations

---> Raw Data PCBS started collecting data since 1st quarter 2017 using the hand held devices in Palestine excluding Jerusalem in side boarders (J1) and Gaza Strip, the program used in HHD called Sql Server and Microsoft. Net which was developed by General Directorate of Information Systems. Using HHD reduced the data processing stages, the fieldworkers collect data and sending data directly to server then the project manager can withdrawal the data at any time he needs. In order to work in parallel with Gaza Strip and Jerusalem in side boarders (J1), an office program was developed using the same techniques by using the same database for the HHD.

---> Harmonized Data - The SPSS package is used to clean and harmonize the datasets. - The harmonization process starts with a cleaning process for all raw data files received from the Statistical Agency. - All cleaned data files are then merged to produce one data file on the individual level containing all variables subject to harmonization. - A country-specific program is generated for each dataset to generate/ compute/ recode/ rename/ format/ label harmonized variables. - A post-harmonization cleaning process is then conducted on the data. - Harmonized data is saved on the household as well as the individual level, in SPSS and then converted to STATA, to be disseminated.

Response rate

The survey sample consists of about 30,230 households of which 23,120 households completed the interview; whereas 14,682 households from the West Bank and 8,438 households in Gaza Strip. Weights were modified to account for non-response rate. The response rate in the West Bank reached 82.4% while in the Gaza Strip it reached 92.7%.

Sampling error estimates

---> Sampling Errors Data of this survey may be affected by sampling errors due to use of a sample and not a complete enumeration. Therefore, certain differences can be expected in comparison with the real values obtained through censuses. Variances were calculated for the most important indicators: the variance table is attached with the final report. There is no problem in disseminating results at national or governorate level for the West Bank and Gaza Strip.

---> Non-Sampling Errors Non-statistical errors are probable in all stages of the project, during data collection or processing. This is referred to as non-response errors, response errors, interviewing errors, and data entry errors. To avoid errors and reduce their effects, great efforts were made to train the fieldworkers intensively. They were trained on how to carry out the interview, what to discuss and what to avoid, carrying out a pilot survey, as well as practical and theoretical training during the training course. Also data entry staff were trained on the data entry program that was examined before starting the data entry process. To stay in contact with progress of fieldwork activities and to limit obstacles, there was continuous contact with the fieldwork team through regular visits to the field and regular meetings with them during the different field visits. Problems faced by fieldworkers were discussed to clarify any issues. Non-sampling errors can occur at the various stages of survey implementation whether in data collection or in data processing. They are generally difficult to be evaluated statistically.

They cover a wide range of errors, including errors resulting from non-response, sampling frame coverage, coding and classification, data processing, and survey response (both respondent and interviewer-related). The use of effective training and supervision and the careful design of questions have direct bearing on limiting the magnitude of non-sampling errors, and hence enhancing the quality of the resulting data. The implementation of the survey encountered non-response where the case ( household was not present at home ) during the fieldwork visit and the case ( housing unit is vacant) become the high percentage of the non response cases. The total non-response rate reached14.2% which is very low once compared to the household surveys conducted by PCBS , The refusal rate reached 3.0% which is very low percentage compared to the
u
Health Survey for England, 2000-2001: Small Area Estimation Teaching Dataset...
datacatalogue.ukdataservice.ac.uk
Updated Jul 29, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Manchester, Cathie Marsh Centre for Census and Survey Research, ESDS Government (2011). Health Survey for England, 2000-2001: Small Area Estimation Teaching Dataset [Dataset]. http://doi.org/10.5255/UKDA-SN-6792-1
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-6792-1
Dataset updated
Jul 29, 2011
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
Authors
University of Manchester, Cathie Marsh Centre for Census and Survey Research, ESDS Government
Area covered
England
Description
The Health Survey for England, 2000-2001: Small Area Estimation Teaching Dataset was prepared as a resource for those interested in learning introductory small area estimation techniques. It was first presented as part of a workshop entitled 'Introducing small area estimation techniques and applying them to the Health Survey for England using Stata'. The data are accompanied by a guide that includes a practical case study enabling users to derive estimates of disability for districts in the absence of survey estimates. This is achieved using various models that combine information from ESDS government surveys with other aggregate data that are reliably available for sub-national areas. Analysis is undertaken using Stata statistical software; all relevant syntax is provided in the accompanying '.do' files.

The data files included in this teaching resource contain HSE variables and data from the Census and Mid-year population estimates and projections that were developed originally by the National Statistical agencies, as follows:
The main data file, 'hse_data.dta', is a reduced version of the HSE for 2000 and 2001. In order to combine data from two years of the HSE in a consistent way some changes have been made to the weights in each year. Additionally, some recoding of the limiting long term illness (LLTI), disability and the age variable has also been undertaken.
File 'practical_1_task_5_data.dta' contains population counts and model mobility disability rates (estimated during practical 1) distinguishing single year of age and sex for the six case study districts.
File 'practical_2_data.dta' contains the aggregate data required for Practical 2, including age- and sex-specific rates of LLTI (Census) for six UK case study districts, age- and sex-specific rates of mobility disability for England (HSE), and population counts for the six districts.
File 'pop_data_practical_3.dta' contains population counts for the six districts (by age, sex and LLTI status) required for practical 3
The original HSEs for 2000 and 2001 are held at the UK Data Archive under SNs 4628 and 4912 respectively. Full details of the recoding of HSE variables and how the aggregate data was produced can be found in the data documentation.

This unrestricted access data collection is freely available to download under an Open Government Licence from the UK Data Service. Note that the files should be unzipped/saved to the C: drive of the computer to be used; all syntax assumes files are saved at this location.
g
Mexican Wealth Distribution 1810-1910
gimi9.com
researchdata.se
Updated Dec 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Mexican Wealth Distribution 1810-1910 [Dataset]. https://gimi9.com/dataset/eu_https-doi-org-10-57804-q8sr-qz06
Explore at:
Dataset updated
Dec 2, 2023
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The zip files contain several files with wills from Mexico between 1810 and 1910 collected in order to measure Mexican wealth distribution in its first century of independence. The main file is wills_clean.xlsx, which contains the full collection of wills; in that file, you will find variables for year, state, and wealth, not excluding debts, debts and wealth (net wealth). You can combine this file with the do file cleaningroutine_for_social_tables to produce the detailed social tables. The rest of the files consist of data files with the social tables (for comparison) and xlsx files with the wills from the main file divided by decade to facilitate calculations using the do file inequality_analysis_ routine_clean.do from which you will be able to reproduce the rest of the analysis (unbalanced sample and generalized beta, lognormal, etc.) Note: The calculation programs are .do files; thus, they require stata to be executed. Some of the detailed social tables are dta files, and thus also stata files. You can open them in R and work with them or convert them to any other data format. The wills come from 5 different Mexican archives: Archivo Histórico de Notarias de la Ciudad de México, Archivo General del Estado de Yucatán, Archivo Municipal de Saltillo, Archivo Histórico de la Ciudad de Morelia and, Testamentos del Colegio de Sonora.
Intimate partner violence against women living with and without HIV, and the...
plos.figshare.com
datasetcatalog.nlm.nih.gov
pdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mengistu Meskele; Nelisiwe Khuzwayo; Myra Taylor (2023). Intimate partner violence against women living with and without HIV, and the associated factors in Wolaita Zone, Southern Ethiopia: A comparative cross-sectional study [Dataset]. http://doi.org/10.1371/journal.pone.0220919
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0220919
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Mengistu Meskele; Nelisiwe Khuzwayo; Myra Taylor
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Southern Nations, Nationalities and Peoples, Ethiopia
Description
ObjectivesThis study aimed to measure the prevalence and associated factors of Intimate Partner Violence (IPV) among women living with and without HIV in Wolaita Zone, Southern Ethiopia.MethodsA comparative cross-sectional study design was used to interview the 816 women between 18–49 years of age (408 = HIV positive, 408 = HIV negative). Using a multistage sampling technique, participants were recruited from nine health facilities based on probability proportional to the number of clients. After data entry (EpiData version 4.4.2.0) the data were exported to STATA/SE 15 software. Binary and multivariable logistic regression analysis were undertaken and the odds ratio (OR) and 95% confidence interval (CI) are presented.ResultsThe lifetime prevalence of IPV among all women was 59.7%, [95% CI: 56.31%-63.05%]. IPV was slightly higher among women living with HIV, 250(61.3%), than those who were HIV negative, 238(58.1%). Lifetime prevalence of emotional violence 413(50.6%), physical violence 349(42.8%), sexual violence 219(26.8%), and controlling behaviours by husbands/partners 489(59.9%) were reported. Associations were found between IPV and controlling behaviour of husband/partner [AOR = 8.13; 95% CI: 4.93–13.42],income [AOR = 3.97; 95% CI:1.81–8.72], bride price payment [AOR = 3.46; 95% CI:1.74–6.87], women’s decision to refuse sex [AOR = 2.99; 95% CI: 1.39–6.41],age group of women [AOR = 2.86; 95% CI:1.67–4.90], partner’s family choosing wife [AOR = 2.83; 95% CI:1.70–4.69], alcohol consumption by partner [AOR = 2.36;95% CI:1.36–4.10], number of sexual partners [AOR = 2.35; 95% CI:1.36–4.09], and if partner ever physically fought with another man [AOR = 1.83; 95% CI:1.05–3.19].ConclusionsThere is a high prevalence of IPV against women both living with and without HIV. Policy priorities should therefore involve males in programs of gender-based violence prevention in order to change their violent behaviour, and interventions are required to improve the economic status of women. Both sexes should be advised to have a single partner and marriage arrangements should be by mutual consent rather than being made by parents.
2
Understanding Society, Waves 1-, 2008- : Safeguarded/Special Licence
datacatalogue.ukdataservice.ac.uk
Updated Jul 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Essex, Institute for Social and Economic Research (2022). Understanding Society, Waves 1-, 2008- : Safeguarded/Special Licence [Dataset]. http://doi.org/10.5255/UKDA-SN-8987-1
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-8987-1
Dataset updated
Jul 22, 2022
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
Authors
University of Essex, Institute for Social and Economic Research
Time period covered
Jan 1, 2020 - Dec 31, 2020
Area covered
United Kingdom
Description
Understanding Society (the UK Household Longitudinal Study), which began in 2009, is conducted by the Institute for Social and Economic Research (ISER) at the University of Essex, and the survey research organisations Verian Group (formerly Kantar Public) and NatCen. It builds on and incorporates, the British Household Panel Survey (BHPS), which began in 1991.

The Understanding Society: Calendar Year Dataset, 2020, is designed to enable cross-sectional analysis of individuals and households relating specifically to their annual interviews conducted in the year 2020, and, therefore, combine data collected in three waves (Waves 10, 11 and 12). It has been produced from the same data collected in the main Understanding Society study and released in the longitudinal datasets SN 6614 (End User Licence) and SN 6931 (Special Licence). Such cross-sectional analysis can, however, only involve variables that are collected in every wave in order to have data for the full sample panel. The 2020 dataset is the first of a series of planned Calendar Year Datasets to facilitate cross-sectional analysis of specific years. Full details of the Calendar Year Dataset sample structure (including why some individual interviews from 2021 are included), data structure and additional supporting information can be found in the document '8987_calendar_year_dataset_2020_user_guide'.

As multi-topic studies, the purpose of Understanding Society is to understand short- and long-term effects of social and economic change in the UK at the household and individual levels. The study has a strong emphasis on domains of family and social ties, employment, education, financial resources, and health. Understanding Society is an annual survey of each adult member of a nationally representative sample. The same individuals are re-interviewed in each wave approximately 12 months apart. When individuals move they are followed within the UK and anyone joining their households are also interviewed as long as they are living with them. The fieldwork period for a single wave is 24 months. Data collection uses computer-assisted personal interviewing (CAPI) and web interviews (from wave 7), and includes a telephone mop up. From March 2020 (the end of wave 10 and 2nd year of wave 11), due to the coronavirus pandemic, face-to-face interviews were suspended and the survey has been conducted by web and telephone only, but otherwise has continued as before. One person completes the household questionnaire. Each person aged 16 or older participates in the individual adult interview and self-completed questionnaire. Youths aged 10 to 15 are asked to respond to a paper self-completion questionnaire. In 2020 an additional frequent web survey was separately issued to sample members to capture data on the rapid changes in people’s lives due to the COVID-19 pandemic (see SN 8644). The COVID-19 Survey data are not included in this dataset.

Further information may be found on the "https://www.understandingsociety.ac.uk/documentation/mainstage"> Understanding Society main stage webpage and links to publications based on the study can be found on the Understanding Society Latest Research webpage.
Co-funders
In addition to the Economic and Social Research Council, co-funders for the study included the Department of Work and Pensions, the Department for Education, the Department for Transport, the Department of Culture, Media and Sport, the Department for Community and Local Government, the Department of Health, the Scottish Government, the Welsh Assembly Government, the Northern Ireland Executive, the Department of Environment and Rural Affairs, and the Food Standards Agency.

End User Licence and Special Licence versions:
There are two versions of the Calendar Year 2020 data. One is available under the standard End User Licence (EUL) agreement, and the other is a Special Licence (SL) version. The SL version contains month and year of birth variables instead of just age, more detailed country and occupation coding for a number of variables and various income variables have not been top-coded (see xxxx_eul_vs_sl_variable_differences for more details). Users are advised to first obtain the standard EUL version of the data to see if they are sufficient for their research requirements. The SL data have more restrictive access conditions; prospective users of the SL version will need to complete an extra application form and demonstrate to the data owners exactly why they need access to the additional variables in order to get permission to use that version. The main longitudinal versions of the Understanding Society study may be found under SNs 6614 (EUL) and 6931 (SL).

Low- and Medium-level geographical identifiers produced for the mainstage longitudinal dataset can be used with this Calendar Year 2020 dataset, subject to SL access conditions. See the User Guide for further details.

Suitable data analysis software
These data are provided by the depositor in Stata format. Users are strongly advised to analyse them in Stata. Transfer to other formats may result in unforeseen issues. Stata SE or MP software is needed to analyse the larger files, which contain about 1,900 variables.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). List of Top Schools of The Stata Journal sorted by citations [Dataset]. https://exaly.com/journal/22728/the-stata-journal/top-schools

List of Top Schools of The Stata Journal sorted by citations

Explore at:

csv, jsonAvailable download formats

Dataset updated

Nov 1, 2025

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

List of Top Schools of The Stata Journal sorted by citations.

Clear search

Close search

Google apps

Main menu

List of Top Schools of The Stata Journal sorted by citations

Data and code from: The impact of light-rail stations on income sorting in...

Data from: Data files used to study change dynamics in software systems

Fee vs Fine

Description

Data Collection

Variables

Identification

Decision Rounds

Norm Elicitation

Demographics

Other

Stata Do-File (analysis.do)

Mini EDHS 2019 data set in excel Stata form.

Dataset for Transient and persistent efficiency and spatial spillovers:...

Replication Data for: Tabloid Media Campaigns and Public Opinion:...

Data from: Conflicting identities and cooperation between groups:...

Final data for in Stata.dta

Labor Force Survey, LFS 2006 - Egypt

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Data for: Evaluation of intrinsic and extrinsic risk factors for dog...

Percentage distribution of socio-demographic characteristics among...

Experimenting exogenous sanctioning in a public good: when the order matters...

Replication data for: What is the Active Prevalence of COVID-19?

Rewards and cooperation in social dilemma games. Journal of Environmental...

Labor Force Survey, LFS 2017 - Palestine

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Response rate

Sampling error estimates

Health Survey for England, 2000-2001: Small Area Estimation Teaching Dataset...

Mexican Wealth Distribution 1810-1910

Intimate partner violence against women living with and without HIV, and the...

Understanding Society, Waves 1-, 2008- : Safeguarded/Special Licence

List of Top Schools of The Stata Journal sorted by citations

Stata Do-File (`analysis.do`)