Facebook
TwitterThis is STATA software code for analysis on publicly available NHANES data
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are the STATA data sheets imported from excel. These are used directly for meta-analysis
Facebook
TwitterGeneral information: The data sets contain information on how often materials of studies available through GESIS: Data Archive for the Social Sciences were downloaded and/or ordered through one of the archive´s plattforms/services between 2004 and 2017.
Sources and plattforms: Study materials are accessible through various GESIS plattforms and services: Data Catalogue (DBK), histat, datorium, data service (and others).
Years available: - Data Catalogue: 2012-2017 - data service: 2006-2017 - datorium: 2014-2017 - histat: 2004-2017
Data sets: Data set ZA6899_Datasets_only_all_sources contains information on how often data files such as those with dta- (Stata) or sav- (SPSS) extension have been downloaded. Identification of data files is handled semi-automatically (depending on the plattform/serice). Multiple downloads of one file by the same user (identified through IP-address or username for registered users) on the same days are only counted as one download.
Data set ZA6899_Doc_and_Data_all_sources contains information on how often study materials have been downloaded. Multiple downloads of any file of the same study by the same user (identified through IP-address or username for registered users) on the same days are only counted as one download.
Both data sets are available in three formats: csv (quoted, semicolon-separated), dta (Stata v13, labeled) and sav (SPSS, labeled). All formats contain identical information.
Variables: Variables/columns in both data sets are identical. za_nr ´Archive study number´ version ´GESIS Archiv Version´ doi ´Digital Object Identifier´ StudyNo ´Study number of respective study´ Title ´English study title´ Title_DE ´German study title´ Access ´Access category (0, A, B, C, D, E)´ PubYear ´Publication year of last version of the study´ inZACAT ´Study is currently also available via ZACAT´ inHISTAT ´Study is currently also available via HISTAT´ inDownloads ´There are currently data files available for download for this study in DBK or datorium´ Total ´All downloads combined´ downloads_2004 ´downloads/orders from all sources combined in 2004´ [up to ...] downloads_2017 ´downloads/orders from all sources combined in 2017´ d_2004_dbk ´downloads from source dbk in 2004´ [up to ...] d_2017_dbk ´downloads from source dbk in 2017´ d_2004_histat ´downloads from source histat in 2004´ [up to ...] d_2017_histat ´downloads from source histat in 2017´ d_2004_dataservice ´downloads/orders from source dataservice in 2004´ [up to ...] d_2017_dataservice ´downloads/orders from source dataservice in 2017´
More information is available within the codebook.
Facebook
TwitterThe dataset includes the Ct values for the 4 channels (HPV16, HPV18, HPV other, and beta-globin) as well as the result of the clinical HPV test and where available, the cytology and histology. (DTA)
Facebook
TwitterThis dataset was created by iFinance Tutor
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We include Stata syntax (dummy_dataset_create.do) that creates a panel dataset for negative binomial time series regression analyses, as described in our paper "Examining methodology to identify patterns of consulting in primary care for different groups of patients before a diagnosis of cancer: an exemplar applied to oesophagogastric cancer". We also include a sample dataset for clarity (dummy_dataset.dta), and a sample of that data in a spreadsheet (Appendix 2).
The variables contained therein are defined as follows:
case: binary variable for case or control status (takes a value of 0 for controls and 1 for cases).
patid: a unique patient identifier.
time_period: A count variable denoting the time period. In this example, 0 denotes 10 months before diagnosis with cancer, and 9 denotes the month of diagnosis with cancer,
ncons: number of consultations per month.
period0 to period9: 10 unique inflection point variables (one for each month before diagnosis). These are used to test which aggregation period includes the inflection point.
burden: binary variable denoting membership of one of two multimorbidity burden groups.
We also include two Stata do-files for analysing the consultation rate, stratified by burden group, using the Maximum likelihood method (1_menbregpaper.do and 2_menbregpaper_bs.do).
Note: In this example, for demonstration purposes we create a dataset for 10 months leading up to diagnosis. In the paper, we analyse 24 months before diagnosis. Here, we study consultation rates over time, but the method could be used to study any countable event, such as number of prescriptions.
Facebook
Twitterwndknd/stata dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThis package contains two files designed to help read individual level DHS data into Stata. The first file addresses the problem that versions of Stata before Version 7/SE will read in only up to 2047 variables and most of the individual files have more variables than that. The file will read in the .do, .dct and .dat file and output new .do and .dct files with only a subset of the variables specified by the user. The second file deals with earlier DHS surveys in which .do and .dct file do not exist and only .sps and .sas files are provided. The file will read in the .sas and .sps files and output a .dct and .do file. If necessary the first file can then be run again to select a subset of variables.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is data used for regression model in Stata format
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
It is a widely accepted fact that evolving software systems change and grow. However, it is less well-understood how change is distributed over time, specifically in object oriented software systems. The patterns and techniques used to measure growth permit developers to identify specific releases where significant change took place as well as to inform them of the longer term trend in the distribution profile. This knowledge assists developers in recording systemic and substantial changes to a release, as well as to provide useful information as input into a potential release retrospective. However, these analysis methods can only be applied after a mature release of the code has been developed. But in order to manage the evolution of complex software systems effectively, it is important to identify change-prone classes as early as possible. Specifically, developers need to know where they can expect change, the likelihood of a change, and the magnitude of these modifications in order to take proactive steps and mitigate any potential risks arising from these changes. Previous research into change-prone classes has identified some common aspects, with different studies suggesting that complex and large classes tend to undergo more changes and classes that changed recently are likely to undergo modifications in the near future. Though the guidance provided is helpful, developers need more specific guidance in order for it to be applicable in practice. Furthermore, the information needs to be available at a level that can help in developing tools that highlight and monitor evolution prone parts of a system as well as support effort estimation activities. The specific research questions that we address in this chapter are: (1) What is the likelihood that a class will change from a given version to the next? (a) Does this probability change over time? (b) Is this likelihood project specific, or general? (2) How is modification frequency distributed for classes that change? (3) What is the distribution of the magnitude of change? Are most modifications minor adjustments, or substantive modifications? (4) Does structural complexity make a class susceptible to change? (5) Does popularity make a class more change-prone? We make recommendations that can help developers to proactively monitor and manage change. These are derived from a statistical analysis of change in approximately 55000 unique classes across all projects under investigation. The analysis methods that we applied took into consideration the highly skewed nature of the metric data distributions. The raw metric data (4 .txt files and 4 .log files in a .zip file measuring ~2MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cleaned Dataset for the Project
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data from an online experiment designed to test whether economically equivalent penalties—fees (paid before taking) and fines (paid after taking)—influence prosocial behaviour differently. Participants played a modified dictator game in which they could take points from another participant.
The dataset is provided in Excel format (Full-data.xlsx), along with a Stata do-file (submit.do) that reshapes, cleans, and analyses the data.
Platform: oTree
Recruitment: Prolific
Sample size: 201 participants
Design: Each participant played 20 rounds: 10 in the control condition and 10 in one treatment condition (fee or fine). Order of blocks was randomised.
Payment: 200 points = £1. One round was randomly selected for payment.
session – Session number
id – Participant ID
treatment – Assigned treatment (1 = Fee, 2 = Fine)
order – Order of blocks (0 = Control first, 1 = Treatment first)
For each round, participants made decisions in both control (c) and treatment (t) conditions.
c1, t1, c2, t2, … – Tokens available and/or allocated across control and treatment rounds.
takeX – Amount taken from the other participant in case X.
Social norms were elicited after the taking task. Variables include empirical, normative, and responsibility measures at both extensive and intensive margins:
eyX, etX – Empirical expectations (beliefs about what others do)
nyX, ntX – Normative expectations (beliefs about what others think is appropriate)
ryX, rtX – Responsibility measures
casenormX – Case identifier for norm elicitation
From survey responses:
Sex – Gender
Ethnicitysimplified – Simplified ethnicity category
Countryofresidence – Participant’s country of residence
order, session – Experimental setup metadata
analysis.do)The .do file performs the following steps:
Data Preparation
Import raw Excel file
Reshape from wide to long format (cases per participant)
Declare panel data (xtset id)
Variable Generation
Rename variables for clarity (e.g., take for amount taken)
Generate treatment dummies (treat)
Construct demographic dummies (gender, race, nationality)
Analysis Preparation
Create extensive and intensive margin variables
Generate expectation and norm measures
Output
Ready-to-analyse panel dataset for regression and statistical analysis
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Archive of datasets for A Stata Companion to Political Analysis, 5th Edition (published by CQ/Sage in 2023). There are five primary datasets: Debate, GSS, NES, States, and World. There are three minor datasets used for chapter exercises: ch6ex, ch11ex, and ch14ex. Use these datasets to replicate demonstration examples from the book and chapter exercises.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data and commands replicate all tables and figures in the paper titled "Chinese Agriculture in the Age of High-speed Rail: Effects on Agricultural Value Added and Food Output" publihsed in Agribusiness, 2023, 39 (2), 387-405. If using the data in this paper, please cite Gao, Y., & Wang, X. (2023). Chinese agriculture in the age of high-speed rail: Effects on agricultural value added and food output. Agribusiness, 39, 387–405. https://doi.org/10.1002/agr.21771
Facebook
TwitterRevised STATA do-file and dataset prepared for journal article resubmission.
Facebook
TwitterRestricted Use data from the ILAB Philippines study
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project points to an article in The Stata Journal describing a set of routines to preprocess nominal data (firm names and addresses), perform probabilistic linking of two datasets, and display candidate matches for clerical review.The ado files and supporting pattern files are downloadable within Stata.
Facebook
TwitterRestricted Use data from the ILAB Philippines study
Facebook
TwitterRestricted Use data from the ILAB Philippines study
Facebook
TwitterThis is STATA software code for analysis on publicly available NHANES data