100+ datasets found
  1. m

    Example Stata syntax and data construction for negative binomial time series...

    • data.mendeley.com
    Updated Nov 2, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Price (2022). Example Stata syntax and data construction for negative binomial time series regression [Dataset]. http://doi.org/10.17632/3mj526hgzx.2
    Explore at:
    Dataset updated
    Nov 2, 2022
    Authors
    Sarah Price
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We include Stata syntax (dummy_dataset_create.do) that creates a panel dataset for negative binomial time series regression analyses, as described in our paper "Examining methodology to identify patterns of consulting in primary care for different groups of patients before a diagnosis of cancer: an exemplar applied to oesophagogastric cancer". We also include a sample dataset for clarity (dummy_dataset.dta), and a sample of that data in a spreadsheet (Appendix 2).

    The variables contained therein are defined as follows:

    case: binary variable for case or control status (takes a value of 0 for controls and 1 for cases).

    patid: a unique patient identifier.

    time_period: A count variable denoting the time period. In this example, 0 denotes 10 months before diagnosis with cancer, and 9 denotes the month of diagnosis with cancer,

    ncons: number of consultations per month.

    period0 to period9: 10 unique inflection point variables (one for each month before diagnosis). These are used to test which aggregation period includes the inflection point.

    burden: binary variable denoting membership of one of two multimorbidity burden groups.

    We also include two Stata do-files for analysing the consultation rate, stratified by burden group, using the Maximum likelihood method (1_menbregpaper.do and 2_menbregpaper_bs.do).

    Note: In this example, for demonstration purposes we create a dataset for 10 months leading up to diagnosis. In the paper, we analyse 24 months before diagnosis. Here, we study consultation rates over time, but the method could be used to study any countable event, such as number of prescriptions.

  2. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  3. a

    External Evaluation of the In Their Hands Programme - Kenya., Round 2 -...

    • microdataportal.aphrc.org
    Updated Jun 14, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damazo Kadengye, PhD (2022). External Evaluation of the In Their Hands Programme - Kenya., Round 2 - Kenya [Dataset]. https://microdataportal.aphrc.org/index.php/catalog/128
    Explore at:
    Dataset updated
    Jun 14, 2022
    Dataset provided by
    Yohannes Dibaba Wado, PhD
    Damazo Kadengye, PhD
    Time period covered
    2019
    Area covered
    Kenya
    Description

    Abstract

    Abstract

    Background: Adolescent girls in Kenya are disproportionately affected by early and unintended pregnancies, unsafe abortion and HIV infection. The In Their Hands (ITH) programme in Kenya aims to increase adolescents' use of high-quality sexual and reproductive health (SRH) services through targeted interventions. ITH Programme aims to promote use of contraception and testing for sexually transmitted infections (STIs) including HIV or pregnancy, for sexually active adolescent girls, 2) provide information, products and services on the adolescent girl's terms; and 3) promote communities support for girls and boys to access SRH services.

    Objectives: The objectives of the evaluation are to assess: a) to what extent and how the new Adolescent Reproductive Health (ARH) partnership model and integrated system of delivery is working to meet its intended objectives and the needs of adolescents; b) adolescent user experiences across key quality dimensions and outcomes; c) how ITH programme has influenced adolescent voice, decision-making autonomy, power dynamics and provider accountability; d) how community support for adolescent reproductive and sexual health initiatives has changed as a result of this programme.

    Methodology ITH programme is being implemented in two phases, a formative planning and experimentation in the first year from April 2017 to March 2018, and a national roll out and implementation from April 2018 to March 2020. This second phase is informed by an Annual Programme Review and thorough benchmarking and assessment which informed critical changes to performance and capacity so that ITH is fit for scale. It is expected that ITH will cover approximately 250,000 adolescent girls aged 15-19 in Kenya by April 2020. The programme is implemented by a consortium of Marie Stopes Kenya (MSK), Well Told Story, and Triggerise. ITH's key implementation strategies seek to increase adolescent motivation for service use, create a user-defined ecosystem and platform to provide girls with a network of accessible subsidized and discreet SRH services; and launch and sustain a national discourse campaign around adolescent sexuality and rights. The 3-year study will employ a mixed-methods approach with multiple data sources including secondary data, and qualitative and quantitative primary data with various stakeholders to explore their perceptions and attitudes towards adolescents SRH services. Quantitative data analysis will be done using STATA to provide descriptive statistics and statistical associations / correlations on key variables. All qualitative data will be analyzed using NVIVO software.

    Study Duration: 36 months - between 2018 and 2020.

    Geographic coverage

    Homabay,Kakamega,Nakuru and Nairobi counties

    Analysis unit

    Private health facilities that provide T-safe services under the In Their Hands(ITH) Program.

    Universe

    1.Adolescent girls aged 15-19 who enrolled on the T-safe platform and received services and those who enrolled but did not receive services from the ITH facilities. 2.Service providers incharge of provision of T-safe services in the ITH facilities. 3.Mobilisers incharge of adolescent girls aged 15-19 recruitment into the T-safe program.

    Sampling procedure

    Qualitative Sampling

    IDI participants were selected purposively from ITH intervention areas and facilities located in the four ITH intervention counties; Homa Bay, Nakuru, Kakamega and Nairobi respectively which were selected for the midline survey. Study participants were identified from selected intervention facilities. We interviewed one service provider of adolescent friendly ITH services per facility. Additionally, we conducted IDI's with adolescent girls' who were enrolled and using/had used the ITH platform to access reproductive health services or enrolled but may not have accessed the services for other reasons.

    Sample coverage We successfully conducted a total of 122 In-depth Interviews with 54 adolescents enrolled on the T-Safe platform, including those who received services and those who were enrolled but did not receive services, 39 IDIS with service providers and 29 IDIs with mobilizers. The distribution per county included 51 IDI's in Nairobi City County (24 with adolescent girls, 17 with service providers and 10 with mobilisers), 15 IDI's in Nakuru County (2 with adolescent girls,8 with service providers and 5 with mobilisers), 34 IDI's in Homa Bay County (18 with adolescent girls,8 with service providers and 8 with mobilisers) and 22 IDI's in Kakamega County (10 with adolescent girls,6 with service providers and another 6 with mobilisers.)

    Sampling deviation

    N/A

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The midline evaluation included qualitative in-depth interviews with adolescent T-Safe users, adolescents enrolled in the platform but did not use the services, providers and mobilizers to assess the adolescent user experience and quality of services as well as provider accountability under the T-Safe program. Generally,the aim of the qualitative study was to assess adolescents' T-Safe users experience across quality dimensions as well as provider's experiences and accountability. The dimensions assessed include adolescent's journey with the platforms, experience with the platform, perceptions of quality of services and how the ITH platforms changed provider behavior and accountability.

    Adolescent in-depth interview included:Adolescent journey,Barriers to adolescents access to SRH services,Community attitudes towards adolescent use of contraceptives,Decision making,Factors influencing decision to visit a clinic,Motivating factors for girls to join ITH,Notable changes since the introduction of ITH,Parental support ,and Perceptions about T-Safe.

    Service providers in-depth interview included;Personal and professional background,Provider's experience with ITH/T-safe platform,Notable changes/influences since the introduction of ITH/T-safe,Influence/Impact on the preference of adolescent service users and health care providers as a result of the program,Impact/influence of ITH on quality of care,Facilitators and barriers for adolescents to access SRH services,Mechanisms to address the barriers,Challenges related to the facility,Feedback about facility from adolescents,Types of support needed to improve SRH services provided to adolescents Scenarios of different clients accessing SRH services,and Free node.

    Mobilisers in-depth interview included;Mobilizer responsibilities and designation,Job description,Motivation for joining ITH,Personal and professional background,Training,Mobilizer roles in ITH,Mobilization process ,Experience with ITH platform,Key messages shared with adolescent about ITH/ Tsafe during enrollment,Motivating factors for adolescents to join ITH/Tsafe,Community's attitude towards ITH/Tsafe,Challenges faced by mobilizers when mobilizing adolescents for Tsafe,Adolescents view regarding platform,Addressing the challenges ,andFree node

    Cleaning operations

    Qualitative interviews were audio-recorded and the audio recordings were transmitted to APHRC study team by uploading the audios to google drive which was only accessible to the team. Related interview notes, participant's description forms and Informed consent forms were transported to APHRC offices in Nairobi at the end of data collection where the data transcription and coding was conducted. Audio recordings from qualitative interviews were transcribed and saved in MS Word format. The transcripts were stored electronically in password protected computers and were only accessible to the evaluation team working on the project. A qualitative software analysis program (NVIVO) was used to assist in coding and analyzing the data. A “thematic analysis” approach was used to organize and analyze the data, and to assist in the development of a codebook and coding scheme. Data was analyzed by first reading the full IDI transcripts, becoming familiar with the data and noting the themes and concepts that emerged. A thematic framework was developed from the identified themes and sub-themes and this was then used to create codes and code the raw data.

    Response rate

    N/A

    Sampling error estimates

    N/A

  4. r

    Substance Abuse and Mental Health Data Archive

    • rrid.site
    • dknet.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Substance Abuse and Mental Health Data Archive [Dataset]. http://identifiers.org/RRID:SCR_007002
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database of the nation''s substance abuse and mental health research data providing public use data files, file documentation, and access to restricted-use data files to support a better understanding of this critical area of public health. The goal is to increase the use of the data to most accurately understand and assess substance abuse and mental health problems and the impact of related treatment systems. The data include the U.S. general and special populations, annual series, and designs that produce nationally representative estimates. Some of the data acquired and archived have never before been publicly distributed. Each collection includes survey instruments (when provided), a bibliography of related literature, and related Web site links. All data may be downloaded free of charge in SPSS, SAS, STATA, and ASCII formats and most studies are available for use with the online data analysis system. This system allows users to conduct analyses ranging from cross-tabulation to regression without downloading data or relying on other software. Another feature, Quick Tables, provides the ability to select variables from drop down menus to produce cross-tabulations and graphs that may be customized and cut and pasted into documents. Documentation files, such as codebooks and questionnaires, can be downloaded and viewed online.

  5. Sensitivity analysis for missing data in cost-effectiveness analysis: Stata...

    • figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baptiste Leurent; Manuel Gomes; Rita Faria; Stephen Morris; Richard Grieve; James R Carpenter (2023). Sensitivity analysis for missing data in cost-effectiveness analysis: Stata code [Dataset]. http://doi.org/10.6084/m9.figshare.6714206.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Baptiste Leurent; Manuel Gomes; Rita Faria; Stephen Morris; Richard Grieve; James R Carpenter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Stata do-files and data to support tutorial "Sensitivity Analysis for Not-at-Random Missing Data in Trial-Based Cost-Effectiveness Analysis" (Leurent, B. et al. PharmacoEconomics (2018) 36: 889).Do-files should be similar to the code provided in the article's supplementary material.Dataset based on 10 Top Tips trial, but modified to preserve confidentiality. Results will differ from those published.

  6. n

    Multilevel modeling of time-series cross-sectional data reveals the dynamic...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Mar 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kodai Kusano (2020). Multilevel modeling of time-series cross-sectional data reveals the dynamic interaction between ecological threats and democratic development [Dataset]. http://doi.org/10.5061/dryad.547d7wm3x
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    University of Nevada, Reno
    Authors
    Kodai Kusano
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    What is the relationship between environment and democracy? The framework of cultural evolution suggests that societal development is an adaptation to ecological threats. Pertinent theories assume that democracy emerges as societies adapt to ecological factors such as higher economic wealth, lower pathogen threats, less demanding climates, and fewer natural disasters. However, previous research confused within-country processes with between-country processes and erroneously interpreted between-country findings as if they generalize to within-country mechanisms. In this article, we analyze a time-series cross-sectional dataset to study the dynamic relationship between environment and democracy (1949-2016), accounting for previous misconceptions in levels of analysis. By separating within-country processes from between-country processes, we find that the relationship between environment and democracy not only differs by countries but also depends on the level of analysis. Economic wealth predicts increasing levels of democracy in between-country comparisons, but within-country comparisons show that democracy declines as countries become wealthier over time. This relationship is only prevalent among historically wealthy countries but not among historically poor countries, whose wealth also increased over time. By contrast, pathogen prevalence predicts lower levels of democracy in both between-country and within-country comparisons. Our longitudinal analyses identifying temporal precedence reveal that not only reductions in pathogen prevalence drive future democracy, but also democracy reduces future pathogen prevalence and increases future wealth. These nuanced results contrast with previous analyses using narrow, cross-sectional data. As a whole, our findings illuminate the dynamic process by which environment and democracy shape each other.

    Methods Our Time-Series Cross-Sectional data combine various online databases. Country names were first identified and matched using R-package “countrycode” (Arel-Bundock, Enevoldsen, & Yetman, 2018) before all datasets were merged. Occasionally, we modified unidentified country names to be consistent across datasets. We then transformed “wide” data into “long” data and merged them using R’s Tidyverse framework (Wickham, 2014). Our analysis begins with the year 1949, which was occasioned by the fact that one of the key time-variant level-1 variables, pathogen prevalence was only available from 1949 on. See our Supplemental Material for all data, Stata syntax, R-markdown for visualization, supplemental analyses and detailed results (available at https://osf.io/drt8j/).

  7. The Canada Trademarks Dataset

    • zenodo.org
    pdf, zip
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeremy Sheff; Jeremy Sheff (2024). The Canada Trademarks Dataset [Dataset]. http://doi.org/10.5281/zenodo.4999655
    Explore at:
    zip, pdfAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jeremy Sheff; Jeremy Sheff
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Canada
    Description

    The Canada Trademarks Dataset

    18 Journal of Empirical Legal Studies 908 (2021), prepublication draft available at https://papers.ssrn.com/abstract=3782655, published version available at https://onlinelibrary.wiley.com/share/author/CHG3HC6GTFMMRU8UJFRR?target=10.1111/jels.12303

    Dataset Selection and Arrangement (c) 2021 Jeremy Sheff

    Python and Stata Scripts (c) 2021 Jeremy Sheff

    Contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office.

    This individual-application-level dataset includes records of all applications for registered trademarks in Canada since approximately 1980, and of many preserved applications and registrations dating back to the beginning of Canada’s trademark registry in 1865, totaling over 1.6 million application records. It includes comprehensive bibliographic and lifecycle data; trademark characteristics; goods and services claims; identification of applicants, attorneys, and other interested parties (including address data); detailed prosecution history event data; and data on application, registration, and use claims in countries other than Canada. The dataset has been constructed from public records made available by the Canadian Intellectual Property Office. Both the dataset and the code used to build and analyze it are presented for public use on open-access terms.

    Scripts are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/. Data files are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/, and also subject to additional conditions imposed by the Canadian Intellectual Property Office (CIPO) as described below.

    Terms of Use:

    As per the terms of use of CIPO's government data, all users are required to include the above-quoted attribution to CIPO in any reproductions of this dataset. They are further required to cease using any record within the datasets that has been modified by CIPO and for which CIPO has issued a notice on its website in accordance with its Terms and Conditions, and to use the datasets in compliance with applicable laws. These requirements are in addition to the terms of the CC-BY-4.0 license, which require attribution to the author (among other terms). For further information on CIPO’s terms and conditions, see https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html. For further information on the CC-BY-4.0 license, see https://creativecommons.org/licenses/by/4.0/.

    The following attribution statement, if included by users of this dataset, is satisfactory to the author, but the author makes no representations as to whether it may be satisfactory to CIPO:

    The Canada Trademarks Dataset is (c) 2021 by Jeremy Sheff and licensed under a CC-BY-4.0 license, subject to additional terms imposed by the Canadian Intellectual Property Office. It contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office. For further information, see https://creativecommons.org/licenses/by/4.0/ and https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html.

    Details of Repository Contents:

    This repository includes a number of .zip archives which expand into folders containing either scripts for construction and analysis of the dataset or data files comprising the dataset itself. These folders are as follows:

    • /csv: contains the .csv versions of the data files
    • /do: contains Stata do-files used to convert the .csv files to .dta format and perform the statistical analyses set forth in the paper reporting this dataset
    • /dta: contains the .dta versions of the data files
    • /py: contains the python scripts used to download CIPO’s historical trademarks data via SFTP and generate the .csv data files

    If users wish to construct rather than download the datafiles, the first script that they should run is /py/sftp_secure.py. This script will prompt the user to enter their IP Horizons SFTP credentials; these can be obtained by registering with CIPO at https://ised-isde.survey-sondage.ca/f/s.aspx?s=59f3b3a4-2fb5-49a4-b064-645a5e3a752d&lang=EN&ds=SFTP. The script will also prompt the user to identify a target directory for the data downloads. Because the data archives are quite large, users are advised to create a target directory in advance and ensure they have at least 70GB of available storage on the media in which the directory is located.

    The sftp_secure.py script will generate a new subfolder in the user’s target directory called /XML_raw. Users should note the full path of this directory, which they will be prompted to provide when running the remaining python scripts. Each of the remaining scripts, the filenames of which begin with “iterparse”, corresponds to one of the data files in the dataset, as indicated in the script’s filename. After running one of these scripts, the user’s target directory should include a /csv subdirectory containing the data file corresponding to the script; after running all the iterparse scripts the user’s /csv directory should be identical to the /csv directory in this repository. Users are invited to modify these scripts as they see fit, subject to the terms of the licenses set forth above.

    With respect to the Stata do-files, only one of them is relevant to construction of the dataset itself. This is /do/CA_TM_csv_cleanup.do, which converts the .csv versions of the data files to .dta format, and uses Stata’s labeling functionality to reduce the size of the resulting files while preserving information. The other do-files generate the analyses and graphics presented in the paper describing the dataset (Jeremy N. Sheff, The Canada Trademarks Dataset, 18 J. Empirical Leg. Studies (forthcoming 2021)), available at https://papers.ssrn.com/abstract=3782655). These do-files are also licensed for reuse subject to the terms of the CC-BY-4.0 license, and users are invited to adapt the scripts to their needs.

    The python and Stata scripts included in this repository are separately maintained and updated on Github at https://github.com/jnsheff/CanadaTM.

    This repository also includes a copy of the current version of CIPO's data dictionary for its historical XML trademarks archive as of the date of construction of this dataset.

  8. f

    Data from: Different Samples, Different Results? How Sampling Techniques...

    • figshare.com
    txt
    Updated Nov 8, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katrin Auspurg; Andreas Schneck; Fabian Thiel (2019). Different Samples, Different Results? How Sampling Techniques Affect the Results of Field Experiments on Ethnic Discrimination [Dataset]. http://doi.org/10.6084/m9.figshare.9890801.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 8, 2019
    Dataset provided by
    figshare
    Authors
    Katrin Auspurg; Andreas Schneck; Fabian Thiel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The file AuspurgSchneckThiel2019_Appendix.pdf contains further analyses not reported in the main article.The data is provided in stata format (version 15.1). rssm_wide.dta is the stata dataset. an_rssm_wide.do is the analysis file for reproducing the results reported in the article and the Appendix. The file discrim.do is an analysis routine for analyzing discrimination that is called in an_rssm_wide.do.

  9. H

    Data from: The impact of state television on voter turnout

    • dataverse.harvard.edu
    application/x-stata +1
    Updated Feb 28, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2017). The impact of state television on voter turnout [Dataset]. http://doi.org/10.7910/DVN/QGMHHQ
    Explore at:
    application/x-stata(496075), application/x-stata(736011), application/x-stata-syntax(7664), application/x-stata(2422452), application/x-stata-syntax(34207)Available download formats
    Dataset updated
    Feb 28, 2017
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    September 1., 2016 REPLICATION FILES FOR «THE IMPACT OF STATE TELEVISION ON VOTER TURNOUT», TO BE PUBLISHED BY THE BRITISH JOURNAL OF POLITICAL SCIENCE The replication files consist of two datasets and corresponding STATA do-files. Please note the following: 1. The data used in the current microanalysis are based on the National Election Surveys of 1965, 1969, and 1973. The Institute of Social Research (ISF) was responsible for the original studies, and data was made available by the NSD (Norwegian Center for Research Data). Neither ISF nor NSD are responsible for the analyses/interpretations of the data presented here. 2. Some of the data used in the municipality-level analyses are taken from NSD’s local government database (“Kommunedatabasen”). The NSD is not responsible for the analysis presented here or the interpretation offered in the BJPS-paper. 3. Note the municipality identification has been anonymized to avoid identification of individual respondents. 4. Most of the analyses generate Word-files that are produced by the outreg2 facility in STATA. These tables can be compared with those presented in the paper. The graphs are directly comparable to those in the paper. In a few cases, the results are only generated in the STATA output window. The paper employs two sets of data: I. Municipal level data in entered in STATA-format (AggregateReplicationTVData.dta), and with a corresponding data with map coordinates (muncoord.dta). The STATA code is in a do-file (ReplicationOfAggregateAnalysis.do). II. The survey data is in a STATA-file (ReplicationofIndividualLevelPanel.dta) and a with a corresponding do-file (ReplicationOfIndividualLevelAnalysis 25.08.2016.do). Please remember to change the file reference (i.e. use-statement) to execute the do-files.

  10. m

    Fayosse and al. Obesity and disability - Tables and Figures STATA codes 2

    • data.mendeley.com
    Updated Mar 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aurore Fayosse (2024). Fayosse and al. Obesity and disability - Tables and Figures STATA codes 2 [Dataset]. http://doi.org/10.17632/5hj2b5g3ky.3
    Explore at:
    Dataset updated
    Mar 19, 2024
    Authors
    Aurore Fayosse
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Here is the programme used to produce the analyses and graphs for the "Cross-sectional and longitudinal associations of obesity with disability between age 50 and 90 in the SHARE study" paper. This programme includes STATA v17 codes.

  11. d

    Replication Data for: Sweetening the Deal: The Strategic Value of Combining...

    • dataone.org
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kwon, Yewon (2024). Replication Data for: Sweetening the Deal: The Strategic Value of Combining Inducements with Militarized Compellent Threats [Dataset]. http://doi.org/10.7910/DVN/QXFTJF
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Kwon, Yewon
    Description

    Data Description for "Sweetening the Deal: The Strategic Value of Combining Inducements with Militarized Compellent Threats" This package contains replication data for the above-mentioned study. It includes: Stata log file: Detailed record of all statistical analyses performed. Stata 'do' file: Executable script for replicating the study's analysis. Stata 'dta' file: Dataset used in the study. These files collectively provide all the necessary resources for replicating and understanding the study's methodologies and findings.

  12. g

    Stata Code and Log-Files for the article "Women's aversion to majors that...

    • search.gesis.org
    • datacatalogue.cessda.eu
    Updated Oct 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Combet, Benita (2023). Stata Code and Log-Files for the article "Women's aversion to majors that (seemingly) require systemizing skills causes gendered field of study choice" [Dataset]. https://search.gesis.org/research_data/SDN-10.7802-2554
    Explore at:
    Dataset updated
    Oct 25, 2023
    Dataset provided by
    GESIS search
    GESIS, Köln
    Authors
    Combet, Benita
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Description

    Stata Do-Files and Log-Files of Article "Women's aversion to majors that (seemingly) require systemizing skills causes gendered field of study choice", published in European Sociological Review, 2023. Data preparation and statistical analyses.

  13. H

    Area Resource File (ARF)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). Area Resource File (ARF) [Dataset]. http://doi.org/10.7910/DVN/8NMSFV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the area resource file (arf) with r the arf is fun to say out loud. it's also a single county-level data table with about 6,000 variables, produced by the united states health services and resources administration (hrsa). the file contains health information and statistics for over 3,000 us counties. like many government agencies, hrsa provides only a sas importation script and an as cii file. this new github repository contains two scripts: 2011-2012 arf - download.R download the zipped area resource file directly onto your local computer load the entire table into a temporary sql database save the condensed file as an R data file (.rda), comma-separated value file (.csv), and/or stata-readable file (.dta). 2011-2012 arf - analysis examples.R limit the arf to the variables necessary for your analysis sum up a few county-level statistics merge the arf onto other data sets, using both fips and ssa county codes create a sweet county-level map click here to view these two scripts for mo re detail about the area resource file (arf), visit: the arf home page the hrsa data warehouse notes: the arf may not be a survey data set itself, but it's particularly useful to merge onto other survey data. confidential to sas, spss, stata, and sudaan users: time to put down the abacus. time to transition to r. :D

  14. f

    Pain medication used during surgical abortions.

    • plos.figshare.com
    xls
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tesfaye Hurissa Tufa; Sarah Prager; Mekitie Wondafrash; Shikur Mohammed; Nicole Byl; Jason Bell (2023). Pain medication used during surgical abortions. [Dataset]. http://doi.org/10.1371/journal.pone.0249529.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Tesfaye Hurissa Tufa; Sarah Prager; Mekitie Wondafrash; Shikur Mohammed; Nicole Byl; Jason Bell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pain medication used during surgical abortions.

  15. o

    Data and Code for Democracy and Aid Donorship

    • openicpsr.org
    delimited, stata, zip
    Updated Oct 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Angelika J. Budjan; Andreas Fuchs (2021). Data and Code for Democracy and Aid Donorship [Dataset]. https://www.openicpsr.org/openicpsr/project/120068/version/V2/view?path=/openicpsr/120068/fcr:versions/V2/Analyse-data.do&type=file
    Explore at:
    delimited, stata, zipAvailable download formats
    Dataset updated
    Oct 25, 2021
    Dataset provided by
    American Economic Association
    Authors
    Angelika J. Budjan; Andreas Fuchs
    Time period covered
    1950 - 2015
    Area covered
    global
    Description
    README TO Democracy and Aid Donorship, Budjan, Angelika J., and Andreas Fuchs, American Economic Journal: Economic Policy.

    AEA Data and Code Repository project ID: 120068

    The replication material consists of four Stata do files, 20 raw input data files, five analysis datasets, and two shapefiles contained in the “outputdata” folder. Analyses have been performed with Stata version 14.0. Running the master do file (“Democracy and Aid Donorship replication file MAIN.do”) will call the configuration do file (“config.do”), the data cleaning do file (“Prepare data.do”), and the data analysis do file (“Analyse data.do”). The configuration do file creates five new folders: the “ado” folder where necessary ado files are stored; the “outputdata” folder where the generated analysis datasets are stored; the “tables” folder where results tables are stored; the “figures” folder where generated figures are stored and the “tempdata” folder where temporary datasets are stored and which are automatically deleted by the end of the script.

    In order to run the master do file (“Democracy and Aid Donorship replication file MAIN.do”), insert the correct folder path in line 19.

    The data analyses do file (“Analyse data.do”) generates four regression datasets in the “outputdata” folder. We had to omit some raw databases from the “input” data folder due to copyright reasons (Marshall et al. 2016; Banks and Wilson 2016; FreedomHouse 2016; Bormann et al. 2017; Correlates of War Project 2017). Since several “input” datasets are omitted from the download package, the do file will neither run without error nor produce the complete datasets required for the analysis – which we however provide in their entirety in the “outputdata” folder. The four regression datasets are the following: ·
    • “new_donors_MAIN.dta” is needed to create Tables 1-3, Figures 2-4, and most tables and figures of the Online Appendix ·
    • “new_donors_limited.dta” and “new_donors_3yaverages.dta” are needed to create the robustness test of Table B3 in the Online Appendix ·
    • “new_donors_sample_firstaid.dta” is needed to create robustness tests of Table C2 in the Online Appendix.
    Figure 1 and Appendix Figure C1 were not produced with STATA. Data from our New Aid Donors Database was merged with country boundaries and saved in shapefile format in the output folder using R. This step can be replicated with the file “Prepare_figure1_figureC1.R.” To run the code, insert the correct folder path in line 9. To create the maps, open the resulting files in QGIS and format the layer “donoryear” as in the manuscript.

    Lines 510-544 of “Prepare_data.do” produce our main variable of interest “democracy” as a temporary datafile (“tempdata\acemoglu_democ.dta”), using the inputs Polity IV Project version 4 (Marshall et al. 2016), Bjørnskov-Rode regime data (Bjørnskov and Rode 2020), and Freedom in the World Country and Territory Ratings and Statuses (Freedom House 2016). This file is then merged to the final analysis datasets. Since our analysis was performed prior to the publication of Acemoglu et al. (2019) and since we require a longer time period for our analysis, the employed data is our own replication and extension of Acemoglu et al.’s democracy variable. To allow users to generate Figure A3 without having executed “Prepare_data.do” before, we also included “acemoglu_democ.dta” in the outdata folder.


  16. H

    Replication Data for: "Selective Attention? Human Rights Organizations and...

    • dataverse.harvard.edu
    Updated May 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arman Azedi (2025). Replication Data for: "Selective Attention? Human Rights Organizations and Anti-State Naming and Shaming, 1995-2018" [Dataset]. http://doi.org/10.7910/DVN/8QGEFZ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 19, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Arman Azedi
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This Stata dataset was used for the analyses in the article "Selective Attention? Human Rights Organizations and Anti-State Naming and Shaming, 1995-2018," published in Social Problems. It is a cross-national dataset where the unit of analyses is the country-year. The dependent variables in the study related to anti-state shaming were created using data from the Integrated Conflict Early Warning System (ICEWS) and transformed into cross-national data. The independent variables are gathered from various sources, such as the World Bank's World Development Indicators, Polity IV, the Cross-National Time-Series (CNTS) Data Archive, the Mass Mobilization Project, V-Dem, CIRIGHTS, and the Yearbook of International Associations (YIA).

  17. f

    Are senior high school students in Ghana meeting WHO’s recommended level of...

    • plos.figshare.com
    docx
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdul-Aziz Seidu; Bright Opoku Ahinkorah; Ebenezer Agbaglo; Eugene Kofuor Maafo Darteh; Edward Kwabena Ameyaw; Eugene Budu; Hawa Iddrisu (2023). Are senior high school students in Ghana meeting WHO’s recommended level of physical activity? Evidence from the 2012 Global School-based Student Health Survey Data [Dataset]. http://doi.org/10.1371/journal.pone.0229012
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Abdul-Aziz Seidu; Bright Opoku Ahinkorah; Ebenezer Agbaglo; Eugene Kofuor Maafo Darteh; Edward Kwabena Ameyaw; Eugene Budu; Hawa Iddrisu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Ghana
    Description

    IntroductionPhysical activity (PA) has both short- and long-term importance. In this study we sought to assess the prevalence and correlates of PA among 1,542 Senior High School (SHS) students.MethodsA cross-sectional study was conducted in Ghana among SHS students using the 2012 version of the Ghana Global School-based Student Health Survey (GSHS) data, which utilised two-stage cluster sampling technique. The population for the study comprised SHS students. The outcome variable was physical activity. The data were analysed using STATA version 14.2 for Mac OS. Both bivariate and multivariate analyses were employed. At the bivariate level, Pearson chi-square test between each independent variable and PA was conducted and the level of statistical significance was set at 5%. All the significant variables from the chi-square test were selected for the multivariate analysis. In the multivariate analysis, Poisson regression with robust variance was performed to estimate crude and adjusted prevalence ratios (APR).ResultsIt was found that 25.0% (29.0% males and 21.9% females) of SHS students were physically active. Female students (APR = 0.78, 95% CI = 0.65, 0.94), students in SHS 2 (APR = 0.76, 95% CI = 0.577, 0.941) and SHS3 (APR = 0.79, 95% CI = 0.63, 0.93), and those who went hungry (APR = 0.77, 95% CI = 0.65, 0.92) were less likely to be physically active compared to males, those in SHS1 and those who did not go hungry respectively. On the other hand, students who actively commuted to school (APR = 2.40, 95% CI = 1.72, 2.42) and got support from their peers were more likely to be physically active (APR = 1.62, 95% CI = 1.09–2.41).ConclusionOnly a quarter of SHS students who participated in the 2012 version of the GSHS met the WHO’s recommended level of physical activity. Sex, grade/form and experience of hunger are associated with physical activity. Physical activity is a major component of any health promotion program. Policies and programmes targeting improvement in physical activity among SHS students should take these associated factors into consideration.

  18. H

    Replication Data for: "Genocidal Consolidation: Final Solutions to Elite...

    • dataverse.harvard.edu
    • search.datacite.org
    Updated May 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eelco van der Maat (2020). Replication Data for: "Genocidal Consolidation: Final Solutions to Elite rivalry" [Dataset]. http://doi.org/10.7910/DVN/VJTPJK
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 7, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Eelco van der Maat
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Replication files “Genocidal Consolidation: Final Solutions to Elite rivalry” E van der Maat 16-12-19 The paper contains three analyses. Each analysis has its own replication folder. First analysis (genocidal consolidation onset) — IO_GC_replication_I The first analysis has 3 .R files, one Stata .do file and three data files. To replicate the models of the first analysis, first run: 1) functions.R 2) two-stage probit.R Then run the main latent.R file to replicate the models in the paper; it contains: • replication of Table 4 in the paper • replication of Tables A.4 and A.10 of Appendix C and G Next, run the analysisI.do file; it contains: • replication of the crosstabs in Table 3 (p 30) • replication of the effect estimates (p31) • replication of Table A.2 of Appendix C Second analysis (elite purges) — IO_GC_replication_II The second analysis folder has a single stata .do file and six data files. To replicate the models, run the analysisII.do file; it contains: • replication of Crosstab Figure 3 (p 35) • replication of Table 5 • replication of Table A.5 and A.6 of Appendix D • replication of Table A.11 of Appendix G Third analysis (leader fates) — IO_GC_replication_III The third analysis folder contains a single stata .do file, two data files, and a log file. To replicate the models, run the analysisIII.do file; it contains: • replication of man analysis (Figure 4: leader fates; p 41) • replication of Rosenbaum sensitivity analysis (footnote 136 & 137) • replication of balance checks for various specifications Table A.7 of Appendix E • replication of leader propensity scores and matches (Tables A.8 and A.15—A.25) • replication of alternative specifications (Table A.9 of appendix E) • replication of balance checks for various HI specifications (Table A.12 of Appendix G) • replication of alternative specification with HI (Table A.14 of appendix G) Note that this file may take a very long time to run because of a total 150,000 bootstraps. I’ve added a log for easy reference of outcomes. To check replication results it’s probably easiest to make sure the code works on your machine; then run the file with a log; and check the log when it’s finished running. Good luck!

  19. f

    All relevant data used in this work.

    • plos.figshare.com
    xlsx
    Updated Feb 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guangrong Deng; Ling Tang; Qian Yang; Zhengyong Li (2025). All relevant data used in this work. [Dataset]. http://doi.org/10.1371/journal.pone.0319199.s003
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 20, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Guangrong Deng; Ling Tang; Qian Yang; Zhengyong Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background and purposeThe ability of the abbreviated burn severity index (ABSI) to predict death among patients with severe burns remains unclear. This meta-analysis aimed to identify the association between the ABSI and mortality in severely burned patients.MethodsThe PubMed, EMBASE and Web of Science databases were searched up to September 15, 2024. The odds ratios (ORs) with 95% confidence intervals (CIs) were combined, and a subgroup analysis was conducted on the basis of age, ABSI grouping method and OR source. All the statistical analyses were performed with STATA version 15.0.ResultsSixteen studies with 4011 cases were included in the analysis. The pooled results demonstrated that an elevated ABSI was significantly related to an increased risk of mortality (OR = 1.72, 95% CI: 1.48–2.00; P < 0.001). In addition, subgroup analysis by age (adult: OR = 1.35, P < 0.001; child: OR = 68.40, P < 0.001), ABSI grouping method (dichotomous: OR = 16.14, P < 0.001; continuous: OR = 1.59, P < 0.001) and OR source (univariate: OR = 11.42, P = 0.015; multivariate: OR = 1.51, P < 0.001) yielded similar results.ConclusionThe ABSI serves as a reliable prognostic indicator in severely burned patients, and patients with an elevated ABSI are at increased risk of death.

  20. Understanding Society: Calendar Year Dataset, 2022

    • beta.ukdataservice.ac.uk
    Updated 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Institute For Social University Of Essex (2024). Understanding Society: Calendar Year Dataset, 2022 [Dataset]. http://doi.org/10.5255/ukda-sn-9333-1
    Explore at:
    Dataset updated
    2024
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    datacite
    Authors
    Institute For Social University Of Essex
    Description

    Understanding Society, (UK Household Longitudinal Study), which began in 2009, is conducted by the Institute for Social and Economic Research (ISER) at the University of Essex and the survey research organisations Verian Group (formerly Kantar Public) and NatCen. It builds on and incorporates, the British Household Panel Survey (BHPS), which began in 1991.

    The Understanding Society: Calendar Year Dataset, 2022, is designed for analysts to conduct cross-sectional analysis for the 2022 calendar year. The Calendar Year datasets combine data collected in a specific year from across multiple waves and these are released as separate calendar year studies, with appropriate analysis weights, starting with the 2020 Calendar Year dataset. Each subsequent year, an additional yearly study is released.

    The Calendar Year data is designed to enable timely cross-sectional analysis of individuals and households in a calendar year. Such analysis can, however, only involve variables that are collected in every wave (excluding rotating content, which is only collected in some of the waves). Due to overlapping fieldwork, the data files combine data collected in the three waves that make up a calendar year. Analysis cannot be restricted to data collected in one wave during a calendar year, as this subset will not be representative of the population. Further details and guidance on this study can be found in the document 9333_main_survey_calendar_year_user_guide_2022.

    These calendar year datasets should be used for cross-sectional analysis only. For those interested in longitudinal analyses using Understanding Society please access the main survey datasets: End User Licence version or Special Licence version.

    Understanding Society: the UK Household Longitudinal Study, started in 2009 with a general population sample (GPS) of UK residents living in private households of around 26,000 households and an ethnic minority boost sample (EMBS) of 4,000 households. All members of these responding households and their descendants became part of the core sample who were eligible to be interviewed every year. Anyone who joined these households after this initial wave was also interviewed as long as they lived with these core sample members to provide the household context. At each annual interview, some basic demographic information was collected about every household member, information about the household is collected from one household member, all 16+-year-old household members are eligible for adult interviews, 10-15-year-old household members are eligible for youth interviews, and some information is collected about 0-9 year-olds from their parents or guardians. Since 1991 until 2008/9 a similar survey, the British Household Panel Survey (BHPS), was fielded. The surviving members of this survey sample were incorporated into Understanding Society in 2010. In 2015, an immigrant and ethnic minority boost sample (IEMBS) of around 2,500 households was added. In 2022, a GPS boost sample (GPS2) of around 5,700 households was added. To know more about the sample design, following rules, interview modes, incentives, consent, and questionnaire content, please see the study overview and user guide.

    Co-funders

    In addition to the Economic and Social Research Council, co-funders for the study included the Department of Work and Pensions, the Department for Education, the Department for Transport, the Department of Culture, Media and Sport, the Department for Community and Local Government, the Department of Health, the Scottish Government, the Welsh Assembly Government, the Northern Ireland Executive, the Department of Environment and Rural Affairs, and the Food Standards Agency.

    End User Licence and Special Licence versions:

    There are two versions of the Calendar Year 2022 data. One is available under the standard End User Licence (EUL) agreement (SN 9333), and the other is a Special Licence (SL) version (SN 9334). The SL version contains month and year of birth variables instead of just age, more detailed country and occupation coding for a number of variables and various income variables have not been top-coded (see document 9333_eul_vs_sl_variable_differences for more details). Users are advised first to obtain the standard EUL version of the data to see if they are sufficient for their research requirements. The SL data have more restrictive access conditions; prospective users of the SL version will need to complete an extra application form and demonstrate to the data owners exactly why they need access to the additional variables in order to get permission to use that version. The main longitudinal versions of the Understanding Society study may be found under SNs 6614 (EUL) and 6931 (SL).

    Low- and Medium-level geographical identifiers produced for the mainstage longitudinal dataset can be used with this Calendar Year 2022 dataset, subject to SL access conditions. See the User Guide for further details.

    Suitable data analysis software

    These data are provided by the depositor in Stata format. Users are strongly advised to analyse them in Stata. Transfer to other formats may result in unforeseen issues. Stata SE or MP software is needed to analyse the larger files, which contain about 1,800 variables.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sarah Price (2022). Example Stata syntax and data construction for negative binomial time series regression [Dataset]. http://doi.org/10.17632/3mj526hgzx.2

Example Stata syntax and data construction for negative binomial time series regression

Explore at:
Dataset updated
Nov 2, 2022
Authors
Sarah Price
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

We include Stata syntax (dummy_dataset_create.do) that creates a panel dataset for negative binomial time series regression analyses, as described in our paper "Examining methodology to identify patterns of consulting in primary care for different groups of patients before a diagnosis of cancer: an exemplar applied to oesophagogastric cancer". We also include a sample dataset for clarity (dummy_dataset.dta), and a sample of that data in a spreadsheet (Appendix 2).

The variables contained therein are defined as follows:

case: binary variable for case or control status (takes a value of 0 for controls and 1 for cases).

patid: a unique patient identifier.

time_period: A count variable denoting the time period. In this example, 0 denotes 10 months before diagnosis with cancer, and 9 denotes the month of diagnosis with cancer,

ncons: number of consultations per month.

period0 to period9: 10 unique inflection point variables (one for each month before diagnosis). These are used to test which aggregation period includes the inflection point.

burden: binary variable denoting membership of one of two multimorbidity burden groups.

We also include two Stata do-files for analysing the consultation rate, stratified by burden group, using the Maximum likelihood method (1_menbregpaper.do and 2_menbregpaper_bs.do).

Note: In this example, for demonstration purposes we create a dataset for 10 months leading up to diagnosis. In the paper, we analyse 24 months before diagnosis. Here, we study consultation rates over time, but the method could be used to study any countable event, such as number of prescriptions.

Search
Clear search
Close search
Google apps
Main menu