Facebook
TwitterStata “do-file” containing the code used to run the analyses in the manuscript using the data from “S1 Dataset.”
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We include Stata syntax (dummy_dataset_create.do) that creates a panel dataset for negative binomial time series regression analyses, as described in our paper "Examining methodology to identify patterns of consulting in primary care for different groups of patients before a diagnosis of cancer: an exemplar applied to oesophagogastric cancer". We also include a sample dataset for clarity (dummy_dataset.dta), and a sample of that data in a spreadsheet (Appendix 2).
The variables contained therein are defined as follows:
case: binary variable for case or control status (takes a value of 0 for controls and 1 for cases).
patid: a unique patient identifier.
time_period: A count variable denoting the time period. In this example, 0 denotes 10 months before diagnosis with cancer, and 9 denotes the month of diagnosis with cancer,
ncons: number of consultations per month.
period0 to period9: 10 unique inflection point variables (one for each month before diagnosis). These are used to test which aggregation period includes the inflection point.
burden: binary variable denoting membership of one of two multimorbidity burden groups.
We also include two Stata do-files for analysing the consultation rate, stratified by burden group, using the Maximum likelihood method (1_menbregpaper.do and 2_menbregpaper_bs.do).
Note: In this example, for demonstration purposes we create a dataset for 10 months leading up to diagnosis. In the paper, we analyse 24 months before diagnosis. Here, we study consultation rates over time, but the method could be used to study any countable event, such as number of prescriptions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Revised STATA do-file and dataset prepared for journal article resubmission.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset is the STATA dta file containing the analysis of the data for the study on eliciting preferences for place of delivery for peri-urban setting women in Kenya.
Facebook
TwitterGeneral information: The data sets contain information on how often materials of studies available through GESIS: Data Archive for the Social Sciences were downloaded and/or ordered through one of the archive´s plattforms/services between 2004 and 2017.
Sources and plattforms: Study materials are accessible through various GESIS plattforms and services: Data Catalogue (DBK), histat, datorium, data service (and others).
Years available: - Data Catalogue: 2012-2017 - data service: 2006-2017 - datorium: 2014-2017 - histat: 2004-2017
Data sets: Data set ZA6899_Datasets_only_all_sources contains information on how often data files such as those with dta- (Stata) or sav- (SPSS) extension have been downloaded. Identification of data files is handled semi-automatically (depending on the plattform/serice). Multiple downloads of one file by the same user (identified through IP-address or username for registered users) on the same days are only counted as one download.
Data set ZA6899_Doc_and_Data_all_sources contains information on how often study materials have been downloaded. Multiple downloads of any file of the same study by the same user (identified through IP-address or username for registered users) on the same days are only counted as one download.
Both data sets are available in three formats: csv (quoted, semicolon-separated), dta (Stata v13, labeled) and sav (SPSS, labeled). All formats contain identical information.
Variables: Variables/columns in both data sets are identical. za_nr ´Archive study number´ version ´GESIS Archiv Version´ doi ´Digital Object Identifier´ StudyNo ´Study number of respective study´ Title ´English study title´ Title_DE ´German study title´ Access ´Access category (0, A, B, C, D, E)´ PubYear ´Publication year of last version of the study´ inZACAT ´Study is currently also available via ZACAT´ inHISTAT ´Study is currently also available via HISTAT´ inDownloads ´There are currently data files available for download for this study in DBK or datorium´ Total ´All downloads combined´ downloads_2004 ´downloads/orders from all sources combined in 2004´ [up to ...] downloads_2017 ´downloads/orders from all sources combined in 2017´ d_2004_dbk ´downloads from source dbk in 2004´ [up to ...] d_2017_dbk ´downloads from source dbk in 2017´ d_2004_histat ´downloads from source histat in 2004´ [up to ...] d_2017_histat ´downloads from source histat in 2017´ d_2004_dataservice ´downloads/orders from source dataservice in 2004´ [up to ...] d_2017_dataservice ´downloads/orders from source dataservice in 2017´
More information is available within the codebook.
Facebook
TwitterThis package contains two files designed to help read individual level DHS data into Stata. The first file addresses the problem that versions of Stata before Version 7/SE will read in only up to 2047 variables and most of the individual files have more variables than that. The file will read in the .do, .dct and .dat file and output new .do and .dct files with only a subset of the variables specified by the user. The second file deals with earlier DHS surveys in which .do and .dct file do not exist and only .sps and .sas files are provided. The file will read in the .sas and .sps files and output a .dct and .do file. If necessary the first file can then be run again to select a subset of variables.
Facebook
Twitteranalyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supplementary online material (S2 File) containing the dataset (STATA format) used in Justice Nyakura study (VL uptake among PBF on ART in Mazowe, Zimbabwe, 2017)
Facebook
TwitterStata file with all variables used in the logistic regression models (DTA format).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file contains the data set (in stata format) and do file associated with the paper entitled "Risk misperceptions of structured financial products with worst-of payout characteristics revisited"
Facebook
TwitterStata file with sequential serological data, wide format (DTA format).
Facebook
TwitterThe Uniform Appraisal Dataset (UAD) Appraisal-Level Public Use File (PUF) is the nation’s first publicly available appraisal-level dataset of appraisal records, giving the public new access to a selected set of data fields found in appraisal reports. The UAD Appraisal-Level PUF is based on a five percent nationally representative random sample of appraisals for single-family mortgages acquired by the Enterprises. The current release includes appraisals from 2013 through 2021. The UAD Appraisal-Level PUF is a resource for users capable of using statistical software to extract and analyze data. Users can download annual or combined files in CSV, R, SAS and Stata formats. All files are zipped for ease with download.
Facebook
TwitterThe BOP-HH Scientific Use File 202401 Version 01 data set continues the BOPSOCE Scientific Use File Version 1.0. It consists of the Stata files bophh_suf_202401_v02_wave01.dta to bophh_suf_202401_v02_wave48.dta. For more details, see the BOP-HH documentation on the website of the Deutsche Bundesbank. Self-administered questionnaire: Web-based Internet-based survey Individuals in Germany with age 16 or higher
Facebook
TwitterStata file containing all data underlying the findings described in this study. (DTA)
Facebook
TwitterSince the data utilized are protected by copyright, I provide the pseudo-data set, including variable definitions. The commands used are summarized in the Stata do-file.
Facebook
Twitterhttp://rdm.uva.nl/en/support/confidential-data.htmlhttp://rdm.uva.nl/en/support/confidential-data.html
This project contains the data of a study on CEO life events, marriage and parenthood in particular, and its effects on firm growth. Dataset contains data on S&P100 firms and their CEOs (2003-2013). Sources include common databases such as Compustat, Execucomp, MSCI ESG, and Boardex. Data on the private life events was collected using news articles, company websites, and biographical websites such as; Referenceforbusiness, Britannica, Notablebiographies, Marquis Who’s Who in Finance and Industry, and the Notable Names Database.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Canada Trademarks Dataset
18 Journal of Empirical Legal Studies 908 (2021), prepublication draft available at https://papers.ssrn.com/abstract=3782655, published version available at https://onlinelibrary.wiley.com/share/author/CHG3HC6GTFMMRU8UJFRR?target=10.1111/jels.12303
Dataset Selection and Arrangement (c) 2021 Jeremy Sheff
Python and Stata Scripts (c) 2021 Jeremy Sheff
Contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office.
This individual-application-level dataset includes records of all applications for registered trademarks in Canada since approximately 1980, and of many preserved applications and registrations dating back to the beginning of Canada’s trademark registry in 1865, totaling over 1.6 million application records. It includes comprehensive bibliographic and lifecycle data; trademark characteristics; goods and services claims; identification of applicants, attorneys, and other interested parties (including address data); detailed prosecution history event data; and data on application, registration, and use claims in countries other than Canada. The dataset has been constructed from public records made available by the Canadian Intellectual Property Office. Both the dataset and the code used to build and analyze it are presented for public use on open-access terms.
Scripts are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/. Data files are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/, and also subject to additional conditions imposed by the Canadian Intellectual Property Office (CIPO) as described below.
Terms of Use:
As per the terms of use of CIPO's government data, all users are required to include the above-quoted attribution to CIPO in any reproductions of this dataset. They are further required to cease using any record within the datasets that has been modified by CIPO and for which CIPO has issued a notice on its website in accordance with its Terms and Conditions, and to use the datasets in compliance with applicable laws. These requirements are in addition to the terms of the CC-BY-4.0 license, which require attribution to the author (among other terms). For further information on CIPO’s terms and conditions, see https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html. For further information on the CC-BY-4.0 license, see https://creativecommons.org/licenses/by/4.0/.
The following attribution statement, if included by users of this dataset, is satisfactory to the author, but the author makes no representations as to whether it may be satisfactory to CIPO:
The Canada Trademarks Dataset is (c) 2021 by Jeremy Sheff and licensed under a CC-BY-4.0 license, subject to additional terms imposed by the Canadian Intellectual Property Office. It contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office. For further information, see https://creativecommons.org/licenses/by/4.0/ and https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html.
Details of Repository Contents:
This repository includes a number of .zip archives which expand into folders containing either scripts for construction and analysis of the dataset or data files comprising the dataset itself. These folders are as follows:
If users wish to construct rather than download the datafiles, the first script that they should run is /py/sftp_secure.py. This script will prompt the user to enter their IP Horizons SFTP credentials; these can be obtained by registering with CIPO at https://ised-isde.survey-sondage.ca/f/s.aspx?s=59f3b3a4-2fb5-49a4-b064-645a5e3a752d&lang=EN&ds=SFTP. The script will also prompt the user to identify a target directory for the data downloads. Because the data archives are quite large, users are advised to create a target directory in advance and ensure they have at least 70GB of available storage on the media in which the directory is located.
The sftp_secure.py script will generate a new subfolder in the user’s target directory called /XML_raw. Users should note the full path of this directory, which they will be prompted to provide when running the remaining python scripts. Each of the remaining scripts, the filenames of which begin with “iterparse”, corresponds to one of the data files in the dataset, as indicated in the script’s filename. After running one of these scripts, the user’s target directory should include a /csv subdirectory containing the data file corresponding to the script; after running all the iterparse scripts the user’s /csv directory should be identical to the /csv directory in this repository. Users are invited to modify these scripts as they see fit, subject to the terms of the licenses set forth above.
With respect to the Stata do-files, only one of them is relevant to construction of the dataset itself. This is /do/CA_TM_csv_cleanup.do, which converts the .csv versions of the data files to .dta format, and uses Stata’s labeling functionality to reduce the size of the resulting files while preserving information. The other do-files generate the analyses and graphics presented in the paper describing the dataset (Jeremy N. Sheff, The Canada Trademarks Dataset, 18 J. Empirical Leg. Studies (forthcoming 2021)), available at https://papers.ssrn.com/abstract=3782655). These do-files are also licensed for reuse subject to the terms of the CC-BY-4.0 license, and users are invited to adapt the scripts to their needs.
The python and Stata scripts included in this repository are separately maintained and updated on Github at https://github.com/jnsheff/CanadaTM.
This repository also includes a copy of the current version of CIPO's data dictionary for its historical XML trademarks archive as of the date of construction of this dataset.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
September 1., 2016 REPLICATION FILES FOR «THE IMPACT OF STATE TELEVISION ON VOTER TURNOUT», TO BE PUBLISHED BY THE BRITISH JOURNAL OF POLITICAL SCIENCE The replication files consist of two datasets and corresponding STATA do-files. Please note the following: 1. The data used in the current microanalysis are based on the National Election Surveys of 1965, 1969, and 1973. The Institute of Social Research (ISF) was responsible for the original studies, and data was made available by the NSD (Norwegian Center for Research Data). Neither ISF nor NSD are responsible for the analyses/interpretations of the data presented here. 2. Some of the data used in the municipality-level analyses are taken from NSD’s local government database (“Kommunedatabasen”). The NSD is not responsible for the analysis presented here or the interpretation offered in the BJPS-paper. 3. Note the municipality identification has been anonymized to avoid identification of individual respondents. 4. Most of the analyses generate Word-files that are produced by the outreg2 facility in STATA. These tables can be compared with those presented in the paper. The graphs are directly comparable to those in the paper. In a few cases, the results are only generated in the STATA output window. The paper employs two sets of data: I. Municipal level data in entered in STATA-format (AggregateReplicationTVData.dta), and with a corresponding data with map coordinates (muncoord.dta). The STATA code is in a do-file (ReplicationOfAggregateAnalysis.do). II. The survey data is in a STATA-file (ReplicationofIndividualLevelPanel.dta) and a with a corresponding do-file (ReplicationOfIndividualLevelAnalysis 25.08.2016.do). Please remember to change the file reference (i.e. use-statement) to execute the do-files.
Facebook
TwitterData format: Stata. Compatible with SE 13 or more recent version.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This package includes Stata do file and the main Stata datasets used to generate tables (Table 1, Table 3, Table 4, Table A1, and Table 7) for the article titled "The Geography of Investor Attention". Due to data restrictions and the utilization of multiple datasets in the paper, we provide a subsample of the main dataset, with pseudo firm identifiers, to aid in understanding both the code's structure and the main dataset employed.
Facebook
TwitterStata “do-file” containing the code used to run the analyses in the manuscript using the data from “S1 Dataset.”