7 datasets found
  1. The Canada Trademarks Dataset

    • zenodo.org
    pdf, zip
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeremy Sheff; Jeremy Sheff (2024). The Canada Trademarks Dataset [Dataset]. http://doi.org/10.5281/zenodo.4999655
    Explore at:
    zip, pdfAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jeremy Sheff; Jeremy Sheff
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Canada Trademarks Dataset

    18 Journal of Empirical Legal Studies 908 (2021), prepublication draft available at https://papers.ssrn.com/abstract=3782655, published version available at https://onlinelibrary.wiley.com/share/author/CHG3HC6GTFMMRU8UJFRR?target=10.1111/jels.12303

    Dataset Selection and Arrangement (c) 2021 Jeremy Sheff

    Python and Stata Scripts (c) 2021 Jeremy Sheff

    Contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office.

    This individual-application-level dataset includes records of all applications for registered trademarks in Canada since approximately 1980, and of many preserved applications and registrations dating back to the beginning of Canada’s trademark registry in 1865, totaling over 1.6 million application records. It includes comprehensive bibliographic and lifecycle data; trademark characteristics; goods and services claims; identification of applicants, attorneys, and other interested parties (including address data); detailed prosecution history event data; and data on application, registration, and use claims in countries other than Canada. The dataset has been constructed from public records made available by the Canadian Intellectual Property Office. Both the dataset and the code used to build and analyze it are presented for public use on open-access terms.

    Scripts are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/. Data files are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/, and also subject to additional conditions imposed by the Canadian Intellectual Property Office (CIPO) as described below.

    Terms of Use:

    As per the terms of use of CIPO's government data, all users are required to include the above-quoted attribution to CIPO in any reproductions of this dataset. They are further required to cease using any record within the datasets that has been modified by CIPO and for which CIPO has issued a notice on its website in accordance with its Terms and Conditions, and to use the datasets in compliance with applicable laws. These requirements are in addition to the terms of the CC-BY-4.0 license, which require attribution to the author (among other terms). For further information on CIPO’s terms and conditions, see https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html. For further information on the CC-BY-4.0 license, see https://creativecommons.org/licenses/by/4.0/.

    The following attribution statement, if included by users of this dataset, is satisfactory to the author, but the author makes no representations as to whether it may be satisfactory to CIPO:

    The Canada Trademarks Dataset is (c) 2021 by Jeremy Sheff and licensed under a CC-BY-4.0 license, subject to additional terms imposed by the Canadian Intellectual Property Office. It contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office. For further information, see https://creativecommons.org/licenses/by/4.0/ and https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html.

    Details of Repository Contents:

    This repository includes a number of .zip archives which expand into folders containing either scripts for construction and analysis of the dataset or data files comprising the dataset itself. These folders are as follows:

    • /csv: contains the .csv versions of the data files
    • /do: contains Stata do-files used to convert the .csv files to .dta format and perform the statistical analyses set forth in the paper reporting this dataset
    • /dta: contains the .dta versions of the data files
    • /py: contains the python scripts used to download CIPO’s historical trademarks data via SFTP and generate the .csv data files

    If users wish to construct rather than download the datafiles, the first script that they should run is /py/sftp_secure.py. This script will prompt the user to enter their IP Horizons SFTP credentials; these can be obtained by registering with CIPO at https://ised-isde.survey-sondage.ca/f/s.aspx?s=59f3b3a4-2fb5-49a4-b064-645a5e3a752d&lang=EN&ds=SFTP. The script will also prompt the user to identify a target directory for the data downloads. Because the data archives are quite large, users are advised to create a target directory in advance and ensure they have at least 70GB of available storage on the media in which the directory is located.

    The sftp_secure.py script will generate a new subfolder in the user’s target directory called /XML_raw. Users should note the full path of this directory, which they will be prompted to provide when running the remaining python scripts. Each of the remaining scripts, the filenames of which begin with “iterparse”, corresponds to one of the data files in the dataset, as indicated in the script’s filename. After running one of these scripts, the user’s target directory should include a /csv subdirectory containing the data file corresponding to the script; after running all the iterparse scripts the user’s /csv directory should be identical to the /csv directory in this repository. Users are invited to modify these scripts as they see fit, subject to the terms of the licenses set forth above.

    With respect to the Stata do-files, only one of them is relevant to construction of the dataset itself. This is /do/CA_TM_csv_cleanup.do, which converts the .csv versions of the data files to .dta format, and uses Stata’s labeling functionality to reduce the size of the resulting files while preserving information. The other do-files generate the analyses and graphics presented in the paper describing the dataset (Jeremy N. Sheff, The Canada Trademarks Dataset, 18 J. Empirical Leg. Studies (forthcoming 2021)), available at https://papers.ssrn.com/abstract=3782655). These do-files are also licensed for reuse subject to the terms of the CC-BY-4.0 license, and users are invited to adapt the scripts to their needs.

    The python and Stata scripts included in this repository are separately maintained and updated on Github at https://github.com/jnsheff/CanadaTM.

    This repository also includes a copy of the current version of CIPO's data dictionary for its historical XML trademarks archive as of the date of construction of this dataset.

  2. Z

    Repeated information of benefits reduce COVID-19 vaccination hesitancy:...

    • data.niaid.nih.gov
    Updated Jun 17, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Steimanis, Ivo (2022). Repeated information of benefits reduce COVID-19 vaccination hesitancy: Experimental evidence from Germany [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6242619
    Explore at:
    Dataset updated
    Jun 17, 2022
    Dataset provided by
    Steimanis, Ivo
    Mayer, Matthias
    Burger, Max
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    This replication package contains the raw data and code to replicate the findings reported in the paper. The data are licensed under a Creative Commons Attribution 4.0 International Public License. The code is licensed under a Modified BSD License. See LICENSE.txt for details.

    Software requirements

    All analysis were done in Stata version 16:

    Add-on packages are included in scripts/libraries/stata and do not need to be installed by user. The names, installation sources, and installation dates of these packages are available in scripts/libraries/stata/stata.trk.

    Instructions

    Save the folder ‘replication_PLOS’ to your local drive.

    Open the master script ‘run.do’ and change the global pointing to the working direction (line 20) to the location where you save the folder on your local drive

    Run the master script ‘run.do’ to replicate the analysis and generate all tables and figures reported in the paper and supplementary online materials

    Datasets

    Wave 1 – Survey experiment: ‘wave1_survey_experiment_raw.dta’

    Wave 2 – Follow-up Survey: ‘wave2_follow_up_raw.dta'

    Map: shape-files ‘plz2stellig.shp’ ‘OSM_PLZ.shp’, area codes ‘Postleitzahlengebiete-_OSM.csv’_, (all links to the sources can be found in the script ‘04_figure2_germany_map.do’)

    Pretest: ‘pre-test_corona_raw.dta’

    For Appendix S7: ‘alter_geschlecht_zensus_det.xlsx’, ‘vaccination_landkreis_raw.dta’, ‘census2020_age_gender.csv’ (all links to the sources can be found in the script ‘06_AppendixS7.do’)

    For Appendix S10: ‘vaccination_landkreis_raw.dta’ (all links to the sources can be found in the script ‘07_AppendixS10.do’)

    Descriptions of scripts

    1_1_clean_wave1.do This script processes the raw data from wave 1, the survey experiment 1_2_clean_wave2.do This script processes the raw data from wave 2, the follow-up survey 1_3_merge_generate.do This script creates the datasets used in the main analysis and for robustness checks by merging the cleaned data from wave 1 and 2, tests the exclusion criteria and creates additional variables 02_analysis.do This script estimates regression models in Stata, creates figures and tables, saving them to results/figures and results/tables 03_robustness_checks_no_exclusion.do This script runs the main analysis using the dataset without applying the exclusion criteria. Results are saved in results/tables 04_figure2_germany_map.do This script creates Figure 2 in the main manuscript using publicly available data on vaccination numbers in Germany. 05_figureS1_dogmatism_scale.do This script creates Figure S1 using data from a pretest to adjust the dogmatism scale. 06_AppendixS7.do This script creates the figures and tables provided in Appendix S7 on the representativity of our sample compared to the German average using publicly available data about the age distribution in Germany. 07_AppendixS10.do This script creates the figures and tables provided in Appendix S10 on the external validity of vaccination rates in our sample using publicly available data on vaccination numbers in Germany.

  3. f

    Dataset for Hepmarc: a 96 week randomised controlled feasibility trial of...

    • sussex.figshare.com
    • datasetcatalog.nlm.nih.gov
    bin
    Updated Aug 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Bradshaw; Iga Abramowicz; Stephen Bremner; Sumita Verma; Yvonne Gilleece; Sarah Kirk; Mark Nelson; Rosalie Housman; Helena Miras; Chloe Orkin; Ashini Fox; Michael Curnock; Louise Jennings; Mark Gompels; Emily Clarke; Rachel Robinson; Pauline Lambert; David Chadwick; Nicky Perry (2023). Dataset for Hepmarc: a 96 week randomised controlled feasibility trial of add-on maraviroc in people with HIV and non-alcoholic fatty liver disease [Dataset]. http://doi.org/10.25377/sussex.22815686.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 1, 2023
    Dataset provided by
    University of Sussex
    Authors
    Daniel Bradshaw; Iga Abramowicz; Stephen Bremner; Sumita Verma; Yvonne Gilleece; Sarah Kirk; Mark Nelson; Rosalie Housman; Helena Miras; Chloe Orkin; Ashini Fox; Michael Curnock; Louise Jennings; Mark Gompels; Emily Clarke; Rachel Robinson; Pauline Lambert; David Chadwick; Nicky Perry
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data for paper published in PLOS ONE 14.07.2023 These files were used for the statistical analysis of the hemparc feasibility trial using Stata software verson 17, and are as follows, both Stata format and .csv format as appropriate. The .do file is a simple text file.

    hepmarc_data minimum dataset: .csv, .dta: See doi:10.1136/bmjopen-2019-035596 for study protocol describing all data collected hepmarc Data dictionary .xls, .dta; description of each data fields in minimum dataset hepmarc AE listing: Adverse events listing .csv, .dta hepmarc SAP v1.0 240322_.xls .dta; description of each data fields in minimum dataset hepmarc data.do Stata .do file used to perform the analysis

    Notes: Each particpant's age has been altered by a random amount to preserve anonymity. There are two rows for two of the participants who each reported two adverse reactions.

    Abstract Objectives Maraviroc may reduce hepatic inflammation in people with HIV and non-alcoholic fatty liver disease (HIV-NAFLD) through CCR5-receptor antagonism, which warrants further exploration.

    Methods We performed an open-label 96-week randomised-controlled feasibility trial of maraviroc plus optimised background therapy (OBT) versus OBT alone, in a 1:1 ratio, for people with virologically-suppressed HIV-1 and NAFLD without cirrhosis. Dosing followed recommendations for HIV therapy in the Summary of Product Characteristics for maraviroc. The primary outcomes were safety, recruitment and retention rates, adherence and data completeness. Secondary outcomes included the change in Fibroscan-assessed liver stiffness measurements (LSM), controlled attenuation parameter (CAP) and Enhanced Liver Fibrosis (ELF) scores.

    Results Fifty-three participants (53/60, 88% of target) were recruited; 23 received maraviroc plus OBT; 89% were male; 19% had type 2 diabetes mellitus. The median baseline LSM, CAP & ELF scores were 6.2 (IQR 4.6-7.8) kPa, 325 (IQR 279-351) dB/m and 9.1 (IQR 8.6-9.6) respectively.

    Primary outcomes: all individuals eligible after screening were randomised; there was 92% (SD 6.6%) adherence to maraviroc [target >90%]; 83% (95%CI 70%-92%) participant retention [target >65%]; 5.5% of data were missing [target

  4. H

    Area Resource File (ARF)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). Area Resource File (ARF) [Dataset]. http://doi.org/10.7910/DVN/8NMSFV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the area resource file (arf) with r the arf is fun to say out loud. it's also a single county-level data table with about 6,000 variables, produced by the united states health services and resources administration (hrsa). the file contains health information and statistics for over 3,000 us counties. like many government agencies, hrsa provides only a sas importation script and an as cii file. this new github repository contains two scripts: 2011-2012 arf - download.R download the zipped area resource file directly onto your local computer load the entire table into a temporary sql database save the condensed file as an R data file (.rda), comma-separated value file (.csv), and/or stata-readable file (.dta). 2011-2012 arf - analysis examples.R limit the arf to the variables necessary for your analysis sum up a few county-level statistics merge the arf onto other data sets, using both fips and ssa county codes create a sweet county-level map click here to view these two scripts for mo re detail about the area resource file (arf), visit: the arf home page the hrsa data warehouse notes: the arf may not be a survey data set itself, but it's particularly useful to merge onto other survey data. confidential to sas, spss, stata, and sudaan users: time to put down the abacus. time to transition to r. :D

  5. Labor Force Survey, LFS 2006 - Egypt

    • erfdataportal.com
    Updated Feb 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Agency For Public Mobilization And Statistics (2023). Labor Force Survey, LFS 2006 - Egypt [Dataset]. https://www.erfdataportal.com/index.php/catalog/146
    Explore at:
    Dataset updated
    Feb 5, 2023
    Dataset provided by
    Central Agency for Public Mobilization and Statisticshttps://www.capmas.gov.eg/
    Economic Research Forum
    Time period covered
    2006
    Area covered
    Egypt
    Description

    Abstract

    THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE CENTRAL AGENCY FOR PUBLIC MOBILIZATION AND STATISTICS (CAPMAS)

    In any society, the human element represents the basis of the work force which exercises all the service and production activities. Therefore, it is a mandate to produce labor force statistics and studies, that is related to the growth and distribution of manpower and labor force distribution by different types and characteristics.

    In this context, the Central Agency for Public Mobilization and Statistics conducts "Quarterly Labor Force Survey" which includes data on the size of manpower and labor force (employed and unemployed) and their geographical distribution by their characteristics.

    By the end of each year, CAPMAS issues the annual aggregated labor force bulletin publication that includes the results of the quarterly survey rounds that represent the manpower and labor force characteristics during the year.

    ----> Historical Review of the Labor Force Survey:

    1- The First Labor Force survey was undertaken in 1957. The first round was conducted in November of that year, the survey continued to be conducted in successive rounds (quarterly, bi-annually, or annually) till now.

    2- Starting the October 2006 round, the fieldwork of the labor force survey was developed to focus on the following two points: a. The importance of using the panel sample that is part of the survey sample, to monitor the dynamic changes of the labor market. b. Improving the used questionnaire to include more questions, that help in better defining of relationship to labor force of each household member (employed, unemployed, out of labor force ...etc.). In addition to re-order of some of the already existing questions in much logical way.

    3- Starting the January 2008 round, the used methodology was developed to collect more representative sample during the survey year. this is done through distributing the sample of each governorate into five groups, the questionnaires are collected from each of them separately every 15 days for 3 months (in the middle and the end of the month)

    ----> The survey aims at covering the following topics:

    1- Measuring the size of the Egyptian labor force among civilians (for all governorates of the republic) by their different characteristics. 2- Measuring the employment rate at national level and different geographical areas. 3- Measuring the distribution of employed people by the following characteristics: gender, age, educational status, occupation, economic activity, and sector. 4- Measuring unemployment rate at different geographic areas. 5- Measuring the distribution of unemployed people by the following characteristics: gender, age, educational status, unemployment type "ever employed/never employed", occupation, economic activity, and sector for people who have ever worked.

    The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing labor force surveys in several Arab countries.

    Geographic coverage

    Covering a sample of urban and rural areas in all the governorates.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey covered a national sample of households and all individuals permanently residing in surveyed households.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE CENTRAL AGENCY FOR PUBLIC MOBILIZATION AND STATISTICS (CAPMAS)

    ----> Sample Design and Selection

    The sample of the LFS 2006 survey is a simple systematic random sample.

    ----> Sample Size

    The sample size varied in each quarter (it is Q1=19429, Q2=19419, Q3=19119 and Q4=18835) households with a total number of 76802 households annually. These households are distributed on the governorate level (urban/rural).

    A more detailed description of the different sampling stages and allocation of sample across governorates is provided in the Methodology document available among external resources in Arabic.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire design follows the latest International Labor Organization (ILO) concepts and definitions of labor force, employment, and unemployment.

    The questionnaire comprises 3 tables in addition to the identification and geographic data of household on the cover page.

    ----> Table 1- Demographic and employment characteristics and basic data for all household individuals

    Including: gender, age, educational status, marital status, residence mobility and current work status

    ----> Table 2- Employment characteristics table

    This table is filled by employed individuals at the time of the survey or those who were engaged to work during the reference week, and provided information on: - Relationship to employer: employer, self-employed, waged worker, and unpaid family worker - Economic activity - Sector - Occupation - Effective working hours - Work place - Average monthly wage

    ----> Table 3- Unemployment characteristics table

    This table is filled by all unemployed individuals who satisfied the unemployment criteria, and provided information on: - Type of unemployment (unemployed, unemployed ever worked) - Economic activity and occupation in the last held job before being unemployed - Last unemployment duration in months - Main reason for unemployment

    Cleaning operations

    ----> Raw Data

    Office editing is one of the main stages of the survey. It started once the questionnaires were received from the field and accomplished by the selected work groups. It includes: a-Editing of coverage and completeness b-Editing of consistency

    ----> Harmonized Data

    • The STATA is used to clean and SPSS is used harmonize the datasets.
    • The harmonization process starts with a cleaning process for all raw data files received from the Statistical Agency.
    • All cleaned data files are then merged to produce one data file on the individual level containing all variables subject to harmonization.
    • A country-specific program is generated for each dataset to generate/ compute/ recode/ rename/ format/ label harmonized variables.
    • A post-harmonization cleaning process is then conducted on the data.
    • Harmonized data is saved on the household as well as the individual level, in SPSS and then converted to STATA, to be disseminated.
  6. Survey of Consumer Attitudes and Behavior, September 2005 - Version 2

    • search.gesis.org
    • datasearch.gesis.org
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Michigan. Survey Research Center. Economic Behavior Program, Survey of Consumer Attitudes and Behavior, September 2005 - Version 2 [Dataset]. http://doi.org/10.3886/ICPSR35377.v2
    Explore at:
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    GESIS search
    Authors
    University of Michigan. Survey Research Center. Economic Behavior Program
    License

    https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de471016https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de471016

    Description

    Abstract (en): The Survey of Consumer Attitudes and Behavior series (also known as the Surveys of Consumers) was undertaken to measure changes in consumer attitudes and expectations, to understand why such changes occur, and to evaluate how they relate to consumer decisions to save, borrow, or make discretionary purchases. The data regularly include the Index of Consumer Sentiment, the Index of Current Economic Conditions, and the Index of Consumer Expectations. Since the 1940s, these surveys have been produced quarterly through 1977 and monthly thereafter. The surveys conducted in 2005 focused on topics such as evaluations and expectations about personal finances, employment, price changes, and the national business situation. Opinions were collected regarding respondents' appraisals of present market conditions for purchasing houses, automobiles, computers, and other durables. Also explored in this survey, were respondents' types of savings and financial investments, loan use, family income, and retirement planning. Other topics in this series typically include ownership, lease, and use of automobiles, respondents' general feelings and respondents' familiarity with and use of the Internet. Demographic information includes ethnic origin, sex, age, marital status, and education. The purpose of this survey series is to forecast changes in aggregate consumer behavior. The data are not weighted, however, this collection contains four weight variables; WT (HOUSEHOLD HEAD WEIGHT), WT_HH (HOUSEHOLD WEIGHT), WT_ADHD (ADULT HEAD WEIGHT), and WT_AD (ADULT WEIGHT) that must be used in any analysis. For more information on weights and sampling please refer to the documentation and/or visit the Surveys of Consumers Web site. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Created variable labels and/or value labels.; Checked for undocumented or out-of-range codes.. Persons aged 18 years or older living in households with telephones within the United States. Smallest Geographic Unit: Census Division National sample of dwelling units selected by area probability sampling. 2016-04-12 This collection has been fully curated, and was updated to include SPSS, SAS, and Stata data and setup files, a tab-delimited data file, an R data file, and a PDF codebook. computer-assisted telephone interview (CATI)Information on the Index of Consumer Sentiment, the Index of Current Economic Conditions, and the Index of Consumer Expectations and how they were created can be found in the ICPSR Codebook. Additional information on the Survey of Consumers can be found by visiting the Surveys of Consumers Web site.

  7. Labor Force Survey 2006, Harmonized Dataset - Egypt, Arab Rep.

    • catalog.ihsn.org
    Updated Dec 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Agency For Public Mobilization And Statistics (2019). Labor Force Survey 2006, Harmonized Dataset - Egypt, Arab Rep. [Dataset]. https://catalog.ihsn.org/index.php/catalog/8141
    Explore at:
    Dataset updated
    Dec 5, 2019
    Dataset provided by
    Central Agency for Public Mobilization and Statisticshttps://www.capmas.gov.eg/
    Economic Research Forum
    Time period covered
    2006
    Area covered
    Egypt
    Description

    Abstract

    The cleaned and harmonized version of the survey data produced and published by the Economic Research Forum represents 100% of the original survey data collected by the Central Agency for Public Mobilization and Statistics (CAPMAS)

    In any society, the human element represents the basis of the work force which exercises all the service and production activities. Therefore, it is a mandate to produce labor force statistics and studies, that is related to the growth and distribution of manpower and labor force distribution by different types and characteristics.

    In this context, the Central Agency for Public Mobilization and Statistics conducts "Quarterly Labor Force Survey" which includes data on the size of manpower and labor force (employed and unemployed) and their geographical distribution by their characteristics.

    By the end of each year, CAPMAS issues the annual aggregated labor force bulletin publication that includes the results of the quarterly survey rounds that represent the manpower and labor force characteristics during the year.

    ----> Historical Review of the Labor Force Survey:

    1- The First Labor Force survey was undertaken in 1957. The first round was conducted in November of that year, the survey continued to be conducted in successive rounds (quarterly, bi-annually, or annually) till now.

    2- Starting the October 2006 round, the fieldwork of the labor force survey was developed to focus on the following two points: a. The importance of using the panel sample that is part of the survey sample, to monitor the dynamic changes of the labor market. b. Improving the used questionnaire to include more questions, that help in better defining of relationship to labor force of each household member (employed, unemployed, out of labor force ...etc.). In addition to re-order of some of the already existing questions in much logical way.

    3- Starting the January 2008 round, the used methodology was developed to collect more representative sample during the survey year. this is done through distributing the sample of each governorate into five groups, the questionnaires are collected from each of them separately every 15 days for 3 months (in the middle and the end of the month)

    ----> The survey aims at covering the following topics:

    1- Measuring the size of the Egyptian labor force among civilians (for all governorates of the republic) by their different characteristics. 2- Measuring the employment rate at national level and different geographical areas. 3- Measuring the distribution of employed people by the following characteristics: gender, age, educational status, occupation, economic activity, and sector. 4- Measuring unemployment rate at different geographic areas. 5- Measuring the distribution of unemployed people by the following characteristics: gender, age, educational status, unemployment type "ever employed/never employed", occupation, economic activity, and sector for people who have ever worked.

    The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing labor force surveys in several Arab countries.

    Geographic coverage

    Covering a sample of urban and rural areas in all the governorates.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey covered a national sample of households and all individuals permanently residing in surveyed households.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The cleaned and harmonized version of the survey data produced and published by the Economic Research Forum represents 100% of the original survey data collected by the Central Agency for Public Mobilization and Statistics (CAPMAS)

    Sample Design and Selection

    The sample of the LFS 2006 survey is a simple systematic random sample.

    Sample Size

    The sample size varied in each quarter (it is Q1=19429, Q2=19419, Q3=19119 and Q4=18835) households with a total number of 76802 households annually. These households are distributed on the governorate level (urban/rural).

    A more detailed description of the different sampling stages and allocation of sample across governorates is provided in the Methodology document available among external resources in Arabic.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire design follows the latest International Labor Organization (ILO) concepts and definitions of labor force, employment, and unemployment.

    The questionnaire comprises 3 tables in addition to the identification and geographic data of household on the cover page.

    ----> Table 1- Demographic and employment characteristics and basic data for all household individuals

    Including: gender, age, educational status, marital status, residence mobility and current work status

    ----> Table 2- Employment characteristics table

    This table is filled by employed individuals at the time of the survey or those who were engaged to work during the reference week, and provided information on: - Relationship to employer: employer, self-employed, waged worker, and unpaid family worker - Economic activity - Sector - Occupation - Effective working hours - Work place - Average monthly wage

    ----> Table 3- Unemployment characteristics table

    This table is filled by all unemployed individuals who satisfied the unemployment criteria, and provided information on: - Type of unemployment (unemployed, unemployed ever worked) - Economic activity and occupation in the last held job before being unemployed - Last unemployment duration in months - Main reason for unemployment

    Cleaning operations

    ----> Raw Data

    Office editing is one of the main stages of the survey. It started once the questionnaires were received from the field and accomplished by the selected work groups. It includes: a-Editing of coverage and completeness b-Editing of consistency

    ----> Harmonized Data

    • STATA is used to clean and SPSS is used harmonize the datasets.
    • The harmonization process starts with a cleaning process for all raw data files received from the Statistical Agency.
    • All cleaned data files are then merged to produce one data file on the individual level containing all variables subject to harmonization.
    • A country-specific program is generated for each dataset to generate/ compute/ recode/ rename/ format/ label harmonized variables.
    • A post-harmonization cleaning process is then conducted on the data.
    • Harmonized data is saved on the household as well as the individual level, in SPSS and then converted to STATA, to be disseminated.
  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jeremy Sheff; Jeremy Sheff (2024). The Canada Trademarks Dataset [Dataset]. http://doi.org/10.5281/zenodo.4999655
Organization logo

The Canada Trademarks Dataset

Explore at:
8 scholarly articles cite this dataset (View in Google Scholar)
zip, pdfAvailable download formats
Dataset updated
Jul 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jeremy Sheff; Jeremy Sheff
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Canada Trademarks Dataset

18 Journal of Empirical Legal Studies 908 (2021), prepublication draft available at https://papers.ssrn.com/abstract=3782655, published version available at https://onlinelibrary.wiley.com/share/author/CHG3HC6GTFMMRU8UJFRR?target=10.1111/jels.12303

Dataset Selection and Arrangement (c) 2021 Jeremy Sheff

Python and Stata Scripts (c) 2021 Jeremy Sheff

Contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office.

This individual-application-level dataset includes records of all applications for registered trademarks in Canada since approximately 1980, and of many preserved applications and registrations dating back to the beginning of Canada’s trademark registry in 1865, totaling over 1.6 million application records. It includes comprehensive bibliographic and lifecycle data; trademark characteristics; goods and services claims; identification of applicants, attorneys, and other interested parties (including address data); detailed prosecution history event data; and data on application, registration, and use claims in countries other than Canada. The dataset has been constructed from public records made available by the Canadian Intellectual Property Office. Both the dataset and the code used to build and analyze it are presented for public use on open-access terms.

Scripts are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/. Data files are licensed for reuse subject to the Creative Commons Attribution License 4.0 (CC-BY-4.0), https://creativecommons.org/licenses/by/4.0/, and also subject to additional conditions imposed by the Canadian Intellectual Property Office (CIPO) as described below.

Terms of Use:

As per the terms of use of CIPO's government data, all users are required to include the above-quoted attribution to CIPO in any reproductions of this dataset. They are further required to cease using any record within the datasets that has been modified by CIPO and for which CIPO has issued a notice on its website in accordance with its Terms and Conditions, and to use the datasets in compliance with applicable laws. These requirements are in addition to the terms of the CC-BY-4.0 license, which require attribution to the author (among other terms). For further information on CIPO’s terms and conditions, see https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html. For further information on the CC-BY-4.0 license, see https://creativecommons.org/licenses/by/4.0/.

The following attribution statement, if included by users of this dataset, is satisfactory to the author, but the author makes no representations as to whether it may be satisfactory to CIPO:

The Canada Trademarks Dataset is (c) 2021 by Jeremy Sheff and licensed under a CC-BY-4.0 license, subject to additional terms imposed by the Canadian Intellectual Property Office. It contains data licensed by Her Majesty the Queen in right of Canada, as represented by the Minister of Industry, the minister responsible for the administration of the Canadian Intellectual Property Office. For further information, see https://creativecommons.org/licenses/by/4.0/ and https://www.ic.gc.ca/eic/site/cipointernet-internetopic.nsf/eng/wr01935.html.

Details of Repository Contents:

This repository includes a number of .zip archives which expand into folders containing either scripts for construction and analysis of the dataset or data files comprising the dataset itself. These folders are as follows:

  • /csv: contains the .csv versions of the data files
  • /do: contains Stata do-files used to convert the .csv files to .dta format and perform the statistical analyses set forth in the paper reporting this dataset
  • /dta: contains the .dta versions of the data files
  • /py: contains the python scripts used to download CIPO’s historical trademarks data via SFTP and generate the .csv data files

If users wish to construct rather than download the datafiles, the first script that they should run is /py/sftp_secure.py. This script will prompt the user to enter their IP Horizons SFTP credentials; these can be obtained by registering with CIPO at https://ised-isde.survey-sondage.ca/f/s.aspx?s=59f3b3a4-2fb5-49a4-b064-645a5e3a752d&lang=EN&ds=SFTP. The script will also prompt the user to identify a target directory for the data downloads. Because the data archives are quite large, users are advised to create a target directory in advance and ensure they have at least 70GB of available storage on the media in which the directory is located.

The sftp_secure.py script will generate a new subfolder in the user’s target directory called /XML_raw. Users should note the full path of this directory, which they will be prompted to provide when running the remaining python scripts. Each of the remaining scripts, the filenames of which begin with “iterparse”, corresponds to one of the data files in the dataset, as indicated in the script’s filename. After running one of these scripts, the user’s target directory should include a /csv subdirectory containing the data file corresponding to the script; after running all the iterparse scripts the user’s /csv directory should be identical to the /csv directory in this repository. Users are invited to modify these scripts as they see fit, subject to the terms of the licenses set forth above.

With respect to the Stata do-files, only one of them is relevant to construction of the dataset itself. This is /do/CA_TM_csv_cleanup.do, which converts the .csv versions of the data files to .dta format, and uses Stata’s labeling functionality to reduce the size of the resulting files while preserving information. The other do-files generate the analyses and graphics presented in the paper describing the dataset (Jeremy N. Sheff, The Canada Trademarks Dataset, 18 J. Empirical Leg. Studies (forthcoming 2021)), available at https://papers.ssrn.com/abstract=3782655). These do-files are also licensed for reuse subject to the terms of the CC-BY-4.0 license, and users are invited to adapt the scripts to their needs.

The python and Stata scripts included in this repository are separately maintained and updated on Github at https://github.com/jnsheff/CanadaTM.

This repository also includes a copy of the current version of CIPO's data dictionary for its historical XML trademarks archive as of the date of construction of this dataset.

Search
Clear search
Close search
Google apps
Main menu