16 datasets found
  1. f

    Data from: Supplemental data

    • figshare.com
    xlsx
    Updated Mar 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    T Miyakoshi; Yoichi M. Ito (2024). Supplemental data [Dataset]. http://doi.org/10.6084/m9.figshare.24596058.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 15, 2024
    Dataset provided by
    figshare
    Authors
    T Miyakoshi; Yoichi M. Ito
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset for the article "The current utilization status of wearable devices in clinical research".Analyses were performed by utilizing the JMP Pro 16.10, Microsoft Excel for Mac version 16 (Microsoft).The file extension "jrp" is a file of the statistical analysis software JMP, which contains both the analysis code and the data set.In case JMP is not available, a "csv" file as a data set and JMP script, the analysis code, are prepared in "rtf" format.The "xlsx" file is a Microsoft Excel file that contains the data set and the data plotted or tabulated using Microsoft Excel functions.Supplementary Figure 1. NCT number duplication frequencyIncludes Excel file used to create the figure (Supplemental Figure 1).・Sfig1_NCT number duplication frequency.xlsxSupplementary Figure 2-5 Simple and annual time series aggregationIncludes Excel file, JMP repo file, csv dataset of JMP repo file and JMP scripts used to create the figure (Supplementary Figures 2-5).・Sfig2-5 Annual time series aggregation.xlsx・Sfig2 Study Type.jrp・Sfig4device type.jrp・Sfig3 Interventions Type.jrp・Sfig5Conditions type.jrp・Sfig2, 3 ,5_database.csv・Sfig2_JMP script_Study type.rtf・Sfig3_JMP script Interventions type.rtf・Sfig5_JMP script Conditions type.rtf・Sfig4_dataset.csv・Sfig4_JMP script_device type.rtfSupplementary Figures 6-11 Mosaic diagram of intervention by conditionSupplementary tables 4-9 Analysis of contingency table for intervention by condition JMP repot files used to create the figures(Supplementary Figures 6-11 ) and tables(Supplementary Tablea 4-9) , including the csv dataset of JMP repot files and JMP scripts.・Sfig6-11 Stable4-9 Intervention devicetype_conditions.jrp・Sfig6-11_Stable4-9_dataset.csv・Sfig6-11_Stable4-9_JMP script.rtfSupplementary Figure 12. Distribution of enrollmentIncludes Excel file, JMP repo file, csv dataset of JMP repo file and JMP scripts used to create the figure (Supplementary Figures 12).・Sfig12_Distribution of enrollment.jrp・Sfig12_Distribution of enrollment.csv・Sfig12_JMP script.rtf

  2. N

    Excel, AL Population Pyramid Dataset: Age Groups, Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Excel, AL Population Pyramid Dataset: Age Groups, Male and Female Population, and Total Population for Demographics Analysis // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/524b2436-f122-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 22, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Excel
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Total Population for Age Groups, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) male population, (b) female population and (b) total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the data for the Excel, AL population pyramid, which represents the Excel population distribution across age and gender, using estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It lists the male and female population for each age group, along with the total population for those age groups. Higher numbers at the bottom of the table suggest population growth, whereas higher numbers at the top indicate declining birth rates. Furthermore, the dataset can be utilized to understand the youth dependency ratio, old-age dependency ratio, total dependency ratio, and potential support ratio.

    Key observations

    • Youth dependency ratio, which is the number of children aged 0-14 per 100 persons aged 15-64, for Excel, AL, is 61.2.
    • Old-age dependency ratio, which is the number of persons aged 65 or over per 100 persons aged 15-64, for Excel, AL, is 26.9.
    • Total dependency ratio for Excel, AL is 88.1.
    • Potential support ratio, which is the number of youth (working age population) per elderly, for Excel, AL is 3.7.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group for the Excel population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Excel for the selected age group is shown in the following column.
    • Population (Female): The female population in the Excel for the selected age group is shown in the following column.
    • Total Population: The total population of the Excel for the selected age group is shown in the following column.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Excel Population by Age. You can refer the same here

  3. w

    Immigration system statistics data tables

    • gov.uk
    Updated May 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2025). Immigration system statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/immigration-system-statistics-data-tables
    Explore at:
    Dataset updated
    May 22, 2025
    Dataset provided by
    GOV.UK
    Authors
    Home Office
    Description

    List of the data tables as part of the Immigration System Statistics Home Office release. Summary and detailed data tables covering the immigration system, including out-of-country and in-country visas, asylum, detention, and returns.

    If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.

    Accessible file formats

    The Microsoft Excel .xlsx files may not be suitable for users of assistive technology.
    If you use assistive technology (such as a screen reader) and need a version of these documents in a more accessible format, please email MigrationStatsEnquiries@homeoffice.gov.uk
    Please tell us what format you need. It will help us if you say what assistive technology you use.

    Related content

    Immigration system statistics, year ending March 2025
    Immigration system statistics quarterly release
    Immigration system statistics user guide
    Publishing detailed data tables in migration statistics
    Policy and legislative changes affecting migration to the UK: timeline
    Immigration statistics data archives

    Passenger arrivals

    https://assets.publishing.service.gov.uk/media/68258d71aa3556876875ec80/passenger-arrivals-summary-mar-2025-tables.xlsx">Passenger arrivals summary tables, year ending March 2025 (MS Excel Spreadsheet, 66.5 KB)

    ‘Passengers refused entry at the border summary tables’ and ‘Passengers refused entry at the border detailed datasets’ have been discontinued. The latest published versions of these tables are from February 2025 and are available in the ‘Passenger refusals – release discontinued’ section. A similar data series, ‘Refused entry at port and subsequently departed’, is available within the Returns detailed and summary tables.

    Electronic travel authorisation

    https://assets.publishing.service.gov.uk/media/681e406753add7d476d8187f/electronic-travel-authorisation-datasets-mar-2025.xlsx">Electronic travel authorisation detailed datasets, year ending March 2025 (MS Excel Spreadsheet, 56.7 KB)
    ETA_D01: Applications for electronic travel authorisations, by nationality ETA_D02: Outcomes of applications for electronic travel authorisations, by nationality

    Entry clearance visas granted outside the UK

    https://assets.publishing.service.gov.uk/media/68247953b296b83ad5262ed7/visas-summary-mar-2025-tables.xlsx">Entry clearance visas summary tables, year ending March 2025 (MS Excel Spreadsheet, 113 KB)

    https://assets.publishing.service.gov.uk/media/682c4241010c5c28d1c7e820/entry-clearance-visa-outcomes-datasets-mar-2025.xlsx">Entry clearance visa applications and outcomes detailed datasets, year ending March 2025 (MS Excel Spreadsheet, 29.1 MB)
    Vis_D01: Entry clearance visa applications, by nationality and visa type
    Vis_D02: Outcomes of entry clearance visa applications, by nationality, visa type, and outcome

    Additional dat

  4. Market Basket Analysis

    • kaggle.com
    Updated Dec 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 9, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aslan Ahmedov
    Description

    Market Basket Analysis

    Market basket analysis with Apriori algorithm

    The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

    Introduction

    Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

    An Example of Association Rules

    Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

    Strategy

    • Data Import
    • Data Understanding and Exploration
    • Transformation of the data – so that is ready to be consumed by the association rules algorithm
    • Running association rules
    • Exploring the rules generated
    • Filtering the generated rules
    • Visualization of Rule

    Dataset Description

    • File name: Assignment-1_Data
    • List name: retaildata
    • File format: . xlsx
    • Number of Row: 522065
    • Number of Attributes: 7

      • BillNo: 6-digit number assigned to each transaction. Nominal.
      • Itemname: Product name. Nominal.
      • Quantity: The quantities of each product per transaction. Numeric.
      • Date: The day and time when each transaction was generated. Numeric.
      • Price: Product price. Numeric.
      • CustomerID: 5-digit number assigned to each customer. Nominal.
      • Country: Name of the country where each customer resides. Nominal.

    imagehttps://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

    Libraries in R

    First, we need to load required libraries. Shortly I describe all libraries.

    • arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).
    • arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.
    • tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.
    • readxl - Read Excel Files in R.
    • plyr - Tools for Splitting, Applying and Combining Data.
    • ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
    • knitr - Dynamic Report generation in R.
    • magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.
    • dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
    • tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

    imagehttps://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

    Data Pre-processing

    Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

    imagehttps://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> imagehttps://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

    After we will clear our data frame, will remove missing values.

    imagehttps://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

    To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...

  5. Hive Annotation Job Results - Cleaned and Audited

    • kaggle.com
    Updated Apr 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brendan Kelley (2021). Hive Annotation Job Results - Cleaned and Audited [Dataset]. https://www.kaggle.com/brendankelley/hive-annotation-job-results-cleaned-and-audited/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 28, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Brendan Kelley
    Description

    Context

    This notebook serves to showcase my problem solving ability, knowledge of the data analysis process, proficiency with Excel and its various tools and functions, as well as my strategic mindset and statistical prowess. This project consist of an auditing prompt provided by Hive Data, a raw Excel data set, a cleaned and audited version of the raw Excel data set, and my description of my thought process and knowledge used during completion of the project. The prompt can be found below:

    Hive Data Audit Prompt

    The raw data that accompanies the prompt can be found below:

    Hive Annotation Job Results - Raw Data

    ^ These are the tools I was given to complete my task. The rest of the work is entirely my own.

    To summarize broadly, my task was to audit the dataset and summarize my process and results. Specifically, I was to create a method for identifying which "jobs" - explained in the prompt above - needed to be rerun based on a set of "background facts," or criteria. The description of my extensive thought process and results can be found below in the Content section.

    Content

    Brendan Kelley April 23, 2021

    Hive Data Audit Prompt Results

    This paper explains the auditing process of the “Hive Annotation Job Results” data. It includes the preparation, analysis, visualization, and summary of the data. It is accompanied by the results of the audit in the excel file “Hive Annotation Job Results – Audited”.

    Observation

    The “Hive Annotation Job Results” data comes in the form of a single excel sheet. It contains 7 columns and 5,001 rows, including column headers. The data includes “file”, “object id”, and the pseudonym for five questions that each client was instructed to answer about their respective table: “tabular”, “semantic”, “definition list”, “header row”, and “header column”. The “file” column includes non-unique (that is, there are multiple instances of the same value in the column) numbers separated by a dash. The “object id” column includes non-unique numbers ranging from 5 to 487539. The columns containing the answers to the five questions include Boolean values - TRUE or FALSE – which depend upon the yes/no worker judgement.

    Use of the COUNTIF() function reveals that there are no values other than TRUE or FALSE in any of the five question columns. The VLOOKUP() function reveals that the data does not include any missing values in any of the cells.

    Assumptions

    Based on the clean state of the data and the guidelines of the Hive Data Audit Prompt, the assumption is that duplicate values in the “file” column are acceptable and should not be removed. Similarly, duplicated values in the “object id” column are acceptable and should not be removed. The data is therefore clean and is ready for analysis/auditing.

    Preparation

    The purpose of the audit is to analyze the accuracy of the yes/no worker judgement of each question according to the guidelines of the background facts. The background facts are as follows:

    • A table that is a definition list should automatically be tabular and also semantic • Semantic tables should automatically be tabular • If a table is NOT tabular, then it is definitely not semantic nor a definition list • A tabular table that has a header row OR header column should definitely be semantic

    These background facts serve as instructions for how the answers to the five questions should interact with one another. These facts can be re-written to establish criteria for each question:

    For tabular column: - If the table is a definition list, it is also tabular - If the table is semantic, it is also tabular

    For semantic column: - If the table is a definition list, it is also semantic - If the table is not tabular, it is not semantic - If the table is tabular and has either a header row or a header column...

  6. Ten-year data tables by province, industry and substance – releases

    • open.canada.ca
    • ouvert.canada.ca
    csv, html
    Updated Dec 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Environment and Climate Change Canada (2024). Ten-year data tables by province, industry and substance – releases [Dataset]. https://open.canada.ca/data/en/dataset/ea0dc8ae-d93c-4e24-9f61-946f1736a26f
    Explore at:
    html, csvAvailable download formats
    Dataset updated
    Dec 5, 2024
    Dataset provided by
    Environment And Climate Change Canadahttps://www.canada.ca/en/environment-climate-change.html
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Time period covered
    Jan 1, 2014 - Dec 31, 2023
    Description

    The National Pollutant Release Inventory (NPRI) is Canada's public inventory of pollutant releases (to air, water and land), disposals and transfers for recycling. Each file contains annual total releases for the past ten years by media (air, water or land), broken-down by province, industry or substance. Files are in .CSV format. The results can be further broken down using the pre-defined search available at the bottom of the NPRI Data Search webpage. The results returned by the NPRI search engine may differ from the numbers contained in the downloadable files. The online search engine’s results will display releases, disposals and transfers reported by facilities, but does not distinguish between media type (i.e. air, water, land). It also displays facilities reporting only under Ontario Regulation 127/01 and facilities submitting “did not meet criteria” reports. Please consult the following resources to enhance your analysis: - Guide on using and Interpreting NPRI Data: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/using-interpreting-data.html - Access additional data from the NPRI, including datasets and mapping products: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/exploredata.html Supplemental Information More NPRI datasets and mapping products are available here: https://www.canada.ca/en/environment-climate-change/services/national-pollutant-release-inventory/tools-resources-data/access.html Supporting Projects: National Pollutant Release Inventory (NPRI)

  7. s

    Analysis of CBCS publications for Open Access, data availability statements...

    • figshare.scilifelab.se
    • researchdata.se
    txt
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Theresa Kieselbach (2025). Analysis of CBCS publications for Open Access, data availability statements and persistent identifiers for supplementary data [Dataset]. http://doi.org/10.17044/scilifelab.23641749.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    Umeå University
    Authors
    Theresa Kieselbach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    General descriptionThis dataset contains some markers of Open Science in the publications of the Chemical Biology Consortium Sweden (CBCS) between 2010 and July 2023. The sample of CBCS publications during this period consists of 188 articles. Every publication was visited manually at its DOI URL to answer the following questions.1. Is the research article an Open Access publication?2. Does the research article have a Creative Common license or a similar license?3. Does the research article contain a data availability statement?4. Did the authors submit data of their study to a repository such as EMBL, Genbank, Protein Data Bank PDB, Cambridge Crystallographic Data Centre CCDC, Dryad or a similar repository?5. Does the research article contain supplementary data?6. Do the supplementary data have a persistent identifier that makes them citable as a defined research output?VariablesThe data were compiled in a Microsoft Excel 365 document that includes the following variables.1. DOI URL of research article2. Year of publication3. Research article published with Open Access4. License for research article5. Data availability statement in article6. Supplementary data added to article7. Persistent identifier for supplementary data8. Authors submitted data to NCBI or EMBL or PDB or Dryad or CCDCVisualizationParts of the data were visualized in two figures as bar diagrams using Microsoft Excel 365. The first figure displays the number of publications during a year, the number of publications that is published with open access and the number of publications that contain a data availability statement (Figure 1). The second figure shows the number of publication sper year and how many publications contain supplementary data. This figure also shows how many of the supplementary datasets have a persistent identifier (Figure 2).File formats and softwareThe file formats used in this dataset are:.csv (Text file).docx (Microsoft Word 365 file).jpg (JPEG image file).pdf/A (Portable Document Format for archiving).png (Portable Network Graphics image file).pptx (Microsoft Power Point 365 file).txt (Text file).xlsx (Microsoft Excel 365 file)All files can be opened with Microsoft Office 365 and work likely also with the older versions Office 2019 and 2016. MD5 checksumsHere is a list of all files of this dataset and of their MD5 checksums.1. Readme.txt (MD5: 795f171be340c13d78ba8608dafb3e76)2. Manifest.txt (MD5: 46787888019a87bb9d897effdf719b71)3. Materials_and_methods.docx (MD5: 0eedaebf5c88982896bd1e0fe57849c2),4. Materials_and_methods.pdf (MD5: d314bf2bdff866f827741d7a746f063b),5. Materials_and_methods.txt (MD5: 26e7319de89285fc5c1a503d0b01d08a),6. CBCS_publications_until_date_2023_07_05.xlsx (MD5: 532fec0bd177844ac0410b98de13ca7c),7. CBCS_publications_until_date_2023_07_05.csv (MD5: 2580410623f79959c488fdfefe8b4c7b),8. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.xlsx (MD5: 9c67dd84a6b56a45e1f50a28419930e5),9. Data_from_CBCS_publications_until_date_2023_07_05_obtained_by_manual_collection.csv (MD5: fb3ac69476bfc57a8adc734b4d48ea2b),10. Aggregated_data_from_CBCS_publications_until_2023_07_05.xlsx (MD5: 6b6cbf3b9617fa8960ff15834869f793),11. Aggregated_data_from_CBCS_publications_until_2023_07_05.csv (MD5: b2b8dd36ba86629ed455ae5ad2489d6e),12. Figure_1_CBCS_publications_until_2023_07_05_Open_Access_and_data_availablitiy_statement.xlsx (MD5: 9c0422cf1bbd63ac0709324cb128410e),13. Figure_1.pptx (MD5: 55a1d12b2a9a81dca4bb7f333002f7fe),14. Image_of_figure_1.jpg (MD5: 5179f69297fbbf2eaaf7b641784617d7),15. Image_of_figure_1.png (MD5: 8ec94efc07417d69115200529b359698),16. Figure_2_CBCS_publications_until_2023_07_05_supplementary_data_and_PID_for_supplementary_data.xlsx (MD5: f5f0d6e4218e390169c7409870227a0a),17. Figure_2.pptx (MD5: 0fd4c622dc0474549df88cf37d0e9d72),18. Image_of_figure_2.jpg (MD5: c6c68b63b7320597b239316a1c15e00d),19. Image_of_figure_2.png (MD5: 24413cc7d292f468bec0ac60cbaa7809)

  8. Replication Package - How Do Requirements Evolve During Elicitation? An...

    • zenodo.org
    bin, zip
    Updated Apr 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alessio Ferrari; Alessio Ferrari; Paola Spoletini; Paola Spoletini; Sourav Debnath; Sourav Debnath (2022). Replication Package - How Do Requirements Evolve During Elicitation? An Empirical Study Combining Interviews and App Store Analysis [Dataset]. http://doi.org/10.5281/zenodo.6472498
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Apr 21, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alessio Ferrari; Alessio Ferrari; Paola Spoletini; Paola Spoletini; Sourav Debnath; Sourav Debnath
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the replication package for the paper titled "How Do Requirements Evolve During Elicitation? An Empirical Study Combining Interviews and App Store Analysis", by Alessio Ferrari, Paola Spoletini and Sourav Debnath.

    The package contains the following folders and files.

    /R-analysis

    This is a folder containing all the R implementations of the the statistical tests included in the paper, together with the source .csv file used to produce the results. Each R file has the same title as the associated .csv file. The titles of the files reflect the RQs as they appear in the paper. The association between R files and Tables in the paper is as follows:

    - RQ1-1-analyse-story-rates.R: Tabe 1, user story rates

    - RQ1-1-analyse-role-rates.R: Table 1, role rates

    - RQ1-2-analyse-story-category-phase-1.R: Table 3, user story category rates in phase 1 compared to original rates

    - RQ1-2-analyse-role-category-phase-1.R: Table 5, role category rates in phase 1 compared to original rates

    - RQ2.1-analysis-app-store-rates-phase-2.R: Table 8, user story and role rates in phase 2

    - RQ2.2-analysis-percent-three-CAT-groups-ph1-ph2.R: Table 9, comparison of the categories of user stories in phase 1 and 2

    - RQ2.2-analysis-percent-two-CAT-roles-ph1-ph2.R: Table 10, comparison of the categories of roles in phase 1 and 2.

    The .csv files used for statistical tests are also used to produce boxplots. The association betwee boxplot figures and files is as follows.

    - RQ1-1-story-rates.csv: Figure 4

    - RQ1-1-role-rates.csv: Figure 5

    - RQ1-2-categories-phase-1.csv: Figure 8

    - RQ1-2-role-category-phase-1.csv: Figure 9

    - RQ2-1-user-story-and-roles-phase-2.csv: Figure 13

    - RQ2.2-percent-three-CAT-groups-ph1-ph2.csv: Figure 14

    - RQ2.2-percent-two-CAT-roles-ph1-ph2.csv: Figure 17

    - IMG-only-RQ2.2-us-category-comparison-ph1-ph2.csv: Figure 15

    - IMG-only-RQ2.2-frequent-roles.csv: Figure 18

    NOTE: The last two .csv files do not have an associated statistical tests, but are used solely to produce boxplots.

    /Data-Analysis

    This folder contains all the data used to answer the research questions.

    RQ1.xlsx: includes all the data associated to RQ1 subquestions, two tabs for each subquestion (one for user stories and one for roles). The names of the tabs are self-explanatory of their content.

    RQ2.1.xlsx: includes all the data for the RQ1.1 subquestion. Specifically, it includes the following tabs:

    - Data Source-US-category: for each category of user story, and for each analyst, there are two lines.

    The first one reports the number of user stories in that category for phase 1, and the second one reports the

    number of user stories in that category for phase 2, considering the specific analyst.

    - Data Source-role: for each category of role, and for each analyst, there are two lines.

    The first one reports the number of user stories in that role for phase 1, and the second one reports the

    number of user stories in that role for phase 2, considering the specific analyst.

    - RQ2.1 rates: reports the final rates for RQ2.1.

    NOTE: The other tabs are used to support the computation of the final rates.

    RQ2.2.xlsx: includes all the data for the RQ2.2 subquestion. Specifically, it includes the following tabs:

    - Data Source-US-category: same as RQ2.1.xlsx

    - Data Source-role: same as RQ2.1.xlsx

    - RQ2.2-category-group: comparison between groups of categories in the different phases, used to produce Figure 14

    - RQ2.2-role-group: comparison between role groups in the different phases, used to produce Figure 17

    - RQ2.2-specific-roles-diff: difference between specific roles, used to produce Figure 18

    NOTE: the other tabs are used to support the computation of the values reported in the tabs above.

    RQ2.2-single-US-category.xlsx: includes the data for the RQ2.2 subquestion associated to single categories of user stories.

    A separate tab is used given the complexity of the computations.

    - Data Source-US-category: same as RQ2.1.xlsx

    - Totals: total number of user stories for each analyst in phase 1 and phase 2

    - Results-Rate-Comparison: difference between rates of user stories in phase 1 and phase 2, used to produce the file

    "img/IMG-only-RQ2.2-us-category-comparison-ph1-ph2.csv", which is in turn used to produce Figure 15

    - Results-Analysts: number of analysts using each novel category produced in phase 2, used to produce Figure 16.

    NOTE: the other tabs are used to support the computation of the values reported in the tabs above.

    RQ2.3.xlsx: includes the data for the RQ2.3 subquestion. Specifically, it includes the following tabs:

    - Data Source-US-category: same as RQ2.1.xlsx

    - Data Source-role: same as RQ2.1.xlsx

    - RQ2.3-categories: novel categories produced in phase 2, used to produce Figure 19

    - RQ2-3-most-frequent-categories: most frequent novel categories

    /Raw-Data-Phase-I

    The folder contains one Excel file for each analyst, s1.xlsx...s30.xlsx, plus the file of the original user stories with annotations (original-us.xlsx). Each file contains two tabs:

    - Evaluation: includes the annotation of the user stories as existing user story in the original categories (annotated with "E"), novel user story in a certain category (refinement, annotated with "N"), and novel user story in novel category (Name of the category in column "New Feature"). **NOTE 1:** It should be noticed that in the paper the case "refinement" is said to be annotated with "R" (instead of "N", as in the files) to make the paper clearer and easy to read.

    - Roles: roles used in the user stories, and count of the user stories belonging to a certain role.

    /Raw-Data-Phaes-II

    The folder contains one Excel file for each analyst, s1.xlsx...s30.xlsx. Each file contains two tabs:

    - Analysis: includes the annotation of the user stories as belonging to existing original

    category (X), or to categories introduced after interviews, or to categories introduced

    after app store inspired elicitation (name of category in "Cat. Created in PH1"), or to

    entirely novel categories (name of category in "New Category").

    - Roles: roles used in the user stories, and count of the user stories belonging to a certain role.

    /Figures

    This folder includes the figures reported in the paper. The boxplots are generated from the

    data using the tool http://shiny.chemgrid.org/boxplotr/. The histograms and other plots are

    produced with Excel, and are also reported in the excel files listed above.

  9. m

    Metropolitan Lagos Dataset on Customers' Perception Ratings of Problems...

    • data.mendeley.com
    Updated Feb 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Tsetimi (2021). Metropolitan Lagos Dataset on Customers' Perception Ratings of Problems Associated with Electricity Distribution [Dataset]. http://doi.org/10.17632/jddmfmy7ry.2
    Explore at:
    Dataset updated
    Feb 22, 2021
    Authors
    Jonathan Tsetimi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Metropolitan Lagos dataset consists of the files (i) tsetimi_lagos_dataset.sav and (ii) tsetimi_lagos_dataset.xlxs. The two files contain the same number of records (377) and same information. The first file is in IBM SPSS database format while the second is in Microsoft Excel spreadsheet format. The SPSS database format can be accessed in the data view of SPSS. The fieldnames, field descriptions and field types are self-contained in the SPSS database file.

    The dataset is part of a nationwide survey on the problems associated with electricity distribution and generation in Nigeria. A pilot survey [1] of this research was conducted in Delta State South-South, Nigeria. The files for the pilot survey are available in [2]. The survey for the Lagos data set was conducted by means of a well-structured questionnaire administered by trained interviewers. The questionnaire for the research collected information on respondents’ bio-data, experience with the services of their distribution companies and observed problems on electricity distribution from the fieldwork. The perception ratings on the services of distributions companies from the electricity customers was on a five-point scale based on the following metrics adapted from [3]: i. Overall satisfaction with services of distribution company; ii. Quality and reliability of power from distribution company; iii. Reasonableness of bills from distribution company; iv. Billing system of distribution company; v. Corporate image of distribution company; vi. Effectiveness of Communication of distribution company with stakeholders; vii. Customers service of the distribution company. The respondents scored the metrics between 0 and 5 inclusive depending on their perception on the above metrics. The scores of the respondents on the observed problems were based on the following items listed below: i. Low voltage; ii. Incessant power outages; iii. Load Shedding; iv. Inadequate number of meters; v. Inadequate distribution lines; vi. Unreasonable price of power; vii. Illegal connections; viii. Inadequate number of transformers; ix. Stealing of Distribution facilities; The respondents assign a score between 0 and 10 inclusive depending on their perception on the level of severity of the observed problems.

    References [1] J. Tsetimi, A. O. Atonuje and E. J. Mamadu. An Analysis of a Pilot Survey of the Problems of Electricity Distribution in Delta State, Nigeria. Transactions of Nigerian Institution of Mathematical Physics. 2020; 12(7): 109-116 [2] J. Tsetimi. Customers' Problems with Electricity Distribution in Delta State Nigeria, [dataset], Mendeley Data, V1, doi: 10.17632/msrhyv489k.1. 2020. Accessed 16th February, 2021. Available: http://dx.doi.org/10.17632/msrhyv489k.1 [3] D. Smith, S. Nayak, M. Karig, I. Kosnik, M. Konya, K. Lovett, Z. Liu, and H.Luvai. Assessing Residential Customer Satisfaction for Large Electric Utilities. UMSL, Department of Economics Working Papers. (2011).

  10. i

    Household Health Survey 2012-2013, Economic Research Forum (ERF)...

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Jun 26, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistical Organization (CSO) (2017). Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq [Dataset]. https://datacatalog.ihsn.org/catalog/6937
    Explore at:
    Dataset updated
    Jun 26, 2017
    Dataset provided by
    Economic Research Forum
    Kurdistan Regional Statistics Office (KRSO)
    Central Statistical Organization (CSO)
    Time period covered
    2012 - 2013
    Area covered
    Iraq
    Description

    Abstract

    The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.

    ----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:

    Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    The survey has six main objectives. These objectives are:

    1. Provide data for poverty analysis and measurement and monitor, evaluate and update the implementation Poverty Reduction National Strategy issued in 2009.
    2. Provide comprehensive data system to assess household social and economic conditions and prepare the indicators related to the human development.
    3. Provide data that meet the needs and requirements of national accounts.
    4. Provide detailed indicators on consumption expenditure that serve making decision related to production, consumption, export and import.
    5. Provide detailed indicators on the sources of households and individuals income.
    6. Provide data necessary for formulation of a new consumer price index number.

    The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.

    Geographic coverage

    National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    ----> Design:

    Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.

    ----> Sample frame:

    Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.

    ----> Sampling Stages:

    In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    ----> Preparation:

    The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.

    ----> Questionnaire Parts:

    The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job

    Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.

    Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days

    Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.

    Cleaning operations

    ----> Raw Data:

    Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.

    ----> Harmonized Data:

    • The SPSS package is used to harmonize the Iraq Household Socio Economic Survey (IHSES) 2007 with Iraq Household Socio Economic Survey (IHSES) 2012.
    • The harmonization process starts with raw data files received from the Statistical Office.
    • A program is generated for each dataset to create harmonized variables.
    • Data is saved on the household and individual level, in SPSS and then converted to STATA, to be disseminated.

    Response rate

    Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).

  11. Enterprise Survey 2009-2019, Panel Data - Slovenia

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated Aug 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Bank for Reconstruction and Development (EBRD) (2020). Enterprise Survey 2009-2019, Panel Data - Slovenia [Dataset]. https://microdata.worldbank.org/index.php/catalog/3762
    Explore at:
    Dataset updated
    Aug 6, 2020
    Dataset provided by
    World Bank Grouphttp://www.worldbank.org/
    World Bankhttp://worldbank.org/
    European Bank for Reconstruction and Developmenthttp://ebrd.com/
    European Investment Bank (EIB)
    Time period covered
    2008 - 2019
    Area covered
    Slovenia
    Description

    Abstract

    The documentation covers Enterprise Survey panel datasets that were collected in Slovenia in 2009, 2013 and 2019.

    The Slovenia ES 2009 was conducted between 2008 and 2009. The Slovenia ES 2013 was conducted between March 2013 and September 2013. Finally, the Slovenia ES 2019 was conducted between December 2018 and November 2019. The objective of the Enterprise Survey is to gain an understanding of what firms experience in the private sector.

    As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.

    Geographic coverage

    National

    Analysis unit

    The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must take its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.

    Universe

    As it is standard for the ES, the Slovenia ES was based on the following size stratification: small (5 to 19 employees), medium (20 to 99 employees), and large (100 or more employees).

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The sample for Slovenia ES 2009, 2013, 2019 were selected using stratified random sampling, following the methodology explained in the Sampling Manual for Slovenia 2009 ES and for Slovenia 2013 ES, and in the Sampling Note for 2019 Slovenia ES.

    Three levels of stratification were used in this country: industry, establishment size, and oblast (region). The original sample designs with specific information of the industries and regions chosen are included in the attached Excel file (Sampling Report.xls.) for Slovenia 2009 ES. For Slovenia 2013 and 2019 ES, specific information of the industries and regions chosen is described in the "The Slovenia 2013 Enterprise Surveys Data Set" and "The Slovenia 2019 Enterprise Surveys Data Set" reports respectively, Appendix E.

    For the Slovenia 2009 ES, industry stratification was designed in the way that follows: the universe was stratified into manufacturing industries, services industries, and one residual (core) sector as defined in the sampling manual. Each industry had a target of 90 interviews. For the manufacturing industries sample sizes were inflated by about 17% to account for potential non-response cases when requesting sensitive financial data and also because of likely attrition in future surveys that would affect the construction of a panel. For the other industries (residuals) sample sizes were inflated by about 12% to account for under sampling in firms in service industries.

    For Slovenia 2013 ES, industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).

    Finally, for Slovenia 2019 ES, three levels of stratification were used in this country: industry, establishment size, and region. The original sample design with specific information of the industries and regions chosen is described in "The Slovenia 2019 Enterprise Surveys Data Set" report, Appendix C. Industry stratification was done as follows: Manufacturing – combining all the relevant activities (ISIC Rev. 4.0 codes 10-33), Retail (ISIC 47), and Other Services (ISIC 41-43, 45, 46, 49-53, 55, 56, 58, 61, 62, 79, 95).

    For Slovenia 2009 and 2013 ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not a common practice, except in the sectors of construction and agriculture.

    For Slovenia 2009 ES, regional stratification was defined in 2 regions. These regions are Vzhodna Slovenija and Zahodna Slovenija. The Slovenia sample contains panel data. The wave 1 panel “Investment Climate Private Enterprise Survey implemented in Slovenia” consisted of 223 establishments interviewed in 2005. A total of 57 establishments have been re-interviewed in the 2008 Business Environment and Enterprise Performance Survey.

    For Slovenia 2013 ES, regional stratification was defined in 2 regions (city and the surrounding business area) throughout Slovenia.

    Finally, for Slovenia 2019 ES, regional stratification was done across two regions: Eastern Slovenia (NUTS code SI03) and Western Slovenia (SI04).

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Research instrument

    Questionnaires have common questions (core module) and respectfully additional manufacturing- and services-specific questions. The eligible manufacturing industries have been surveyed using the Manufacturing questionnaire (includes the core module, plus manufacturing specific questions). Retail firms have been interviewed using the Services questionnaire (includes the core module plus retail specific questions) and the residual eligible services have been covered using the Services questionnaire (includes the core module). Each variation of the questionnaire is identified by the index variable, a0.

    Response rate

    Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.

    Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as (-8). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response.

    For 2009 and 2013 Slovenia ES, the survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Up to 4 attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.

    For 2009, the number of contacted establishments per realized interview was 6.18. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The relatively low ratio of contacted establishments per realized interview (6.18) suggests that the main source of error in estimates in the Slovenia may be selection bias and not frame inaccuracy.

    For 2013, the number of realized interviews per contacted establishment was 25%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 44%.

    Finally, for 2019, the number of interviews per contacted establishments was 9.7%. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The share of rejections per contact was 75.2%.

  12. o

    Data from: Dataset from Probst C, Globig A, Knoll B, Conraths FJ, Depner K....

    • openagrar.de
    Updated May 9, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carolina Probst; Anja Globig; B. Knoll; Franz Josef Conraths; Klaus Robert Depner (2017). Dataset from Probst C, Globig A, Knoll B, Conraths FJ, Depner K. 2017 Behaviour of free ranging wild boar towards their dead fellows: potential implications for the transmission of African swine fever. R. Soc. open sci. 4:170054. http://dx.doi.org/10.1098/rsos.170054 [Dataset]. https://www.openagrar.de/receive/openagrar_mods_00026443
    Explore at:
    Dataset updated
    May 9, 2017
    Authors
    Carolina Probst; Anja Globig; B. Knoll; Franz Josef Conraths; Klaus Robert Depner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set includes details of wild boar visits at the carcass sites based on the evaluation of pictures (Excel sheet 1 “Visits”; Excel sheet 2 “Visits summary”). It also includes pictures displaying the typical behavior of wild boar at the carcass sites. Overview: Supporting data Table 1 - Details pictures; ESM Figure 1 - Wild boar rooting at site 3; ESM Figure 2 - Wild boar are curious, but do not touch; ESM Figure 3 - Wild boar attracted by the soft ground; ESM Figure 4 - Wild boar chewing on bone; ESM Figure 5 - Wild boar rolling on soft ground on site 3; ESM Figure 6 - Ground underneath carcass 3 is stirred up; ESM Figure 7 - Wild boar feeding on wild ruminant; ESM Figure 8 - Skeletonization process of carcass 1;

  13. Z

    Measuring Bulk Crystallographic Texture from Ti-6Al-4V Hot-Rolled Sample...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel, Christopher Stuart (2023). Measuring Bulk Crystallographic Texture from Ti-6Al-4V Hot-Rolled Sample Matrices using Synchrotron X-ray Diffraction (Analysis Dataset) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7437908
    Explore at:
    Dataset updated
    Feb 3, 2023
    Dataset provided by
    Daniel, Christopher Stuart
    Quinta da Fonseca, João
    Zeng, Xiaohan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A dataset of synchrotron X-ray diffraction (SXRD) analysis files, recording the refinement of crystallographic texture from a number of Ti-6Al-4V (Ti-64) sample matrices, containing a total of 93 hot-rolled samples, from three different orthogonal sample directions. The aim of the work was to accurately quantify bulk macro-texture for both the α (hexagonal close packed, hcp) and β (body-centred cubic, bcc) phases across a range of different processing conditions.

    Material

    Prior to the experiment, the Ti-64 materials had been hot-rolled at a range of different temperatures, and to different reductions, followed by air-cooling, using a rolling mill at The University of Manchester. Rectangular specimens (6 mm x 5 mm x 2 mm) were then machined from the centre of these rolled blocks, and from the starting material. The samples were cut along different orthogonal rolling directions and are referenced according to alignment of the rolling directions (RD – rolling direction, TD – transverse direction, ND – normal direction) with the long horizontal (X) axis and short vertical (Y) axis of the rectangular specimens. Samples of the same orientation were glued together to form matrices for the synchrotron analysis. The material, rolling conditions, sample orientations and experiment reference numbers used for the synchrotron diffraction analysis are included in the data as an excel spreadsheet.

    SXRD Data Collection

    Data was recorded using a high energy 90 keV synchrotron X-ray beam and a 5 second exposure at the detector for each measurement point. The slits were adjusted to give a 0.5 x 0.5 mm beam area, chosen to optimally resolve both the α and β phase peaks. The SXRD data was recorded by stage-scanning the beam in sequential X-Y positions at 0.5 mm increments across the rectangular sample matrices, containing a number of samples glued together, to analyse a total of 93 samples from the different processing conditions and orientations. Post-processing of the data was then used to sort the data into a rectangular grid of measurement points from each individual sample.

    Diffraction Pattern Averaging

    The stage-scan diffraction pattern images from each matrix were sorted into individual samples, and the images averaged together for each specimen, using a Python notebook sxrd-tiff-summer. The averaged .tiff images each capture average diffraction peak intensities from an area of about 30 mm2 (equivalent to a total volume of ~ 60 mm3), with three different sample orientations then used to calculate the bulk crystallographic texture from each rolling condition.

    SXRD Data Analysis

    A new Fourier-based peak fitting method from the Continuous-Peak-Fit Python package was used to fit full diffraction pattern ring intensities, using a range of different lattice plane peaks for determining crystallographic texture in both the α and β phases. Bulk texture was calculated by combining the ring intensities from three different sample orientations.

    A .poni calibration file was created using Dioptas, through a refinement matching peak intensities from a LaB6 or CeO2 standard diffraction pattern image. Two calibrations were needed as some of the data was collected in July 2022 and some of the data was collected in August 2022. Dioptas was then used to determine peak bounds in 2θ for characterising a total of 22 α and 4 β lattice plane rings from the averaged Ti-64 diffraction pattern images, which were recorded in a .py input script. Using these two inputs, Continuous-Peak-Fit automatically converts full diffraction pattern rings into profiles of intensity versus azimuthal angle, for each 2θ section, which can also include multiple overlapping α and β peaks.

    The Continuous-Peak-Fit refinement can be launched in a notebook or from the terminal, to automatically calculate a full mathematical description, in the form of Fourier expansion terms, to match the intensity variation of each individual lattice plane ring. The results for peak position, intensity and half-width for all 22 α and 4 β lattice plane peaks were recorded at an azimuthal resolution of 1º and stored in a .fit output file. Details for setting up and running this analysis can be found in the continuous-peak-fit-analysis package. This package also includes a Python script for extracting lattice plane ring intensity distributions from the .fit files, matching the intensity values with spherical polar coordinates to parametrise the intensity distributions from each of the three different sample orientations, in the form of pole figures. The script can also be used to combine intensity distributions from different sample orientations. The final intensity variations are recorded for each of the lattice plane peaks as text files, which can be loaded into MTEX to plot and analyse both the α and β phase crystallographic texture.

    Metadata

    An accompanying YAML text file contains associated SXRD beamline metadata for each measurement. The raw data is in the form of synchrotron diffraction pattern .tiff images which were too large to upload to Zenodo and are instead stored on The University of Manchester's Research Database Storage (RDS) repository. The raw data can therefore be obtained by emailing the authors.

    The material data folder documents the machining of the samples and the sample orientations.

    The associated processing metadata for the Continuous-Peak-Fit analyses records information about the different packages used to process the data, along with details about the different files contained within this analysis dataset.

  14. f

    Supplement 1. Tab-delimited text file for the Excel database that was used...

    • wiley.figshare.com
    html
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    José Herrera; Ravin Poudel; Katherine A. Nebel; Scott L. Collins (2023). Supplement 1. Tab-delimited text file for the Excel database that was used to create Table 1 described in the main text. [Dataset]. http://doi.org/10.6084/m9.figshare.3563736.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Wiley
    Authors
    José Herrera; Ravin Poudel; Katherine A. Nebel; Scott L. Collins
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    File List Herrera-et-al-supplemental-information-data-set-used-in-analyses.txt -- (MD5: d99808cfe4ebc5bff6c4c8b75fdfaf56) Description Database used for statistical and graphical analysis described in the main text. It includes all the computational data for Fisher’s alpha, Shannon diversity, Morisita-Horn, and Sorensen’s (both qualitative and quantitative) similarity.

      Column definitions:
    
       1.  Taxonomic order of OTU
       2. Genus and species epithet of OTU
       3. OTU number 
       4, 6, 8, 10, 12. Identity of plant in 5 mm plots. (SSc = Sevilleta Sporobolus cryptandrus) # = plot number
       5, 7, 9,11, 13. Proportion sequences of that OTU in that plant (column) in July.
       14. Total number of sequences of that fungal OTU in the 5 mm plots in July
       15. Contig number in sequencher. 1 = presence of OTU in a contig
       16. Singletons. 1 = only one sequence of this OTU obtained
       17. Proportion of sequences of row OTU in all plants in 5 mm plots in July
       18. Total number of sequences of that OTU in 5 mm plots squared in July
       19. Proportion of sequences of row OTU in all plants in 5 mm plots
       20. Proportion multiplied by log of that same proportion).
       21–37. repeat information from columns 3–20 except the information describes plants and OTUs sampled from 20 mm plots.
       38–50. repeat information from columns 3–20 except the information describes plants and OTUs sampled from 0 mm (control) plots.
       51. July summary of information of total number of sequences of that fungal OTU in all three plots (0, 5, 20 mm)
       52. July summary of information from all three plots regarding whether that fungal OTU is present in a contig.
       53. July summary of information from all three plots regarding whether that fungal OTU is present as a singleton.
       54. July summary of information of total number of sequences of that fungal OTU in all three plots (0, 5, 20 mm) squared 
       55. July summary of the proportion of sequences of row OTU in all plants in all three plots (0, 5, 20 mm). 
       56. Proportion from column 55 multiplied by log of that same proportion.
       57–109. Repeats information except values now represent data obtained in August
       110–115. Repeats information except values now represent summary data obtained for both July and August.
       116. Blank column
       117–131. Multiplied squared values from pair-wise columns and rows used to calculate similarity indices. J = June, A = August, 0,5,20 correspond to mm of water added to each plot. For example, J5/J20 (column 117) corresponds to the product of the squared values for that OTU (row) obtained from plants in June from the 5mm and 20mm plots.
       132–133. Blank columns
       134–141. Matrix of Morisita-Horn and Sorensen’s (qualitative and quantitive) and similarity indices comparing pair-wise communities of fungi obtained from different plants: 0,5,20 mm plots obtained in July and August. 
    
    
     Checksum values are:
     Columns 5, 7, and 17 = 1 All proportions of sequences from all OTUs in two plants (columns 5 and 7) and overall for July (17).
    
  15. f

    Dataset for social support paper in Excel format.

    • figshare.com
    xlsx
    Updated Jul 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alphanso L. Blake; Nadia R. Bennett; Joette A. McKenzie; Marshall K. Tulloch-Reid; Ishtar Govia; Shelly R. McFarlane; Renee Walters; Damian K. Francis; Rainford J. Wilks; David R. Williams; Novie O. Younger-Coleman; Trevor S. Ferguson (2024). Dataset for social support paper in Excel format. [Dataset]. http://doi.org/10.1371/journal.pgph.0003466.s014
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 30, 2024
    Dataset provided by
    PLOS Global Public Health
    Authors
    Alphanso L. Blake; Nadia R. Bennett; Joette A. McKenzie; Marshall K. Tulloch-Reid; Ishtar Govia; Shelly R. McFarlane; Renee Walters; Damian K. Francis; Rainford J. Wilks; David R. Williams; Novie O. Younger-Coleman; Trevor S. Ferguson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Recent studies have suggested that high levels of social support can encourage better health behaviours and result in improved cardiovascular health. In this study we evaluated the association between social support and ideal cardiovascular health among urban Jamaicans. We conducted a cross-sectional study among urban residents in Jamaica’s south-east health region. Socio-demographic data and information on cigarette smoking, physical activity, dietary practices, blood pressure, body size, cholesterol, and glucose, were collected by trained personnel. The outcome variable, ideal cardiovascular health, was defined as having optimal levels of ≥5 of these characteristics (ICH-5) according to the American Heart Association definitions. Social support exposure variables included number of friends (network size), number of friends willing to provide loans (instrumental support) and number of friends providing advice (informational support). Principal component analysis was used to create a social support score using these three variables. Survey-weighted logistic regression models were used to evaluate the association between ICH-5 and social support score. Analyses included 841 participants (279 males, 562 females) with mean age of 47.6 ± 18.42 years. ICH-5 prevalence was 26.6% (95%CI 22.3, 31.0) with no significant sex difference (male 27.5%, female 25.7%). In sex-specific, multivariable logistic regression models, social support score, was inversely associated with ICH-5 among males (OR 0.67 [95%CI 0.51, 0.89], p = 0.006) but directly associated among females (OR 1.26 [95%CI 1.04, 1.53], p = 0.020) after adjusting for age and community SES. Living in poorer communities was also significantly associated with higher odds of ICH-5 among males, while living communities with high property value was associated with higher odds of ICH among females. In this study, higher level of social support was associated with better cardiovascular health among women, but poorer cardiovascular health among men in urban Jamaica. Further research should explore these associations and identify appropriate interventions to promote cardiovascular health.

  16. Data from: Data and analysis files

    • figshare.com
    xlsx
    Updated May 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Megan Murphy (2024). Data and analysis files [Dataset]. http://doi.org/10.6084/m9.figshare.25904785.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 26, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Megan Murphy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this study, we raised crickets on four diets that differed in macronutrient availability: high lipid, high carb, high protein, and a control diet. For the first two weeks of our experiment, we collected weights, number of eggs laid, and survivorship. After two weeks, half of the crickets received an immune challenge while the others received a sham challenge. For the next week, we measured survivorship and egg-laying. Finally, we measured their lytic activity to examine the effect of diet and immune challenges.The file "MacroDataOrg" is our full dataset, including the raw data as it was collected and any/all transformations that we completed prior to analysis. The Excel file has several sheets, which are briefly explained below:Weight: Shows the weight for each individual in the experiment, at the start of our experiment (roughly 1 week post-adult molt), week 1 of our experiment, week 2 of our experiment, and week 3 of our experiment (after the immune challenge). The individual's diet group is noted in the "Individual ID."EggCount: Shows the egg count for each individual in the experiment, at week 1 of our experiment, week 2 of our experiment, and week 3 of our experiment (after the immune challenge). The individual's diet group is noted in the "Individual ID."Survivorship: In columns A-C, shows the total number of days that each individual lived during our experiment (started approximately one week post-adult molt) and the number of days they lived after the immune challenge; in columns E-I, the data is converted to show what proportion of crickets in each diet treatment were alive on each day of the experiment (this converted data was used to produce Figure 5).Lytic Activity: Shows the individual ID (including diet group), immune challenge status, and calculated lytic activity for each individualAssay Wells: Shows the data from the plate reader, including absorbance readings for each cell at each minute of the lytic assay; below (in rows 73-92), it shows the layout of each plateDiet Macronutrient: Shows the compiled nutrition information for each component of the diets (high lipid, high protein, high carbohydrate, and control), as well as the calculations that we used to compile Tables 1 and 2The file "Data Sheet" was used for all ANOVA analyses, as well as making the figures. It is referenced and used in the attached R scripts "ANOVA Script" and "Figures Script." The data was formatted differently for survivorship analyses, which is saved in "Survivorship Data" and referenced and used in the R script "SurvivorshipAnalysis Script."

  17. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
T Miyakoshi; Yoichi M. Ito (2024). Supplemental data [Dataset]. http://doi.org/10.6084/m9.figshare.24596058.v1

Data from: Supplemental data

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
Mar 15, 2024
Dataset provided by
figshare
Authors
T Miyakoshi; Yoichi M. Ito
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The dataset for the article "The current utilization status of wearable devices in clinical research".Analyses were performed by utilizing the JMP Pro 16.10, Microsoft Excel for Mac version 16 (Microsoft).The file extension "jrp" is a file of the statistical analysis software JMP, which contains both the analysis code and the data set.In case JMP is not available, a "csv" file as a data set and JMP script, the analysis code, are prepared in "rtf" format.The "xlsx" file is a Microsoft Excel file that contains the data set and the data plotted or tabulated using Microsoft Excel functions.Supplementary Figure 1. NCT number duplication frequencyIncludes Excel file used to create the figure (Supplemental Figure 1).・Sfig1_NCT number duplication frequency.xlsxSupplementary Figure 2-5 Simple and annual time series aggregationIncludes Excel file, JMP repo file, csv dataset of JMP repo file and JMP scripts used to create the figure (Supplementary Figures 2-5).・Sfig2-5 Annual time series aggregation.xlsx・Sfig2 Study Type.jrp・Sfig4device type.jrp・Sfig3 Interventions Type.jrp・Sfig5Conditions type.jrp・Sfig2, 3 ,5_database.csv・Sfig2_JMP script_Study type.rtf・Sfig3_JMP script Interventions type.rtf・Sfig5_JMP script Conditions type.rtf・Sfig4_dataset.csv・Sfig4_JMP script_device type.rtfSupplementary Figures 6-11 Mosaic diagram of intervention by conditionSupplementary tables 4-9 Analysis of contingency table for intervention by condition JMP repot files used to create the figures(Supplementary Figures 6-11 ) and tables(Supplementary Tablea 4-9) , including the csv dataset of JMP repot files and JMP scripts.・Sfig6-11 Stable4-9 Intervention devicetype_conditions.jrp・Sfig6-11_Stable4-9_dataset.csv・Sfig6-11_Stable4-9_JMP script.rtfSupplementary Figure 12. Distribution of enrollmentIncludes Excel file, JMP repo file, csv dataset of JMP repo file and JMP scripts used to create the figure (Supplementary Figures 12).・Sfig12_Distribution of enrollment.jrp・Sfig12_Distribution of enrollment.csv・Sfig12_JMP script.rtf

Search
Clear search
Close search
Google apps
Main menu