18 datasets found
  1. m

    Global Burden of Disease analysis dataset of noncommunicable disease...

    • data.mendeley.com
    Updated Apr 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Cundiff (2023). Global Burden of Disease analysis dataset of noncommunicable disease outcomes, risk factors, and SAS codes [Dataset]. http://doi.org/10.17632/g6b39zxck4.10
    Explore at:
    Dataset updated
    Apr 6, 2023
    Authors
    David Cundiff
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This formatted dataset (AnalysisDatabaseGBD) originates from raw data files from the Institute of Health Metrics and Evaluation (IHME) Global Burden of Disease Study (GBD2017) affiliated with the University of Washington. We are volunteer collaborators with IHME and not employed by IHME or the University of Washington.

    The population weighted GBD2017 data are on male and female cohorts ages 15-69 years including noncommunicable diseases (NCDs), body mass index (BMI), cardiovascular disease (CVD), and other health outcomes and associated dietary, metabolic, and other risk factors. The purpose of creating this population-weighted, formatted database is to explore the univariate and multiple regression correlations of health outcomes with risk factors. Our research hypothesis is that we can successfully model NCDs, BMI, CVD, and other health outcomes with their attributable risks.

    These Global Burden of disease data relate to the preprint: The EAT-Lancet Commission Planetary Health Diet compared with Institute of Health Metrics and Evaluation Global Burden of Disease Ecological Data Analysis. The data include the following: 1. Analysis database of population weighted GBD2017 data that includes over 40 health risk factors, noncommunicable disease deaths/100k/year of male and female cohorts ages 15-69 years from 195 countries (the primary outcome variable that includes over 100 types of noncommunicable diseases) and over 20 individual noncommunicable diseases (e.g., ischemic heart disease, colon cancer, etc). 2. A text file to import the analysis database into SAS 3. The SAS code to format the analysis database to be used for analytics 4. SAS code for deriving Tables 1, 2, 3 and Supplementary Tables 5 and 6 5. SAS code for deriving the multiple regression formula in Table 4. 6. SAS code for deriving the multiple regression formula in Table 5 7. SAS code for deriving the multiple regression formula in Supplementary Table 7
    8. SAS code for deriving the multiple regression formula in Supplementary Table 8 9. The Excel files that accompanied the above SAS code to produce the tables

    For questions, please email davidkcundiff@gmail.com. Thanks.

  2. d

    Census Data

    • catalog.data.gov
    • data.globalchange.gov
    • +3more
    Updated Mar 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Bureau of the Census (2024). Census Data [Dataset]. https://catalog.data.gov/dataset/census-data
    Explore at:
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    U.S. Bureau of the Census
    Description

    The Bureau of the Census has released Census 2000 Summary File 1 (SF1) 100-Percent data. The file includes the following population items: sex, age, race, Hispanic or Latino origin, household relationship, and household and family characteristics. Housing items include occupancy status and tenure (whether the unit is owner or renter occupied). SF1 does not include information on incomes, poverty status, overcrowded housing or age of housing. These topics will be covered in Summary File 3. Data are available for states, counties, county subdivisions, places, census tracts, block groups, and, where applicable, American Indian and Alaskan Native Areas and Hawaiian Home Lands. The SF1 data are available on the Bureau's web site and may be retrieved from American FactFinder as tables, lists, or maps. Users may also download a set of compressed ASCII files for each state via the Bureau's FTP server. There are over 8000 data items available for each geographic area. The full listing of these data items is available here as a downloadable compressed data base file named TABLES.ZIP. The uncompressed is in FoxPro data base file (dbf) format and may be imported to ACCESS, EXCEL, and other software formats. While all of this information is useful, the Office of Community Planning and Development has downloaded selected information for all states and areas and is making this information available on the CPD web pages. The tables and data items selected are those items used in the CDBG and HOME allocation formulas plus topics most pertinent to the Comprehensive Housing Affordability Strategy (CHAS), the Consolidated Plan, and similar overall economic and community development plans. The information is contained in five compressed (zipped) dbf tables for each state. When uncompressed the tables are ready for use with FoxPro and they can be imported into ACCESS, EXCEL, and other spreadsheet, GIS and database software. The data are at the block group summary level. The first two characters of the file name are the state abbreviation. The next two letters are BG for block group. Each record is labeled with the code and name of the city and county in which it is located so that the data can be summarized to higher-level geography. The last part of the file name describes the contents . The GEO file contains standard Census Bureau geographic identifiers for each block group, such as the metropolitan area code and congressional district code. The only data included in this table is total population and total housing units. POP1 and POP2 contain selected population variables and selected housing items are in the HU file. The MA05 table data is only for use by State CDBG grantees for the reporting of the racial composition of beneficiaries of Area Benefit activities. The complete package for a state consists of the dictionary file named TABLES, and the five data files for the state. The logical record number (LOGRECNO) links the records across tables.

  3. w

    Fire statistics data tables

    • gov.uk
    • s3.amazonaws.com
    Updated Oct 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Housing, Communities and Local Government (2025). Fire statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire-statistics-data-tables
    Explore at:
    Dataset updated
    Oct 23, 2025
    Dataset provided by
    GOV.UK
    Authors
    Ministry of Housing, Communities and Local Government
    Description

    On 1 April 2025 responsibility for fire and rescue transferred from the Home Office to the Ministry of Housing, Communities and Local Government.

    This information covers fires, false alarms and other incidents attended by fire crews, and the statistics include the numbers of incidents, fires, fatalities and casualties as well as information on response times to fires. The Ministry of Housing, Communities and Local Government (MHCLG) also collect information on the workforce, fire prevention work, health and safety and firefighter pensions. All data tables on fire statistics are below.

    MHCLG has responsibility for fire services in England. The vast majority of data tables produced by the Ministry of Housing, Communities and Local Government are for England but some (0101, 0103, 0201, 0501, 1401) tables are for Great Britain split by nation. In the past the Department for Communities and Local Government (who previously had responsibility for fire services in England) produced data tables for Great Britain and at times the UK. Similar information for devolved administrations are available at https://www.firescotland.gov.uk/about/statistics/">Scotland: Fire and Rescue Statistics, https://statswales.gov.wales/Catalogue/Community-Safety-and-Social-Inclusion/Community-Safety">Wales: Community safety and https://www.nifrs.org/home/about-us/publications/">Northern Ireland: Fire and Rescue Statistics.

    If you use assistive technology (for example, a screen reader) and need a version of any of these documents in a more accessible format, please email alternativeformats@communities.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.

    Related content

    Fire statistics guidance
    Fire statistics incident level datasets

    Incidents attended

    https://assets.publishing.service.gov.uk/media/68f0f810e8e4040c38a3cf96/FIRE0101.xlsx">FIRE0101: Incidents attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 143 KB) Previous FIRE0101 tables

    https://assets.publishing.service.gov.uk/media/68f0ffd528f6872f1663ef77/FIRE0102.xlsx">FIRE0102: Incidents attended by fire and rescue services in England, by incident type and fire and rescue authority (MS Excel Spreadsheet, 2.12 MB) Previous FIRE0102 tables

    https://assets.publishing.service.gov.uk/media/68f20a3e06e6515f7914c71c/FIRE0103.xlsx">FIRE0103: Fires attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 197 KB) Previous FIRE0103 tables

    https://assets.publishing.service.gov.uk/media/68f20a552f0fc56403a3cfef/FIRE0104.xlsx">FIRE0104: Fire false alarms by reason for false alarm, England (MS Excel Spreadsheet, 443 KB) Previous FIRE0104 tables

    Dwelling fires attended

    https://assets.publishing.service.gov.uk/media/68f100492f0fc56403a3cf94/FIRE0201.xlsx">FIRE0201: Dwelling fires attended by fire and rescue services by motive, population and nation (MS Excel Spreadsheet, 192 KB) Previous FIRE0201 tables

    <span class="gem

  4. Salary by Profession and Country Over Time

    • kaggle.com
    zip
    Updated Dec 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Salary by Profession and Country Over Time [Dataset]. https://www.kaggle.com/datasets/thedevastator/uncovering-global-data-professional-salary-trend/code
    Explore at:
    zip(682944 bytes)Available download formats
    Dataset updated
    Dec 4, 2022
    Authors
    The Devastator
    Description

    Salary by Profession and Country Over Time

    Salary Differences by Country and Profession

    By Kelly Garrett [source]

    About this dataset

    This dataset contains survey responses from 882 data professionals from 46 countries who took part in the 2021 Global Data Professional Salary Survey. Our goal was to understand how much database administrators, data analysts, data architects, developers and data scientists make across the world in 2017-2021.

    The survey covers three years of salary trends, allowing you to compare and contrast movements over time. It also includes an optional postal code field which can be used to identify global regions with specific salary trends. In addition, all questions asked this year were also asked in 2017 and 2018 so that you can easily track changes in compensation over three years.

    The spreadsheet contains anonymized responses which are provided as public domain making it available for any purpose without attribution or mention of anyone else. With this dataset at your disposal you'll have access to the detailed salary information needed to make informed decisions about your career development!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    • Start by familiarizing yourself with the columns in this dataset. The columns range from age of respondent to country of residence. It also includes salary information for each year (average annual income for 2017, 2018, and 2019). Read through each column header carefully to understand what you're looking at.

    • Explore some basic summary statistics about the sample group such as median salary levels by profession or average age by nationality are interesting ways to get acquainted with this data set quickly. Excel's native statistical tools may be used here if you're using an excel file version as your source material; otherwise, you can use any programming language or statistics software that supports importing an exportable CSV (Comma Separated Values) format file or conversion thereof into something manipulable form like a spreadsheet or table structure within your preferred platform..

    • You'll then want to identify which factors might be influencing salaries such as experience level, gender and geographical location etc., and attempt some correlation testing between those features against salaries across different job roles or countries over time - where possible without having external datasets available terms of area data points matching up perfectly between thematic dimensions presented within the Respondents' Survey Results tab.. Subsets may also prove relevant when carrying out deeper statistical testing—for example isolating particular participation sets like Ireland alone versus looking at just Europe/Middle East/Africa region altogether..

    • Finally look at how these factors have changed over time - it's worth bearing in mind that seasonality might play a role here too depending on where respondents originally reside so it could still be relevant if larger trends towards comparing yearly cohorts differs more widely than expected based purely national economic condition context changes during particular quarters throughout those periods tracked in our findings report � comparison purposes if looking country-by-country instead just individual profiles without taking overall stimulant effects into account e.g higher education qualifications among ~2 yr cohorts vs ~3 yr ones across different populations: Comparing annual amounts doled out employers making ultra-quick transitioning easier tracking changes alone isn't feasible because they're normalized

    Research Ideas

    • Analyzing regional salary gaps amongst data professionals within the same country, or between countries.
    • Evaluating trends in salary rates over time by reviewing changes in year over year responses.
    • Generating employer profiles by comparing the salary range of employees at different organizations and industries, as well storing demographic info of individuals who participated in the survey (i.e age range, gender etc)

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    Unknown License - Please check the dataset description for more information.

    Columns

    File: 2019_Data_Professional_Salary_Survey_Responses.csv

    File: Data_Professional_Salary_Survey_Responses.csv

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Kelly Garrett.

  5. d

    Forward-looking factories

    • datasets.ai
    33, 8
    Updated Dec 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Plateforme ouverte des données publiques françaises (2021). Forward-looking factories [Dataset]. https://datasets.ai/datasets/61afa88d8cd81d1bcb7e7242
    Explore at:
    8, 33Available download formats
    Dataset updated
    Dec 7, 2021
    Dataset authored and provided by
    Plateforme ouverte des données publiques françaises
    Description

    Presentation

    The Prospective Fabriques are one of the service offerings of the National Agency for Territorial Cohesion. They allow territories to be accompanied, individually and collectively, in order to work on a transition (ecological, demographic, economic...) of national and territorial interest.

    Data

    The dataset contains: — the list of forward-looking factories

    NameDescription
    id_fabpid_fabp
    lib_fabplabel of the prospective factory
    yeareyear of initiation of the device in the territory
    partnerdevice partner

    — the list of municipalities accompanied by the forward-looking factories

    NameDescription
    insee_comInsee_com
    lib_comtown label
    id_fabpid_fabp
    lib_fabplabel of the prospective factory

    — the list of groups accompanied by the forward-looking factories

    NameDescription
    siren_groupingsiren code of the group
    lib_groupinggroup label
    legal_naturelegal nature
    id_fabpid_fabp
    lib_fabplabel of the prospective factory

    Useful Links

    crossing with other ANCT devices (data.gouv)detailed presentation of the forward-looking factories (ANCT)

    Opening the data file If you are using the Microsoft Excel spreadsheet, a particular operation is required to open the data file: 1. Create a new Excel workbook 2. Click on the **Data tab located in the ribbon and then click from the text 3. Choose the location of the csv file and click Importer 4. In the window that opens, choose the option Delimited and in File Origin, choose 65001: Unicode UTF8. Click on Next 5. Select only the Separator Virgule. Click on Next 6. Choose the right column data format by referring to the dataset documentation. Click Finish.

  6. Extended 1.0 Dataset of "Concentration and Geospatial Modelling of Health...

    • zenodo.org
    bin, csv, pdf
    Updated Sep 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Domjan; Peter Domjan; Viola Angyal; Viola Angyal; Istvan Vingender; Istvan Vingender (2024). Extended 1.0 Dataset of "Concentration and Geospatial Modelling of Health Development Offices' Accessibility for the Total and Elderly Populations in Hungary" [Dataset]. http://doi.org/10.5281/zenodo.13826993
    Explore at:
    bin, pdf, csvAvailable download formats
    Dataset updated
    Sep 23, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Peter Domjan; Peter Domjan; Viola Angyal; Viola Angyal; Istvan Vingender; Istvan Vingender
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Sep 23, 2024
    Area covered
    Hungary
    Description

    Introduction

    We are enclosing the database used in our research titled "Concentration and Geospatial Modelling of Health Development Offices' Accessibility for the Total and Elderly Populations in Hungary", along with our statistical calculations. For the sake of reproducibility, further information can be found in the file Short_Description_of_Data_Analysis.pdf and Statistical_formulas.pdf

    The sharing of data is part of our aim to strengthen the base of our scientific research. As of March 7, 2024, the detailed submission and analysis of our research findings to a scientific journal has not yet been completed.

    The dataset was expanded on 23rd September 2024 to include SPSS statistical analysis data, a heatmap, and buffer zone analysis around the Health Development Offices (HDOs) created in QGIS software.

    Short Description of Data Analysis and Attached Files (datasets):

    Our research utilised data from 2022, serving as the basis for statistical standardisation. The 2022 Hungarian census provided an objective basis for our analysis, with age group data available at the county level from the Hungarian Central Statistical Office (KSH) website. The 2022 demographic data provided an accurate picture compared to the data available from the 2023 microcensus. The used calculation is based on our standardisation of the 2022 data. For xlsx files, we used MS Excel 2019 (version: 1808, build: 10406.20006) with the SOLVER add-in.

    Hungarian Central Statistical Office served as the data source for population by age group, county, and regions: https://www.ksh.hu/stadat_files/nep/hu/nep0035.html, (accessed 04 Jan. 2024.) with data recorded in MS Excel in the Data_of_demography.xlsx file.

    In 2022, 108 Health Development Offices (HDOs) were operational, and it's noteworthy that no developments have occurred in this area since 2022. The availability of these offices and the demographic data from the Central Statistical Office in Hungary are considered public interest data, freely usable for research purposes without requiring permission.

    The contact details for the Health Development Offices were sourced from the following page (Hungarian National Population Centre (NNK)): https://www.nnk.gov.hu/index.php/efi (n=107). The Semmelweis University Health Development Centre was not listed by NNK, hence it was separately recorded as the 108th HDO. More information about the office can be found here: https://semmelweis.hu/egeszsegfejlesztes/en/ (n=1). (accessed 05 Dec. 2023.)

    Geocoordinates were determined using Google Maps (N=108): https://www.google.com/maps. (accessed 02 Jan. 2024.) Recording of geocoordinates (latitude and longitude according to WGS 84 standard), address data (postal code, town name, street, and house number), and the name of each HDO was carried out in the: Geo_coordinates_and_names_of_Hungarian_Health_Development_Offices.csv file.

    The foundational software for geospatial modelling and display (QGIS 3.34), an open-source software, can be downloaded from:

    https://qgis.org/en/site/forusers/download.html. (accessed 04 Jan. 2024.)

    The HDOs_GeoCoordinates.gpkg QGIS project file contains Hungary's administrative map and the recorded addresses of the HDOs from the

    Geo_coordinates_and_names_of_Hungarian_Health_Development_Offices.csv file,

    imported via .csv file.

    The OpenStreetMap tileset is directly accessible from www.openstreetmap.org in QGIS. (accessed 04 Jan. 2024.)

    The Hungarian county administrative boundaries were downloaded from the following website: https://data2.openstreetmap.hu/hatarok/index.php?admin=6 (accessed 04 Jan. 2024.)

    HDO_Buffers.gpkg is a QGIS project file that includes the administrative map of Hungary, the county boundaries, as well as the HDO offices and their corresponding buffer zones with a radius of 7.5 km.

    Heatmap.gpkg is a QGIS project file that includes the administrative map of Hungary, the county boundaries, as well as the HDO offices and their corresponding heatmap (Kernel Density Estimation).

    A brief description of the statistical formulas applied is included in the Statistical_formulas.pdf.

    Recording of our base data for statistical concentration and diversification measurement was done using MS Excel 2019 (version: 1808, build: 10406.20006) in .xlsx format.

    • Aggregated number of HDOs by county: Number_of_HDOs.xlsx
    • Standardised data (Number of HDOs per 100,000 residents): Standardized_data.xlsx
    • Calculation of the Lorenz curve: Lorenz_curve.xlsx
    • Calculation of the Gini index: Gini_Index.xlsx
    • Calculation of the LQ index: LQ_Index.xlsx
    • Calculation of the Herfindahl-Hirschman Index: Herfindahl_Hirschman_Index.xlsx
    • Calculation of the Entropy index: Entropy_Index.xlsx
    • Regression and correlation analysis calculation: Regression_correlation.xlsx

    Using the SPSS 29.0.1.0 program, we performed the following statistical calculations with the databases Data_HDOs_population_without_outliers.sav and Data_HDOs_population.sav:

    • Regression curve estimation with elderly population and number of HDOs, excluding outlier values (Types of analyzed equations: Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power, S, Growth, Exponential, Logistic, with summary and ANOVA analysis table): Curve_estimation_elderly_without_outlier.spv
    • Pearson correlation table between the total population, elderly population, and number of HDOs per county, excluding outlier values such as Budapest and Pest County: Pearson_Correlation_populations_HDOs_number_without_outliers.spv.
    • Dot diagram including total population and number of HDOs per county, excluding outlier values such as Budapest and Pest Counties: Dot_HDO_total_population_without_outliers.spv.
    • Dot diagram including elderly (64<) population and number of HDOs per county, excluding outlier values such as Budapest and Pest Counties: Dot_HDO_elderly_population_without_outliers.spv
    • Regression curve estimation with total population and number of HDOs, excluding outlier values (Types of analyzed equations: Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power, S, Growth, Exponential, Logistic, with summary and ANOVA analysis table): Curve_estimation_without_outlier.spv
    • Dot diagram including elderly (64<) population and number of HDOs per county: Dot_HDO_elderly_population.spv
    • Dot diagram including total population and number of HDOs per county: Dot_HDO_total_population.spv
    • Pearson correlation table between the total population, elderly population, and number of HDOs per county: Pearson_Correlation_populations_HDOs_number.spv
    • Regression curve estimation with total population and number of HDOs, (Types of analyzed equations: Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power, S, Growth, Exponential, Logistic, with summary and ANOVA analysis table): Curve_estimation_total_population.spv

    For easier readability, the files have been provided in both SPV and PDF formats.

    The translation of these supplementary files into English was completed on 23rd Sept. 2024.

    If you have any further questions regarding the dataset, please contact the corresponding author: domjan.peter@phd.semmelweis.hu

  7. Housebuilding starts per 1000 households - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Mar 17, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2016). Housebuilding starts per 1000 households - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/housebuilding-starts-per-1000-households
    Explore at:
    Dataset updated
    Mar 17, 2016
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This data set gives annual figures for the number of house building starts per 1000 households. House building data are collected at local authority district level and are available on ODC here. Figures for annual house building starts are derived from live table 253. A dwelling is counted as started on the date that work begins on the laying of the foundation, including 'slabbing' for houses that require it, but not including site preparation. Household figures are derived from 2012-based household projections by district, available on ODC here, or for download from published live tables as an Excel spreadsheet. The assumptions underlying national household and population projections are based on demographic trends. They are not forecasts as, for example, they do not attempt to predict the impact of future Government policies, changing economic circumstances or other factors that might have influence household growth. The projections show the household numbers that would result if the assumptions based in previous demographic trends in the population and rates of household formation were to be realised in practice.

  8. Z

    Galatanet dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Labatut, Vincent; Balasque, Jean-Michel (2024). Galatanet dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6811541
    Explore at:
    Dataset updated
    Oct 1, 2024
    Dataset provided by
    Galatasaray University, Computer Science Departement
    Galatasaray University, Business Science Department
    Authors
    Labatut, Vincent; Balasque, Jean-Michel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description. This project contains the dataset relative to the Galatanet survey, conducted in 2009 and 2010 at the Galatasaray University in Istanbul (Turkey). The goal of this survey was to retrieve information regarding the social relationships between students, their feeling regarding the university in general, and their purchase behavior. The survey was conducted during two phases: the first one in 2009 and the second in 2010.

    The dataset includes two kinds of data. First, the answers to most of the questions are contained in a large table, available under both CSV and MS Excel formats. An description file allows understanding the meaning of each field appearing in the table. Note thesurvey form is also contained in the archive, for reference (it is in French and Turkish only, though). Second, the social network of students is available under both Pajek and Graphml formats. Having both individual (nodal attributes) and relational (links) information in the same dataset is, to our knowledge, rare and difficult to find in public sources, and this makes (to our opinion) this dataset interesting and valuable.

    All data are completely anonymous: students' names have been replaced by random numbers. Note that the survey is not exactly the same between the two phases: some small adjustments were applied thanks to the feedback from the first phase (but the datasets have been normalized since then). Also, the electronic form was very much improved for the second phase, which explains why the answers are much more complete than in the first phase.

    The data were used in our following publications:

    Labatut, V. & Balasque, J.-M. (2010). Business-oriented Analysis of a Social Network of University Students. In: International Conference on Advances in Social Network Analysis and Mining, 25-32. Odense, DK : IEEE. ⟨hal-00633643⟩ - DOI: 10.1109/ASONAM.2010.15

    An extended version of the original article: Labatut, V. & Balasque, J.-M. (2013). Informative Value of Individual and Relational Data Compared Through Business-Oriented Community Detection. Özyer, T.; Rokne, J.; Wagner, G. & Reuser, A. H. (Eds.), The Influence of Technology on Social Network Analysis and Mining, Springer, 2013, chap.6, 303-330. ⟨hal-00633650⟩ - DOI: 10.1007/978-3-7091-1346-2_13

    A more didactic article using some of these data just for illustration purposes: Labatut, V. & Balasque, J.-M. (2012). Detection and Interpretation of Communities in Complex Networks: Methods and Practical Application. Abraham, A. & Hassanien, A.-E. (Eds.), Computational Social Networks: Tools, Perspectives and Applications, Springer, chap.4, 81-113. ⟨hal-00633653⟩ - DOI: 10.1007/978-1-4471-4048-1_4

    Citation. If you use this data, please cite article [1] above:

    @InProceedings{Labatut2010, author = {Labatut, Vincent and Balasque, Jean-Michel}, title = {Business-oriented Analysis of a Social Network of University Students}, booktitle = {International Conference on Advances in Social Networks Analysis and Mining}, year = {2010}, pages = {25-32}, address = {Odense, DK}, publisher = {IEEE Publishing}, doi = {10.1109/ASONAM.2010.15},}

    Contact. 2009-2010 by Jean-Michel Balasque (jmbalasque@gsu.edu.tr) & Vincent Labatut (vlabatut@gsu.edu.tr)

    License. This dataset is open data: you can redistribute it and/or use it under the terms of the Creative Commons Zero license (see license.txt).

  9. Additional resources for Kiva Crowdfunding

    • kaggle.com
    zip
    Updated Apr 12, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luke (2018). Additional resources for Kiva Crowdfunding [Dataset]. https://www.kaggle.com/datasets/lucian18/mpi-on-regions
    Explore at:
    zip(104671314 bytes)Available download formats
    Dataset updated
    Apr 12, 2018
    Authors
    Luke
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset contains the locations found in the Kiva datasets included in an administrative or geographical region. You can also find poverty data about this region. This facilitates answering some of the tough questions about a region's poverty.

    Content

    In the interest of preserving the original names and spelling for the locations/countries/regions all the data is in Excel format and has no preview (I think only the Kaggle recommended file types have preview - if anyone can show me how to do this for an xlsx file, it will be greatly appreciated)

    The Tables datasets contain the most recent analysis of the MPI on countries and regions. These datasets are updated regularly. In unique regions_names_from_google_api you will find 3 levels of inclusion for every geocode provided in Kiva datasets. (village/town, administrative region, sub-national region - which can be administrative or geographical). These are the results from the Google API Geocoding process.

    Files:

    • all_kiva_loans.csv

    Dropped multiple columns, kept all the rows from loans.csv with names, tags, descriptions and got a csv file of 390MB instead of 2.13 GB. Basically is a simplified version of loans.csv (originally included in the analysis by beluga)

    • country_stats.csv
    1. population source: https://en.wikipedia.org/wiki/List_of_countries_by_population_(United_Nations)
    2. population_below_poverty_line: Percentage
    3. hdi: Human Development Index
    4. life_expectancy: Life expectancy at birth
    5. expected_years_of_schooling: Expected years of schooling
    6. mean_years_of_schooling: Mean years of schooling
    7. gni: Gross national income (GNI) per capita This dataset was originally created by [beluga][1].
    • all_loan_theme_merged_with_geo_mpi_regions.xlsx

    This is the loan_themes_by_region left joined with Tables_5.3_Contribution_of_Deprivations. (all the original entries from loan_themes and only the entries that match from Tables_5; for the regions that lack MPI data, you will find Nan)

    These are the columns in the database:

    1. Partner ID
    2. Field Partner
    3. Name
    4. sector
    5. Loan Theme ID
    6. Loan Theme Type
    7. Country
    8. forkiva
    9. number
    10. amount
    11. geo
    12. rural_pct
    13. City
    14. Administrative region
    15. Sub-national region
    16. ISO
    17. World region
    18. Population Share of the Region (%)
    19. region MPI
    20. Education (%)
    21. Health (%)
    22. Living standards (%)
    23. Schooling (%)
    24. Child school attendance (%)
    25. Child Mortality (%)
    26. Nutrition (%)
    27. Electricity (%)
    28. Improved sanitation (%)
    29. Drinking water (%)
    30. Floor (%)
    31. Cooking fuel (%)
    32. Asset ownership (%)
    • mpi_on_regions.xlsx

    Matched the loans in loan_themes_by_region with the regions that have info regarding MPI. This dataset brings together the amount invested in a region and the biggest problems the said region has to deal with. It is a join between the loan_themes_by_region provided by Kiva and Tables 5.3 Contribution_of_Deprivations.

    It is a subset of the all_loan_theme_merged_with_geo_mpi_regions.xlsx, which contains only the entries that I could match with poverty decomposition data. It has the same columns.

    • Tables_5_SubNational_Decomposition_MPI_2017-18.xlsx

    Multidimensional poverty index decomposition for over 1000 regions part of 79 countries.

    Table 5.3: Contribution of deprivations to the MPI, by sub-national regions
    This table shows which dimensions and indicators contribute most to a region's MPI, which is useful for understanding the major source(s) of deprivation in a sub-national region.

    Source: http://ophi.org.uk/multidimensional-poverty-index/global-mpi-2016/

    • Tables_7_MPI_estimations_country_levels.xlsx

    MPI decomposition for 120 countries.

    Table 7 All Published MPI Results since 2010
    The table presents an archive of all MPI estimations published over the past 5 years, together with MPI, H, A and censored headcount ratios. For comparisons over time please use Table 6, which is strictly harmonised. The full set of data tables for each year published (Column A), is found on the 'data tables' page under 'Archive'.

    The data in this file is shown in interactive plots on Oxford Poverty and Human Development Initiative website. http://www.dataforall.org/dashboard/ophi/index.php/

    • unique_regions_from_kiva_loan_themes.xlsx

    These are all the regions corresponding to the geocodes found in Kiva's loan_themes_by_region. There are 718 unique entries, that you can join with any database from Kiva that has either a coordinates or region column.
    Columns:

    • geo: pair of Lat, Lon (from loan_themes_by_region)

    • City: name of the city (has the most NaN's)

    • Administrative region: first level of administrative inclusion for the city/location; (the equivalent of county for US)

    • Sub-national region: second lev...

  10. London Borough Profiles and Atlas

    • data.europa.eu
    • data.wu.ac.at
    csv, unknown, zip
    Updated Nov 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Greater London Authority (2021). London Borough Profiles and Atlas [Dataset]. https://data.europa.eu/data/datasets/london-borough-profiles-1?locale=sk
    Explore at:
    unknown, csv, zipAvailable download formats
    Dataset updated
    Nov 1, 2021
    Dataset authored and provided by
    Greater London Authorityhttp://www.london.gov.uk/
    Area covered
    London
    Description

    The London Borough Profiles help paint a general picture of an area by presenting a range of headline indicator data in both spreadsheet and map form to help show statistics covering demographic, economic, social and environmental datasets for each borough, alongside relevant comparator areas. The London Borough Atlas does the same but provides further detailed breakdowns and time-series data for each borough. The full datasets and more information for each of the indicators are usually available on the London Datastore. A link to each of the datasets is contained in the spreadsheet and map.

    London Borough Profiles

    On opening the Microsoft Excel version, a simple drop down box allows you to choose which borough profile you are interested in. Selecting this will display data for that borough, plus either Inner or Outer London, London and a national comparator (usually England where data is available). To see the full set of data for all 33 local authorities in London plus the comparator areas in Excel, click the 'Data' worksheet. A chart and a map are also available to help visualise the data for all boroughs (macros must be enabled for the Excel map to function). The data is set out across 11 themes covering most of the key indicators relating to demographic, economic, social and environmental data. Sources are provided in the spreadsheet. Notes about the indicator are provided in comment boxes attached to the indicator names. For a geographical and bar chart representation of the profile data, choose the InstantAtlas version. Choose indicators from the left hand side. Click on the comparators to make them appear on the chart and map. Sources, links to data, and notes are all contained in the box in the bottom right hand corner.

    excelIA

    These profiles include data relating to: Population, Households (census), Demographics, Migrant population, Ethnicity, Language, Employment, NEET, DWP Benefits (client group), Housing Benefit, Qualifications, Earnings, Volunteering, Jobs density, Business Survival, Crime, Fires, House prices, New homes, Tenure, Greenspace, Recycling, Carbon Emissions, Cars, Public Transport Accessibility (PTAL), Indices of Multiple Deprivation, GCSE results, Children looked after, Children in out-of-work families, Life Expectancy, Teenage conceptions, Happiness levels, Political control, and Election turnout.

    London Borough Atlas

    To access even more data at local authority level, use the London Borough Atlas. It contains data about the same topics as the profiles but provides further detailed breakdowns and time-series data for each borough. There is also an InstantAtlas version available.

    excelIA

    The London boroughs are: City of London, Barking and Dagenham, Barnet, Bexley, Brent, Bromley, Camden, Croydon, Ealing, Enfield, Greenwich, Hackney, Hammersmith and Fulham, Haringey, Harrow, Havering, Hillingdon, Hounslow, Islington, Kensington and Chelsea, Kingston upon Thames, Lambeth, Lewisham, Merton, Newham, Redbridge, Richmond upon Thames, Southwark, Sutton, Tower Hamlets, Waltham Forest, Wandsworth, Westminster. You may also find our small area profiles useful - Ward, LSOA, and "/dataset/msoa-atlas">MS

  11. S

    Data from: The Impairment Attention Capture by Topological Change in...

    • scidb.cn
    Updated Jul 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xi Huanjun; Xu Huilin; Duan Tao; Li Jing; Li Dandan; Wang Kai; Zhu Chunyan (2024). The Impairment Attention Capture by Topological Change in Children with Autism Spectrum Disorder [Dataset]. http://doi.org/10.57760/sciencedb.09651
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 1, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Xi Huanjun; Xu Huilin; Duan Tao; Li Jing; Li Dandan; Wang Kai; Zhu Chunyan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset originates from children with Autism Spectrum Disorder (ASD) and typically developing (TD) control children. The data is divided into two main parts: demographic information and eye movement indicators. The demographic information was measured using a scale questionnaire and electronically stored in EXCEL; the original eye movement indicator data was recorded by the Tobii Pro X3-120 eye tracker, with each subject's data exported in EXCEL format, and eye movement indicators were calculated based on conditions.The data collection time span ranges from 2021 to 2023, with ASD children sourced from Anhui Provincial Children's Hospital and Hefei Kanghua Hospital; TD children were sourced from the affiliated kindergarten of Anhui Medical University.The data is in table format, stored as a .csv file. The demographic information table has 56 rows, with each row representing a subject and each column representing the measured indicators. The second column, labeled 'group', indicates the type of subject: '0' represents ASD, and '1' represents TD. TD children did not undergo scale measurements (ABC, SRS, and RBS-R), hence the data is missing. The eye movement indicator table also has 56 rows, with each row indicating a subject and each column representing the eye movement indicators measured under different conditions (topological transformation and non-topological transformation).

  12. Estimates of the population for the UK, England, Wales, Scotland, and...

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Sep 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Estimates of the population for the UK, England, Wales, Scotland, and Northern Ireland [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/datasets/populationestimatesforukenglandandwalesscotlandandnorthernireland
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    Ireland, United Kingdom, England
    Description

    National and subnational mid-year population estimates for the UK and its constituent countries by administrative area, age and sex (including components of population change, median age and population density).

  13. w

    National Family Survey 2019-2021 - India

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated May 12, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    International Institute for Population Sciences (IIPS) (2022). National Family Survey 2019-2021 - India [Dataset]. https://microdata.worldbank.org/index.php/catalog/4482
    Explore at:
    Dataset updated
    May 12, 2022
    Dataset provided by
    International Institute for Population Sciences (IIPS)
    Ministry of Health and Family Welfare (MoHFW)
    Time period covered
    2019 - 2021
    Area covered
    India
    Description

    Abstract

    The National Family Health Survey 2019-21 (NFHS-5), the fifth in the NFHS series, provides information on population, health, and nutrition for India, each state/union territory (UT), and for 707 districts.

    The primary objective of the 2019-21 round of National Family Health Surveys is to provide essential data on health and family welfare, as well as data on emerging issues in these areas, such as levels of fertility, infant and child mortality, maternal and child health, and other health and family welfare indicators by background characteristics at the national and state levels. Similar to NFHS-4, NFHS-5 also provides information on several emerging issues including perinatal mortality, high-risk sexual behaviour, safe injections, tuberculosis, noncommunicable diseases, and the use of emergency contraception.

    The information collected through NFHS-5 is intended to assist policymakers and programme managers in setting benchmarks and examining progress over time in India’s health sector. Besides providing evidence on the effectiveness of ongoing programmes, NFHS-5 data will help to identify the need for new programmes in specific health areas.

    The clinical, anthropometric, and biochemical (CAB) component of NFHS-5 is designed to provide vital estimates of the prevalence of malnutrition, anaemia, hypertension, high blood glucose levels, and waist and hip circumference, Vitamin D3, HbA1c, and malaria parasites through a series of biomarker tests and measurements.

    Geographic coverage

    National coverage

    Analysis unit

    • Household
    • Individual
    • Children age 0-5
    • Woman age 15-49
    • Man age 15 to 54

    Universe

    The survey covered all de jure household members (usual residents), all women aged 15-49, all men age 15-54, and all children aged 0-5 resident in the household.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A uniform sample design, which is representative at the national, state/union territory, and district level, was adopted in each round of the survey. Each district is stratified into urban and rural areas. Each rural stratum is sub-stratified into smaller substrata which are created considering the village population and the percentage of the population belonging to scheduled castes and scheduled tribes (SC/ST). Within each explicit rural sampling stratum, a sample of villages was selected as Primary Sampling Units (PSUs); before the PSU selection, PSUs were sorted according to the literacy rate of women age 6+ years. Within each urban sampling stratum, a sample of Census Enumeration Blocks (CEBs) was selected as PSUs. Before the PSU selection, PSUs were sorted according to the percentage of SC/ST population. In the second stage of selection, a fixed number of 22 households per cluster was selected with an equal probability systematic selection from a newly created list of households in the selected PSUs. The list of households was created as a result of the mapping and household listing operation conducted in each selected PSU before the household selection in the second stage. In all, 30,456 Primary Sampling Units (PSUs) were selected across the country in NFHS-5 drawn from 707 districts as on March 31st 2017, of which fieldwork was completed in 30,198 PSUs.

    For further details on sample design, see Section 1.2 of the final report.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Research instrument

    Four survey schedules/questionnaires: Household, Woman, Man, and Biomarker were canvassed in 18 local languages using Computer Assisted Personal Interviewing (CAPI).

    Cleaning operations

    Electronic data collected in the 2019-21 National Family Health Survey were received on a daily basis via the SyncCloud system at the International Institute for Population Sciences, where the data were stored on a password-protected computer. Secondary editing of the data, which required resolution of computer-identified inconsistencies and coding of open-ended questions, was conducted in the field by the Field Agencies and at the Field Agencies central office, and IIPS checked the secondary edits before the dataset was finalized.

    Field-check tables were produced by IIPS and the Field Agencies on a regular basis to identify certain types of errors that might have occurred in eliciting information and recording question responses. Information from the field-check tables on the performance of each fieldwork team and individual investigator was promptly shared with the Field Agencies during the fieldwork so that the performance of the teams could be improved, if required.

    Response rate

    A total of 664,972 households were selected for the sample, of which 653,144 were occupied. Among the occupied households, 636,699 were successfully interviewed, for a response rate of 98 percent.

    In the interviewed households, 747,176 eligible women age 15-49 were identified for individual women’s interviews. Interviews were completed with 724,115 women, for a response rate of 97 percent. In all, there were 111,179 eligible men age 15-54 in households selected for the state module. Interviews were completed with 101,839 men, for a response rate of 92 percent.

  14. z

    Building Consumer Loyalty: Understanding E-Satisfaction in Fast-Fashion...

    • zenodo.org
    Updated Sep 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hendra Wijaya; Hendra Wijaya; Sandy Setiawan; Sandy Setiawan (2025). Building Consumer Loyalty: Understanding E-Satisfaction in Fast-Fashion Purchases on E-Commerce Platform [Dataset]. http://doi.org/10.5281/zenodo.17222864
    Explore at:
    Dataset updated
    Sep 29, 2025
    Dataset provided by
    Zenodo
    Authors
    Hendra Wijaya; Hendra Wijaya; Sandy Setiawan; Sandy Setiawan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The research's raw data is present in an Excel file with twelve sheets. The Form Response sheet has the main dataset which contains complete responses from 441 subjects that completed the questionnaire survey. The data consists of demographic variables such as gender, age range, domicile, occupation, marital status, and income level. In addition, it captures respondents' shopping behavior on Shopee regarding their frequency of purchases of fast-fashion items and respondents' evaluations of Shopee responses through Likert-scale items (1–5). These Likert-scale evaluations include information quality, service quality, price perceptions, overall satisfaction, and loyalty intentions to Shopee as an e-commerce platform.

    The demographics distribution of Respondents is summarized in the Respondent Profile sheet with counts of gender (135 males and 306 females) and other demographics presented in the profile format as a recap table.

    Several sheets contain outputs of the measurement model evaluation of the PLS-SEM procedure. For example, the Outer Loadings sheet reports the indicator reliability values for the constructs of Information Quality (IQ), Perceived Value (PV), and Loyalty (LO). The Construct Reliability sheet provides the Cronbach’s Alpha, Composite Reliability, and AVE values which indicated that the majority of models are above the threshold levels for reliability and evidence of convergent validity.

    The Discriminant Validity and Fornell Larcker sheets demonstrate that the constructs are different from one another. The Model Fit sheet reports indices of the model's goodness-of-fit (SRMR, d_ULS, d_G, Chi-square and NFI).

    Lastly, the Path Coefficients and Hypothesis Testing sheets provide the results of the structural model, including the magnitude and significance of the relationships between constructs as shown through path coefficients, t-statistics, and p-values, as well a note about whether each hypothesis is accepted or rejected.

    Overall, while the raw data file contains the original typos questionnaire, it also includes the statistical analyses necessary to confirm the measurement instruments and test the proposed structural model of consumer loyalty in fast fashion purchases on Shopee platform.

  15. Additional file 2 of Genomic data integration and user-defined sample-set...

    • springernature.figshare.com
    xlsx
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tommaso Alfonsi; Anna Bernasconi; Arif Canakoglu; Marco Masseroli (2023). Additional file 2 of Genomic data integration and user-defined sample-set extraction for population variant analysis [Dataset]. http://doi.org/10.6084/m9.figshare.21251615.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Tommaso Alfonsi; Anna Bernasconi; Arif Canakoglu; Marco Masseroli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 2. Example of transformed metadata: In this .xlsx (MS Excel) file, we list all the output metadata categories generated for each sample from the transformation of the 1KGP input datasets. The output metadata include information collected from all the four 1KGP metadata files considered. Some categories are not reported in the source metadata files—they are identified by the label manually_curated_...—and were added by the developed pipeline to store technical details (e.g., download date, the md5 hash of the source file, file size, etc.) and information derived from the knowledge of the source, such as the species, the processing pipeline used in the source and the health status. For every information category, the table reports a possible value. The third column (cardinality > 1) tells whether the same key can appear multiple times in the output GDM metadata file. This is used to represent multi-valued metadata categories; for example, in a GDM metadata file, the key manually_curated_chromosome appears once for every chromosome mutated by the variants of the sample.

  16. d

    Die Volkszählung in der Sowjetunion von 1989. USSR 1989 Population Census -...

    • demo-b2find.dkrz.de
    Updated Jul 9, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). Die Volkszählung in der Sowjetunion von 1989. USSR 1989 Population Census - Dataset - B2FIND [Dataset]. http://demo-b2find.dkrz.de/dataset/294fa717-5d2c-5973-825e-6fe32b5c535c
    Explore at:
    Dataset updated
    Jul 9, 2011
    Area covered
    Soviet Union
    Description

    Die Sowjet – Zählung von 1989 (Erhebung: 12. - 19. Januar 1989) war die letzte und auch kompletteste Zählung in der früheren UDSSR. Die folgende Version der Ergebnisse ist eine Zusammenfassung der Daten in Form von Excel – Tabellen, geordnet nach den zwölf Themenschwerpunkten der Originalversion, die als publizierte Original - Edition seit 1992 vorliegt. “The present publication is the CD-ROM version of the results of the 1989 USSR Population Census. As such, it contains the entire contents of the printed (microfiche) edition of this publication, which was first published in the latter half of 1992. The major change has been to transform all the data in the printed (microfiche) edition into a set of tables, or files.The CD-ROM edition presents the data in twelve subject areas, corresponding to each of the twelve original volumes in the printed (microfiche) edition. Each of the general subject areas is subdivided into a number of specific subjects, which in turn correspond to a unique table in the printed (microfiche) edition.Statistical and demographic data on general subject areas: Vol. 1 Statistical and demographic data on; Vol. 2 Population Size and Distribution; Vol. 3 Age and Marital Status; Vol. 4 Family/Household Size and Structure; Vol. 5 Number of Children born; Vol. 6 Housing Conditions; Vol. 7 Education Level; Vol. 8 Nationality Composition; Vol. 8 Means of Livelihood; Vol. 9 Social Composition; Vol. 10 Employment by Economic Sector; Vol. 11 Occupations; Vol. 12 Migration.The data may also be approached from the point of view of geographic unit. Geographic units are: Russia; Ukraine; Belarus; Moldova; Uzbekistan; Kazakhstan; Kyrgyzstan; Takikistan; Turkmenistan; Georgia; Azerbaijan; Armenia; Estonia; Latvia; Lithuania.Finally, the 1989 USSR Population Census data may also be approached from the point of view of nationality. Nationalities: approximately 130 nationalities” (East View (ed.), 1996: The 1989 USSR CENSUS. Minneapolis).

  17. Employee_Management_Dataset

    • kaggle.com
    Updated Oct 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yash Yennewar (2025). Employee_Management_Dataset [Dataset]. https://www.kaggle.com/datasets/yashyennewar/employee-management-dataset/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 5, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yash Yennewar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    👨‍💼 Employee Management Dataset (100,000 Records)

    📌 Overview

    This synthetic dataset contains 100,000 employee records designed to simulate real-world HR and organizational data. It includes employee demographics, job roles, salaries, manager relationships, and performance scores. It’s ideal for HR analytics, employee lifecycle analysis, organizational modeling, dashboarding, and machine learning projects.

    📂 Dataset Structure

    • Rows: 100,000
    • Columns: 14

    Features

    1. id – Unique employee identifier
    2. fullname – Employee full name
    3. email – Email address
    4. phoneno – Contact number
    5. address – Residential address
    6. salary – Annual salary (USD)
    7. department_id – Reference ID to department
    8. designation_id – Reference ID to job designation
    9. hire_date – Date of joining the organization
    10. manager_id – Manager assigned (FK to manager table)
    11. status – Employment status (Active, On Leave, etc.)
    12. gender – Employee gender
    13. dob – Date of birth
    14. performance_score – Employee performance rating

    Related Table

    Manager Table (4 rows) – Contains id and name for managers.employee.manager_id is a foreign key linking to manager.id.

    🎯 Potential Use Cases

    • HR Analytics: Workforce demographics, salary distribution, performance trends.
    • Organizational Modeling: Department/designation hierarchy, manager-employee relationships.
    • Attrition & Retention Studies: Analyze patterns in employee status and lifecycle.
    • Data Engineering: Practice ETL, SQL, and relational modeling.
    • Dashboarding: Build HR dashboards in Power BI, Tableau, or Excel.
    • Machine Learning: Predict performance, salary trends, or attrition risk.

    🏷️ Tags

    human resources · employee management · HR analytics · salary analysis · performance

    📜 License

    This dataset is synthetic and created for educational and analytical purposes only. It is freely available under the CC BY 4.0 License.

    🙌 Acknowledgments

    Generated to provide a realistic base for HR analytics, SQL practice, and BI dashboard projects.

  18. Survey_national_identity

    • kaggle.com
    zip
    Updated Apr 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira (2025). Survey_national_identity [Dataset]. https://www.kaggle.com/datasets/willianoliveiragibin/survey-national-identity
    Explore at:
    zip(516 bytes)Available download formats
    Dataset updated
    Apr 29, 2025
    Authors
    willian oliveira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset is generated based on a structured questionnaire survey, aiming to study the national identity and related social attitudes of Chinese citizens and residents in Hong Kong and Macao. Data collection adopts a Likert five-point scale (1-5 points) design, covering 56 questions, covering multiple dimensions such as national belonging, cultural identity, immigration policy attitudes, democracy and economic development evaluation. The questionnaire was distributed online or offline, and the respondents included college students in Guangdong, Hong Kong Special Administrative Region and Macao Special Administrative Region in mainland China, with a total of 66 records ("National Identity Score Total.xlsx" contains 36 questionnaire score data, and "Information Results.xlsx" contains 30 demographic information data). Data entry and preliminary processing are completed through Microsoft Excel, including total score calculation (full score 265 points), 10-point standardization conversion and standard deviation analysis, and statistical indicators are automatically generated by formulas.

    The spatial and temporal scope of the data focuses on recent years, covering Guangdong, Hong Kong and Macau. The spatial resolution is at the individual level, and the regions are divided according to the place of residence. In the questionnaire data table, each row represents a respondent (such as YA01, B01, etc.), column headings B to BH correspond to 56 ordered scoring questions (such as "Are you Chinese?" "Proud of the operation of democracy"), and columns BI to BL count the total score, average score, standardized score and dispersion; the demographic information table includes the interview number, place of residence, gender, grade and years of living in the mainland (unit: year).

    Each data set was analyzed and studied using NVivo12 qualitative analysis software, and no data was excluded. Potential errors may come from manual input bias. Data files are all in Excel format (.xlsx) and need to be opened with Microsoft Excel or compatible tools (such as WPS, LibreOffice). Statistical indicators can be fully analyzed after the formula function is enabled. The data set has been anonymized and is suitable for cross-regional comparisons in the field of social sciences, analysis of factors affecting national identity, and research on policy attitudes.

  19. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
David Cundiff (2023). Global Burden of Disease analysis dataset of noncommunicable disease outcomes, risk factors, and SAS codes [Dataset]. http://doi.org/10.17632/g6b39zxck4.10

Global Burden of Disease analysis dataset of noncommunicable disease outcomes, risk factors, and SAS codes

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Apr 6, 2023
Authors
David Cundiff
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This formatted dataset (AnalysisDatabaseGBD) originates from raw data files from the Institute of Health Metrics and Evaluation (IHME) Global Burden of Disease Study (GBD2017) affiliated with the University of Washington. We are volunteer collaborators with IHME and not employed by IHME or the University of Washington.

The population weighted GBD2017 data are on male and female cohorts ages 15-69 years including noncommunicable diseases (NCDs), body mass index (BMI), cardiovascular disease (CVD), and other health outcomes and associated dietary, metabolic, and other risk factors. The purpose of creating this population-weighted, formatted database is to explore the univariate and multiple regression correlations of health outcomes with risk factors. Our research hypothesis is that we can successfully model NCDs, BMI, CVD, and other health outcomes with their attributable risks.

These Global Burden of disease data relate to the preprint: The EAT-Lancet Commission Planetary Health Diet compared with Institute of Health Metrics and Evaluation Global Burden of Disease Ecological Data Analysis. The data include the following: 1. Analysis database of population weighted GBD2017 data that includes over 40 health risk factors, noncommunicable disease deaths/100k/year of male and female cohorts ages 15-69 years from 195 countries (the primary outcome variable that includes over 100 types of noncommunicable diseases) and over 20 individual noncommunicable diseases (e.g., ischemic heart disease, colon cancer, etc). 2. A text file to import the analysis database into SAS 3. The SAS code to format the analysis database to be used for analytics 4. SAS code for deriving Tables 1, 2, 3 and Supplementary Tables 5 and 6 5. SAS code for deriving the multiple regression formula in Table 4. 6. SAS code for deriving the multiple regression formula in Table 5 7. SAS code for deriving the multiple regression formula in Supplementary Table 7
8. SAS code for deriving the multiple regression formula in Supplementary Table 8 9. The Excel files that accompanied the above SAS code to produce the tables

For questions, please email davidkcundiff@gmail.com. Thanks.

Search
Clear search
Close search
Google apps
Main menu