31 datasets found
  1. O

    Equity Report Data: Geography

    • data.sandiegocounty.gov
    Updated May 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Various (2025). Equity Report Data: Geography [Dataset]. https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Geography/p6uw-qxpv
    Explore at:
    application/geo+json, csv, kmz, kml, xlsx, xmlAvailable download formats
    Dataset updated
    May 21, 2025
    Dataset authored and provided by
    Various
    Description

    This dataset contains the geographic data used to create maps for the San Diego County Regional Equity Indicators Report led by the Office of Equity and Racial Justice (OERJ). The full report can be found here: https://data.sandiegocounty.gov/stories/s/7its-kgpt

    Demographic data from the report can be found here: https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Demographics/q9ix-kfws

    Filter by the Indicator column to select data for a particular indicator map.

    Export notes: Dataset may not automatically open correctly in Excel due to geospatial data. To export the data for geospatial analysis, select Shapefile or GEOJSON as the file type. To view the data in Excel, export as a CSV but do not open the file. Then, open a blank Excel workbook, go to the Data tab, select “From Text/CSV,” and follow the prompts to import the CSV file into Excel. Alternatively, use the exploration options in "View Data" to hide the geographic column prior to exporting the data.

    USER NOTES: 4/7/2025 - The maps and data have been removed for the Health Professional Shortage Areas indicator due to inconsistencies with the data source leading to some missing health professional shortage areas. We are working to fix this issue, including exploring possible alternative data sources.

    5/21/2025 - The following changes were made to the 2023 report data (Equity Report Year = 2023). Self-Sufficiency Wage - a typo in the indicator name was fixed (changed sufficienct to sufficient) and the percent for one PUMA corrected from 56.9 to 59.9 (PUMA = San Diego County (Northwest)--Oceanside City & Camp Pendleton). Notes were made consistent for all rows where geography = ZCTA. A note was added to all rows where geography = PUMA. Voter registration - label "92054, 92051" was renamed to be in numerical order and is now "92051, 92054". Removed data from the percentile column because the categories are not true percentiles. Employment - Data was corrected to show the percent of the labor force that are employed (ages 16 and older). Previously, the data was the percent of the population 16 years and older that are in the labor force. 3- and 4-Year-Olds Enrolled in School - percents are now rounded to one decimal place. Poverty - the last two categories/percentiles changed because the 80th percentile cutoff was corrected by 0.01 and one ZCTA was reassigned to a different percentile as a result. Low Birthweight - the 33th percentile label was corrected to be written as the 33rd percentile. Life Expectancy - Corrected the category and percentile assignment for SRA CENTRAL SAN DIEGO. Parks and Community Spaces - corrected the category assignment for six SRAs.

    5/21/2025 - Data was uploaded for Equity Report Year 2025. The following changes were made relative to the 2023 report year. Adverse Childhood Experiences - added geographic data for 2025 report. No calculation of bins nor corresponding percentiles due to small number of geographic areas. Low Birthweight - no calculation of bins nor corresponding percentiles due to small number of geographic areas.

    Prepared by: Office of Evaluation, Performance, and Analytics and the Office of Equity and Racial Justice, County of San Diego, in collaboration with the San Diego Regional Policy & Innovation Center (https://www.sdrpic.org).

  2. e

    Waterworks — water supply system_reporting

    • data.europa.eu
    • gimi9.com
    unknown
    Updated Feb 7, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Waterworks — water supply system_reporting [Dataset]. https://data.europa.eu/88u/dataset/https-data-norge-no-node-1495
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Feb 7, 2022
    License

    http://spdx.org/licenses/NLOD-2.0http://spdx.org/licenses/NLOD-2.0

    Description

    The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find datasets for: 1. Water supply system_reporting In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the “moder” file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the æøå. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets.

    Purpose: Make data for drinking water supply available to the public.

  3. Z

    Dataset for the Paper: Understanding the Issues, Their Causes and Solutions...

    • data.niaid.nih.gov
    Updated Jul 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Waseem; Peng Liang; Aakash Ahmad; Arif Ali Khan; Mojtaba Shahin; Pekka Abrahamsson; Ali Rezaei Nasab; Tommi Mikkonen (2023). Dataset for the Paper: Understanding the Issues, Their Causes and Solutions in Microservices Systems: An Empirical Study [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7602413
    Explore at:
    Dataset updated
    Jul 10, 2023
    Dataset provided by
    Tampere University
    Wuhan University
    Lancaster University Leipzig
    University of Oulu
    University of Jyväskylä
    Shiraz University
    RMIT University
    Authors
    Muhammad Waseem; Peng Liang; Aakash Ahmad; Arif Ali Khan; Mojtaba Shahin; Pekka Abrahamsson; Ali Rezaei Nasab; Tommi Mikkonen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the dataset for the paper: Understanding the Issues, Their Causes and Solutions in Microservices Systems: An Empirical Study. The dataset is recorded in an MS Excel file which contains the following Excel sheets, and the description of each sheet is briefly presented below.

    (1) Selected Systems

    contains the 15 selected open source microservices systems with the color code and URL of each system.

    (2) Raw Data

    contains the information of initially retrieved 10,222 issues, including issue titles, issue links, issue open date, issue closed date, and the number of participants in each issue discussion.

    (3) Screened Issues

    contains the issues that meet the initial selection criteria (i.e., 5,115 issues) and the issues that do not meet the initial selection criteria (i.e., 5,107 issues).

    (4) Selected Issues (Round 1)

    contains the list of 5,115 issues that meet the initial selection criteria.

    (5) Selected Issues (Round 2)

    contains the issues related to RQs (i.e., 2,641 issues) and the issues not related to RQs (i.e., 2,474 issues).

    (6) Selected Issues

    contains the list of selected 2,641 issues, which were used to answer the RQs.

    (7) Initial Codes

    contains the initial codes for identifying the types of issues, causes, and solutions. We used these codes to further generate the subcategories and categories of issues, causes, and solutions.

    (8) Interview Questionnaire

    contains the interview questions we asked microservices practitioners to identify any missing issues, causes, and solutions, as well as to improve the proposed taxonomies.

    (9) Interview Results

    contains the results of interviews that we conducted to confirm and improve the developed taxonomies of issues, causes, and solutions.

    (10) Survey Questionnaire

    contains the survey questions we asked microservices practitioners through a Web-based survey to validate our taxonomies of issues, causes, and solutions.

    (11) Issue Taxonomy

    contains the detailed issue taxonomy consisting of 19 categories, 54 subcategories, and 402 types of issues.

    (12) Cause Taxonomy

    contains the detailed cause taxonomy consisting of 8 categories, 26 subcategories, and 228 types of causes.

    (13) Solution Taxonomy

    contains the detailed solution taxonomy consisting of 8 categories, 32 subcategories, and 177 types of solutions.

  4. d

    3.07 AZ Merit Data (summary)

    • datasets.ai
    • data.tempe.gov
    • +13more
    15, 21, 25, 3, 57, 8
    Updated Sep 2, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2022). 3.07 AZ Merit Data (summary) [Dataset]. https://datasets.ai/datasets/3-07-az-merit-data-summary-55307
    Explore at:
    8, 15, 3, 21, 57, 25Available download formats
    Dataset updated
    Sep 2, 2022
    Dataset authored and provided by
    City of Tempe
    Description

    This page provides data for the 3rd Grade Reading Level Proficiency performance measure.


    The dataset includes the student performance results on the English/Language Arts section of the AzMERIT from the Fall 2017 and Spring 2018. Data is representive of students in third grade in public elementary schools in Tempe. This includes schools from both Tempe Elementary and Kyrene districts. Results are by school and provide the total number of students tested, total percentage passing and percentage of students scoring at each of the four levels of proficiency.


    The performance measure dashboard is available at 3.07 3rd Grade Reading Level Proficiency.


    Additional Information

    Source: Arizona Department of Education
    Contact: Ann Lynn DiDomenico
    Contact E-Mail: Ann_DiDomenico@tempe.gov
    Data Source Type: Excel/ CSV
    Preparation Method: Filters on original dataset: within "Schools" Tab School District [select Tempe School District and Kyrene School District]; School Name [deselect Kyrene SD not in Tempe city limits]; Content Area [select English Language Arts]; Test Level [select Grade 3]; Subgroup/Ethnicity [select All Students] Remove irrelevant fields; Add Fiscal Year
    Publish Frequency: Annually as data becomes available
    Publish Method: Manual
    Data Dictionary

  5. Asthma ED Visit Rates by ZIP

    • kaggle.com
    Updated Jan 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Asthma ED Visit Rates by ZIP [Dataset]. https://www.kaggle.com/datasets/thedevastator/asthma-ed-visit-rates-by-zip
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 22, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Asthma ED Visit Rates by ZIP

    Counts and Rates by Age Group in California

    By Health [source]

    About this dataset

    This dataset presents a comprehensive look into the prevalence of asthma among Californian residents in terms of emergency department visits. Using age-adjusted rates and county FIPS codes, it offers an accurate snapshot of the prevalence rates per 10,000 people and provides key insights into how this condition affects certain age groups by ZIP Code. With its easy to use associated map view, this dataset allows users to quickly gain deeper knowledge about this important health issue and craft meaningful solutions to address it

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains counts and rates of asthma related emergency department visits by ZIP Code and age group in California. This data can be useful when doing research on asthma related trends or attempting to find correlations between environmental factors, prevalence of disease and geography.

    • Select a year for analysis - the latest year for which data is available is the default selection, but other years are also listed in the dropdown menu.
    • Select an Age Group to analyze - use the provided dropdown menus to select one or more age groups (all ages, 0-17, 18+) if you wish to analyze two different age groups in your analysis.
    • Define a geographical area by selecting a ZIP code or County Fips code from which you wish to obtain your dataset from based on its availability or importance in your research question .
    • View and download relevant data - after selecting all of the desired criteria (year,Age group(s), ZIP code/County FIPS Code) click “View Data” then “Download” at the bottom right corner of window that opens up
      5 Analyze information found - use software such as Microsoft Excel or open source programs like Openoffice Calc to gain insight into your downloaded dataset through statistics calculations, graphs etc.. In particular look out for anomalies that could signify further investigation

    Research Ideas

    • Identifying the geographic clusters of asthma sufferers by analyzing the rate of emergency department visits with geographic mapping.
    • Developing outreach initiatives to areas with a high rate of ED visits for asthma to provide education, interventions and resources designed towards increasing preventive care and reducing preventable complications due to lack of access or knowledge about available services in these communities.
    • Assessing disparities in ED visit rates for asthma between age groups as well as between urban and rural areas or different socio-economic groups within counties or ZIP codes in order to identify areas where there is a need for increased interventions, services and other resources related to asthma care in order to reduce the burden or severity of this chronic condition among particularly vulnerable population groups

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Open Database License (ODbL) v1.0 - You are free to: - Share - copy and redistribute the material in any medium or format. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices. - No Derivatives - If you remix, transform, or build upon the material, you may not distribute the modified material. - No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

    Columns

    File: Asthma_Emergency_Department_Visit_Rates_by_ZIP_Code.csv | Column name | Description | |:----------------------|:------------------------------------------------------------------------------------------------------------------| | Year | The year the data was collected. (Integer) | | ZIP code | The ZIP code of the area the data was collected from. (String...

  6. e

    Mayor Election 2014 Düsseldorf

    • data.europa.eu
    csv, json
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Düsseldorf (2025). Mayor Election 2014 Düsseldorf [Dataset]. https://data.europa.eu/data/datasets/851e793b-50ac-4e57-91fc-a10418b8bb56?locale=en
    Explore at:
    json, csv(33995), csv(272), csv(928), csv(5542), csv(510), csv(51497), csv(3583), csv(1575)Available download formats
    Dataset updated
    May 30, 2025
    Dataset authored and provided by
    Düsseldorf
    License

    http://dcat-ap.de/def/licenses/other-closedhttp://dcat-ap.de/def/licenses/other-closed

    Area covered
    Düsseldorf
    Description

    The data set contains the results of the mayor’s election on 25 May 2014 and the mayor’s key election on 15 June 2014 of the City of Düsseldorf.

    The local elections took place on 25 May 2014. Because no clear majority was reached, there was a runoff election of the mayor on 15 June 2014.

    An authority may set up different territorial levels to present the election results, from the lowest level (voting districts) to constituencies and districts to the level of the city or municipality, district and constituency. However, not all levels are necessary for each type of election. For each of the territorial levels that an authority has set up, there is a file containing the overview of those areas with fast messages already received.

    Further data sets contain information on the division of electoral areas for local elections and the division of voting districts.

    Information on terms in the field of ‘Elections’ can be found in the Election ABC of the interactive learning platform for election workers of the City of Düsseldorf.

    The files are encoded in UTF-8. By default, Excel does not display the umlauts in the files correctly. You can avoid this as follows:

    Excel 2003 Select from the menu ‘Data’ -> ‘Import external data’ from the menu item ‘Import data’. The ‘Select data source’ dialog opens. Select the file you want to open and press the ‘Open’ button. Then place the file origin to '65001 Unicode: (UTF-8)' fixed and continue with the ‘Next’ button. In the next dialog, set the separator to ‘Semicolon’ instead of ‘Tabstopp’ and continue with the ‘Next’ button again. They then select the ‘Text’ option as the data format of the columns and exit the wizard with the ‘Finish’ button. Use the ‘OK’ button to finish the procedure and the data is displayed UTF-8 encoded in Microsoft Excel.

    Excel 2010 From the tab ‘Data’ in the section ‘Retrieve external data’, select the option ‘From text’. The dialog ‘Import text file’ opens. Select the file you want to open and press the ‘Open’ button. Then place the file origin to '65001 Unicode: (UTF-8)' fixed and continue with the ‘Next’ button. In the next dialog, set the separator to ‘Semicolon’ instead of ‘Tabstopp’ and continue with the ‘Next’ button again. They then select the ‘Text’ option as the data format of the columns and exit the wizard with the ‘Finish’ button. Use the ‘OK’ button to finish the procedure and the data is displayed UTF-8 encoded in Microsoft Excel.

    The files contain the following column information:

    Number: Constituency number Name: Name of the constituency MaxQuick Messages: maximum number of quick messages AnzQuick Messages: Number of fast messages already recorded Eligible voters: Number of eligible voters Filed under: Number of ballot papers submitted Turnout: Voter turnouts at the respective view levels valid Voting List: Number of valid ballot papers valid: Number of valid votes cast invalid Voting List: Number of invalid ballot papers invalid: Number of invalid votes cast In addition, the following fields are available for each party (example of one party called ‘A Party’):

    A Party: Number of total votes of the party A-Party_Proz: Percentage of total votes of the party from the total result

  7. U

    Statistical Abstract of the United States, 2011

    • dataverse-staging.rdmc.unc.edu
    Updated Oct 28, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNC Dataverse (2011). Statistical Abstract of the United States, 2011 [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/CD-10849
    Explore at:
    Dataset updated
    Oct 28, 2011
    Dataset provided by
    UNC Dataverse
    License

    https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-10849https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CD-10849

    Description

    "The Statistical Abstract of the United States, published since 1878, is the standard summary of statistics on the social, political, and economic organization of the United States. It is designed to serve as a convenient volume for statistical reference and as a guide to other statistical publications and sources. The latter function is served by the introductory text to each section, the source note appearing below each table, and Appendix I, which comprises the Guide to Sources of Statisti cs, the Guide to State Statistical Abstracts, and the Guide to Foreign Statistical Abstracts. The Statistical Abstract sections and tables are compiled into one Adobe PDF named StatAbstract2009.pdf. This PDF is bookmarked by section and by table and can be searched using the Acrobat Search feature. The Statistical Abstract on CD-ROM is best viewed using Adobe Acrobat 5, or any subsequent version of Acrobat or Acrobat Reader. The Statistical Abstract tables and the metropolitan areas tables from Appendix II are available as Excel(.xls or .xlw) spreadsheets. In most cases, these spreadsheet files offer the user direct access to more data than are shown either in the publication or Adobe Acrobat. These files usually contain more years of data, more geographic areas, and/or more categories of subjects than those shown in the Acrobat version. The extensive selection of statistics is provided for the United States, with selected data for regions, divisions, states, metropolitan areas, cities, and foreign countries from reports and records of government and private agencies. Software on the disc can be used to perform full-text searches, view official statistics, open tables as Lotus worksheets or Excel workbooks, and link directly to source agencies and organizations for supporting information. Except as indicated, figures are for the United States as presently constituted. Although emphasis in the Statistical Abstract is primarily given to national data, many tables present data for regions and individual states and a smaller number for metropolitan areas and cities.Statistics for the Commonwealth of Puerto Rico and for island areas of the United States are included in many state tables and are supplemented by information in Section 29. Additional information for states, cities, counties, metropolitan areas, and other small units, as well as more historical data are available in various supplements to the Abstract. Statistics in this edition are generally for the most recent year or period available by summer 2006. Each year over 1,400 tables and charts are reviewed and evaluated; new tables and charts of current interest are added, continuing series are updated, and less timely data are condensed or eliminated. Text notes and appendices are revised as appropriate. This year we have introduced 72 new tables covering a wide range of subject areas. These cover a variety of topics including: learning disability for children, people impacted by the hurricanes in the Gulf Coast area, employees with alternative work arrangements, adult computer and Internet users by selected characteristics, North America cruise industry, women- and minority-owned businesses, and the percentage of the adult population considered to be obese. Some of the annually surveyed topics are population; vital statistics; health and nutrition; education; law enforcement, courts and prison; geography and environment; elections; state and local government; federal government finances and employment; national defense and veterans affairs; social insurance and human services; labor force, employment, and earnings; income, expenditures, and wealth; prices; business enterprise; science and technology; agriculture; natural resources; energy; construction and housing; manufactures; domestic trade and services; transportation; information and communication; banking, finance, and insurance; arts, entertainment, and recreation; accommodation, food services, and other services; foreign commerce and aid; outlying areas; and comparative international statistics." Note to Users: This CD is part of a collection located in the Data Archive of the Odum Institute for Research in Social Science, at the University of North Carolina at Chapel Hill. The collection is located in Room 10, Manning Hall. Users may check the CDs out subscribing to the honor system. Items can be checked out for a period of two weeks. Loan forms are located adjacent to the collection.

  8. b

    Bulk Data Provider | Verified B2B & B2C Databases in Excel – India’s Trusted...

    • bulkdataprovider.com
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bulk data Provider (2025). Bulk Data Provider | Verified B2B & B2C Databases in Excel – India’s Trusted Source [Dataset]. https://bulkdataprovider.com/blog/articles/bulk-data-provider-verified-b2b-b2c-databases-in-excel-india-s-trusted-source/
    Explore at:
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Bulk data Provider
    Description

    🚀 Bulk Data Provider – Your Trusted Source for Verified B2B & B2C Databases in IndiaMeta Description:Looking for a reliable bulk data provider? Get verified B2B and B2C databases for marketing, telecalling, and lead generation from India’s leading source—Bulk Data…

  9. HelpSteer: AI Alignment Dataset

    • kaggle.com
    zip
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). HelpSteer: AI Alignment Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/helpsteer-ai-alignment-dataset
    Explore at:
    zip(16614333 bytes)Available download formats
    Dataset updated
    Nov 22, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    HelpSteer: AI Alignment Dataset

    Real-World Helpfulness Annotated for AI Alignment

    By Huggingface Hub [source]

    About this dataset

    HelpSteer is an Open-Source dataset designed to empower AI Alignment through the support of fair, team-oriented annotation. The dataset provides 37,120 samples each containing a prompt and response along with five human-annotated attributes ranging between 0 and 4; with higher results indicating better quality. Using cutting-edge methods in machine learning and natural language processing in combination with the annotation of data experts, HelpSteer strives to create a set of standardized values that can be used to measure alignment between human and machine interactions. With comprehensive datasets providing responses rated for correctness, coherence, complexity, helpfulness and verbosity, HelpSteer sets out to assist organizations in fostering reliable AI models which ensure more accurate results thereby leading towards improved user experience at all levels

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    How to Use HelpSteer: An Open-Source AI Alignment Dataset

    HelpSteer is an open-source dataset designed to help researchers create models with AI Alignment. The dataset consists of 37,120 different samples each containing a prompt, a response and five human-annotated attributes used to measure these responses. This guide will give you a step-by-step introduction on how to leverage HelpSteer for your own projects.

    Step 1 - Choosing the Data File

    Helpsteer contains two data files – one for training and one for validation. To start exploring the dataset, first select the file you would like to use by downloading both train.csv and validation.csv from the Kaggle page linked above or getting them from the Google Drive repository attached here: [link]. All the samples in each file consist of 7 columns with information about a single response: prompt (given), response (submitted), helpfulness, correctness, coherence, complexity and verbosity; all sporting values between 0 and 4 where higher means better in respective category.

    ## Step 2—Exploratory Data Analysis (EDA) Once you have your file loaded into your workspace or favorite software environment (e.g suggested libraries like Pandas/Numpy or even Microsoft Excel), it’s time explore it further by running some basic EDA commands that summarize each feature's distribution within our data set as well as note potential trends or points of interests throughout it - e.g what are some traits that are polarizing these responses more? Are there any outliers that might signal something interesting happening? Plotting these results often provides great insights into pattern recognition across datasets which can be used later on during modeling phase also known as “Feature Engineering”

    ## Step 3—Data Preprocessing After your interpretation of raw data while doing EDA should form some hypotheses around what features matter most when trying to estimate attribute scores of unknown responses accurately so proceeding with preprocessing such as cleaning up missing entries or handling outliers accordingly becomes highly recommended before starting any modelling efforts with this data set - kindly refer also back at Kaggle page description section if unsure about specific attributes domain ranges allowed values explicitly for extra confidence during this step because having correct numerical suggestions ready can make modelling workload lighter later on while building predictive models . It’s important not rushing over this stage otherwise poor results may occur later when aiming high accuracy too quickly upon model deployment due low quality

    Research Ideas

    • Designating and measuring conversational AI engagement goals: Researchers can utilize the HelpSteer dataset to design evaluation metrics for AI engagement systems.
    • Identifying conversational trends: By analyzing the annotations and data in HelpSteer, organizations can gain insights into what makes conversations more helpful, cohesive, complex or consistent across datasets or audiences.
    • Training Virtual Assistants: Train artificial intelligence algorithms on this dataset to develop virtual assistants that respond effectively to customer queries with helpful answers

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    **License: [CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication](https://creativecommons.org/pu...

  10. C

    Verden Source LLC

    • data.cityofchicago.org
    Updated Dec 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Chicago (2025). Verden Source LLC [Dataset]. https://data.cityofchicago.org/Community-Economic-Development/Verden-Source-LLC/qkv5-pk99
    Explore at:
    application/geo+json, csv, xlsx, kml, kmz, xmlAvailable download formats
    Dataset updated
    Dec 2, 2025
    Authors
    City of Chicago
    Description

    This dataset contains all current and active business licenses issued by the Department of Business Affairs and Consumer Protection. This dataset contains a large number of records /rows of data and may not be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Notepad or Wordpad, to view and search.

    Data fields requiring description are detailed below.

    APPLICATION TYPE: 'ISSUE' is the record associated with the initial license application. 'RENEW' is a subsequent renewal record. All renewal records are created with a term start date and term expiration date. 'C_LOC' is a change of location record. It means the business moved. 'C_CAPA' is a change of capacity record. Only a few license types my file this type of application. 'C_EXPA' only applies to businesses that have liquor licenses. It means the business location expanded.

    LICENSE STATUS: 'AAI' means the license was issued.

    Business license owners may be accessed at: http://data.cityofchicago.org/Community-Economic-Development/Business-Owners/ezma-pppn To identify the owner of a business, you will need the account number or legal name.

    Data Owner: Business Affairs and Consumer Protection

    Time Period: Current

    Frequency: Data is updated daily

  11. Uniquely Popular Businesses

    • kaggle.com
    zip
    Updated Jan 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Uniquely Popular Businesses [Dataset]. https://www.kaggle.com/datasets/thedevastator/uniquely-popular-businesses
    Explore at:
    zip(48480 bytes)Available download formats
    Dataset updated
    Jan 22, 2023
    Authors
    The Devastator
    Description

    Uniquely Popular Businesses

    Rankings of Business Categories in Seattle & NYC Neighborhoods

    By data.world's Admin [source]

    About this dataset

    This dataset contains data used to analyze the uniquely popular business types in the neighborhoods of Seattle and New York City. We used publically available neighborhood-level shapefiles to identify neighborhoods, and then crossed that information against Yelp's Business Category API to find businesses operating within each neighborhood. The ratio of businesses from each category was studied in comparison to their ratios in the entire city to determine any significant differences between each borough.

    Any single business with more than one category was repeated for each one, however none of them were ever recorded twice for any single category. Moreover, if a certain business type didn't make up at least 1% of a particular neighborhood's businesses overall it was removed from the analysis altogether.

    The data available here is free to use under MIT license, with appropriate attribution given back to Yelp for providing this information. It is an invaluable resource for researchers across different disciplines looking into consumer behavior or clustering within urban areas!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    How to Use This Dataset

    To get started using this dataset: - Download the appropriate file for the area you’re researching - either salt5_Seattle.csv or top5_NewYorkCity.csv - from the Kaggle site which hosts this dataset (https://www.kaggle.com/puddingmagazine/uniquely-popular-businesses). - Read through each columns information available under Columns section associated with this kaggle description (above).
    - Take note of columns that are relevant to your analysis such as nCount which indicates the number of businesses in a neighborhood, rank which shows how popular that business type is overall and neighborhoodTotal which specifies total number of businesses in a particular neighborhood etc.,
    - ) Load your selected file into an application designed for data analysis such as Jupyter Notebook, Microsoft Excel, Power BI etc.,
    - ) Begin performing various analyses related to understanding where certain types of unique business are most common by subsetting rows based on specific neighborhoods; alternatively perform regressions-based analyses related to trends similar unique type's ranks over multiple neighborhoods etc.,

    If you have any questions about interpreting data from this source please reach out if needed!

    Research Ideas

    • Analyzing the unique business trends in Seattle and New York City to identify potential investment opportunities.
    • Creating a tool that helps businesses understand what local competitions they face by neighborhood.
    • Exploring the distinctions between neighborhoods by plotting out the different businesses they have in comparison with each other and other cities

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns

    File: top5_Seattle.csv | Column name | Description | |:----------------------|:----------------------------------------------------------------------------------------------------------------------------------| | neighborhood | Name of the neighborhood. (String) | | yelpAlias | The Yelp-specified Alias for the business type. (String) | | yelpTitle | The Title given to this business type by Yelp. (String) | | nCount | Number of businesses with this type within a particular neighborhood. (Integer) | | neighborhoodTotal | Total number of businesses located within that particular region. (Integer) | | cCount | Number of businesses with this storefront within an entire city. (Integer) | | cityTotal | Total number of all types of storefronts within an entire city. (Integer) ...

  12. Yelp Reviews Sentiment Dataset

    • kaggle.com
    zip
    Updated Nov 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Yelp Reviews Sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/yelp-reviews-sentiment-dataset/code
    Explore at:
    zip(169587518 bytes)Available download formats
    Dataset updated
    Nov 25, 2022
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Yelp Reviews Sentiment Dataset

    A Challenge for Natural Language Processing

    By Huggingface Hub [source]

    About this dataset

    The Yelp Reviews Polarity dataset is a collection of Yelp reviews that have been labeled as positive or negative. This dataset is perfect for natural language processing tasks such as sentiment analysis

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This YELP reviews dataset is a great natural language processing dataset for anyone looking to get started with text classification. The data is split into two files: train.csv and test.csv. The training set contains 7,000 reviews with labels (0 = negative, 1 = positive), and the test set contains 3,000 unlabeled reviews.

    To get started with this dataset, download the two CSV files and put them in the same directory. Then, open up train.csv in your favorite text editor or spreadsheet software (I like using Microsoft Excel). Next, take a look at the first few rows of data to get a feel for what you're working with:

    textlabel
    So there is no way for me to plug it in here in the US unless I go by...0

    Research Ideas

    • This dataset could be used to train a machine learning model to classify Yelp reviews as positive or negative.
    • This dataset could be used to train a machine learning model to predict the star rating of a Yelp review based on the text of the review.
    • This dataset could be used to build a natural language processing system that generates fake Yelp reviews

    Acknowledgements

    If you use this dataset in your research, please credit the original authors.

    Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: train.csv | Column name | Description | |:--------------|:----------------------------------| | text | The text of the review. (string) | | label | The label of the review. (string) |

    File: test.csv | Column name | Description | |:--------------|:----------------------------------| | text | The text of the review. (string) | | label | The label of the review. (string) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Huggingface Hub.

  13. Student Performance Factors (Excel Analysis)

    • kaggle.com
    zip
    Updated Nov 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kino (2025). Student Performance Factors (Excel Analysis) [Dataset]. https://www.kaggle.com/datasets/kinozyne/student-performance-factors-excel-analysis
    Explore at:
    zip(973447 bytes)Available download formats
    Dataset updated
    Nov 17, 2025
    Authors
    Kino
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    📊 Student Performance Analysis

    Project: Data Analysis using Excel Pivot Tables & Charts

    Executive Summary

    Based on the analysis of 6,607 students, this project identifies that active student habits (Attendance, Tutoring) are stronger predictors of success than environmental factors (Income, Resources).

    Key Insights

    1. Show Up: Attendance is the #1 driver of success.
    2. Get Help: Students attending 6 tutoring sessions/week scored 5 points higher on average.
    3. Sleep Myth: Sleep duration showed no correlation with exam scores.

    Tools Used

    • Microsoft Excel: Pivot Tables, Advanced Charting, Statistical Analysis, Data Cleaning.

    Source of Dataset(.csv)

  14. Google Certificate BellaBeats Capstone Project

    • kaggle.com
    zip
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason Porzelius (2023). Google Certificate BellaBeats Capstone Project [Dataset]. https://www.kaggle.com/datasets/jasonporzelius/google-certificate-bellabeats-capstone-project
    Explore at:
    zip(169161 bytes)Available download formats
    Dataset updated
    Jan 5, 2023
    Authors
    Jason Porzelius
    Description

    Introduction: I have chosen to complete a data analysis project for the second course option, Bellabeats, Inc., using a locally hosted database program, Excel for both my data analysis and visualizations. This choice was made primarily because I live in a remote area and have limited bandwidth and inconsistent internet access. Therefore, completing a capstone project using web-based programs such as R Studio, SQL Workbench, or Google Sheets was not a feasible choice. I was further limited in which option to choose as the datasets for the ride-share project option were larger than my version of Excel would accept. In the scenario provided, I will be acting as a Junior Data Analyst in support of the Bellabeats, Inc. executive team and data analytics team. This combined team has decided to use an existing public dataset in hopes that the findings from that dataset might reveal insights which will assist in Bellabeat's marketing strategies for future growth. My task is to provide data driven insights to business tasks provided by the Bellabeats, Inc.'s executive and data analysis team. In order to accomplish this task, I will complete all parts of the Data Analysis Process (Ask, Prepare, Process, Analyze, Share, Act). In addition, I will break each part of the Data Analysis Process down into three sections to provide clarity and accountability. Those three sections are: Guiding Questions, Key Tasks, and Deliverables. For the sake of space and to avoid repetition, I will record the deliverables for each Key Task directly under the numbered Key Task using an asterisk (*) as an identifier.

    Section 1 - Ask:

    A. Guiding Questions:
    1. Who are the key stakeholders and what are their goals for the data analysis project? 2. What is the business task that this data analysis project is attempting to solve?

    B. Key Tasks: 1. Identify key stakeholders and their goals for the data analysis project *The key stakeholders for this project are as follows: -Urška Sršen and Sando Mur - co-founders of Bellabeats, Inc. -Bellabeats marketing analytics team. I am a member of this team.

    1. Identify the business task. *The business task is: -As provided by co-founder Urška Sršen, the business task for this project is to gain insight into how consumers are using their non-BellaBeats smart devices in order to guide upcoming marketing strategies for the company which will help drive future growth. Specifically, the researcher was tasked with applying insights driven by the data analysis process to 1 BellaBeats product and presenting those insights to BellaBeats stakeholders.

    Section 2 - Prepare:

    A. Guiding Questions: 1. Where is the data stored and organized? 2. Are there any problems with the data? 3. How does the data help answer the business question?

    B. Key Tasks:

    1. Research and communicate the source of the data, and how it is stored/organized to stakeholders. *The data source used for our case study is FitBit Fitness Tracker Data. This dataset is stored in Kaggle and was made available through user Mobius in an open-source format. Therefore, the data is public and available to be copied, modified, and distributed, all without asking the user for permission. These datasets were generated by respondents to a distributed survey via Amazon Mechanical Turk reportedly (see credibility section directly below) between 03/12/2016 thru 05/12/2016.
      *Reportedly (see credibility section directly below), thirty eligible Fitbit users consented to the submission of personal tracker data, including output related to steps taken, calories burned, time spent sleeping, heart rate, and distance traveled. This data was broken down into minute, hour, and day level totals. This data is stored in 18 CSV documents. I downloaded all 18 documents into my local laptop and decided to use 2 documents for the purposes of this project as they were files which had merged activity and sleep data from the other documents. All unused documents were permanently deleted from the laptop. The 2 files used were: -sleepDay_merged.csv -dailyActivity_merged.csv

    2. Identify and communicate to stakeholders any problems found with the data related to credibility and bias. *As will be more specifically presented in the Process section, the data seems to have credibility issues related to the reported time frame of the data collected. The metadata seems to indicate that the data collected covered roughly 2 months of FitBit tracking. However, upon my initial data processing, I found that only 1 month of data was reported. *As will be more specifically presented in the Process section, the data has credibility issues related to the number of individuals who reported FitBit data. Specifically, the metadata communicates that 30 individual users agreed to report their tracking data. My initial data processing uncovered 33 individual ...

  15. g

    Soil Temperature Station Data from Permafrost Regions of Russia (Selection...

    • data.globalchange.gov
    • data.wu.ac.at
    Updated Feb 17, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). Soil Temperature Station Data from Permafrost Regions of Russia (Selection of Five Stations), 1880s - 2000 [Dataset]. https://data.globalchange.gov/dataset/nsidc-g02189
    Explore at:
    Dataset updated
    Feb 17, 2011
    Description

    This data set includes soil temperature data from boreholes located at five stations in Russia: Yakutsk, Verkhoyansk, Pokrovsk, Isit', and Churapcha. The data have been compiled into five Microsoft Excel files, one for each station. Each Excel file contains three worksheets:

    • G02189info worksheet: Contains the same content in each Excel file - lat/lon info and notes on the stations
    • Jan soil & surface temp worksheet: Contains winter (January) soil temperature and air temperature (except for the Churapcha Excel file that only contains soil temperature - air temperature was not available)
    • Jul soil & surface temp worksheet: Contains summer (July) soil temperature and air temperature (except for the Churapcha Excel file)
    There are two different versions of the Excel files: a complete version and a subsetted version. Both versions exist for each of the five stations for a total of 10 files. The complete versions of the files reside in the directory called complete and have the word full in their filename. These files contain borehole temperature data at all available standard depths: 0.2 m, 0.4 m, 0.6 m, 0.8 m, 1.2 m, 1.6 m, 2.0 m, 2.4 m, and 3.2 m. The subsetted versions of the files reside in the subset directory and have subset in their filename. These files contain data from the 0.8 m and 3.2 m depths only. Missing data are indicated by the value -999.0. The complete version is more applicable to scientific investigation. The subset version is provided for K-12 teachers and is featured in a classroom activity called "How Permanent is Permafrost?" We have included air temperature measured at these five stations when it is available. There are two sources for the surface air temperature data: NCAR World Monthly Surface Station Climatology, 1738-cont and NOAA Global Historical Climatology Network (GHCN) Monthly data set. These two sources both draw on the same single original source: data from the World Meteorological Organization (WMO) station network. The complete files have data from one or both sources, while the subset files only include data from the source with the most complete record. These data are being offered as is. NOAA@NSIDC believes these data to be of value but is unable to research and document these data as we do most data sets we publish.

  16. p

    Business Activity Survey 2009 - Samoa

    • microdata.pacificdata.org
    Updated Jul 2, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samoa Bureau of Statistics (2019). Business Activity Survey 2009 - Samoa [Dataset]. https://microdata.pacificdata.org/index.php/catalog/253
    Explore at:
    Dataset updated
    Jul 2, 2019
    Dataset authored and provided by
    Samoa Bureau of Statistics
    Time period covered
    2009
    Area covered
    Samoa
    Description

    Abstract

    The intention is to collect data for the calendar year 2009 (or the nearest year for which each business keeps its accounts. The survey is considered a one-off survey, although for accurate NAs, such a survey should be conducted at least every five years to enable regular updating of the ratios, etc., needed to adjust the ongoing indicator data (mainly VAGST) to NA concepts. The questionnaire will be drafted by FSD, largely following the previous BAS, updated to current accounting terminology where necessary. The questionnaire will be pilot tested, using some accountants who are likely to complete a number of the forms on behalf of their business clients, and a small sample of businesses. Consultations will also include Ministry of Finance, Ministry of Commerce, Industry and Labour, Central Bank of Samoa (CBS), Samoa Tourism Authority, Chamber of Commerce, and other business associations (hotels, retail, etc.).

    The questionnaire will collect a number of items of information about the business ownership, locations at which it operates and each establishment for which detailed data can be provided (in the case of complex businesses), contact information, and other general information needed to clearly identify each unique business. The main body of the questionnaire will collect data on income and expenses, to enable value added to be derived accurately. The questionnaire will also collect data on capital formation, and will contain supplementary pages for relevant industries to collect volume of production data for selected commodities and to collect information to enable an estimate of value added generated by key tourism activities.

    The principal user of the data will be FSD which will incorporate the survey data into benchmarks for the NA, mainly on the current published production measure of GDP. The information on capital formation and other relevant data will also be incorporated into the experimental estimates of expenditure on GDP. The supplementary data on volumes of production will be used by FSD to redevelop the industrial production index which has recently been transferred under the SBS from the CBS. The general information about the business ownership, etc., will be used to update the Business Register.

    Outputs will be produced in a number of formats, including a printed report containing descriptive information of the survey design, data tables, and analysis of the results. The report will also be made available on the SBS website in “.pdf” format, and the tables will be available on the SBS website in excel tables. Data by region may also be produced, although at a higher level of aggregation than the national data. All data will be fully confidentialised, to protect the anonymity of all respondents. Consideration may also be made to provide, for selected analytical users, confidentialised unit record files (CURFs).

    A high level of accuracy is needed because the principal purpose of the survey is to develop revised benchmarks for the NA. The initial plan was that the survey will be conducted as a stratified sample survey, with full enumeration of large establishments and a sample of the remainder.

    Geographic coverage

    National Coverage

    Analysis unit

    The main statistical unit to be used for the survey is the establishment. For simple businesses that undertake a single activity at a single location there is a one-to-one relationship between the establishment and the enterprise. For large and complex enterprises, however, it is desirable to separate each activity of an enterprise into establishments to provide the most detailed information possible for industrial analysis. The business register will need to be developed in such a way that records the links between establishments and their parent enterprises. The business register will be created from administrative records and may not have enough information to recognize all establishments of complex enterprises. Large businesses will be contacted prior to the survey post-out to determine if they have separate establishments. If so, the extended structure of the enterprise will be recorded on the business register and a questionnaire will be sent to the enterprise to be completed for each establishment.

    SBS has decided to follow the New Zealand simplified version of its statistical units model for the 2009 BAS. Future surveys may consider location units and enterprise groups if they are found to be useful for statistical collections.

    It should be noted that while establishment data may enable the derivation of detailed benchmark accounts, it may be necessary to aggregate up to enterprise level data for the benchmarks if the ongoing data used to extrapolate the benchmark forward (mainly VAGST) are only available at the enterprise level.

    Universe

    The BAS's covered all employing units, and excluded small non-employing units such as the market sellers. The surveys also excluded central government agencies engaged in public administration (ministries, public education and health, etc.). It only covers businesses that pay the VAGST. (Threshold SAT$75,000 and upwards).

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    -Total Sample Size was 1240 -Out of the 1240, 902 successfully completed the questionnaire. -The other remaining 338 either never responded or were omitted (some businesses were ommitted from the sample as they do not meet the requirement to be surveyed) -Selection was all employing units paying VAGST (Threshold SAT $75,000 upwards)

    WILL CONFIRM LATER!!

    OSO LE MEA E LE FAASA...AEA :-)

    Mode of data collection

    Mail Questionnaire [mail]

    Research instrument

    1. General instructions, authority for the survey, etc;
    2. Business demography information on ownership, contact details, structure, etc.;
    3. Employment;
    4. Income;
    5. Expenses;
    6. Inventories;
    7. Profit or loss and reconciliation to business accounts' profit and loss;
    8. Fixed assets - purchases, disposals, book values
    9. Thank you and signature of respondent.

    Supplementary Pages Additional pages have been prepared to collect data for a limited range of industries. 1.Production data. To rebase and redevelop the Industrial Production Index (IPI), it is intended to collect volume of production information from a selection of large manufacturing businesses. The selection of businesses and products is critical to the usefulness of the IPI. The products must be homogeneous, and be of enough importance to the economy to justify collecting the data. Significance criteria should be established for the selection of products to include in the IPI, and the 2009 BAS provides an opportunity to collect benchmark data for a range of products known to be significant (based on information in the existing IPI, CPI weights, export data, etc.) as well as open questions for respondents to provide information on other significant products. 2.Tourism. There is a strong demand for estimates of tourism value added. To estimate tourism value added using the international standard Tourism Satellite Account methodology requires the use of an input-output table, which is beyond the capacity of SBS at present. However, some indicative estimates of the main parts of the economy influenced by tourism can be derived if the necessary data are collected. Tourism is a demand concept, based on defining tourists (the international standard includes both international and domestic tourists), what products are characteristically purchased by tourists, and which industries supply those products. Some questions targeted at those industries that have significant involvement with tourists (hotels, restaurants, transport and tour operators, vehicle hire, etc.), on how much of their income is sourced from tourism would provide valuable indicators of the size of the direct impact of tourism.

    Cleaning operations

    Partial imputation was done at the time of receipt of questionnaires, after follow-up procedures to obtain fully completed questionnaires have been followed. Imputation followed a process, i.e., apply ratios from responding units in the imputation cell to the partial data that was supplied. Procedures were established during the editing stage (a) to preserve the integrity of the questionnaires as supplied by respondents, and (b) to record all changes made to the questionnaires during editing. If SBS staff writes on the form, for example, this should only be done in red pen, to distinguish the alterations from the original information.

    Additional edit checks were developed, including checking against external data at enterprise/establishment level. External data to be checked against include VAGST and SNPF for turnover and purchases, and salaries and wages and employment data respectively. Editing and imputation processes were undertaken by FSD using Excel.

    Sampling error estimates

    NOT APPLICABLE!!

  17. e

    Waterworks — intake point_reporting

    • data.europa.eu
    • gimi9.com
    unknown
    Updated Feb 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Waterworks — intake point_reporting [Dataset]. https://data.europa.eu/data/datasets/https-data-norge-no-node-1499/
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Feb 7, 2022
    License

    http://spdx.org/licenses/NLOD-2.0http://spdx.org/licenses/NLOD-2.0

    Description

    The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find datasets for: 4. Input point_reporting. In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the “moder” file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the æøå. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets.

    Purpose: Make data for drinking water supply available to the public.

  18. f

    Data from: S1 Dataset -

    • figshare.com
    xlsx
    Updated Apr 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Mahwera; Erick Killel; Ninael Jonas; Adam Hancy; Anna Zangira; Aika Lekey; Rose Msaki; Doris Katana; Rogath Kishimba; Debora Charwe; Fatma Abdallah; Geofrey Chiduo; Ray Masumo; Germana Leyna; Geofrey Mchau (2024). S1 Dataset - [Dataset]. http://doi.org/10.1371/journal.pone.0299025.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Apr 19, 2024
    Dataset provided by
    PLOS ONE
    Authors
    David Mahwera; Erick Killel; Ninael Jonas; Adam Hancy; Anna Zangira; Aika Lekey; Rose Msaki; Doris Katana; Rogath Kishimba; Debora Charwe; Fatma Abdallah; Geofrey Chiduo; Ray Masumo; Germana Leyna; Geofrey Mchau
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe evaluation of surveillance systems has been recommended by the World Health Organization (WHO) to identify the performance and areas for improvement. Universal salt iodization (USI) as one of the surveillance systems in Tanzania needs periodic evaluation for its optimal function. This study aimed at evaluating the universal salt iodization (USI) surveillance system in Tanzania from January to December 2021 to find out if the system meets its intended objectives by evaluating its attributes as this was the first evaluation of the USI surveillance system since its establishment in 2010. The USI surveillance system is key for monitoring the performance towards the attainment of universal salt iodization (90%).MethodologyThis evaluation was guided by the Center for Disease Control Guidelines for Evaluating Public Health Surveillance Systems, (MMWR) to evaluate USI 2021 data. The study was conducted in Kigoma region in March 2022. Both Purposive and Convenient sampling was used to select the region, district, and ward for the study. The study involved reviewing documents used in the USI system and interviewing the key informants in the USI program. Data analysis was done by Microsoft Excel and presented in tables and graphs.ResultsA total of 1715 salt samples were collected in the year 2021 with 279 (16%) of non-iodized salt identified. The majority of the system attributes 66.7% had a good performance with a score of three, 22.2% had a moderate performance with a score of two and one attribute with poor performance with a score of one. Data quality, completeness and sensitivity were 100%, acceptability 91.6%, simplicity 83% were able to collect data on a single sample in < 2 minutes, the system stability in terms of performance was >75% and the usefulness of the system had poor performance.ConclusionAlthough the system attributes were found to be working overall well, for proper surveillance of the USI system, the core attributes need to be strengthened. Key variables that measure the system performance must be included from the primary data source and well-integrated with the Local Government (district and regions) to Ministry of Health information systems.

  19. Supporting data: Behaviour-related scalar habitat selection by Cape buffalo...

    • zenodo.org
    Updated Jan 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Bennitt; Mpaphi Casper Bonyongo; Stephen Harris; Emily Bennitt; Mpaphi Casper Bonyongo; Stephen Harris (2020). Supporting data: Behaviour-related scalar habitat selection by Cape buffalo (Syncerus caffer caffer) [Dataset]. http://doi.org/10.5281/zenodo.18871
    Explore at:
    Dataset updated
    Jan 21, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Emily Bennitt; Mpaphi Casper Bonyongo; Stephen Harris; Emily Bennitt; Mpaphi Casper Bonyongo; Stephen Harris
    Description

    Excel files containing source data for habitat selection

  20. d

    1.05 Feeling of Safety in Your Neighborhood (summary)

    • catalog.data.gov
    • performance.tempe.gov
    • +7more
    Updated Nov 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2025). 1.05 Feeling of Safety in Your Neighborhood (summary) [Dataset]. https://catalog.data.gov/dataset/1-05-feeling-of-safety-in-your-neighborhood-summary-8efc2
    Explore at:
    Dataset updated
    Nov 1, 2025
    Dataset provided by
    City of Tempe
    Description

    Tempe’s trust data for this measure is collected every month and comes from the “Safety” result from the monthly administered Police Sentiment Survey. There is one question which feeds into these results: "When it comes to the threat of crime, how safe do you feel in your neighborhood?" Benchmark data is from cohorts of communities with similar characteristics, such as size, population density, and region. This data is collected every month and quarter via a recurring report.This page provides data for the Feeling of Safety in Your Neighborhood performance measure. The performance measure dashboard is available at 1.05 Feeling of Safety in Your Neighborhood.Data Dictionary Additional Information Source: Zencity Contact: Amber Asburry Contact email: strategic_management_innovation@tempe.gov Data Source Type: Excel, CSV Preparation Method: Take the "Safety" score from the Police Sentiment Survey. This score includes the average of the top two results from the question underneath this area on the report. These months are then averaged to get the quarterly score. Publish Frequency: Monthly Publish Method: Manual

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Various (2025). Equity Report Data: Geography [Dataset]. https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Geography/p6uw-qxpv

Equity Report Data: Geography

Explore at:
application/geo+json, csv, kmz, kml, xlsx, xmlAvailable download formats
Dataset updated
May 21, 2025
Dataset authored and provided by
Various
Description

This dataset contains the geographic data used to create maps for the San Diego County Regional Equity Indicators Report led by the Office of Equity and Racial Justice (OERJ). The full report can be found here: https://data.sandiegocounty.gov/stories/s/7its-kgpt

Demographic data from the report can be found here: https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Demographics/q9ix-kfws

Filter by the Indicator column to select data for a particular indicator map.

Export notes: Dataset may not automatically open correctly in Excel due to geospatial data. To export the data for geospatial analysis, select Shapefile or GEOJSON as the file type. To view the data in Excel, export as a CSV but do not open the file. Then, open a blank Excel workbook, go to the Data tab, select “From Text/CSV,” and follow the prompts to import the CSV file into Excel. Alternatively, use the exploration options in "View Data" to hide the geographic column prior to exporting the data.

USER NOTES: 4/7/2025 - The maps and data have been removed for the Health Professional Shortage Areas indicator due to inconsistencies with the data source leading to some missing health professional shortage areas. We are working to fix this issue, including exploring possible alternative data sources.

5/21/2025 - The following changes were made to the 2023 report data (Equity Report Year = 2023). Self-Sufficiency Wage - a typo in the indicator name was fixed (changed sufficienct to sufficient) and the percent for one PUMA corrected from 56.9 to 59.9 (PUMA = San Diego County (Northwest)--Oceanside City & Camp Pendleton). Notes were made consistent for all rows where geography = ZCTA. A note was added to all rows where geography = PUMA. Voter registration - label "92054, 92051" was renamed to be in numerical order and is now "92051, 92054". Removed data from the percentile column because the categories are not true percentiles. Employment - Data was corrected to show the percent of the labor force that are employed (ages 16 and older). Previously, the data was the percent of the population 16 years and older that are in the labor force. 3- and 4-Year-Olds Enrolled in School - percents are now rounded to one decimal place. Poverty - the last two categories/percentiles changed because the 80th percentile cutoff was corrected by 0.01 and one ZCTA was reassigned to a different percentile as a result. Low Birthweight - the 33th percentile label was corrected to be written as the 33rd percentile. Life Expectancy - Corrected the category and percentile assignment for SRA CENTRAL SAN DIEGO. Parks and Community Spaces - corrected the category assignment for six SRAs.

5/21/2025 - Data was uploaded for Equity Report Year 2025. The following changes were made relative to the 2023 report year. Adverse Childhood Experiences - added geographic data for 2025 report. No calculation of bins nor corresponding percentiles due to small number of geographic areas. Low Birthweight - no calculation of bins nor corresponding percentiles due to small number of geographic areas.

Prepared by: Office of Evaluation, Performance, and Analytics and the Office of Equity and Racial Justice, County of San Diego, in collaboration with the San Diego Regional Policy & Innovation Center (https://www.sdrpic.org).

Search
Clear search
Close search
Google apps
Main menu