23 datasets found
  1. O

    Equity Report Data: Geography

    • data.sandiegocounty.gov
    Updated May 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Various (2025). Equity Report Data: Geography [Dataset]. https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Geography/p6uw-qxpv
    Explore at:
    application/rssxml, application/rdfxml, csv, tsv, xml, application/geo+json, kmz, kmlAvailable download formats
    Dataset updated
    May 21, 2025
    Dataset authored and provided by
    Various
    Description

    This dataset contains the geographic data used to create maps for the San Diego County Regional Equity Indicators Report led by the Office of Equity and Racial Justice (OERJ). The full report can be found here: https://data.sandiegocounty.gov/stories/s/7its-kgpt

    Demographic data from the report can be found here: https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Demographics/q9ix-kfws

    Filter by the Indicator column to select data for a particular indicator map.

    Export notes: Dataset may not automatically open correctly in Excel due to geospatial data. To export the data for geospatial analysis, select Shapefile or GEOJSON as the file type. To view the data in Excel, export as a CSV but do not open the file. Then, open a blank Excel workbook, go to the Data tab, select “From Text/CSV,” and follow the prompts to import the CSV file into Excel. Alternatively, use the exploration options in "View Data" to hide the geographic column prior to exporting the data.

    USER NOTES: 4/7/2025 - The maps and data have been removed for the Health Professional Shortage Areas indicator due to inconsistencies with the data source leading to some missing health professional shortage areas. We are working to fix this issue, including exploring possible alternative data sources.

    5/21/2025 - The following changes were made to the 2023 report data (Equity Report Year = 2023). Self-Sufficiency Wage - a typo in the indicator name was fixed (changed sufficienct to sufficient) and the percent for one PUMA corrected from 56.9 to 59.9 (PUMA = San Diego County (Northwest)--Oceanside City & Camp Pendleton). Notes were made consistent for all rows where geography = ZCTA. A note was added to all rows where geography = PUMA. Voter registration - label "92054, 92051" was renamed to be in numerical order and is now "92051, 92054". Removed data from the percentile column because the categories are not true percentiles. Employment - Data was corrected to show the percent of the labor force that are employed (ages 16 and older). Previously, the data was the percent of the population 16 years and older that are in the labor force. 3- and 4-Year-Olds Enrolled in School - percents are now rounded to one decimal place. Poverty - the last two categories/percentiles changed because the 80th percentile cutoff was corrected by 0.01 and one ZCTA was reassigned to a different percentile as a result. Low Birthweight - the 33th percentile label was corrected to be written as the 33rd percentile. Life Expectancy - Corrected the category and percentile assignment for SRA CENTRAL SAN DIEGO. Parks and Community Spaces - corrected the category assignment for six SRAs.

    5/21/2025 - Data was uploaded for Equity Report Year 2025. The following changes were made relative to the 2023 report year. Adverse Childhood Experiences - added geographic data for 2025 report. No calculation of bins nor corresponding percentiles due to small number of geographic areas. Low Birthweight - no calculation of bins nor corresponding percentiles due to small number of geographic areas.

    Prepared by: Office of Evaluation, Performance, and Analytics and the Office of Equity and Racial Justice, County of San Diego, in collaboration with the San Diego Regional Policy & Innovation Center (https://www.sdrpic.org).

  2. e

    Waterworks — intake point_analysis

    • data.europa.eu
    unknown
    Updated Feb 7, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Waterworks — intake point_analysis [Dataset]. https://data.europa.eu/data/datasets/https-data-norge-no-node-1439
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Feb 7, 2022
    License

    https://data.norge.no/nlod/en/2.0/https://data.norge.no/nlod/en/2.0/

    Description

    The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier.There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source. Below you will find data sets for the 4th intake point_analysis. In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the “moder” file. to get names and other static information. These files have the _reporting ending in the file name.Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the æøå. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next &remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets. — Purpose: Make information on the supply of drinking water available to the public. The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting.The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4. Entry point. It also includes analysis of the water source.Below you will find data sets for the 4th intake point_analysis. In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the “moder” file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the æøå. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. The waterworks are responsible for the quality of the datasets. — Purpose: Make information on the supply of drinking

  3. d

    3.07 AZ Merit Data (summary)

    • catalog.data.gov
    • data-academy.tempe.gov
    • +12more
    Updated Jan 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2025). 3.07 AZ Merit Data (summary) [Dataset]. https://catalog.data.gov/dataset/3-07-az-merit-data-summary-55307
    Explore at:
    Dataset updated
    Jan 17, 2025
    Dataset provided by
    City of Tempe
    Area covered
    Arizona
    Description

    This page provides data for the 3rd Grade Reading Level Proficiency performance measure.The dataset includes the student performance results on the English/Language Arts section of the AzMERIT from the Fall 2017 and Spring 2018. Data is representive of students in third grade in public elementary schools in Tempe. This includes schools from both Tempe Elementary and Kyrene districts. Results are by school and provide the total number of students tested, total percentage passing and percentage of students scoring at each of the four levels of proficiency. The performance measure dashboard is available at 3.07 3rd Grade Reading Level Proficiency.Additional InformationSource: Arizona Department of EducationContact: Ann Lynn DiDomenicoContact E-Mail: Ann_DiDomenico@tempe.govData Source Type: Excel/ CSVPreparation Method: Filters on original dataset: within "Schools" Tab School District [select Tempe School District and Kyrene School District]; School Name [deselect Kyrene SD not in Tempe city limits]; Content Area [select English Language Arts]; Test Level [select Grade 3]; Subgroup/Ethnicity [select All Students] Remove irrelevant fields; Add Fiscal YearPublish Frequency: Annually as data becomes availablePublish Method: ManualData Dictionary

  4. d

    1.05 Feeling of Safety in Your Neighborhood (summary)

    • catalog.data.gov
    Updated Sep 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2025). 1.05 Feeling of Safety in Your Neighborhood (summary) [Dataset]. https://catalog.data.gov/dataset/1-05-feeling-of-safety-in-your-neighborhood-summary-8efc2
    Explore at:
    Dataset updated
    Sep 27, 2025
    Dataset provided by
    City of Tempe
    Description

    Tempe’s trust data for this measure is collected every month and comes from the “Safety” result from the monthly administered Police Sentiment Survey. There is one question which feeds into these results: "When it comes to the threat of crime, how safe do you feel in your neighborhood?" Benchmark data is from cohorts of communities with similar characteristics, such as size, population density, and region. This data is collected every month and quarter via a recurring report.This page provides data for the Feeling of Safety in Your Neighborhood performance measure. The performance measure dashboard is available at 1.05 Feeling of Safety in Your Neighborhood.Data Dictionary Additional Information Source: Zencity Contact: Amber Asburry Contact email: strategic_management_innovation@tempe.gov Data Source Type: Excel, CSV Preparation Method: Take the "Safety" score from the Police Sentiment Survey. This score includes the average of the top two results from the question underneath this area on the report. These months are then averaged to get the quarterly score. Publish Frequency: Monthly Publish Method: Manual

  5. f

    Data files used to plot Figures 1-6, Supplemental Figures 1 and 2 in...

    • figshare.com
    txt
    Updated Aug 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rory Cottrell (2023). Data files used to plot Figures 1-6, Supplemental Figures 1 and 2 in Cottrell et al 2023, "No Late Cretaceous true polar wander oscillation and implications for stability of Earth relative to the rotation axis" [Dataset]. http://doi.org/10.6084/m9.figshare.23508861.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 18, 2023
    Dataset provided by
    figshare
    Authors
    Rory Cottrell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Earth
    Description

    Also saved as Readme.txt

    SupplTable1_June2023.xlsx: Supplementary Table 1: Directional data as reported by Mitchell et al 2021. This table has been filtered for repeated samples, and data not made available from the original data source.
    Column Headings: Section, stratigraphic section; Sample Name, sample identification of Mitchell et al 2021; Bin Age (Ma), age assigned by Mitchell et al 2021;
    Age (Ma), age of samples as reported by Mitchell et al 2021; GDec (o), GINc (o), Geographic Directions as reported by Mitchell et al 2021; SDec (o), SInc (o), stratigraphic directions as reported by Mitchell et al 2021; SPDec (o), SPInc (o), stratigraphic directions reported as northern hemisphere directions by Mitchell et al 2021. SupplTable2_June2023.xlsx: Supplementary Table 2: 1 million year binned average directions and poles for Apiro and Furlo sections.
    Section, Stratigraphic section; Age (Ma), binned ages from Suppl. Table 1; dec (o) and inc (o), Fisher directions of age bins; n, number of individual directions in each bin; alpha95 (o), circle of confidence radius for Fisher Means; Plon (o), Plat (o), A95 (o), poles calculated from directions in Suppl. Table 1 using section locations as reported in Mitchell et al 2021, binned and averaged and presented here. These poles are used for Supplementary Figures 1 and 2 in the Cottrell et al 2023 manuscript. Figure1data.xlsx: Table of published paleomagnetic poles as presented in Mitchell et al 2021. The published age bin (included for comparison) is corrected in the last column for bin age assignment based on data provided in the Mitchell et al 2021 publication. This data set forms the basis of Figure 1 in Cottrell et al 2023 manuscript. Figure2data.xlsx: Excel file of select data columns of two samples (C10AD253 and C10AD268) presented as reverse and normal polarity examples from the data set of Mitchell et al 2021. Each file is presented on a separate tab in a format compatible with MagIC database format. Original data can be downloaded via the link provided in Mitchell et al 2021. Original data files were filtered using python and bash scripts to select demagnetization step, magnetization moment, and stratigraphic corrected directions. Column headings
    Sample: sample name
    Demag Step (oC) - demagnetization step in degrees C Moment (emu) - Magnetization moment in electromagnetic units SDec (o) - declination corrected for geographic direction and strike/bedding dip SInc (o) - inclination corrected for geographic direction and strike/bedding dip
    Figure3data.zip: Zipped folder of hysteresis data presented in Figure 3 of Cottrell et al 2023. Data were collected on a Princeton Measurements Alternating Gradient Force Magnetometer Model 2900 with a P1 probe. The probe and diamagnetic/paramagnetic adjusted hysteresis data file and first order reversal curve files for each sample are provided. The PDF and text output of forcsensei (publically available python code for evaluating first order reversal curves) are also provided. Figure4data.xlsx: Excel file of data presented in Figure 4. The original demagnetization data files can be found in the link provided in Mitchell et al 2021. Magnetization moment of ~580 degrees C of each data file (580 degrees specifically was not always used as a demagnetization step by the original authors) was used to calculate the percent natural remanent magnetization remaining as normalized by the zero demagnetization step magnetization moment. Any sample line designated excursion or transition was removed from the analysis. Samples were grouped based on Chron designation into Normal (32n, 33n, 34n) or Reverse (32r, 33r). Histograms of % NRM remaining after demagnetization to 580 degrees were plotted in Figure 4. Figure5data.xlsx: Data file for plotting Figure 5. Originally presented in Mitchell et al 2021, and filtered for repeated data lines and files not made available for download. See Supplementary Data Table 1 for full details. Column headings Section - sedimentary section Sample Name - sample designation assigned by Mitchell et al 2021 Bin Age - Bin Age in millions of years as assigned by Mitchell et al 2021 Age - Age in millions of years as determined by Mitchell et al 2021 Chron - Chron assignment as determined by Mitchell et al 2021 SDec - Stratigraphic declination as presented by Mitchell et al 2021 SInc - Stratigraphic inclination as presented by Mitchell et al 2021 NRMleft580 - percent NRM remaining after demagnetization to ~580 degrees. See Methods in Cottrell et al 2023 for details. Figure6data.xlsx: Excel data file of filtered directional data presented in Mitchell et al 2021. Characteristic remanent magnetization directions and Low temperature component fits as presented in Mitchell et al 2021, and filtered for excluded data lines, repeated measurement lines, and only for the Apiro sedimentary section. Chron 33r and 33n data are presented in separate tabs. Column headings: Section - sedimentary section Sample Name - sample name as presented in Mitchell et al 2021 Bin Age - as presented in Mitchell et al 2021 Age - as presented in Mitchell et al 2021 Chron - chron assignment as presented in Mitchell et al 2021 GDec - geographic declination direction of the characteristic remanent magnetization, as presented in Mitchell et al 2021 GInc - geographic inclination direction of the characteristic remanent magnetizartion, as presented in Mitchell et al 2021 NRMleft580 - percent natural remanent magnetization remaining aftyer demagnetization to ~580 degrees LTGDec - low temperature geographic declination direction as presented by Mitchell et al 2021 LTGInc - low temperature geographic inclination direction as presented by Mitchell et al 2021

    Figure 7 is a statistical model based on input parameters; there is no data associated with it.

  6. C

    Verden Source LLC

    • data.cityofchicago.org
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Chicago (2025). Verden Source LLC [Dataset]. https://data.cityofchicago.org/Community-Economic-Development/Verden-Source-LLC/qkv5-pk99
    Explore at:
    application/geo+json, csv, xlsx, kml, kmz, xmlAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    City of Chicago
    Description

    This dataset contains all current and active business licenses issued by the Department of Business Affairs and Consumer Protection. This dataset contains a large number of records /rows of data and may not be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Notepad or Wordpad, to view and search.

    Data fields requiring description are detailed below.

    APPLICATION TYPE: 'ISSUE' is the record associated with the initial license application. 'RENEW' is a subsequent renewal record. All renewal records are created with a term start date and term expiration date. 'C_LOC' is a change of location record. It means the business moved. 'C_CAPA' is a change of capacity record. Only a few license types my file this type of application. 'C_EXPA' only applies to businesses that have liquor licenses. It means the business location expanded.

    LICENSE STATUS: 'AAI' means the license was issued.

    Business license owners may be accessed at: http://data.cityofchicago.org/Community-Economic-Development/Business-Owners/ezma-pppn To identify the owner of a business, you will need the account number or legal name.

    Data Owner: Business Affairs and Consumer Protection

    Time Period: Current

    Frequency: Data is updated daily

  7. g

    Waterworks — water supply system

    • gimi9.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Waterworks — water supply system [Dataset]. https://gimi9.com/dataset/eu_https-data-norge-no-node-1422/
    Explore at:
    Description

    The data sets provide an overview of selected data on waterworks registered with the Norwegian Food Safety Authority. The information has been reported by the waterworks through application processing or other reporting to the Norwegian Food Safety Authority. Drinking water regulations require, among other things, annual reporting. The Norwegian Food Safety Authority has created a separate form service for such reporting. The data sets include public or private waterworks that supply 50 people or more. In addition, all municipal owned businesses with their own water supply are included regardless of size. The data sets also contain decommissioned facilities. This is done for those who wish to view historical data, i.e. data for previous years or earlier. There are data sets for the following supervisory objects: 1. Water supply system. It also includes analysis of drinking water. 2. Transport system 3. Treatment facility 4.Entry point. It also includes analysis of the water source. Below you will find data sets for the 1st water supply system.In addition, there is a file (information.txt) that provides an overview of when the extracts were produced and how many lines there are in the individual files. The withdrawals are done weekly. Furthermore, for the data sets water supply system, transport system and intake point it is possible to see historical data on what is included in the annual reporting. To make use of that information, the file must be linked to the “moder” file. to get names and other static information. These files have the _reporting ending in the file name. Description of the data fields (i.e. metadata) in the individual data sets appears in separate files. These are available in pdf format. If you double-click the csv file and it opens directly in excel, then you will not get the æøå. To see the character set correctly in Excel, you must: & start Excel and a new spreadsheet & select data and then from text, press Import & select separator data and file origin 65001: Unicode (UTF-8) and tick of My Data have headings and press Next & remove tab as separator and select semicolon as separator, press next & otherwise, complete the data sets can be imported into a separate database and compiled as desired. There are link keys in the files that make it possible to link the files together. Waterworks are responsible for the quality of the datasets — Purpose: Make information on drinking water supply available to the public

  8. Google Capstone Project - BellaBeats

    • kaggle.com
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason Porzelius (2023). Google Capstone Project - BellaBeats [Dataset]. https://www.kaggle.com/datasets/jasonporzelius/google-capstone-project-bellabeats
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 5, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jason Porzelius
    Description

    Introduction: I have chosen to complete a data analysis project for the second course option, Bellabeats, Inc., using a locally hosted database program, Excel for both my data analysis and visualizations. This choice was made primarily because I live in a remote area and have limited bandwidth and inconsistent internet access. Therefore, completing a capstone project using web-based programs such as R Studio, SQL Workbench, or Google Sheets was not a feasible choice. I was further limited in which option to choose as the datasets for the ride-share project option were larger than my version of Excel would accept. In the scenario provided, I will be acting as a Junior Data Analyst in support of the Bellabeats, Inc. executive team and data analytics team. This combined team has decided to use an existing public dataset in hopes that the findings from that dataset might reveal insights which will assist in Bellabeat's marketing strategies for future growth. My task is to provide data driven insights to business tasks provided by the Bellabeats, Inc.'s executive and data analysis team. In order to accomplish this task, I will complete all parts of the Data Analysis Process (Ask, Prepare, Process, Analyze, Share, Act). In addition, I will break each part of the Data Analysis Process down into three sections to provide clarity and accountability. Those three sections are: Guiding Questions, Key Tasks, and Deliverables. For the sake of space and to avoid repetition, I will record the deliverables for each Key Task directly under the numbered Key Task using an asterisk (*) as an identifier.

    Section 1 - Ask: A. Guiding Questions: Who are the key stakeholders and what are their goals for the data analysis project? What is the business task that this data analysis project is attempting to solve?

    B. Key Tasks: Identify key stakeholders and their goals for the data analysis project *The key stakeholders for this project are as follows: -Urška Sršen and Sando Mur - co-founders of Bellabeats, Inc. -Bellabeats marketing analytics team. I am a member of this team. Identify the business task. *The business task is: -As provided by co-founder Urška Sršen, the business task for this project is to gain insight into how consumers are using their non-BellaBeats smart devices in order to guide upcoming marketing strategies for the company which will help drive future growth. Specifically, the researcher was tasked with applying insights driven by the data analysis process to 1 BellaBeats product and presenting those insights to BellaBeats stakeholders.

    Section 2 - Prepare: A. Guiding Questions: Where is the data stored and organized? Are there any problems with the data? How does the data help answer the business question?

    B. Key Tasks: Research and communicate the source of the data, and how it is stored/organized to stakeholders. *The data source used for our case study is FitBit Fitness Tracker Data. This dataset is stored in Kaggle and was made available through user Mobius in an open-source format. Therefore, the data is public and available to be copied, modified, and distributed, all without asking the user for permission. These datasets were generated by respondents to a distributed survey via Amazon Mechanical Turk reportedly (see credibility section directly below) between 03/12/2016 thru 05/12/2016. *Reportedly (see credibility section directly below), thirty eligible Fitbit users consented to the submission of personal tracker data, including output related to steps taken, calories burned, time spent sleeping, heart rate, and distance traveled. This data was broken down into minute, hour, and day level totals. This data is stored in 18 CSV documents. I downloaded all 18 documents into my local laptop and decided to use 2 documents for the purposes of this project as they were files which had merged activity and sleep data from the other documents. All unused documents were permanently deleted from the laptop. The 2 files used were: -sleepDaymerged.csv -dailyActivitymerged.csv Identify and communicate to stakeholders any problems found with the data related to credibility and bias. *As will be more specifically presented in the Process section, the data seems to have credibility issues related to the reported time frame of the data collected. The metadata seems to indicate that the data collected covered roughly 2 months of FitBit tracking. However, upon my initial data processing, I found that only 1 month of data was reported. *As will be more specifically presented in the Process section, the data has credibility issues related to the number of individuals who reported FitBit data. Specifically, the metadata communicates that 30 individual users agreed to report their tracking data. My initial data processing uncovered 33 individual IDs in the dailyActivity_merged dataset. *Due to the small number of participants (...

  9. March Madness Historical DataSet (2002 to 2025)

    • kaggle.com
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Pilafas (2025). March Madness Historical DataSet (2002 to 2025) [Dataset]. https://www.kaggle.com/datasets/jonathanpilafas/2024-march-madness-statistical-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 22, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jonathan Pilafas
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This Kaggle dataset comes from an output dataset that powers my March Madness Data Analysis dashboard in Domo. - Click here to view this dashboard: Dashboard Link - Click here to view this dashboard features in a Domo blog post: Hoops, Data, and Madness: Unveiling the Ultimate NCAA Dashboard

    This dataset offers one the most robust resource you will find to discover key insights through data science and data analytics using historical NCAA Division 1 men's basketball data. This data, sourced from KenPom, goes as far back as 2002 and is updated with the latest 2025 data. This dataset is meticulously structured to provide every piece of information that I could pull from this site as an open-source tool for analysis for March Madness.

    Key features of the dataset include: - Historical Data: Provides all historical KenPom data from 2002 to 2025 from the Efficiency, Four Factors (Offense & Defense), Point Distribution, Height/Experience, and Misc. Team Stats endpoints from KenPom's website. Please note that the Height/Experience data only goes as far back as 2007, but every other source contains data from 2002 onward. - Data Granularity: This dataset features an individual line item for every NCAA Division 1 men's basketball team in every season that contains every KenPom metric that you can possibly think of. This dataset has the ability to serve as a single source of truth for your March Madness analysis and provide you with the granularity necessary to perform any type of analysis you can think of. - 2025 Tournament Insights: Contains all seed and region information for the 2025 NCAA March Madness tournament. Please note that I will continually update this dataset with the seed and region information for previous tournaments as I continue to work on this dataset.

    These datasets were created by downloading the raw CSV files for each season for the various sections on KenPom's website (Efficiency, Offense, Defense, Point Distribution, Summary, Miscellaneous Team Stats, and Height). All of these raw files were uploaded to Domo and imported into a dataflow using Domo's Magic ETL. In these dataflows, all of the column headers for each of the previous seasons are standardized to the current 2025 naming structure so all of the historical data can be viewed under the exact same field names. All of these cleaned datasets are then appended together, and some additional clean up takes place before ultimately creating the intermediate (INT) datasets that are uploaded to this Kaggle dataset. Once all of the INT datasets were created, I joined all of the tables together on the team name and season so all of these different metrics can be viewed under one single view. From there, I joined an NCAAM Conference & ESPN Team Name Mapping table to add a conference field in its full length and respective acronyms they are known by as well as the team name that ESPN currently uses. Please note that this reference table is an aggregated view of all of the different conferences a team has been a part of since 2002 and the different team names that KenPom has used historically, so this mapping table is necessary to map all of the teams properly and differentiate the historical conferences from their current conferences. From there, I join a reference table that includes all of the current NCAAM coaches and their active coaching lengths because the active current coaching length typically correlates to a team's success in the March Madness tournament. I also join another reference table to include the historical post-season tournament teams in the March Madness, NIT, CBI, and CIT tournaments, and I join another reference table to differentiate the teams who were ranked in the top 12 in the AP Top 25 during week 6 of the respective NCAA season. After some additional data clean-up, all of this cleaned data exports into the "DEV _ March Madness" file that contains the consolidated view of all of this data.

    This dataset provides users with the flexibility to export data for further analysis in platforms such as Domo, Power BI, Tableau, Excel, and more. This dataset is designed for users who wish to conduct their own analysis, develop predictive models, or simply gain a deeper understanding of the intricacies that result in the excitement that Division 1 men's college basketball provides every year in March. Whether you are using this dataset for academic research, personal interest, or professional interest, I hope this dataset serves as a foundational tool for exploring the vast landscape of college basketball's most riveting and anticipated event of its season.

  10. f

    Data from: S1 Dataset -

    • figshare.com
    xlsx
    Updated Apr 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Mahwera; Erick Killel; Ninael Jonas; Adam Hancy; Anna Zangira; Aika Lekey; Rose Msaki; Doris Katana; Rogath Kishimba; Debora Charwe; Fatma Abdallah; Geofrey Chiduo; Ray Masumo; Germana Leyna; Geofrey Mchau (2024). S1 Dataset - [Dataset]. http://doi.org/10.1371/journal.pone.0299025.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Apr 19, 2024
    Dataset provided by
    PLOS ONE
    Authors
    David Mahwera; Erick Killel; Ninael Jonas; Adam Hancy; Anna Zangira; Aika Lekey; Rose Msaki; Doris Katana; Rogath Kishimba; Debora Charwe; Fatma Abdallah; Geofrey Chiduo; Ray Masumo; Germana Leyna; Geofrey Mchau
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe evaluation of surveillance systems has been recommended by the World Health Organization (WHO) to identify the performance and areas for improvement. Universal salt iodization (USI) as one of the surveillance systems in Tanzania needs periodic evaluation for its optimal function. This study aimed at evaluating the universal salt iodization (USI) surveillance system in Tanzania from January to December 2021 to find out if the system meets its intended objectives by evaluating its attributes as this was the first evaluation of the USI surveillance system since its establishment in 2010. The USI surveillance system is key for monitoring the performance towards the attainment of universal salt iodization (90%).MethodologyThis evaluation was guided by the Center for Disease Control Guidelines for Evaluating Public Health Surveillance Systems, (MMWR) to evaluate USI 2021 data. The study was conducted in Kigoma region in March 2022. Both Purposive and Convenient sampling was used to select the region, district, and ward for the study. The study involved reviewing documents used in the USI system and interviewing the key informants in the USI program. Data analysis was done by Microsoft Excel and presented in tables and graphs.ResultsA total of 1715 salt samples were collected in the year 2021 with 279 (16%) of non-iodized salt identified. The majority of the system attributes 66.7% had a good performance with a score of three, 22.2% had a moderate performance with a score of two and one attribute with poor performance with a score of one. Data quality, completeness and sensitivity were 100%, acceptability 91.6%, simplicity 83% were able to collect data on a single sample in < 2 minutes, the system stability in terms of performance was >75% and the usefulness of the system had poor performance.ConclusionAlthough the system attributes were found to be working overall well, for proper surveillance of the USI system, the core attributes need to be strengthened. Key variables that measure the system performance must be included from the primary data source and well-integrated with the Local Government (district and regions) to Ministry of Health information systems.

  11. Covid-19 Food Insecurity Data

    • kaggle.com
    Updated Sep 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 13, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jack Ogozaly
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    What's in the Data?

    This dataset tracks food insecurity across different demographics starting 4/23/2020 to 8/23/2021. It contains fields such as Race, Education, Sex, State, Income, etc. If you're looking for a dataset to examine Covid-19's impact on food insecurity for different demographics, then here you are!

    Data Source

    This data is from the United States Census Bureau's Pulse Survey. The Pulse Survey is a frequently updating survey designed to collect data on how people's lives have been impacted by the coronavirus. Specifically, this dataset is a cleaned up version of the ' Food Sufficiency for Households, in the Last 7 Days, by Select Characteristics" tables.

    The original form of this data can be found at: https://www.census.gov/programs-surveys/household-pulse-survey/data.html

    What was done to this data?

    The original form of this data was split into 36 excel files containing ~67 sheets each. The data was in a non-tidy format, and questions were also not entirely standard. This dataset is my attempt to combine all these different files, tidy the data up, and combine slightly different questions together.

    Why are there so many NA's?

    The large amount of NA's are a consequence of how awful the data was originally/ forcing the data into a tidy format. Just filter the NA's out for the question you want to analyze and you'll be fine.

  12. Google Data Analytics Capstone

    • kaggle.com
    Updated Nov 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haimanot Tadross (2021). Google Data Analytics Capstone [Dataset]. https://www.kaggle.com/haimanottadross/how-does-a-bikeshare-navigate-speedy-success/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 29, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Haimanot Tadross
    Description

    Google Data Analytics How Does a Bike-Share Navigate Speedy Success?

    This is a case study project to complete the Google Data Analytics Certification. In this project I followed the data analysis process which are ask, prepare, process, analyze, share, and act. In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime. The director of marketing of the company has set a clear goal, to convert casual riders into annual members, which will make the company earn more profits. In order to do that the analyst team needs to better understand how annual and casual riders differ, why casual riders would buy a membership, and how digital media could affect the marketing tactics. How do annual members and casual riders use Cyclistic bikes differently?

    Ask Three questions will guide the future marketing program: 1. How do annual members and casual riders use Cyclistic bikes differently? 2. Why would casual riders buy Cyclistic annual memberships? 3. How can Cyclistic use digital media to influence casual riders to become members?

    Prepare In this part of data analysis process we will try to answer some of guiding questions about our data source , data quality and perform below task 1, Download data and store it appropriately. 2. Identify how it’s organized. 3. Sort and filter the data 4.Determine the credibility of the data Data Source: https://divvy-tripdata.s3.amazonaws.com/index.html Data License Agreement: https://www.divvybikes.com/data-license-agreement

    Process For the data process part of this project I used Excel, R, MS SQL, T-SQL and Tableau Excel - was used to check the data integrity , sort and filter individual month data SQL\ T-SQL - I choose to work on the 12 month dataset from 202011 - 202110 and this was a big dataset to process it in
    Excel, so I choose to use SQL for data cleaning and processing R - I also used R programing to for data cleaning , visualizations and report generation Tableau - Useed the output dataset from SQL and R to generate viz in Tableau

    Analyze 1. Aggregate the data so it’s useful and accessible. 2. Organize and format the data. 3. Perform calculations. 4. Identify trends and relationships.

    Share This is a case study project to complete the Google Data Analytics Certification and has been published on kaggle

    ACT Based on my analysist I will recommend to the Cyclistic marketing team to - Focus on weekend events and use social media to advertise - Give discount for causal riders since they ride for longer period of time - Promote causal riders to be came a member

  13. Data and Code for "Comparing the Effects of Euclidean Distance Matching and...

    • zenodo.org
    bin, csv, png +1
    Updated Oct 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous; Anonymous (2024). Data and Code for "Comparing the Effects of Euclidean Distance Matching and Dynamic Time Warping in the Clustering of COVID-19 Evolution" [Dataset]. http://doi.org/10.5281/zenodo.13905791
    Explore at:
    png, text/x-python, csv, binAvailable download formats
    Dataset updated
    Oct 9, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous; Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the datasets and data sources, analysis code, and workflow associated with the manuscript "Comparing the Effects of Euclidean Distance Matching and Dynamic Time Warping in the Clustering of COVID-19 Evolution". The following resources are provided:

    • Data Files:

      • time_series_data.csv: A curated time series dataset with dates as rows and NUTS 2 regions as columns. Each column is labeled using a 4-letter abbreviation format "CC.RR", where "CC" represents the country code and "RR" represents the region code. This same abbreviation is also included in the accompanying GeoJSON file.
      • geometry_data.geojson: A GeoJSON file representing the spatial boundaries of the NUTS 2 regions, with the same 4-letter abbreviations used in the CSV file. EPSG:4326.
      • COVID19_data_sources.xlsx: This Excel file contains important metadata regarding the sources of COVID-19 data used in this study. It includes:
        • Source of the data for each country
        • Official website(s)
        • The agency responsible for the data
        • Description of the processing steps used to curate the data into the final time series.
    • Code:

      • analysis.py: A Python script used to process and analyze the data. This code can be run using Python 3.x. The libraries required to run this script are listed in the first lines of the code. The code is organized in different numbered sections (1), (2), ... and sub-sections (1a), (1b) ... Make sure to run the script one (sub-)section at a time, so that everything stays overviewable and you don't get all the output at once.
    • Workflow:

      • workflow.png : A detailed workflow according to the Knowledge Discovery in Databases (KDD) process, outlining the steps involved in processing and analyzing the data, including the methods used. This workflow provides a comprehensive guide to reproducing the analysis presented in the paper.
  14. G

    Graph Data Integration Platform Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Graph Data Integration Platform Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/graph-data-integration-platform-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Graph Data Integration Platform Market Outlook



    According to our latest research, the global graph data integration platform market size reached USD 2.1 billion in 2024, reflecting robust adoption across industries. The market is projected to grow at a CAGR of 18.4% from 2025 to 2033, reaching approximately USD 10.7 billion by 2033. This significant growth is fueled by the increasing need for advanced data management and analytics solutions that can handle complex, interconnected data across diverse organizational ecosystems. The rapid digital transformation and the proliferation of big data have further accelerated the demand for graph-based data integration platforms.




    The primary growth factor driving the graph data integration platform market is the exponential increase in data complexity and volume within enterprises. As organizations collect vast amounts of structured and unstructured data from multiple sources, traditional relational databases often struggle to efficiently process and analyze these data sets. Graph data integration platforms, with their ability to map, connect, and analyze relationships between data points, offer a more intuitive and scalable solution. This capability is particularly valuable in sectors such as BFSI, healthcare, and telecommunications, where real-time data insights and dynamic relationship mapping are crucial for decision-making and operational efficiency.




    Another significant driver is the growing emphasis on advanced analytics and artificial intelligence. Modern enterprises are increasingly leveraging AI and machine learning to extract actionable insights from their data. Graph data integration platforms enable the creation of knowledge graphs and support complex analytics, such as fraud detection, recommendation engines, and risk assessment. These platforms facilitate seamless integration of disparate data sources, enabling organizations to gain a holistic view of their operations and customers. As a result, investment in graph data integration solutions is rising, particularly among large enterprises seeking to enhance their analytics capabilities and maintain a competitive edge.




    The surge in regulatory requirements and compliance mandates across various industries also contributes to the expansion of the graph data integration platform market. Organizations are under increasing pressure to ensure data accuracy, lineage, and transparency, especially in highly regulated sectors like finance and healthcare. Graph-based platforms excel in tracking data provenance and relationships, making it easier for companies to comply with regulations such as GDPR, HIPAA, and others. Additionally, the shift towards hybrid and multi-cloud environments further underscores the need for robust data integration tools capable of operating seamlessly across different infrastructures, further boosting market growth.




    From a regional perspective, North America currently dominates the graph data integration platform market, accounting for the largest share due to early adoption of advanced data technologies, a strong presence of key market players, and significant investments in digital transformation initiatives. However, Asia Pacific is expected to witness the fastest growth over the forecast period, driven by rapid industrialization, expanding IT infrastructure, and increasing adoption of cloud-based solutions among enterprises in countries like China, India, and Japan. Europe also remains a significant contributor, supported by stringent data privacy regulations and a mature digital economy.





    Component Analysis



    The component segment of the graph data integration platform market is bifurcated into software and services. The software segment currently commands the largest market share, reflecting the critical role of robust graph database engines, visualization tools, and integration frameworks in managing and analyzing complex data relationships. These software solutions are designed to deliver high scalability, flexibility, and real-time proces

  15. Emissions & Generation Resource Integrated Database (eGRID), eGRID2010

    • data.wu.ac.at
    csv
    Updated Jan 1, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency (2014). Emissions & Generation Resource Integrated Database (eGRID), eGRID2010 [Dataset]. https://data.wu.ac.at/odso/data_gov/MGZkOTI0ZjctN2U3ZS00NGI0LTkyNDAtN2VhNmViN2JiYjQw
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 1, 2014
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    3a935543591422e476dd0558a6229d5a6488b0f1
    Description

    The Emissions & Generation Resource Integrated Database (eGRID) is a comprehensive source of data on the environmental characteristics of almost all electric power generated in the United States. These environmental characteristics include air emissions for nitrogen oxides, sulfur dioxide, carbon dioxide, methane, and nitrous oxide; emissions rates; net generation; resource mix; and many other attributes.

    eGRID2010 contains the complete release of year 2007 data, as well as years 2005 and 2004 data. Excel spreadsheets, full documentation, summary data, eGRID subregion and NERC region representational maps, and GHG emission factors are included in this data set. The Archived data in eGRID2002 contain years 1996 through 2000 data.

    For year 2007 data, the first Microsoft Excel workbook, Plant, contains boiler, generator, and plant spreadsheets. The second Microsoft Excel workbook, Aggregation, contains aggregated data by state, electric generating company, parent company, power control area, eGRID subregion, NERC region, and U.S. total levels. The third Microsoft Excel workbook, ImportExport, contains state import-export data, as well as U.S. generation and consumption data for years 2007, 2005, and 2004. For eGRID data for years 2005 and 2004, a user friendly web application, eGRIDweb, is available to select, view, print, and export specified data.

  16. O

    Time series

    • data.open-power-system-data.org
    csv, sqlite, xlsx
    Updated Oct 6, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Muehlenpfordt (2020). Time series [Dataset]. http://doi.org/10.25832/time_series/2020-10-06
    Explore at:
    csv, sqlite, xlsxAvailable download formats
    Dataset updated
    Oct 6, 2020
    Dataset provided by
    Open Power System Data
    Authors
    Jonathan Muehlenpfordt
    Time period covered
    Jan 1, 2015 - Oct 1, 2020
    Variables measured
    utc_timestamp, DE_wind_profile, DE_solar_profile, DE_wind_capacity, DK_wind_capacity, SE_wind_capacity, CH_solar_capacity, DE_solar_capacity, DK_solar_capacity, AT_price_day_ahead, and 290 more
    Description

    Load, wind and solar, prices in hourly resolution. This data package contains different kinds of timeseries data relevant for power system modelling, namely electricity prices, electricity consumption (load) as well as wind and solar power generation and capacities. The data is aggregated either by country, control area or bidding zone. Geographical coverage includes the EU and some neighbouring countries. All variables are provided in hourly resolution. Where original data is available in higher resolution (half-hourly or quarter-hourly), it is provided in separate files. This package version only contains data provided by TSOs and power exchanges via ENTSO-E Transparency, covering the period 2015-mid 2020. See previous versions for historical data from a broader range of sources. All data processing is conducted in Python/pandas and has been documented in the Jupyter notebooks linked below.

  17. O*NET Database

    • onetcenter.org
    excel, mysql, oracle +2
    Updated Aug 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Center for O*NET Development (2025). O*NET Database [Dataset]. https://www.onetcenter.org/database.html
    Explore at:
    oracle, sql server, text, mysql, excelAvailable download formats
    Dataset updated
    Aug 26, 2025
    Dataset provided by
    Occupational Information Network
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Dataset funded by
    US Department of Labor, Employment and Training Administration
    Description

    The O*NET Database contains hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. The database, which is available to the public at no cost, is continually updated by a multi-method data collection program. Sources of data include: job incumbents, occupational experts, occupational analysts, employer job postings, and customer/professional association input.

    Data content areas include:

    • Worker Characteristics (e.g., Abilities, Interests, Work Styles)
    • Worker Requirements (e.g., Education, Knowledge, Skills)
    • Experience Requirements (e.g., On-the-Job Training, Work Experience)
    • Occupational Requirements (e.g., Detailed Work Activities, Work Context)
    • Occupation-Specific Information (e.g., Job Titles, Tasks, Technology Skills)

  18. d

    Census Data

    • catalog.data.gov
    • data.globalchange.gov
    • +2more
    Updated Mar 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Bureau of the Census (2024). Census Data [Dataset]. https://catalog.data.gov/dataset/census-data
    Explore at:
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    U.S. Bureau of the Census
    Description

    The Bureau of the Census has released Census 2000 Summary File 1 (SF1) 100-Percent data. The file includes the following population items: sex, age, race, Hispanic or Latino origin, household relationship, and household and family characteristics. Housing items include occupancy status and tenure (whether the unit is owner or renter occupied). SF1 does not include information on incomes, poverty status, overcrowded housing or age of housing. These topics will be covered in Summary File 3. Data are available for states, counties, county subdivisions, places, census tracts, block groups, and, where applicable, American Indian and Alaskan Native Areas and Hawaiian Home Lands. The SF1 data are available on the Bureau's web site and may be retrieved from American FactFinder as tables, lists, or maps. Users may also download a set of compressed ASCII files for each state via the Bureau's FTP server. There are over 8000 data items available for each geographic area. The full listing of these data items is available here as a downloadable compressed data base file named TABLES.ZIP. The uncompressed is in FoxPro data base file (dbf) format and may be imported to ACCESS, EXCEL, and other software formats. While all of this information is useful, the Office of Community Planning and Development has downloaded selected information for all states and areas and is making this information available on the CPD web pages. The tables and data items selected are those items used in the CDBG and HOME allocation formulas plus topics most pertinent to the Comprehensive Housing Affordability Strategy (CHAS), the Consolidated Plan, and similar overall economic and community development plans. The information is contained in five compressed (zipped) dbf tables for each state. When uncompressed the tables are ready for use with FoxPro and they can be imported into ACCESS, EXCEL, and other spreadsheet, GIS and database software. The data are at the block group summary level. The first two characters of the file name are the state abbreviation. The next two letters are BG for block group. Each record is labeled with the code and name of the city and county in which it is located so that the data can be summarized to higher-level geography. The last part of the file name describes the contents . The GEO file contains standard Census Bureau geographic identifiers for each block group, such as the metropolitan area code and congressional district code. The only data included in this table is total population and total housing units. POP1 and POP2 contain selected population variables and selected housing items are in the HU file. The MA05 table data is only for use by State CDBG grantees for the reporting of the racial composition of beneficiaries of Area Benefit activities. The complete package for a state consists of the dictionary file named TABLES, and the five data files for the state. The logical record number (LOGRECNO) links the records across tables.

  19. p

    Household Income and Expenditure Survey 2005-2006 - Solomon Islands

    • microdata.pacificdata.org
    • catalog.ihsn.org
    • +1more
    Updated Apr 1, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Solomon Islands Statistics Office (SISO) (2019). Household Income and Expenditure Survey 2005-2006 - Solomon Islands [Dataset]. https://microdata.pacificdata.org/index.php/catalog/146
    Explore at:
    Dataset updated
    Apr 1, 2019
    Dataset authored and provided by
    Solomon Islands Statistics Office (SISO)
    Time period covered
    2005 - 2006
    Area covered
    Solomon Islands
    Description

    Abstract

    The 2005/6 Household Income and Expenditure Survey is the second nationwide survey of households undertaken by Solomon Islands Statistics Office (SISO) since 1992.

    The primary objectives of the HIES includes: • Re-basing of the weights of the current basket of goods and services in the Consumer Price Index (CPI). The survey also aimed to provide data on the behavior of household consumption expenditure patterns that will help form the weights that would reflect the relative importance that consumers attach to commodities and services; • Obtaining relevant data for purposes of updating the series of national accounts aggregates particularly the Gross Domestic Product.

    The secondary objectives of the HIES were to: • Obtain data on housing and general demographic characteristics of households; • Obtain data on poverty measures, income and income inequality measures; • Obtain relevant data for the Millennium Development Goals (MDG), particularly health and education; and • Obtain other relevant data where necessary

    The field data collecting exercise was undertaken from October 2005 to March 2006 and that seasonality effects on expenditure was not fully considered.

    Geographic coverage

    National. The HIES operation covered both the Urban and Rural areas focusing on Honiara, Other Urban Areas and the Rural Areas of the ten (9) provinces, and aimed to produce estimates at the country national and provincial levels only.

    Analysis unit

    • Households
    • Individuals

    Universe

    The survey targeted private households whilst collective households in hospital, hotels, prison and educational institutions were excluded. A household is considered in the scope for the survey if the household have resided in the Solomon Islands for the last 12 months or more, or if not, they intend to live in Solomon Islands for the next 12 months.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Survey Design The survey was based on a two-stage sampling strategy using probability proportional to size (PPS) selection and random selection. The strategy for selection of each area type is slightly different depending also on enumerator workload schedule and the need to accommodate estimates at the National and Provincial level as well as Urban and Rural splits.

    The Survey was designed to collect data for national and provincial level estimates and covered both urban and rural areas. The survey covered Honiara, provincial centers and rural areas within these provinces.

    The sampling scheme used was a stratified two stage design with the Enumeration Areas (EA) as the Primary Sampling Unit (PSU) and the households within the sample areas as the secondary sampling unit (SSU). In the first stage the EAs were selected with probability proportional to their population size based on the 1999 population census. In the second stage households were selected using systematic sampling with a random start. The next stage was allocating the sample to each provinces proportional to the square-root of the population. This should mean that estimates of each province would roughly have the same level of accuracy. The sample was then split for each province between the provincial centers (considered to be urban) and the remaining rural population. Given the need for urban and rural estimates the sample was split between the two areas proportional to the square-root of the population based on the 1999 census. The last stage in the process involved modifying the final counts to accommodate the workloads for interviewers during the fieldwork. The interviewers were expected in the field for six months and could accommodate 10 households per month (60 household in total). It was desirable to have the total workloads for each province divisible by 60 to give each interviewer an even sized workload and have the sample spread out evenly across each month.

    Since Honiara (capital of Solomon Islands) consists of a mix of areas which covers high income, middle income and low income areas, it was advisable that the EAs be grouped based on the class best suited to their situation. Thus for Honiara the EA list was sorted by the income group category for selection. The number of EAs to select from Honiara is simply the desirable sample size (480 households) divided by the number of households to be selected for each EA. It was decided that 10 households should be selected from each selected EA. Therefore the number of EAs that were selected was equivalent to (480 / 10) = 48 EAs.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The HIES is a relatively complex survey and the instruments to collect data was implemented through the following questionnaires and associated sections: • Household Control Form – household composition and particulars; • Household Expenditure Form – housing amenities, facilities and major household, expenditure on tenure, fixed capital, land, property etc; • Personal Income Form – Income pattern of household members and other income earning activities; • Household Dairy – Daily expenditure by type of goods and services • An additional health module was included – health facility utilization, immunization, motherhood, mortality, breast feeding & family planning, Malaria and miscellaneous

    Cleaning operations

    The Statistics Programme at the Secretariat of the Pacific Community (SPC) provided the assistance in data processing. A HIES data entry program was setup in CSPro version 2.6 and data entry started soon after the first workload was registered in the Statistics Office in November 2005 until May 2006. Logic procedures for data editing are prepared in Microsoft Access and data editing for all questionnaires were done in CSPro, except for the Diary where the editing is done in Microsoft Excel. Data management queries are done in Microsoft Access and the production of tables was done in Microsoft Excel. This report was prepared in Microsoft Word. Data verification of 5 per cent is done to check the accuracy of data input, though data edit checks are carried out for completeness, consistency and accuracy including the outliers. Anomalies of data were amended appropriately.

    Response rate

    Response Rates A sample of 4,320 households was planned for the country and about 3,822 households (88.5%) responded favorably satisfying the survey requirements.

    Non-Response Despite efforts made by the enumerators and follow up attempts by the supervisors in most of the cases, there was non-response encountered during the survey.

    The reasons for non response by the household were due mainly to the following: • The household was out of scope of the survey • Dwelling was vacant or not being lived in • The household could not be contacted after a number of attempts • Household excluded for other reasons like death in the family, refusals, customary reasons etc

    Sampling error estimates

    Error Measurements No formal measures of sample errors have been calculated for the survey results.

    Non sampling errors cannot be readily measured. These included: o A response difficulty caused by misunderstanding of what was required from the survey and survey instruments by both households and interviewers. o The questionnaires were in English, which is at least a second language for interviewers and respondents. o The fact that some expenditure are seasonal and would not have been picked up in the survey period. o The exclusion of remote areas and institutions from the sampling frame.

  20. d

    Statistics on Obesity, Physical Activity and Diet (replaced by Statistics on...

    • digital.nhs.uk
    pdf, xlsx, zip
    Updated Apr 4, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Statistics on Obesity, Physical Activity and Diet (replaced by Statistics on Public Health) [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/statistics-on-obesity-physical-activity-and-diet
    Explore at:
    pdf(113.4 kB), xlsx(349.5 kB), pdf(684.8 kB), pdf(323.8 kB), pdf(239.3 kB), zip(173.5 kB)Available download formats
    Dataset updated
    Apr 4, 2018
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Time period covered
    Mar 31, 2016 - Dec 31, 2017
    Area covered
    England
    Description

    This statistical report presents information on obesity, physical activity and diet, drawn together from a variety of sources. The topics covered include: Obesity related hospital admissions. Prescription items for the treatment of obesity. Adult obesity prevalence. Childhood obesity prevalence. Physical activity levels among adults and children. Diet among adults and children, including trends in purchases, and consumption of food and drink and energy intake. Each section provides an overview of the key findings from these sources, as well as providing sources of further information and links to relevant documents and sources. Some of the data have been published previously by NHS Digital. A data visualisation tool at the link below allows users to select obesity related hospital admissions data for any Local Authority (as contained in Excel tables 3, 7 and 11 of this publication), along with time series data from 2013/14. Regional and national comparisons are also provided.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Various (2025). Equity Report Data: Geography [Dataset]. https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Geography/p6uw-qxpv

Equity Report Data: Geography

Explore at:
application/rssxml, application/rdfxml, csv, tsv, xml, application/geo+json, kmz, kmlAvailable download formats
Dataset updated
May 21, 2025
Dataset authored and provided by
Various
Description

This dataset contains the geographic data used to create maps for the San Diego County Regional Equity Indicators Report led by the Office of Equity and Racial Justice (OERJ). The full report can be found here: https://data.sandiegocounty.gov/stories/s/7its-kgpt

Demographic data from the report can be found here: https://data.sandiegocounty.gov/dataset/Equity-Report-Data-Demographics/q9ix-kfws

Filter by the Indicator column to select data for a particular indicator map.

Export notes: Dataset may not automatically open correctly in Excel due to geospatial data. To export the data for geospatial analysis, select Shapefile or GEOJSON as the file type. To view the data in Excel, export as a CSV but do not open the file. Then, open a blank Excel workbook, go to the Data tab, select “From Text/CSV,” and follow the prompts to import the CSV file into Excel. Alternatively, use the exploration options in "View Data" to hide the geographic column prior to exporting the data.

USER NOTES: 4/7/2025 - The maps and data have been removed for the Health Professional Shortage Areas indicator due to inconsistencies with the data source leading to some missing health professional shortage areas. We are working to fix this issue, including exploring possible alternative data sources.

5/21/2025 - The following changes were made to the 2023 report data (Equity Report Year = 2023). Self-Sufficiency Wage - a typo in the indicator name was fixed (changed sufficienct to sufficient) and the percent for one PUMA corrected from 56.9 to 59.9 (PUMA = San Diego County (Northwest)--Oceanside City & Camp Pendleton). Notes were made consistent for all rows where geography = ZCTA. A note was added to all rows where geography = PUMA. Voter registration - label "92054, 92051" was renamed to be in numerical order and is now "92051, 92054". Removed data from the percentile column because the categories are not true percentiles. Employment - Data was corrected to show the percent of the labor force that are employed (ages 16 and older). Previously, the data was the percent of the population 16 years and older that are in the labor force. 3- and 4-Year-Olds Enrolled in School - percents are now rounded to one decimal place. Poverty - the last two categories/percentiles changed because the 80th percentile cutoff was corrected by 0.01 and one ZCTA was reassigned to a different percentile as a result. Low Birthweight - the 33th percentile label was corrected to be written as the 33rd percentile. Life Expectancy - Corrected the category and percentile assignment for SRA CENTRAL SAN DIEGO. Parks and Community Spaces - corrected the category assignment for six SRAs.

5/21/2025 - Data was uploaded for Equity Report Year 2025. The following changes were made relative to the 2023 report year. Adverse Childhood Experiences - added geographic data for 2025 report. No calculation of bins nor corresponding percentiles due to small number of geographic areas. Low Birthweight - no calculation of bins nor corresponding percentiles due to small number of geographic areas.

Prepared by: Office of Evaluation, Performance, and Analytics and the Office of Equity and Racial Justice, County of San Diego, in collaboration with the San Diego Regional Policy & Innovation Center (https://www.sdrpic.org).

Search
Clear search
Close search
Google apps
Main menu