100+ datasets found
  1. h

    Data from: stereoset

    • huggingface.co
    • opendatalab.com
    Updated Dec 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    McGill NLP Group (2021). stereoset [Dataset]. https://huggingface.co/datasets/McGill-NLP/stereoset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 8, 2021
    Dataset authored and provided by
    McGill NLP Group
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for StereoSet

      Dataset Summary
    

    StereoSet is a dataset that measures stereotype bias in language models. StereoSet consists of 17,000 sentences that measures model preferences across gender, race, religion, and profession.

      Supported Tasks and Leaderboards
    

    multiple-choice question answering

      Languages
    

    English (en)

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    intersentence

    {'bias_type': 'race', 'context': 'Many people live in… See the full description on the dataset page: https://huggingface.co/datasets/McGill-NLP/stereoset.

  2. Statewide Live Birth Profiles

    • data.ca.gov
    • data.chhs.ca.gov
    • +6more
    csv, zip
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Statewide Live Birth Profiles [Dataset]. https://data.ca.gov/dataset/statewide-live-birth-profiles
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains counts of live births for California as a whole based on information entered on birth certificates. Final counts are derived from static data and include out of state births to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all births that occurred during the time period.

    The final data tables include both births that occurred in California regardless of the place of residence (by occurrence) and births to California residents (by residence), whereas the provisional data table only includes births that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by parent giving birth's age, parent giving birth's race-ethnicity, and birth place type. See temporal coverage for more information on which strata are available for which years.

  3. Wildfire Risk to Communities Housing Unit Count

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +4more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Forest Service (2025). Wildfire Risk to Communities Housing Unit Count [Dataset]. https://catalog.data.gov/dataset/wildfire-risk-to-communities-housing-unit-count-image-service
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
    Description

    The data included in this publication depict components of wildfire risk specifically for populated areas in the United States. These datasets represent where people live in the United States and the in situ risk from wildfire, i.e., the risk at the location where the adverse effects take place.National wildfire hazard datasets of annual burn probability and fire intensity, generated by the USDA Forest Service, Rocky Mountain Research Station and Pyrologix LLC, form the foundation of the Wildfire Risk to Communities data. Vegetation and wildland fuels data from LANDFIRE 2020 (version 2.2.0) were used as input to two different but related geospatial fire simulation systems. Annual burn probability was produced with the USFS geospatial fire simulator (FSim) at a relatively coarse cell size of 270 meters (m). To bring the burn probability raster data down to a finer resolution more useful for assessing hazard and risk to communities, we upsampled them to the native 30 m resolution of the LANDFIRE fuel and vegetation data. In this upsampling process, we also spread values of modeled burn probability into developed areas represented in LANDFIRE fuels data as non-burnable. Burn probability rasters represent landscape conditions as of the end of 2020. Fire intensity characteristics were modeled at 30 m resolution using a process that performs a comprehensive set of FlamMap runs spanning the full range of weather-related characteristics that occur during a fire season and then integrates those runs into a variety of results based on the likelihood of those weather types occurring. Before the fire intensity modeling, the LANDFIRE 2020 data were updated to reflect fuels disturbances occurring in 2021 and 2022. As such, the fire intensity datasets represent landscape conditions as of the end of 2022. The data products in this publication that represent where people live, reflect 2021 estimates of housing unit and population counts from the U.S. Census Bureau, combined with building footprint data from Onegeo and USA Structures, both reflecting 2022 conditions.The specific raster datasets included in this publication include:Building Count: Building Count is a 30-m raster representing the count of buildings in the building footprint dataset located within each 30-m pixel.Building Density: Building Density is a 30-m raster representing the density of buildings in the building footprint dataset (buildings per square kilometer [km²]).Building Coverage: Building Coverage is a 30-m raster depicting the percentage of habitable land area covered by building footprints.Population Count (PopCount): PopCount is a 30-m raster with pixel values representing residential population count (persons) in each pixel.Population Density (PopDen): PopDen is a 30-m raster of residential population density (people/km²).Housing Unit Count (HUCount): HUCount is a 30-m raster representing the number of housing units in each pixel.Housing Unit Density (HUDen): HUDen is a 30-m raster of housing-unit density (housing units/km²).Housing Unit Exposure (HUExposure): HUExposure is a 30-m raster that represents the expected number of housing units within a pixel potentially exposed to wildfire in a year. This is a long-term annual average and not intended to represent the actual number of housing units exposed in any specific year.Housing Unit Impact (HUImpact): HUImpact is a 30-m raster that represents the relative potential impact of fire to housing units at any pixel, if a fire were to occur. It is an index that incorporates the general consequences of fire on a home as a function of fire intensity and uses flame length probabilities from wildfire modeling to capture likely intensity of fire.Housing Unit Risk (HURisk): HURisk is a 30-m raster that integrates all four primary elements of wildfire risk - likelihood, intensity, susceptibility, and exposure - on pixels where housing unit density is greater than zero.

  4. a

    Catholic Carbon Footprint Summary Dashboard

    • catholic-geo-hub-cgisc.hub.arcgis.com
    Updated Oct 8, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    burhansm2 (2019). Catholic Carbon Footprint Summary Dashboard [Dataset]. https://catholic-geo-hub-cgisc.hub.arcgis.com/items/456fa8d2472541529a006719bd8e3745
    Explore at:
    Dataset updated
    Oct 8, 2019
    Dataset authored and provided by
    burhansm2
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    PerCapita_CO2_Footprint_InDioceses_FULLBurhans, Molly A., Cheney, David M., Gerlt, R.. . “PerCapita_CO2_Footprint_InDioceses_FULL”. Scale not given. Version 1.0. MO and CT, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2019.MethodologyThis is the first global Carbon footprint of the Catholic population. We will continue to improve and develop these data with our research partners over the coming years. While it is helpful, it should also be viewed and used as a "beta" prototype that we and our research partners will build from and improve. The years of carbon data are (2010) and (2015 - SHOWN). The year of Catholic data is 2018. The year of population data is 2016. Care should be taken during future developments to harmonize the years used for catholic, population, and CO2 data.1. Zonal Statistics: Esri Population Data and Dioceses --> Population per dioceses, non Vatican based numbers2. Zonal Statistics: FFDAS and Dioceses and Population dataset --> Mean CO2 per Diocese3. Field Calculation: Population per Diocese and Mean CO2 per diocese --> CO2 per Capita4. Field Calculation: CO2 per Capita * Catholic Population --> Catholic Carbon FootprintAssumption: PerCapita CO2Deriving per-capita CO2 from mean CO2 in a geography assumes that people's footprint accounts for their personal lifestyle and involvement in local business and industries that are contribute CO2. Catholic CO2Assumes that Catholics and non-Catholic have similar CO2 footprints from their lifestyles.Derived from:A multiyear, global gridded fossil fuel CO2 emission data product: Evaluation and analysis of resultshttp://ffdas.rc.nau.edu/About.htmlRayner et al., JGR, 2010 - The is the first FFDAS paper describing the version 1.0 methods and results published in the Journal of Geophysical Research.Asefi et al., 2014 - This is the paper describing the methods and results of the FFDAS version 2.0 published in the Journal of Geophysical Research.Readme version 2.2 - A simple readme file to assist in using the 10 km x 10 km, hourly gridded Vulcan version 2.2 results.Liu et al., 2017 - A paper exploring the carbon cycle response to the 2015-2016 El Nino through the use of carbon cycle data assimilation with FFDAS as the boundary condition for FFCO2."S. Asefi‐Najafabady P. J. Rayner K. R. Gurney A. McRobert Y. Song K. Coltin J. Huang C. Elvidge K. BaughFirst published: 10 September 2014 https://doi.org/10.1002/2013JD021296 Cited by: 30Link to FFDAS data retrieval and visualization: http://hpcg.purdue.edu/FFDAS/index.phpAbstractHigh‐resolution, global quantification of fossil fuel CO2 emissions is emerging as a critical need in carbon cycle science and climate policy. We build upon a previously developed fossil fuel data assimilation system (FFDAS) for estimating global high‐resolution fossil fuel CO2 emissions. We have improved the underlying observationally based data sources, expanded the approach through treatment of separate emitting sectors including a new pointwise database of global power plants, and extended the results to cover a 1997 to 2010 time series at a spatial resolution of 0.1°. Long‐term trend analysis of the resulting global emissions shows subnational spatial structure in large active economies such as the United States, China, and India. These three countries, in particular, show different long‐term trends and exploration of the trends in nighttime lights, and population reveal a decoupling of population and emissions at the subnational level. Analysis of shorter‐term variations reveals the impact of the 2008–2009 global financial crisis with widespread negative emission anomalies across the U.S. and Europe. We have used a center of mass (CM) calculation as a compact metric to express the time evolution of spatial patterns in fossil fuel CO2 emissions. The global emission CM has moved toward the east and somewhat south between 1997 and 2010, driven by the increase in emissions in China and South Asia over this time period. Analysis at the level of individual countries reveals per capita CO2 emission migration in both Russia and India. The per capita emission CM holds potential as a way to succinctly analyze subnational shifts in carbon intensity over time. Uncertainties are generally lower than the previous version of FFDAS due mainly to an improved nightlight data set."Global Diocesan Boundaries:Burhans, M., Bell, J., Burhans, D., Carmichael, R., Cheney, D., Deaton, M., Emge, T. Gerlt, B., Grayson, J., Herries, J., Keegan, H., Skinner, A., Smith, M., Sousa, C., Trubetskoy, S. “Diocesean Boundaries of the Catholic Church” [Feature Layer]. Scale not given. Version 1.2. Redlands, CA, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2016.Using: ArcGIS. 10.4. Version 10.0. Redlands, CA: Environmental Systems Research Institute, Inc., 2016.Boundary ProvenanceStatistics and Leadership DataCheney, D.M. “Catholic Hierarchy of the World” [Database]. Date Updated: August 2019. Catholic Hierarchy. Using: Paradox. Retrieved from Original Source.Catholic HierarchyAnnuario Pontificio per l’Anno .. Città del Vaticano :Tipografia Poliglotta Vaticana, Multiple Years.The data for these maps was extracted from the gold standard of Church data, the Annuario Pontificio, published yearly by the Vatican. The collection and data development of the Vatican Statistics Office are unknown. GoodLands is not responsible for errors within this data. We encourage people to document and report errant information to us at data@good-lands.org or directly to the Vatican.Additional information about regular changes in bishops and sees comes from a variety of public diocesan and news announcements.GoodLands’ polygon data layers, version 2.0 for global ecclesiastical boundaries of the Roman Catholic Church:Although care has been taken to ensure the accuracy, completeness and reliability of the information provided, due to this being the first developed dataset of global ecclesiastical boundaries curated from many sources it may have a higher margin of error than established geopolitical administrative boundary maps. Boundaries need to be verified with appropriate Ecclesiastical Leadership. The current information is subject to change without notice. No parties involved with the creation of this data are liable for indirect, special or incidental damage resulting from, arising out of or in connection with the use of the information. We referenced 1960 sources to build our global datasets of ecclesiastical jurisdictions. Often, they were isolated images of dioceses, historical documents and information about parishes that were cross checked. These sources can be viewed here:https://docs.google.com/spreadsheets/d/11ANlH1S_aYJOyz4TtG0HHgz0OLxnOvXLHMt4FVOS85Q/edit#gid=0To learn more or contact us please visit: https://good-lands.org/Esri Gridded Population Data 2016DescriptionThis layer is a global estimate of human population for 2016. Esri created this estimate by modeling a footprint of where people live as a dasymetric settlement likelihood surface, and then assigned 2016 population estimates stored on polygons of the finest level of geography available onto the settlement surface. Where people live means where their homes are, as in where people sleep most of the time, and this is opposed to where they work. Another way to think of this estimate is a night-time estimate, as opposed to a day-time estimate.Knowledge of population distribution helps us understand how humans affect the natural world and how natural events such as storms and earthquakes, and other phenomena affect humans. This layer represents the footprint of where people live, and how many people live there.Dataset SummaryEach cell in this layer has an integer value with the estimated number of people likely to live in the geographic region represented by that cell. Esri additionally produced several additional layers World Population Estimate Confidence 2016: the confidence level (1-5) per cell for the probability of people being located and estimated correctly. World Population Density Estimate 2016: this layer is represented as population density in units of persons per square kilometer.World Settlement Score 2016: the dasymetric likelihood surface used to create this layer by apportioning population from census polygons to the settlement score raster.To use this layer in analysis, there are several properties or geoprocessing environment settings that should be used:Coordinate system: WGS_1984. This service and its underlying data are WGS_1984. We do this because projecting population count data actually will change the populations due to resampling and either collapsing or splitting cells to fit into another coordinate system. Cell Size: 0.0013474728 degrees (approximately 150-meters) at the equator. No Data: -1Bit Depth: 32-bit signedThis layer has query, identify, pixel, and export image functions enabled, and is restricted to a maximum analysis size of 30,000 x 30,000 pixels - an area about the size of Africa.Frye, C. et al., (2018). Using Classified and Unclassified Land Cover Data to Estimate the Footprint of Human Settlement. Data Science Journal. 17, p.20. DOI: http://doi.org/10.5334/dsj-2018-020.What can you do with this layer?This layer is unsuitable for mapping or cartographic use, and thus it does not include a convenient legend. Instead, this layer is useful for analysis, particularly for estimating counts of people living within watersheds, coastal areas, and other areas that do not have standard boundaries. Esri recommends using the Zonal Statistics tool or the Zonal Statistics to Table tool where you provide input zones as either polygons, or raster data, and the tool will summarize the count of population within those zones. https://www.esri.com/arcgis-blog/products/arcgis-living-atlas/data-management/2016-world-population-estimate-services-are-now-available/

  5. New York City Bus Data

    • kaggle.com
    Updated May 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MichaelStone (2018). New York City Bus Data [Dataset]. https://www.kaggle.com/stoney71/new-york-city-transport-statistics/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 18, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    MichaelStone
    Area covered
    New York
    Description

    Context

    I wanted to find a better way to provide live traffic updates. We dont all have access to the data from traffic monitoring sensors or whatever gets uploaded from people's smart phones to Apple, Google etc plus I question how accurate the traffic congestion is on Google Maps or other apps. So I figured that since buses are also in the same traffic and many buses stream their GPS location and other data live, that would be an ideal source for traffic data. I investigated the data streams available from many bus companies around the world and found MTA in NYC to be very reliable.

    Content

    This dataset is from the NYC MTA buses data stream service. In roughly 10 minute increments the bus location, route, bus stop and more is included in each row. The scheduled arrival time from the bus schedule is also included, to give an indication of where the bus should be (how much behind schedule, or on time, or even ahead of schedule).

    Acknowledgements

    Data is recorded from the MTA SIRI Real Time data feed and the MTA GTFS Schedule data.

    Inspiration

    I want to see what exploratory & discovery people come up with from this data. Feel free to download this dataset for your own use however I would appreciate as many Kernals included on Kaggle as we can get.

    Based on the interest this generates I plan to collect more data for subsequent months down the track.

  6. Live Birth Profiles by County

    • data.chhs.ca.gov
    • data.ca.gov
    • +4more
    csv, zip
    Updated Jun 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Live Birth Profiles by County [Dataset]. https://data.chhs.ca.gov/dataset/live-birth-profiles-by-county
    Explore at:
    csv(1911), csv(8256822), csv(9986780), zip, csv(456184)Available download formats
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This dataset contains counts of live births for California counties based on information entered on birth certificates. Final counts are derived from static data and include out of state births to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all births that occurred during the time period.

    The final data tables include both births that occurred in California regardless of the place of residence (by occurrence) and births to California residents (by residence), whereas the provisional data table only includes births that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by parent giving birth's age, parent giving birth's race-ethnicity, and birth place type. See temporal coverage for more information on which strata are available for which years.

  7. FiveThirtyEight Police Locals Dataset

    • kaggle.com
    Updated Mar 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FiveThirtyEight (2019). FiveThirtyEight Police Locals Dataset [Dataset]. https://www.kaggle.com/fivethirtyeight/fivethirtyeight-police-locals-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 26, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    FiveThirtyEight
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Content

    Police Residence

    This folder contains data behind the story Most Police Don’t Live In The Cities They Serve.

    Includes the cities with the 75 largest police forces, with the exception of Honolulu for which data is not available. All calculations are based on data from the U.S. Census.

    The Census Bureau numbers are potentially going to differ from other counts for three reasons:

    1. The census category for police officers also includes sheriffs, transit police and others who might not be under the same jurisdiction as a city’s police department proper. The census category won’t include private security officers.
    2. The census data is estimated from 2006 to 2010; police forces may have changed in size since then.
    3. There is always a margin of error in census numbers; they are estimates, not complete counts.

    How to read police-locals.csv

    HeaderDefinition
    cityU.S. city
    police_force_sizeNumber of police officers serving that city
    allPercentage of the total police force that lives in the city
    whitePercentage of white (non-Hispanic) police officers who live in the city
    non-whitePercentage of non-white police officers who live in the city
    blackPercentage of black police officers who live in the city
    hispanicPercentage of Hispanic police officers who live in the city
    asianPercentage of Asian police officers who live in the city

    Note: When a cell contains ** it means that there are fewer than 100 police officers of that race serving that city.

    Context

    This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!

    • Update Frequency: This dataset is updated daily.

    Acknowledgements

    This dataset is maintained using GitHub's API and Kaggle's API.

    This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.

  8. Population Mid-Year Estimates - Datasets - Lincolnshire Open Data

    • lincolnshire.ckan.io
    Updated Aug 10, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    lincolnshire.ckan.io (2017). Population Mid-Year Estimates - Datasets - Lincolnshire Open Data [Dataset]. https://lincolnshire.ckan.io/dataset/population-mid-year-estimates
    Explore at:
    Dataset updated
    Aug 10, 2017
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Population Mid-year Estimates from the Office for National Statistics (ONS). These are the official estimates of the resident population in Lincolnshire. ONS uses information from the census and other data to produce these official mid-year population estimates every year between each census. These figures show how many people live in each local area and the population age-sex structure. This data is updated annually. Although the ONS data shows exact numbers, they are estimates so some rounding should be applied. For current Armed forces populations, two Ministry of Defence links are also shown below. The ONS 2021 Census link has Veterans data. Population Projections data sourced from ONS is also available on this platform. The Source link shown below is to the ONS Nomis website. It has user-friendly data query tools for a broad range of ONS and other datasets from official sources.

  9. Anti Spoofing Selfie Live Dataset - 5,000+ files

    • kaggle.com
    Updated May 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Axon Labs (2024). Anti Spoofing Selfie Live Dataset - 5,000+ files [Dataset]. https://www.kaggle.com/datasets/axondata/anti-spoofing-live-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Axon Labs
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Anti Spoofing Selfie Live dataset - Selfie collection

    What is inside this dataset?

    Biometric Attack dataset consists of >5k selfie images of people from >50 countries. Each participant provided 1 real life selfy image. Live selfies help facial recognition models to identify real faces and detect spoofing attempts, decreasing false negative results for Liveness detection tests.

    Dataset parameters:

    • Key nationalities are covered (Caucasians, Black, Asian, Hispanic etc)
    • Variety of lightning conditions and capturing devices
    • Different demographic parameters (broad range of Age, balanced gender and race distribution)

    Full version of dataset is available for commercial usage - leave a request on our website Axonlabs to purchase the dataset 💰

    How Live selfie dataset helps Liveness models?

    Selfies provide a diverse range of facial features, lighting conditions, and capturing devices, which are essential for training robust facial recognition models that can accurately distinguish between real and spoofed faces

    Potential Use Cases:

    Liveness detection: This dataset is ideal for training and evaluating liveness detection models, enabling researchers to distinguish between real and spoof data with high accuracy

    Keywords: Real life data, Live data, Selfie data, Antispoofing for AI, Liveness Detection dataset for AI, Spoof Detection dataset, Facial Recognition dataset, Biometric Authentication dataset, AI Dataset, Anti-Spoofing Technology, Facial Biometrics, Machine Learning Dataset, Deep Learning

  10. Pension Insurance Data Tables

    • catalog.data.gov
    • datadiscoverystudio.org
    • +3more
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pension Benefit Guaranty Corporation (2020). Pension Insurance Data Tables [Dataset]. https://catalog.data.gov/dataset/pension-insurance-data-tables
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    Pension Benefit Guaranty Corporationhttp://www.pbgc.gov/
    Description

    Find out about retirement trends in PBGC's data tables. The tables include statistics on the people and pensions that PBGC protects, including how many Americans are in PBGC-insured pension plans, how many get PBGC benefits, and where they live. This data set will be updated periodically. (Updated annually)

  11. Malnutrition: Underweight Women, Children & Others

    • kaggle.com
    Updated Aug 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarthak Bose (2023). Malnutrition: Underweight Women, Children & Others [Dataset]. https://www.kaggle.com/datasets/sarthakbose/malnutrition-underweight-women-children-and-others
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 17, 2023
    Dataset provided by
    Kaggle
    Authors
    Sarthak Bose
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    🔗 Check out my notebook here: Link

    This dataset includes malnutrition indicators and some of the features that might impact malnutrition. The detailed description of the dataset is given below:

    • Percentage-of-underweight-children-data: Percentage of children aged 5 years or below who are underweight by country.

    • Prevalence of Underweight among Female Adults (Age Standardized Estimate): Percentage of female adults whos BMI is less than 18.

    • GDP per capita (constant 2015 US$): GDP per capita is gross domestic product divided by midyear population. GDP is the sum of gross value added by all resident producers in the economy plus any product taxes and minus any subsidies not included in the value of the products. It is calculated without making deductions for depreciation of fabricated assets or for depletion and degradation of natural resources. Data are in constant 2015 U.S. dollars.

    • Domestic general government health expenditure (% of GDP): Public expenditure on health from domestic sources as a share of the economy as measured by GDP.

    • Maternal mortality ratio (modeled estimate, per 100,000 live births): Maternal mortality ratio is the number of women who die from pregnancy-related causes while pregnant or within 42 days of pregnancy termination per 100,000 live births. The data are estimated with a regression model using information on the proportion of maternal deaths among non-AIDS deaths in women ages 15-49, fertility, birth attendants, and GDP measured using purchasing power parities (PPPs).

    • Mean-age-at-first-birth-of-women-aged-20-50-data: Average age at which women of age 20-50 years have their first child.

    • School enrollment, secondary, female (% gross): Gross enrollment ratio is the ratio of total enrollment, regardless of age, to the population of the age group that officially corresponds to the level of education shown. Secondary education completes the provision of basic education that began at the primary level, and aims at laying the foundations for lifelong learning and human development, by offering more subject- or skill-oriented instruction using more specialized teachers.

  12. N

    United States Age Group Population Dataset: A complete breakdown of United...

    • neilsberg.com
    csv, json
    Updated Sep 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2023). United States Age Group Population Dataset: A complete breakdown of United States age demographics from 0 to 85 years, distributed across 18 age groups [Dataset]. https://www.neilsberg.com/research/datasets/5fd2b2bb-3d85-11ee-9abe-0aa64bf2eeb2/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Sep 16, 2023
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the United States population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for United States. The dataset can be utilized to understand the population distribution of United States by age. For example, using this dataset, we can identify the largest age group in United States.

    Key observations

    The largest age group in United States was for the group of age 25-29 years with a population of 22,854,328 (6.93%), according to the 2021 American Community Survey. At the same time, the smallest age group in United States was the 80-84 years with a population of 5,932,196 (1.80%). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the United States is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of United States total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for United States Population by Age. You can refer the same here

  13. World Soccer live data feed

    • kaggle.com
    Updated Jan 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammad Ghahramani (2019). World Soccer live data feed [Dataset]. https://www.kaggle.com/datasets/analystmasters/world-soccer-live-data-feed/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 28, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mohammad Ghahramani
    Description

    Context

    This is the first live data stream on Kaggle providing a simple yet rich source of all soccer matches around the world 24/7 in real-time.

    What makes it unique compared to other datasets?

    • It is the first live data feed on Kaggle and it is totally free
    • Unlike “Churn rate” datasets you do not have to wait months to evaluate your predictions; simply check the match’s outcome in a couple of hours
    • you can use your predictions/analysis for your own benefit instead of spending your time and resources on helping a company maximizing its profit
    • A Five year old laptop can do the calculations and you do not need high-end GPUs
    • Couldn’t make it to the top 3 submissions? Nevermind, you still have the chance to get your prize on your own
    • You can’t get accurate results on all samples? Do not worry, just filter out the hard ones (e.g. ignore international friendly) and simply choose the ones you are sure of.
    • Need help from human experts for each sample? Every sample comes with at least two opinions from experts
    • You wish you could add your complementary data? Just contact us and we will try to facilitate it.
    • Couldn’t win “Warren Buffett's 2018 March Madness Bracket Contest”? Here is your chance to make your accumulative profit.

    Simply train your algorithm on the first version of training dataset of approximately 11.5k matches and predict the data provided in the following data feed.

    Fetch the data stream

    The CSV file is updated every 30 minutes at minutes 20’ and 50’ of every hour. I kindly request not to download it more than twice per hour as it incurs additional cost.

    You may download the csv data file from the following link from Amazon S3 server by changing the FOLDER_NAME as below,

    https://s3.amazonaws.com/FOLDER_NAME/amasters.csv

    *. Substitute the FOLDER_NAME with "**analyst-masters**"

    Content

    Our goal is to identify the outcome of a match as Home, Draw or Away. The variety of sources and nature of information provided in this data stream makes it a unique database. Currently, FIVE servers are collecting data from soccer matches around the world, communicating with each other and finally aggregating the data based on the dominant features learned from 400,000 matches over 7 years. I describe every column and the data collection below in two categories, Category I – Current situation and Category II – Head-to-Head History. Hence, we divide the type of data we have from each team to 4 modes,

    • Mode 1: we have both Category I and Category II available
    • Mode 2: we only have Category I available
    • Mode 3: we only have Category II available
    • Mode 4: none of Category I and II are available

    Below you can find a full illustration of each category.

    I. Current situation

    Col 1 to 3:

    Votes_for_Home Votes_for_Draw Votes_for_Away
    

    The most distinctive parts of the database are these 3 columns. We are releasing opinions of over 100 professional soccer analysts predicting the outcome of a match. Their votes is the result of every piece of information they receive on players, team line-up, injuries and the urge of a team to win a match to stay in the league. They are spread around the world in various time zones and are experts on soccer teams from various regions. Our servers aggregate their opinions to update the CSV file until kickoff. Therefore, even if 40 users predict Real-Madrid wins against Real-Sociedad in Santiago Bernabeu on January 6th, 2019 but 5 users predict Real-Sociedad (the away team) will be the winner, you should doubt the home win. Here, the “majority of votes” works in conjunction with other features.

    Col 4 to 9:

    Weekday Day Month  Year  Hour  Minute
    

    There are over 60,000 matches during a year, and approximately 400 ones are usually held per day on weekends. More critical and exciting matches, which are usually less predictable, are held toward the evening in Europe. We are currently providing time in Central Europe Time (CET) equivalent to GMT +01:00.

    *. Please note that the 2nd row of the CSV file represents the time, data values are saved from all servers to the file.

    Col 10 to 13:

    Total_Bettors   Bet_Perc_on_Home    Bet_Perc_on_Draw   Bet_Perc_on_Away
    

    This data is recorded a few hours before the match as people place bets emotionally when kickoff approaches. The percentage of the overall number of people denoted as “Total_Bettors” is indicated in each column for “Home,” “Draw” and “Away” outcomes.

    Col 14 to 15:

    Team_1 Team_2   
    

    The team playing “Home” is “Team_1” and the opponent playing “Away” is “Team_2”.

    Col 16 to 36:

    League_Rank_1  League_Rank_2  Total_teams     Points_1  Points_2  Max_points Min_points Won_1  Draw_1 Lost_1 Won_2  Draw_2 Lost_2 Goals_Scored_1 Goals_Scored_2 Goals_Rec_1 Goal_Rec_2 Goals_Diff_1  Goals_Diff_2
    

    If the match is betw...

  14. d

    Premium GIS Data | Asia/ MENA | Latest Estimates on Population, Consuming...

    • datarade.ai
    .json, .csv
    Updated Nov 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GapMaps (2024). Premium GIS Data | Asia/ MENA | Latest Estimates on Population, Consuming Class, Retail Spend, Demographics | Map Data | Demographic Data [Dataset]. https://datarade.ai/data-products/gapmaps-premium-demographics-gis-data-asia-mena-150m-x-1-gapmaps
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Nov 23, 2024
    Dataset authored and provided by
    GapMaps
    Area covered
    Indonesia, Philippines, India, Singapore, Malaysia, Saudi Arabia, Asia
    Description

    Sourcing accurate and up-to-date demographics GIS data across Asia and MENA has historically been difficult for retail brands looking to expand their store networks in these regions. Either the data does not exist or it isn't readily accessible or updated regularly.

    GapMaps uses known population data combined with billions of mobile device location points to provide highly accurate and globally consistent geodemographic datasets across Asia and MENA at 150m x 150m grid levels in major cities and 1km grids outside of major cities.

    With this information, brands can get a detailed understanding of who lives in a catchment, where they work and their spending potential which allows you to:

    • Better understand your customers
    • Identify optimal locations to expand your retail footprint
    • Define sales territories for franchisees
    • Run targeted marketing campaigns.

    Premium demographics GIS data for Asia and MENA includes the latest estimates (updated annually) on:

    1. Population (how many people live in your local catchment)
    2. Demographics (who lives within your local catchment)
    3. Worker population (how many people work within your local catchment)
    4. Consuming Class and Premium Consuming Class (who can can afford to buy goods & services beyond their basic needs and /or shop at premium retailers)
    5. Retail Spending (Food & Beverage, Grocery, Apparel, Other). How much are consumers spending on retail goods and services by category.

    Primary Use Cases for GapMaps Demographics GIS Data:

    1. Retail (eg. Fast Food/ QSR, Cafe, Fitness, Supermarket/Grocery)
    2. Customer Profiling: get a detailed understanding of the demographic profile of your customers, where they work and their spending potential
    3. Analyse your trade areas at a granular 150m x 150m grid levels using all the key metrics
    4. Site Selection: Identify optimal locations for future expansion and benchmark performance across existing locations.
    5. Target Marketing: Develop effective marketing strategies to acquire more customers.
    6. Integrate GapMaps demographic data with your existing GIS or BI platform to generate powerful visualizations.

    7. Commercial Real-Estate (Brokers, Developers, Investors, Single & Multi-tenant O/O)

    8. Tenant Recruitment

    9. Target Marketing

    10. Market Potential / Gap Analysis

    11. Marketing / Advertising (Billboards/OOH, Marketing Agencies, Indoor Screens)

    12. Customer Profiling

    13. Target Marketing

    14. Market Share Analysis

  15. Gender Detection & Classification - Face Dataset

    • kaggle.com
    Updated Oct 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Training Data (2023). Gender Detection & Classification - Face Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/gender-detection-and-classification-image-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 31, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Training Data
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Gender Detection & Classification - face recognition dataset

    The dataset is created on the basis of Face Mask Detection dataset

    Dataset Description:

    The dataset comprises a collection of photos of people, organized into folders labeled "women" and "men." Each folder contains a significant number of images to facilitate training and testing of gender detection algorithms or models.

    The dataset contains a variety of images capturing female and male individuals from diverse backgrounds, age groups, and ethnicities.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F1c4708f0b856f7889e3c0eea434fe8e2%2FFrame%2045%20(1).png?generation=1698764294000412&alt=media" alt="">

    This labeled dataset can be utilized as training data for machine learning models, computer vision applications, and gender detection algorithms.

    💴 For Commercial Usage: Full version of the dataset includes 376 000+ photos of people, leave a request on TrainingData to buy the dataset

    Metadata for the full dataset:

    • assignment_id - unique identifier of the media file
    • worker_id - unique identifier of the person
    • age - age of the person
    • true_gender - gender of the person
    • country - country of the person
    • ethnicity - ethnicity of the person
    • photo_1_extension, photo_2_extension, photo_3_extension, photo_4_extension - photo extensions in the dataset
    • photo_1_resolution, photo_2_resolution, photo_3_extension, photo_4_resolution - photo resolution in the dataset

    OTHER BIOMETRIC DATASETS:

    💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

    Content

    The dataset is split into train and test folders, each folder includes: - folders women and men - folders with images of people with the corresponding gender, - .csv file - contains information about the images and people in the dataset

    File with the extension .csv

    • file: link to access the file,
    • gender: gender of a person in the photo (woman/man),
    • split: classification on train and test

    TrainingData provides high-quality data annotation tailored to your needs

    keywords: biometric system, biometric system attacks, biometric dataset, face recognition database, face recognition dataset, face detection dataset, facial analysis, gender detection, supervised learning dataset, gender classification dataset, gender recognition dataset

  16. S

    2023 Census population change by ethnic group and statistical area 2

    • datafinder.stats.govt.nz
    csv, dwg, geodatabase +6
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats NZ, 2023 Census population change by ethnic group and statistical area 2 [Dataset]. https://datafinder.stats.govt.nz/layer/119483-2023-census-population-change-by-ethnic-group-and-statistical-area-2/
    Explore at:
    shapefile, pdf, kml, geodatabase, mapinfo tab, csv, dwg, mapinfo mif, geopackage / sqliteAvailable download formats
    Dataset provided by
    Statistics New Zealandhttp://www.stats.govt.nz/
    Authors
    Stats NZ
    License

    https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/

    Area covered
    Description

    Dataset contains ethnic group census usually resident population counts from the 2013, 2018, and 2023 Censuses, as well as the percentage change in the ethnic group population count between the 2013 and 2018 Censuses, and between the 2018 and 2023 Censuses. Data is available by statistical area 2.

    The ethnic groups are:

    • European
    • Māori
    • Pacific peoples
    • Asian
    • Middle Eastern/Latin American/African
    • Other ethnicity

    Map shows percentage change in the census usually resident population count for ethnic groups between the 2018 and 2023 Censuses.

    Download lookup file from Stats NZ ArcGIS Online or embedded attachment in Stats NZ geographic data service. Download data table (excluding the geometry column for CSV files) using the instructions in the Koordinates help guide.

    Footnotes

    Geographical boundaries

    Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018.

    Subnational census usually resident population

    The census usually resident population count of an area (subnational count) is a count of all people who usually live in that area and were present in New Zealand on census night. It excludes visitors from overseas, visitors from elsewhere in New Zealand, and residents temporarily overseas on census night. For example, a person who usually lives in Christchurch city and is visiting Wellington city on census night will be included in the census usually resident population count of Christchurch city. 

    Caution using time series

    Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data), while the 2013 Census used a full-field enumeration methodology (with no use of administrative data).

    About the 2023 Census dataset

    For information on the 2023 dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings.

    Data quality

    The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.

    Quality rating of a variable

    The quality rating of a variable provides an overall evaluation of data quality for that variable, usually at the highest levels of classification. The quality ratings shown are for the 2023 Census unless stated. There is variability in the quality of data at smaller geographies. Data quality may also vary between censuses, for subpopulations, or when cross tabulated with other variables or at lower levels of the classification. Data quality ratings for 2023 Census variables has more information on quality ratings by variable.

    Ethnicity concept quality rating

    Ethnicity is rated as high quality.

    Ethnicity – 2023 Census: Information by concept has more information, for example, definitions and data quality.

    Using data for good

    Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga”.

    Confidentiality

    The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.

    Symbol

    -998 Not applicable

    -999 Confidential

    Percentages

    To calculate percentages, divide the figure for the category of interest by the figure for ‘Total stated’ where this applies.

  17. Fish Dataset

    • kaggle.com
    Updated May 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alin Cijov (2021). Fish Dataset [Dataset]. https://www.kaggle.com/alincijov/fish-dataset/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 20, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alin Cijov
    Description

    Dataset

    Camper dataset form https://stats.idre.ucla.edu/r/dae/zip/. The dataset contains data on 250 groups that went to a park. Each group was questioned about how many fish they caught (count), how many children were in the group (child), how many people were in the group (persons), if they used a live bait and whether or not they brought a camper to the park (camper). You split the data into train and test dataset.

    Acknowledgements

    University of California, Los Angeles (UCLA) Dataset.

  18. Tables on homelessness

    • gov.uk
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Housing, Communities and Local Government (2025). Tables on homelessness [Dataset]. https://www.gov.uk/government/statistical-data-sets/live-tables-on-homelessness
    Explore at:
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Ministry of Housing, Communities and Local Government
    Description

    Statutory homelessness live tables

    Statutory homelessness England Level Time Series

    https://assets.publishing.service.gov.uk/media/680f5de9dbea49d6a3305ec5/StatHomeless_202412.ods">Statutory homelessness England level time series "live tables"

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute"><abbr title="OpenDocument Spreadsheet" class="gem-c-attachment_abbr">ODS</abbr></span>, <span class="gem-c-attachment_attribute">309 KB</span></p>
    
    
    
      <p class="gem-c-attachment_metadata">
       This file is in an <a href="https://www.gov.uk/guidance/using-open-document-formats-odf-in-your-organisation" target="_self" class="govuk-link">OpenDocument</a> format
    

    Detailed local authority-level tables

    For quarterly local authority-level tables prior to the latest financial year, see the Statutory homelessness release pages.

    https://assets.publishing.service.gov.uk/media/680f5e5c172df773f0305ec9/Detailed_LA_202412.ods">Statutory homelessness in England: October to December 2024

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute"><abbr title="OpenDocument Spreadsheet" class="gem-c-attachment_abbr">ODS</abbr></span>, <span class="gem-c-attachment_attribute">1.19 MB</span></p>
    
    
    
      <p class="gem-c-attachment_metadata">
       This file is in an <a href="https://www.gov.uk/guidance/using-open-document-formats-odf-in-your-organisation" target="_self" class="govuk-link">OpenDocument</a> format
    

  19. Access to Mental Health

    • hub.arcgis.com
    • share-open-data-njtpa.hub.arcgis.com
    Updated Dec 3, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Urban Observatory by Esri (2018). Access to Mental Health [Dataset]. https://hub.arcgis.com/maps/07f70065653b4386b5c87cbe9b50b314
    Explore at:
    Dataset updated
    Dec 3, 2018
    Dataset provided by
    Esrihttp://esri.com/
    Authors
    Urban Observatory by Esri
    Area covered
    Description

    This map shows the access to mental health providers in every county and state in the United States according to the 2024 County Health Rankings & Roadmaps data for counties, states, and the nation. It translates the numbers to explain how many additional mental health providers are needed in each county and state. According to the data, in the United States overall there are 319 people per mental health provider in the U.S. The maps clearly illustrate that access to mental health providers varies widely across the country.The data comes from this County Health Rankings 2024 layer. An updated layer is usually published each year, which allows comparisons from year to year. This map contains layers for 2024 and also for 2022 as a comparison.County Health Rankings & Roadmaps (CHR&R), a program of the University of Wisconsin Population Health Institute with support provided by the Robert Wood Johnson Foundation, draws attention to why there are differences in health within and across communities by measuring the health of nearly all counties in the nation. This map's layers contain 2024 CHR&R data for nation, state, and county levels. The CHR&R Annual Data Release is compiled using county-level measures from a variety of national and state data sources. CHR&R provides a snapshot of the health of nearly every county in the nation. A wide range of factors influence how long and how well we live, including: opportunities for education, income, safe housing and the right to shape policies and practices that impact our lives and futures. Health Outcomes tell us how long people live on average within a community, and how people experience physical and mental health in a community. Health Factors represent the things we can improve to support longer and healthier lives. They are indicators of the future health of our communities.Some example measures are:Life ExpectancyAccess to Exercise OpportunitiesUninsuredFlu VaccinationsChildren in PovertySchool Funding AdequacySevere Housing Cost BurdenBroadband AccessTo see a full list of variables, definitions and descriptions, explore the Fields information by clicking the Data tab here in the Item Details of this layer. For full documentation, visit the Measures page on the CHR&R website. Notable changes in the 2024 CHR&R Annual Data Release:Measures of birth and death now provide more detailed race categories including a separate category for ‘Native Hawaiian or Other Pacific Islander’ and a ‘Two or more races’ category where possible. Find more information on the CHR&R website.Ranks are no longer calculated nor included in the dataset. CHR&R introduced a new graphic to the County Health Snapshots on their website that shows how a county fares relative to other counties in a state and nation. Data Processing:County Health Rankings data and metadata were prepared and formatted for Living Atlas use by the CHR&R team. 2021 U.S. boundaries are used in this dataset for a total of 3,143 counties. Analytic data files can be downloaded from the CHR&R website.

  20. Effect of suicide rates on life expectancy dataset

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Apr 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Filip Zoubek; Filip Zoubek (2021). Effect of suicide rates on life expectancy dataset [Dataset]. http://doi.org/10.5281/zenodo.4694270
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 16, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Filip Zoubek; Filip Zoubek
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Effect of suicide rates on life expectancy dataset

    Abstract
    In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy.
    The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.

    Data

    The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.

    LICENSE

    THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).

    [1] https://www.kaggle.com/szamil/who-suicide-statistics

    [2] https://www.kaggle.com/kumarajarshi/life-expectancy-who

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
McGill NLP Group (2021). stereoset [Dataset]. https://huggingface.co/datasets/McGill-NLP/stereoset

Data from: stereoset

StereoSet

McGill-NLP/stereoset

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 8, 2021
Dataset authored and provided by
McGill NLP Group
License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

Dataset Card for StereoSet

  Dataset Summary

StereoSet is a dataset that measures stereotype bias in language models. StereoSet consists of 17,000 sentences that measures model preferences across gender, race, religion, and profession.

  Supported Tasks and Leaderboards

multiple-choice question answering

  Languages

English (en)

  Dataset Structure





  Data Instances

intersentence

{'bias_type': 'race', 'context': 'Many people live in… See the full description on the dataset page: https://huggingface.co/datasets/McGill-NLP/stereoset.

Search
Clear search
Close search
Google apps
Main menu