100+ datasets found
  1. N

    Illinois annual income distribution by work experience and gender dataset...

    • neilsberg.com
    csv, json
    Updated Jan 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Illinois annual income distribution by work experience and gender dataset (Number of individuals ages 15+ with income, 2022) [Dataset]. https://www.neilsberg.com/research/datasets/23ca91c7-981b-11ee-99cf-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jan 9, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Illinois
    Variables measured
    Income for Male Population, Income for Female Population, Income for Male Population working full time, Income for Male Population working part time, Income for Female Population working full time, Income for Female Population working part time, Number of males working full time for a given income bracket, Number of males working part time for a given income bracket, Number of females working full time for a given income bracket, Number of females working part time for a given income bracket
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates. To portray the number of individuals for both the genders (Male and Female), within each income bracket we conducted an initial analysis and categorization of the American Community Survey data. Households are categorized, and median incomes are reported based on the self-identified gender of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the detailed breakdown of the count of individuals within distinct income brackets, categorizing them by gender (men and women) and employment type - full-time (FT) and part-time (PT), offering valuable insights into the diverse income landscapes within Illinois. The dataset can be utilized to gain insights into gender-based income distribution within the Illinois population, aiding in data analysis and decision-making..

    Key observations

    • Employment patterns: Within Illinois, among individuals aged 15 years and older with income, there were 4.57 million men and 4.51 million women in the workforce. Among them, 2.58 million men were engaged in full-time, year-round employment, while 2.00 million women were in full-time, year-round roles.
    • Annual income under $24,999: Of the male population working full-time, 6.54% fell within the income range of under $24,999, while 9.50% of the female population working full-time was represented in the same income bracket.
    • Annual income above $100,000: 29.19% of men in full-time roles earned incomes exceeding $100,000, while 17.91% of women in full-time positions earned within this income bracket.
    • Refer to the research insights for more key observations on more income brackets ( Annual income under $24,999, Annual income between $25,000 and $49,999, Annual income between $50,000 and $74,999, Annual income between $75,000 and $99,999 and Annual income above $100,000) and employment types (full-time year-round and part-time)

    https://i.neilsberg.com/ch/illinois-income-distribution-by-gender-and-employment-type.jpeg" alt="Illinois gender and employment-based income distribution analysis (Ages 15+)">

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates.

    Income brackets:

    • $1 to $2,499 or loss
    • $2,500 to $4,999
    • $5,000 to $7,499
    • $7,500 to $9,999
    • $10,000 to $12,499
    • $12,500 to $14,999
    • $15,000 to $17,499
    • $17,500 to $19,999
    • $20,000 to $22,499
    • $22,500 to $24,999
    • $25,000 to $29,999
    • $30,000 to $34,999
    • $35,000 to $39,999
    • $40,000 to $44,999
    • $45,000 to $49,999
    • $50,000 to $54,999
    • $55,000 to $64,999
    • $65,000 to $74,999
    • $75,000 to $99,999
    • $100,000 or more

    Variables / Data Columns

    • Income Bracket: This column showcases 20 income brackets ranging from $1 to $100,000+..
    • Full-Time Males: The count of males employed full-time year-round and earning within a specified income bracket
    • Part-Time Males: The count of males employed part-time and earning within a specified income bracket
    • Full-Time Females: The count of females employed full-time year-round and earning within a specified income bracket
    • Part-Time Females: The count of females employed part-time and earning within a specified income bracket

    Employment type classifications include:

    • Full-time, year-round: A full-time, year-round worker is a person who worked full time (35 or more hours per week) and 50 or more weeks during the previous calendar year.
    • Part-time: A part-time worker is a person who worked less than 35 hours per week during the previous calendar year.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Illinois median household income by gender. You can refer the same here

  2. e

    World Top Incomes Database - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Oct 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). World Top Incomes Database - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/dfc6e1ca-ae47-561c-b49a-a735d4943793
    Explore at:
    Dataset updated
    Oct 28, 2023
    Area covered
    World
    Description

    The World Top Incomes Database provides statistical information on the shares of top income groups for 30 countries. The construction of this database was possible thanks to the research of over thirty contributing authors. There has been a marked revival of interest in the study of the distribution of top incomes using tax data. Beginning with the research by Thomas Piketty of the long-run distribution of top incomes in France, a succession of studies has constructed top income share time series over the long-run for more than twenty countries to date. These projects have generated a large volume of data, which are intended as a research resource for further analysis. In using data from income tax records, these studies use similar sources and methods as the pioneering work by Kuznets for the United States.The findings of recent research are of added interest, since the new data provide estimates covering nearly all of the twentieth century -a length of time series unusual in economics. In contrast to existing international databases, generally restricted to the post-1970 or post-1980 period, the top income data cover a much longer period, which is important because structural changes in income and wealth distributions often span several decades. The data series is fairly homogenous across countries, annual, long-run, and broken down by income source for several cases. Users should be aware also about their limitations. Firstly, the series measure only top income shares and hence are silent on how inequality evolves elsewhere in the distribution. Secondly, the series are largely concerned with gross incomes before tax. Thirdly, the definition of income and the unit of observation (the individual vs. the family) vary across countries making comparability of levels across countries more difficult. Even within a country, there are breaks in comparability that arise because of changes in tax legislation affecting the definition of income, although most studies try to correct for such changes to create homogenous series. Finally and perhaps most important, the series might be biased because of tax avoidance and tax evasion. The first theme of the research programme is the assembly and analysis of historical evidence from fiscal records on the long-run development of economic inequality. “Long run” is a relative term, and here it means evidence dating back before the Second World War, and extending where possible back into the nineteenth century. The time span is determined by the sources used, which are based on taxes on incomes, earnings, wealth and estates. Perspective on current concerns is provided by the past, but also by comparison with other countries. The second theme of the research programme is that of cross-country comparisons. The research is not limited to OECD countries and will draw on evidence globally. In order to understand the drivers of inequality, it is necessary to consider the sources of economic advantage. The third theme is the analysis of the sources of income, considering separately the roles of earned incomes and property income, and examining the historical and comparative evolution of earned and property income, and their joint distribution. The fourth theme is the long-run trend in the distribution of wealth and its transmission through inheritance. Here again there are rich fiscal data on the passing of estates at death. The top income share series are constructed, in most of the cases presented in this database, using tax statistics (China is an exception; for the time being the estimates come from households surveys). The use of tax data is often regarded by economists with considerable disbelief. These doubts are well justified for at least two reasons. The first is that tax data are collected as part of an administrative process, which is not tailored to the scientists' needs, so that the definition of income, income unit, etc., are not necessarily those that we would have chosen. This causes particular difficulties for comparisons across countries, but also for time-series analysis where there have been substantial changes in the tax system, such as the moves to and from the joint taxation of couples. Secondly, it is obvious that those paying tax have a financial incentive to present their affairs in a way that reduces tax liabilities. There is tax avoidance and tax evasion. The rich, in particular, have a strong incentive to understate their taxable incomes. Those with wealth take steps to ensure that the return comes in the form of asset appreciation, typically taxed at lower rates or not at all. Those with high salaries seek to ensure that part of their remuneration comes in forms, such as fringe benefits or stock-options which receive favorable tax treatment. Both groups may make use of tax havens that allow income to be moved beyond the reach of the national tax net. These shortcomings limit what can be said from tax data, but this does not mean that the data are worthless. Like all economic data, they measure with error the 'true' variable in which we are interested. References Atkinson, Anthony B. and Thomas Piketty (2007). Top Incomes over the Twentieth Century: A Contrast between Continental European and English-Speaking Countries (Volume 1). Oxford: Oxford University Press, 585 pp. Atkinson, Anthony B. and Thomas Piketty (2010). Top Incomes over the Twentieth Century: A Global Perspective (Volume 2). Oxford: Oxford University Press, 776 pp. Atkinson, Anthony B., Thomas Piketty and Emmanuel Saez (2011). Top Incomes in the Long Run of History, Journal of Economic Literature, 49(1), pp. 3-71. Kuznets, Simon (1953). Shares of Upper Income Groups in Income and Savings. New York: National Bureau of Economic Research, 707 pp. Piketty, Thomas (2001). Les Hauts Revenus en France au 20ème siècle. Paris: Grasset, 807 pp. Piketty, Thomas (2003). Income Inequality in France, 1901-1998, Journal of Political Economy, 111(5), pp. 1004-42.

  3. N

    Income Distribution by Quintile: Mean Household Income in Florence, SC //...

    • neilsberg.com
    csv, json
    Updated Mar 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Income Distribution by Quintile: Mean Household Income in Florence, SC // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/florence-sc-median-household-income/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Florence, South Carolina
    Variables measured
    Income Level, Mean Household Income
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the mean household income for each of the five quintiles in Florence, SC, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

    Key observations

    • Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 10,346, while the mean income for the highest quintile (20% of households with the highest income) is 208,770. This indicates that the top earners earn 20 times compared to the lowest earners.
    • *Top 5%: * The mean household income for the wealthiest population (top 5%) is 368,659, which is 176.59% higher compared to the highest quintile, and 3563.30% higher compared to the lowest quintile.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Income Levels:

    • Lowest Quintile
    • Second Quintile
    • Third Quintile
    • Fourth Quintile
    • Highest Quintile
    • Top 5 Percent

    Variables / Data Columns

    • Income Level: This column showcases the income levels (As mentioned above).
    • Mean Household Income: Mean household income, in 2023 inflation-adjusted dollars for the specific income level.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Florence median household income. You can refer the same here

  4. Income of individuals by age group, sex and income source, Canada, provinces...

    • www150.statcan.gc.ca
    • ouvert.canada.ca
    • +2more
    Updated May 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Income of individuals by age group, sex and income source, Canada, provinces and selected census metropolitan areas [Dataset]. http://doi.org/10.25318/1110023901-eng
    Explore at:
    Dataset updated
    May 1, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Income of individuals by age group, sex and income source, Canada, provinces and selected census metropolitan areas, annual.

  5. d

    Geodatabase of the available top and bottom surface datasets that represent...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Oct 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Geodatabase of the available top and bottom surface datasets that represent the Mississippian aquifer, Alabama, Illinois, Indiana, Iowa, Kentucky, Maryland, Missouri, Ohio, Pennsylvania, Tennessee, Virginia and West Virginia [Dataset]. https://catalog.data.gov/dataset/geodatabase-of-the-available-top-and-bottom-surface-datasets-that-represent-the-mississipp
    Explore at:
    Dataset updated
    Oct 5, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Iowa, Missouri, Illinois, Alabama, Tennessee, Virginia, West Virginia, Pennsylvania
    Description

    This geodatabase includes spatial datasets that represent the Mississippian aquifer in the States of Alabama, Illinois, Indiana, Iowa, Kentucky, Maryland, Missouri, Ohio, Pennsylvania, Tennessee, Virginia and West Virginia. The aquifer is divided into three subareas, based on the data availability. In subarea 1 (SA1), which is the aquifer extent in Iowa, data exist of the aquifer top altitude and aquifer thickness. In subarea 2 (SA2), which is the aquifer extent in Missouri, data exist of the aquifer top and bottom aquifer surface altitudes. In subarea 3 (SA3), which is the aquifer area of the remaining States, no altitude or thickness data exist. Included in this geodatabase are: (1) a feature dataset "ds40MSSPPI_altitude_and_thickness_contours that includes aquifer altitude and thickness contours used to generate the surface rasters for SA1 and SA2, (2) a feature dataset "ds40MSSPPI_extents" that includes a polygon dataset that represents the subarea extents, a polygon dataset that represents the combined overall aquifer extent, and a polygon dataset of the Ft. Dodge Fault and Manson Anomaly, (3) raster datasets that represent the altitude of the top and the bottom of the aquifer in SA1 and SA2, and (4) georeferenced images of the figures that were digitized to create the aquifer top- and bottom-altitude contours or aquifer thickness contours for SA1 and SA2. The images and digitized contours are supplied for reference. The extent of the Mississippian aquifer for all subareas was produced from the digital version of the HA-730 Mississippian aquifer extent, (USGS HA-730). For the two Subareas with vertical-surface information, SA1 and SA2, data were retrieved from the sources as described below. 1. The aquifer-altitude contours for the top and the aquifer-thickness contours for the top-to-bottom thickness of SA1 were received in digital format from the Iowa Geologic Survey. The URL for the top was ftp://ftp.igsb.uiowa.edu/GIS_Library/IA_State/Hydrologic/Ground_Waters/ Mississippian_aquifer/mississippian_topography.zip. The URL for the thickness was ftp://ftp.igsb.uiowa.edu/GIS_Library/IA_State/Hydrologic/Ground_Waters/ Mississippian_aquifer/mississippian_isopach.zip Reference for the top map is Altitude and Configuration, in feet above mean sea level, of the Mississipian Aquifer modified from a scanned image of Map 1, Sheet 1, Miscellaneous Map Series 3, Mississippian Aquifer of Iowa by P.J. Horick and W.L. Steinhilber, Iowa Geological Survey, 1973; IGS MMS-3, Map 1, Sheet 1 Reference for the thickness map is Distribution and isopach thickness, in feet, of the Mississipian Aquifer, modified from a scanned image of Map 1, Sheet 2, Miscellaneous Map Series 3, Mississippian Aquifer of Iowa by P.J. Horick and W.L. Steinhilber, Iowa Geological Survey, 1973; IGS MMS-3, Map 1, Sheet 2 The altitude contours for the top and bottom of SA2 were digitized from georeferenced figures of altitude contours in U.S. Geological Survey Professional Paper 1305 (USGS PP1305), figure 6 (for the top surface) and figure 9 (for the bottom surface). The altitude contours for SA1 and SA2 were interpolated into surface rasters within a GIS using tools that create hydrologically correct surfaces from contour data, derive the altitude from the thickness (depth from the land surface), and merge the subareas into a single surface. The primary tool was an enhanced version of "Topo to Raster" used in ArcGIS, ArcMap, Esri 2014. ArcGIS Desktop: Release 10.2 Redlands, CA: Environmental Systems Research Institute. The raster surfaces were corrected in areas where the altitude of the top of the aquifer exceeded the land surface, and where the bottom of an aquifer exceeded the altitude of the corrected top of the aquifer.

  6. Meta Kaggle Code

    • kaggle.com
    zip
    Updated Aug 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaggle (2025). Meta Kaggle Code [Dataset]. https://www.kaggle.com/datasets/kaggle/meta-kaggle-code/code
    Explore at:
    zip(151993733346 bytes)Available download formats
    Dataset updated
    Aug 7, 2025
    Dataset authored and provided by
    Kagglehttp://kaggle.com/
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Explore our public notebook content!

    Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.

    Why we’re releasing this dataset

    By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.

    Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.

    The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!

    Sensitive data

    While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.

    Joining with Meta Kaggle

    The files contained here are a subset of the KernelVersions in Meta Kaggle. The file names match the ids in the KernelVersions csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.

    File organization

    The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.

    The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays

    Questions / Comments

    We love feedback! Let us know in the Discussion tab.

    Happy Kaggling!

  7. N

    Dataset for Upper Arlington, OH Census Bureau Income Distribution by Gender

    • neilsberg.com
    Updated Jan 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Dataset for Upper Arlington, OH Census Bureau Income Distribution by Gender [Dataset]. https://www.neilsberg.com/research/datasets/b3d9102d-abcb-11ee-8b96-3860777c1fe6/
    Explore at:
    Dataset updated
    Jan 9, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Upper Arlington, Ohio
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Upper Arlington household income by gender. The dataset can be utilized to understand the gender-based income distribution of Upper Arlington income.

    Content

    The dataset will have the following datasets when applicable

    Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).

    • Upper Arlington, OH annual median income by work experience and sex dataset : Aged 15+, 2010-2022 (in 2022 inflation-adjusted dollars)
    • Upper Arlington, OH annual income distribution by work experience and gender dataset (Number of individuals ages 15+ with income, 2021)

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Interested in deeper insights and visual analysis?

    Explore our comprehensive data analysis and visual representations for a deeper understanding of Upper Arlington income distribution by gender. You can refer the same here

  8. T

    China Average Yearly Wages

    • tradingeconomics.com
    • de.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). China Average Yearly Wages [Dataset]. https://tradingeconomics.com/china/wages
    Explore at:
    json, xml, csv, excelAvailable download formats
    Dataset updated
    Jun 15, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1952 - Dec 31, 2024
    Area covered
    China
    Description

    Wages in China increased to 120698 CNY/Year in 2023 from 114029 CNY/Year in 2022. This dataset provides - China Average Yearly Wages - actual values, historical data, forecast, chart, statistics, economic calendar and news.

  9. g

    Aquifer framework datasets used to represent the Willamette Lowland...

    • gimi9.com
    Updated Sep 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Aquifer framework datasets used to represent the Willamette Lowland basin-fill aquifers, Oregon, Washington | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_aquifer-framework-datasets-used-to-represent-the-willamette-lowland-basin-fill-aquifers-or/
    Explore at:
    Dataset updated
    Sep 26, 2024
    Area covered
    Willamette Valley, Oregon, Washington County
    Description

    The Willamette Lowland basin-fill aquifers (hereinafter referred to as the Willamette aquifer) is located in Oregon and in southern Washington. The aquifer is composed of unconsolidated deposits of sand and gravel, which are interlayered with clay units. The aquifer thickness varies from less than 100 feet to 800 feet. The aquifer is underlain by basaltic-rock. Cities such as Portland, Oregon, depend on the aquifer for public and industrial use (HA 730-H). This product provides source data for the Willamette aquifer framework, including: Georeferenced images: 1. i_08WLMLWD_bot.tif: Georeferenced figure of altitude contour lines representing the bottom of the Willamette aquifer. The original figure was from Professional Paper 1424-A, Plate 2 (1424-A-P2). The contour lines from this figure were digitized to make the file c_08WLMLWD_bot.shp, and the fault lines were digitized to make f_08WLMLWD_bot.shp. Extent shapefiles: 1. p_08WLMLWD.shp: Polygon shapefile containing the areal extent of the Willamette aquifer (Willamette_AqExtent). The original shapefile was modified to create the shapefile included in this data release. It was modified to only include the Willamette Lowland portion of the aquifer. The extent file contains no aquifer subunits. Contour line shapefiles: 1. c_08WLMLWD_bot.shp: Contour line dataset containing altitude values, in feet, referenced to National Geodetic Vertical Datum of 1929 (NGVD29), across the bottom of the Willamette aquifer. These data were used to create the ra_08WLMLWD_bot.tif raster dataset. Fault line shapefiles: 1. f_08WLMLWD_bot.shp: Fault line dataset containing fault lines across the bottom of the Willamette aquifer. These data were not used in raster creation but were included as supplementary information. Altitude raster files: 1. ra_08WLMLWD_top.tif: Altitude raster dataset of the top of the Willamette aquifer. The altitude values are in meters reference to North American Vertical Datum of 1988 (NAVD88). The top of the aquifer is assumed to be land surface based on available data and was interpolated from the digital elevation model (DEM) dataset (NED, 100-meter). 2. ra_08WLMLWD_bot.tif: Altitude raster dataset of the bottom of the Willamette aquifer. The altitude values are in meters reference to NAVD88. This raster was interpolated from the c_08WLMLWD_bot.shp dataset. Depth raster files: 1. rd_08WLMLWD_top.tif: Depth raster dataset of the top of the Willamette aquifer. The depth values are in meters below land surface (NED, 100-meter). The top of the aquifer is assumed to be land surface based on available data. 2. rd_08WLMLWD_bot.tif : Depth raster dataset of the bottom of the Willamette aquifer. The depth values are in meters below land surface (NED, 100-meter).

  10. Data from: Global Soil Types, 0.5-Degree Grid (Modified Zobler)

    • s.cnmilf.com
    • data.globalchange.gov
    • +6more
    Updated Jun 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ORNL_DAAC (2025). Global Soil Types, 0.5-Degree Grid (Modified Zobler) [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/global-soil-types-0-5-degree-grid-modified-zobler-9dd94
    Explore at:
    Dataset updated
    Jun 28, 2025
    Dataset provided by
    Oak Ridge National Laboratory Distributed Active Archive Center
    Description

    A global data set of soil types is available at 0.5-degree latitude by 0.5-degree longitude resolution. There are 106 soil units, based on Zobler?s (1986) assessment of the FAO/UNESCO Soil Map of the World. This data set is a conversion of the Zobler 1-degree resolution version to a 0.5-degree resolution. The resolution of the data set was not actually increased. Rather, the 1-degree squares were divided into four 0.5-degree squares with the necessary adjustment of continental boundaries and islands. The computer code used to convert the original 1-degree data to 0.5-degree is provided as a companion file. A JPG image of the data is provided in this document. The Zobler data (1-degree resolution) as distributed by Webb et al. (1993) [http://www.ngdc.noaa.gov/seg/eco/cdroms/gedii_a/datasets/a12/wr.htm#top] contains two columns, one column for continent and one column for soil type. The Soil Map of the World consists of 9 maps that represent parts of the world. The texture data that Webb et al.(1993) provided allowed for the fact that a soil type in one part of the world may have different properties than the same soil in a different part of the world. This continent-specific information is retained in this 0.5-degree resolution data set, as well as the soil type information which is the second column. A code was written (one2half.c) to take the file CONTIZOB.LER distributed by Webb et al. (1993) [http://www.ngdc.noaa.gov/seg/eco/cdroms/gedii_a/datasets/a12/wr.htm#top] and simply divide the 1-degree cells into quarters. This code also reads in a land/water file (land.wave) that specifies the cells that are land at 0.5 degrees. The code checks for consistency between the newly quartered map and the land/water map to which the quartered map is to be registered. If there is a discrepancy between the two, an attempt was made to make the two consistent using the following logic. If the cell is supposed to be water, it is forced to be water. If it is supposed to be land but was resolved to water at 1 degree, the code looks at the surrounding 8 cells and picks the most frequent soil type and assigns it to the cell. If there are no surrounding land cells then it is kept as water in the hopes that on the next pass one or more of the surrounding cells might be converted from water to a soil type. The whole map is iterated 5 times. The remaining cells that should be land but couldn't be determined from surrounding cells (mostly islands that are resolved at 0.5 degree but not at 1 degree) are printed out with coordinate information. A temporary map is output with -9 indicating where data is required. This is repeated for the continent code in CONTIZOB.LER as well. A separate map of the temporary continent codes is produced with -9 indicating required data. A nearly identical code (one2half.c) does the same for the continent codes. The printout allows one to consult the printed versions of the soil map and look up the soil type with the largest coverage in the 0.5-degree cell. The program manfix.c then will go through the temporary map and prompt for input to correct both the soil codes and the continent codes for the map. This can be done manually or by preparing a file of changes (new_fix.dat) and redirecting stdin. A new complete version of the map is outputted. This is in the form of the original CONTIZOB.LER file (contizob.half) but four times larger. Original documentation and computer codes prepared by Post et al. (1996) are provided as companion files with this data set. Image of 106 global soil types available at 0.5-degree by 0.5-degree resolution. Additional documentation from Zobler?s assessment of FAO soil units is available from the NASA Center for Scientific Information.

  11. T

    United States Corporate Profits

    • tradingeconomics.com
    • jp.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States Corporate Profits [Dataset]. https://tradingeconomics.com/united-states/corporate-profits
    Explore at:
    excel, xml, json, csvAvailable download formats
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 31, 1947 - Mar 31, 2025
    Area covered
    United States
    Description

    Corporate Profits in the United States decreased to 3203.60 USD Billion in the first quarter of 2025 from 3312 USD Billion in the fourth quarter of 2024. This dataset provides the latest reported value for - United States Corporate Profits - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  12. d

    Galilee geological model 25-05-15

    • data.gov.au
    • researchdata.edu.au
    zip
    Updated Apr 13, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bioregional Assessment Program (2022). Galilee geological model 25-05-15 [Dataset]. https://data.gov.au/data/dataset/bd1c35a0-52c4-421b-ac7d-651556670eb9
    Explore at:
    zip(122560650)Available download formats
    Dataset updated
    Apr 13, 2022
    Dataset authored and provided by
    Bioregional Assessment Program
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract

    This dataset was derived by the Bioregional Assessment Programme. The parent datasets are identified in the Lineage statement in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

    This dataset comprises of interpreted elevation surfaces and contours for the major Triassic and Upper Permian units of the Galilee Geological Basin.

    Purpose

    This dataset was created to provide formation extents for aquifers in the Galilee geological basin

    Dataset History

    A Quality Assurance (QA) and validation process was conducted on the original well and bore data to choose wells/bores that are within 25 kilometres of the BA Galilee Region extent.

    The QA/Validation process is as follows:

    1. Well data

      a. Obtained excel file "QPED_July_2013_galilee.xlsx" from GA

      b. Based on stratigraphic information in "BH_costrat" tab formation names were regularised and simplified based on current naming conventions.

      c. Simplified names added to QPED_July_2013_galileet.xlsx as "Steve_geo" and "Steve_group"

      d. Produced new file "GSQ_Geology.xlsx" contained decimal latitude and longitude, KB elevation, top of unit in metres from KB, top of unit in metres AHD, bottom of unit in metres from KB, bottom of unit in metres AHD, original geology, simplified geology, simplified Group geology.

       i.     KB obtained from "BH_wellhist"
      
       ii.    Where no KB information was available ie KB=0, sample the 1S DEM at the well's location to obtain height. KB=DEM+10. Marked well as having lower reliability.
      
       iii.    Calculated Top_m_AHD = KB - Top_m_KB
      
       iv.    Calculated Bottom_m_AHD = KB - Bottom_m_KB
      

      e. Brought GSQ_Geology.xlsx into ArcGIS

      f. Selected wells based on "Steve_geo" field for each model layer to produce a geodatabase for each layer.

       i.     GSQ_basement_wells
      
       ii.    GSQ_top_joe_joe_group
      
       iii.    GSQ_top_bandanna_merge
      
       iv.    GSQ_rewan_group
      
       v.     GSQ_clematis
      
       vi.    GSQ_moolyember
      

      g. Additional wells and reinterpreted tops added to appropriate geodatabase based on well completion reports

      h. Additional wells added to coverages to help model building process

       i.     Well_name listed as Fake
      
       ii.    Exception being GSQ_top_basement_fake which was created as a separate geodatabase
      
    2. Bore data

      a. Obtained QLD_DNRM_GroundwaterDatabaseExtract_20131111 from GA

      b. Used files REGISTRATIONS.txt, ELEVATIONS.txt and AQUIFER.txt to build GW_stratigraphy.xlsx

       i.     Based on RN
      
       ii.    Latitude from GIS_LAT (REGISTRATIONS.txt)
      
       iii.    Longitude from GIS_LNG (REGISTRATIONS.txt)
      
       iv.    Elevation from (ELEVATIONS.txt)
      
       v.     FORM_DESC from (AQUIFER.txt)
      
       vi.    Top from (AQUIFER.txt)
      
       vii.    Bottom from (AQUIFER.txt)
      

      c. Brought GW_stratigraphy.xlsx into ArcGIS

      d. Created gw_bores_galilee_dem

       i.     Sampled 1S DEM to obtain ground level elevation column RASTERVALU
      
       ii.    Created column top_m_AHD by RASTERVALU - Top
      

      e. Selected bores based on "FORM_DESC" field for each model layer to produce a geodatabase for each layer.

       i.     Gw_basement
      
       ii.    GW_bores_joe_joe_group
      
       iii.    GW_bores_bandanna
      
       iv.    Gw_bores_rewan
      
       v.     Gw_bores_clematis
      
       vi.    Gw_bores_moolyember
      
    3. Georectified seismic surfaces

      a. Extracted interpreted seismic surfaces for base Permian (interpreted as basement) and top Bandanna (in time) from the following seismic surveys

       i.     Y80A, W81A, Carmichael, Pendine, T81A, Quilpie, Ward and Powell Creek seismic survey downloaded https://qdexguest.deedi.qld.gov.au/portal/site/qdex/search?searchType=general 
      
       ii.    Brought TIF images into ArcGIS and georectified
      
       iii.    Digitised shape of contours and faults into geodatabase
      
           1.   Basement_contours and basement_faults
      
           2.   bandanna_contours_new_data and bandanna_faults
      
       iv.    Added field "contour" to geodatabase
      
       v.     Converted contours to depth in "contour" field based on well and bore data (top_m_AHD) and contour progression
      
       vi.    Use the shape and depth derived from OZ SEEBASE to help to add additional contours and faults to basement and bandanna datasets
      
    4. Additional contour and fault surfaces were built derived from underlying surfaces and wells/bore data

      a. Joejoe_contours and joejoe)faults

      b. Rewan_contour_clip (used bandanna_faults as fault coverage)

      c. Clematis_contour and clematis_faults

      d. Moolyember_contour (used clematis_faults as fault coverage)

    5. Surface geology

      a. Extracted surface geology from QUEENSLAND GEOLOGY_AUGUST_2012 using Galilee BA region boundary with 25 kilometre boundary to form geodatabase QLD_geology_galilee

      b. Selected relevant surface geology from QLD_geology_galilee based on field "Name" for each model layer and created new geodatabase layers

       i.     Basement_geology: Argentine Metamorphics,Running River Metamorphics,Charters Towers Metamorphics; Bimurra Volcanics, Foyle Volcanics, Mount Wyatt Formation, Saint Anns Formation, Silver Hills Volcanics, Stones Creek Volcanics; Bulliwallah Formation, Ducabrook Formation, Mount Rankin Formation, Natal Formation, Star of Hope Formation; Cape River Metamorphics; Einasleigh Metamorphics; Gem Park Granite; Macrossan Province Cambrian-Ordovician intrusives; Macrossan Province Ordovician-Silurian intrusives; Macrossan Province Ordovician intrusives; Mount Formartine, unnamed plutonic units; Pama Province Silurian-Devonian intrusives; Seventy Mile Range Group; and Kirk River beds, Les Jumelles beds.
      
       ii.    Joe_joe_geology: Joe Joe Group
      
       iii.    Galilee_permian_geology: Back Creek Group, Betts Creek Group, Blackwater Group
      
       iv.    Rewan_geology: Rewan Group
      
          1.    Later also made dunda_beds_geology to be included in Rewan model: Dunda beds
      
       v.     Clematis_geology: Clematis Group
      
          1.    Later also made warang_sandstone_geology to be included in Clematis model: Warang Sandstone
      
       vi.    Moolyember_surface_geology: Moolyember Formation
      
    6. DEM for each model layer

      a. Using surface geology geodatabase extent extract grid from dem_s_1s to represent the top of the model layer at the surface

       i.     Basement_dem
      
       ii.    Joejoe_dem
      
       iii.    Bandanna_dem
      
       iv.    Rewan_dem and dunda_dem
      
       v.     Clematis_dem and warang_dem
      
       vi.    Moolyember_surface_dem
      

      b. Used Contour tool in ArcGIS to obtain a 25 metre contour geodatabase from the relevant model DEM

       i.     Basement_dem_contours
      
       ii.    Joejoe_dem_contours
      
       iii.    Bandanna_dem_contours
      
       iv.    Rewan_dem_contours and dunda_dem_contours
      
       v.     Clematis_dem_contours and warang_dem_contours
      
       vi.    Moolyember_dem_contours
      

      c. For the purpose of guiding the model building process additional fields were added to each DEM contour geodatabase was added based on average thickness derived from groundwater bores and petroleum wells.

       i.     Basement_dem_contours: Joejoe, bandanna, rewan, clematis, moolyember
      
       ii.    Joejoe_dem_contours: basement, bandanna
      
       iii.    Bandanna_dem_contours: joejoe, rewan
      
       iv.    Rewan_dem_contours and dunda_dem_contours: clematis, rewan
      
       v.     Clematis_dem_contours and warang_dem_contours: moolyember, rewan
      
       vi.  Moolyember_dem_contours: clematis
      

    The model building process is as follows:

    1. Used the tope to raster tool to create surface based on the following rules

      a. Environment

          i.  Extent
      
             1. Top: -19.7012030024424
      
             2. Right: 148.891511819054
      
             3. Bottom: -27.5812030024424
      
             4. Left: 139.141511819054
      
          ii. Output cell size: 0.01 degrees
      
          iii. Drainage enforcement: No_enforce
      

      b. Input

          i.  Basement
      
             1. Basement_dem_contour; field - contour; type - contour
      
             2. Joejoe_dem_contour; field - basement; type - contour
      
             3. Basement_contour; field - contour; type - contour
      
             4. GSQ_basement_wells; field - top_m_AHD; type - point elevation
      
             5. GW_basement; field - top_m_AHDl type - point elevation
      
             6. GSQ_top_basement_fake; field - top_m_AHDl type - point elevation
      
             7. Basement_faults; type - cliff
      
         ii.  Joe Joe Group
      
             1. Joejoe_dem_contour; field - basement; type - contour
      
             2. Basement_dem_contour; field - joejoe; type - contour
      
             3. permian_dem_contour; field - joejoe, type - contour
      
             4. joejoe_contour; field - joejoe; type - contour
      
             5. GSQ_top_joejoe_group; field - top_m_AHD; type - point elevation
      
             6. GW_bores_joe_joe_group; field - top_m_AHDl type - point elevation
      
             7. joejoe_faults; type - cliff
      
         iii.  Bandanna Group
      
             1. Permian_dem_contour; field - contour; type - contour
      
             2. Joejoe_dem_contour; field - bandanna; type - contour
      
             3. Rewan_dem_contour: field - bandanna; type - contour
      
             4. Dunda_dem_contour; field - bandanna; type - contour
      
  13. Breast cancer dataset

    • kaggle.com
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wasiq Ali (2025). Breast cancer dataset [Dataset]. https://www.kaggle.com/datasets/wasiqaliyasir/breast-cancer-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Wasiq Ali
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Breast Cancer Dataset

    Description

    The Breast Cancer Dataset hosted on Kaggle is a powerful resource for researchers, data scientists, and machine learning enthusiasts looking to explore and develop predictive models for breast cancer diagnosis. This dataset, accessible via Kaggle, is designed for binary classification tasks to predict whether a breast tumor is benign or malignant. It provides a rich collection of features derived from digitized images of fine needle aspirates (FNA) of breast masses, making it an essential tool for advancing healthcare analytics and computational pathology. Below is a comprehensive, human-crafted description of the dataset, complete with examples and key highlights to make it engaging and informative.

    *****Overview*****

    The dataset originates from the Breast Cancer Wisconsin (Diagnostic) Data Set, a widely used benchmark in machine learning for medical diagnostics. It contains detailed measurements of cell nuclei from breast tissue samples, enabling the classification of tumors as either benign (non-cancerous) or malignant (cancerous). This dataset is particularly valuable for developing and testing machine learning models, such as logistic regression, support vector machines, or deep neural networks, to aid in early and accurate breast cancer detection.

    Purpose: Binary classification to predict tumor type (benign or malignant). Source: University of Wisconsin, provided through Kaggle. Link: Breast Cancer Dataset on Kaggle. Application: Ideal for medical research, machine learning model development, and educational purposes.

    ##### Dataset Structure The dataset comprises 569 instances (rows) and 32 columns, including an ID column, a diagnosis label, and 30 numerical features describing cell nuclei characteristics. Each instance represents a single breast mass sample, with features computed from digitized FNA images. Key Columns:

    ID: A unique identifier for each sample (e.g., 842302). Diagnosis: The target variable, labeled as: M (Malignant): Indicates a cancerous tumor. B (Benign): Indicates a non-cancerous tumor.

    Features (30 columns): Numerical measurements of cell nuclei, such as radius, texture, perimeter, and area, derived from image analysis.

    Feature Categories:

    The 30 features are grouped into three main categories based on the characteristics of cell nuclei:

    Mean: Average values of measurements (e.g., mean radius, mean texture). Standard Error (SE): Variability of measurements (e.g., standard error of radius, standard error of area). Worst: Largest (worst) values of measurements (e.g., worst radius, worst smoothness).

    Each category includes 10 specific measurements:

    1. Radius (mean of distances from center to points on the perimeter)
    2. Texture (standard deviation of grayscale values)
    3. Perimeter
    4. Area
    5. Smoothness (local variation in radius lengths)
    6. Compactness (perimeter² / area - 1.0)
    7. Concavity (severity of concave portions of the contour)
    8. Concave points (number of concave portions of the contour)
    9. Symmetry
    10. Fractal dimension ("coastline approximation" - 1)

    Example Data Point: Here’s a simplified example of a single row in the dataset:

    ID Diagnosis Radius_mean Texture_mean Perimeter_mean Area_mean Smoothness_mean ...

    842302 M 17.99 10.38 122.80 1001.0 0.11840 ...

    Interpretation: This sample (ID 842302) is malignant (M), with a mean radius of 17.99 units, a mean texture of 10.38, and so on. The remaining 27 columns provide additional measurements (e.g., standard error and worst values).

    Key Highlights

    Balanced Classes: The dataset includes 357 benign and 212 malignant cases, offering a relatively balanced distribution for training robust models. No Missing Values: The dataset is clean and preprocessed, with no missing or null values, making it ready for immediate analysis. High Dimensionality: With 30 numerical features, the dataset supports complex modeling techniques, including feature selection and dimensionality reduction. Real-World Impact: The dataset is widely used in research to improve diagnostic accuracy, contributing to early breast cancer detection and better patient outcomes. Open Access: Freely available on Kaggle, encouraging collaboration and innovation in the data science community.

    Potential Use Cases

    • Machine Learning: Train classification models (e.g., Random Forest, SVM, or Neural Networks) to predict tumor malignancy.
    • Feature Engineering: Explore correlations between features (e.g., radius and area) to identify key predictors of malignancy.
    • Data Visualization: Create visualizations (e.g., scatter plots, heatmaps) to understand feature distributions and relationships.
    • Medical Research: Support computational pathology studies by analyzing nuclear characteristics for diagnostic insights.
    • Educational Tool: Perfect for teaching data science concepts, such as preprocessing...
  14. State of Nature layers for Water Availability and Water Pollution to support...

    • zenodo.org
    zip
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rafael Camargo; Rafael Camargo; Sara Walker; Elizabeth Saccoccia; Richard McDowell; Richard McDowell; Allen Townsend; Ariane Laporte-Bisquit; Samantha McCraine; Varsha Vijay; Sara Walker; Elizabeth Saccoccia; Allen Townsend; Ariane Laporte-Bisquit; Samantha McCraine; Varsha Vijay (2024). State of Nature layers for Water Availability and Water Pollution to support SBTN Step 1: Assess and Step 2: Interpret & Prioritize [Dataset]. http://doi.org/10.5281/zenodo.7797979
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rafael Camargo; Rafael Camargo; Sara Walker; Elizabeth Saccoccia; Richard McDowell; Richard McDowell; Allen Townsend; Ariane Laporte-Bisquit; Samantha McCraine; Varsha Vijay; Sara Walker; Elizabeth Saccoccia; Allen Townsend; Ariane Laporte-Bisquit; Samantha McCraine; Varsha Vijay
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    There are multiple well-recognized and peer-reviewed global datasets that can be used to assess water availability and water pollution. Each of these datasets are based on different inputs, modeling approaches, and assumptions. Therefore, in SBTN Step 1: Assess and Step 2: Interpret & Prioritize, companies are required to consult different global datasets for a robust and comprehensive State of Nature (SoN) assessment for water availability and water pollution.

    To streamline this process, WWF, the World Resources Institute (WRI), and SBTN worked together to develop two ready-to-use unified layers of SoN – one for water availability and one for water pollution – in line with the Technical Guidance for Steps 1: Assess and Step 2: Interpret & Prioritize. The result is a single file (shapefile) containing the maximum value both for water availability and for water pollution, as well as the datasets’ raw values (as references). This data is publicly available for download from this repository.

    These unified layers will make it easier for companies to implement a robust approach, and they will lead to more aligned and comparable results between companies. A temporary App is available at https://arcg.is/0z9mOD0 to help companies assess the SoN for water availability and water pollution around their operations and supply chain locations. In the future, these layers will become available both in the WRI’s Aqueduct and in the WWF Risk Filter Suite.

    For the SoN for water availability, the following datasets were considered:

    For the SoN for water pollution, the following datasets were considered:

    In general, the same processing steps were performed for all datasets:

    1. Compute the area-weighted median of each dataset at a common spatial resolution, i.e. HydroSHEDS HydroBasins Level 6 in this case.

    2. Classify datasets to a common range as reclassifying raw values to 1-5 values, where 0 (zero) was used for cells or features with no data. See the documentation for more details.

    3. Identify the maximum value between the classified datasets, separately, for Water Availability and for Water Pollution.


    For transparency and reproducibility, the code is publicly available at https://github.com/rafaexx/sbtn-SoN-water

  15. Population Over Time (US Cities)

    • kaggle.com
    Updated Nov 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Population Over Time (US Cities) [Dataset]. https://www.kaggle.com/datasets/thedevastator/explore-the-growing-population-of-america-s-majo/versions/2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 30, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Area covered
    United States
    Description

    Population Over Time (US Cities)

    Population size over time for the top US cities

    By Bob Burggraaf [source]

    About this dataset

    This dataset reveals the faces of America's urbanization by providing the total population of USA cities in 2015. Through this dataset, you can explore and analyze the populations of cities across the United States. This dataset has undergone a series of data cleaning activities to help make sure that it is easy-to-use with visualization tools, such as cleaning up names of city and joining all cities into one formatted table. Therefore, allowing you to quickly visualize various aspects - like population trends or city demographics - in order to get an informative understanding about how our country is growing. With this knowledge, engaging in discussions related to city planning recommendations is easier than ever!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    How to Use this Dataset

    This dataset contains information about the population of the major cities in the United States. The columns in this dataset include city, summary level, place Fips code, state, state Fips code and total population.

    Using this dataset you can explore a variety of topics related to urbanization including population growth over time and comparative analysis between cities. You can also use it to study specific social or demographic trends such as age distribution or race/ethnicity among other key metrics. With the right analysis you could even predict which areas may experience significant growth or decline in their populations over time. Lastly if you want to compare American cities with other global metropolises then you could easily create aggregate tables that include those data points too!

    Research Ideas

    • Use the data to calculate and demonstrate population growth for cities in the USA over time, providing a strong visual of population changes such as migration, birth/death rates and even shows how urbanization is playing a role in US's population change.
    • Analyze correlations between population size and economic indicators (such as GDP) across various cities to examine job opportunities or comparative housing prices.
    • Compare different city populations by state to compare disparate areas of the country and determine how much citizens from one state may be attracted to another based on economic advantages or cultural ties

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: Total_Population_By_City_Acs_2015_5_E_AgeSex.csv | Column name | Description | |:---------------------|:----------------------------------------------------------------------| | City | Name of the city. (String) | | Summary_Level | Level of detail of the data. (Integer) | | Place_Fips | Federal Information Processing Standard code for the city. (Integer) | | State | Name of the state. (String) | | State_Fips | Federal Information Processing Standard code for the state. (Integer) | | Total_Population | Total population of the city. (Integer) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Bob Burggraaf.

  16. Data from: FISBe: A real-world benchmark dataset for instance segmentation...

    • zenodo.org
    • data.niaid.nih.gov
    bin, json +3
    Updated Apr 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa (2024). FISBe: A real-world benchmark dataset for instance segmentation of long-range thin filamentous structures [Dataset]. http://doi.org/10.5281/zenodo.10875063
    Explore at:
    zip, text/x-python, bin, json, txtAvailable download formats
    Dataset updated
    Apr 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Lisa Mais; Lisa Mais; Peter Hirsch; Peter Hirsch; Claire Managan; Claire Managan; Ramya Kandarpa; Josef Lorenz Rumberger; Josef Lorenz Rumberger; Annika Reinke; Annika Reinke; Lena Maier-Hein; Lena Maier-Hein; Gudrun Ihrke; Gudrun Ihrke; Dagmar Kainmueller; Dagmar Kainmueller; Ramya Kandarpa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 26, 2024
    Description

    General

    For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.

    Summary

    • A new dataset for neuron instance segmentation in 3d multicolor light microscopy data of fruit fly brains
      • 30 completely labeled (segmented) images
      • 71 partly labeled images
      • altogether comprising ∼600 expert-labeled neuron instances (labeling a single neuron takes between 30-60 min on average, yet a difficult one can take up to 4 hours)
    • To the best of our knowledge, the first real-world benchmark dataset for instance segmentation of long thin filamentous objects
    • A set of metrics and a novel ranking score for respective meaningful method benchmarking
    • An evaluation of three baseline methods in terms of the above metrics and score

    Abstract

    Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.

    Dataset documentation:

    We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:

    >> FISBe Datasheet

    Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.

    Files

    • fisbe_v1.0_{completely,partly}.zip
      • contains the image and ground truth segmentation data; there is one zarr file per sample, see below for more information on how to access zarr files.
    • fisbe_v1.0_mips.zip
      • maximum intensity projections of all samples, for convenience.
    • sample_list_per_split.txt
      • a simple list of all samples and the subset they are in, for convenience.
    • view_data.py
      • a simple python script to visualize samples, see below for more information on how to use it.
    • dim_neurons_val_and_test_sets.json
      • a list of instance ids per sample that are considered to be of low intensity/dim; can be used for extended evaluation.
    • Readme.md
      • general information

    How to work with the image files

    Each sample consists of a single 3d MCFO image of neurons of the fruit fly.
    For each image, we provide a pixel-wise instance segmentation for all separable neurons.
    Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").
    The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.
    The segmentation mask for each neuron is stored in a separate channel.
    The order of dimensions is CZYX.

    We recommend to work in a virtual environment, e.g., by using conda:

    conda create -y -n flylight-env -c conda-forge python=3.9
    conda activate flylight-env

    How to open zarr files

    1. Install the python zarr package:
      pip install zarr
    2. Opened a zarr file with:

      import zarr
      raw = zarr.open(
      seg = zarr.open(

      # optional:
      import numpy as np
      raw_np = np.array(raw)

    Zarr arrays are read lazily on-demand.
    Many functions that expect numpy arrays also work with zarr arrays.
    Optionally, the arrays can also explicitly be converted to numpy arrays.

    How to view zarr image files

    We recommend to use napari to view the image data.

    1. Install napari:
      pip install "napari[all]"
    2. Save the following Python script:

      import zarr, sys, napari

      raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")
      gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")

      viewer = napari.Viewer(ndisplay=3)
      for idx, gt in enumerate(gts):
      viewer.add_labels(
      gt, rendering='translucent', blending='additive', name=f'gt_{idx}')
      viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')
      viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')
      viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')
      napari.run()

    3. Execute:
      python view_data.py 

    Metrics

    • S: Average of avF1 and C
    • avF1: Average F1 Score
    • C: Average ground truth coverage
    • clDice_TP: Average true positives clDice
    • FS: Number of false splits
    • FM: Number of false merges
    • tp: Relative number of true positives

    For more information on our selected metrics and formal definitions please see our paper.

    Baseline

    To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..
    For detailed information on the methods and the quantitative results please see our paper.

    License

    The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

    Citation

    If you use FISBe in your research, please use the following BibTeX entry:

    @misc{mais2024fisbe,
     title =    {FISBe: A real-world benchmark dataset for instance
             segmentation of long-range thin filamentous structures},
     author =    {Lisa Mais and Peter Hirsch and Claire Managan and Ramya
             Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena
             Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller},
     year =     2024,
     eprint =    {2404.00130},
     archivePrefix ={arXiv},
     primaryClass = {cs.CV}
    }

    Acknowledgments

    We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuable
    discussions.
    P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.
    This work was co-funded by Helmholtz Imaging.

    Changelog

    There have been no changes to the dataset so far.
    All future change will be listed on the changelog page.

    Contributing

    If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.

    All contributions are welcome!

  17. p

    Dominican Republic Number Dataset

    • listtodata.com
    .csv, .xls, .txt
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    List to Data (2025). Dominican Republic Number Dataset [Dataset]. https://listtodata.com/dominican-republic-dataset
    Explore at:
    .csv, .xls, .txtAvailable download formats
    Dataset updated
    Jul 17, 2025
    Dataset authored and provided by
    List to Data
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2025 - Dec 31, 2025
    Area covered
    Dominican Republic
    Variables measured
    phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
    Description

    Dominican Republic number dataset helps in many ways to gain huge amounts from business. Besides, this Dominican Republic number dataset is a very valuable directory that you can buy from us at a minimal cost. In addition, it creates many business chances because this country is rich in multiple sectors. Additionally, this directory makes all businesses more famous, competitive, and useful. For instance, this Dominican Republic number dataset builds new opportunities to do business in your selected places. Yet, the vendors can give sales promotions and make huge money from this lead. This time, they can join with the selected group of clients quickly. Overall, it provides the long-term success of your company or business. Dominican Republic phone data is a powerful way to connect many clients. Our Dominican Republic phone data can assist in getting speedy feedback from the public. In other words, our expert unit supplies this cautiously according to your needs. However, the List To Data website is the perfect source to get upgraded sales leads. Thus, check out the packages to find the one that works best for you and watch your business succeed. Moreover, the Dominican Republic phone data is perfect for sending text messages or making phone calls to potential new clients to make deals. By getting this people easily can reach out to people in this area and get positive results from the marketing. Likewise, this library retains millions of phone numbers from different businesses and people. Dominican Republic phone number list transforms your business into a profitable venture. Finding real contacts is very important because the Dominican Republic phone number list helps you reach a genuine audience, saving you time. Even, this List To Data helps you attach with many people quickly and boosts your marketing efforts. In addition, the Dominican Republic phone number list is a great source of earning from B2B and B2C platforms. The Dominican Republic’s economy is strong and diverse, with important sectors like technology, finance, and tourism. Besides, the country’s economy is persisting to grow. In the end, everyone should buy our contact data to earn a massive amount of profit from your targeted locations.

  18. Z

    A stakeholder-centered determination of High-Value Data sets: the use-case...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anastasija Nikiforova (2021). A stakeholder-centered determination of High-Value Data sets: the use-case of Latvia [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5142816
    Explore at:
    Dataset updated
    Oct 27, 2021
    Dataset authored and provided by
    Anastasija Nikiforova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Latvia
    Description

    The data in this dataset were collected in the result of the survey of Latvian society (2021) aimed at identifying high-value data set for Latvia, i.e. data sets that, in the view of Latvian society, could create the value for the Latvian economy and society. The survey is created for both individuals and businesses. It being made public both to act as supplementary data for "Towards enrichment of the open government data: a stakeholder-centered determination of High-Value Data sets for Latvia" paper (author: Anastasija Nikiforova, University of Latvia) and in order for other researchers to use these data in their own work.

    The survey was distributed among Latvian citizens and organisations. The structure of the survey is available in the supplementary file available (see Survey_HighValueDataSets.odt)

    Description of the data in this data set: structure of the survey and pre-defined answers (if any) 1. Have you ever used open (government) data? - {(1) yes, once; (2) yes, there has been a little experience; (3) yes, continuously, (4) no, it wasn’t needed for me; (5) no, have tried but has failed} 2. How would you assess the value of open govenment data that are currently available for your personal use or your business? - 5-point Likert scale, where 1 – any to 5 – very high 3. If you ever used the open (government) data, what was the purpose of using them? - {(1) Have not had to use; (2) to identify the situation for an object or ab event (e.g. Covid-19 current state); (3) data-driven decision-making; (4) for the enrichment of my data, i.e. by supplementing them; (5) for better understanding of decisions of the government; (6) awareness of governments’ actions (increasing transparency); (7) forecasting (e.g. trendings etc.); (8) for developing data-driven solutions that use only the open data; (9) for developing data-driven solutions, using open data as a supplement to existing data; (10) for training and education purposes; (11) for entertainment; (12) other (open-ended question) 4. What category(ies) of “high value datasets” is, in you opinion, able to create added value for society or the economy? {(1)Geospatial data; (2) Earth observation and environment; (3) Meteorological; (4) Statistics; (5) Companies and company ownership; (6) Mobility} 5. To what extent do you think the current data catalogue of Latvia’s Open data portal corresponds to the needs of data users/ consumers? - 10-point Likert scale, where 1 – no data are useful, but 10 – fully correspond, i.e. all potentially valuable datasets are available 6. Which of the current data categories in Latvia’s open data portals, in you opinion, most corresponds to the “high value dataset”? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 7. Which of them form your TOP-3? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 8. How would you assess the value of the following data categories? 8.1. sensor data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.2. real-time data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.3. geospatial data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 9. What would be these datasets? I.e. what (sub)topic could these data be associated with? - open-ended question 10. Which of the data sets currently available could be valauble and useful for society and businesses? - open-ended question 11. Which of the data sets currently NOT available in Latvia’s open data portal could, in your opinion, be valauble and useful for society and businesses? - open-ended question 12. How did you define them? - {(1)Subjective opinion; (2) experience with data; (3) filtering out the most popular datasets, i.e. basing the on public opinion; (4) other (open-ended question)} 13. How high could be the value of these data sets value for you or your business? - 5-point Likert scale, where 1 – not valuable, 5 – highly valuable 14. Do you represent any company/ organization (are you working anywhere)? (if “yes”, please, fill out the survey twice, i.e. as an individual user AND a company representative) - {yes; no; I am an individual data user; other (open-ended)} 15. What industry/ sector does your company/ organization belong to? (if you do not work at the moment, please, choose the last option) - {Information and communication services; Financial and ansurance activities; Accommodation and catering services; Education; Real estate operations; Wholesale and retail trade; repair of motor vehicles and motorcycles; transport and storage; construction; water supply; waste water; waste management and recovery; electricity, gas supple, heating and air conditioning; manufacturing industry; mining and quarrying; agriculture, forestry and fisheries professional, scientific and technical services; operation of administrative and service services; public administration and defence; compulsory social insurance; health and social care; art, entertainment and recreation; activities of households as employers;; CSO/NGO; Iam not a representative of any company 16. To which category does your company/ organization belong to in terms of its size? - {small; medium; large; self-employeed; I am not a representative of any company} 17. What is the age group that you belong to? (if you are an individual user, not a company representative) - {11..15, 16..20, 21..25, 26..30, 31..35, 36..40, 41..45, 46+, “do not want to reveal”} 18. Please, indicate your education or a scientific degree that corresponds most to you? (if you are an individual user, not a company representative) - {master degree; bachelor’s degree; Dr. and/ or PhD; student (bachelor level); student (master level); doctoral candidate; pupil; do not want to reveal these data}

    Format of the file .xls, .csv (for the first spreadsheet only), .odt

    Licenses or restrictions CC-BY

  19. d

    Aquifer framework datasets used to represent the Arbuckle-Simpson aquifer,...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Sep 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Aquifer framework datasets used to represent the Arbuckle-Simpson aquifer, Oklahoma [Dataset]. https://catalog.data.gov/dataset/aquifer-framework-datasets-used-to-represent-the-arbuckle-simpson-aquifer-oklahoma
    Explore at:
    Dataset updated
    Sep 26, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Oklahoma
    Description

    The Arbuckle-Simpson aquifer covers an area of about 800 square miles in the Arbuckle Mountains and Arbuckle Plains of South-Central Oklahoma. The aquifer is in the Central Lowland Physiographic Province and is composed of the Simpson and Arbuckle Groups of Ordovician and Cambrian age. The aquifer is as thick as 9,000 feet in some areas. The aquifer provides relatively small, but important, amounts of water depended on for public supply, agricultural, and industrial use (HA 730-E). This product provides source data for the Arbuckle-Simpson aquifer framework, including: Georeferenced images: 1. i_46ARBSMP_bot.tif: Digitized figure of depth contour lines below land surface representing the base of fresh water in the Arbuckle-Simpson aquifer. The base of fresh water is considered to be the bottom of the Arbuckle-Simpson aquifer. The original figure is from the "Reconnaissance of the water resources of the Ardmore and Sherman Quadrangles, southern Oklahoma" report, map HA-3, page 2, prepared by the Oklahoma Geological Survey in cooperation with the U.S. Geological Survey (HA3_P2). Extent shapefiles: 1. p_46ABKSMP.shp: Polygon shapefile containing the areal extent of the Arbuckle-Simpson aquifer (Arbuckle-Simpson_AqExtent). The extent file contains no aquifer subunits. Contour line shapefiles: 1. c_46ABKSMP_bot.shp: Contour line dataset containing depth values, in feet below land surface, across the bottom of the Arbuckle-Simpson aquifer. This dataset is a digitized version of the map published in HA3_P2. This dataset was used to create the rd_46ABKSMP_bot.tif raster dataset. This map generalized depth values into zoned areas with associated ranges of depth. The edge of each zone was treated as the minimum value of the assigned range, thus creating the depth contour lines. This interpretation was favorable as it allowed for the creation of the resulting raster. This map was used because more detailed point or contour data for the area is unavailable. Altitude raster files: 1. ra_46ABKSMP_top.tif: Altitude raster dataset of the top of the Arbuckle-Simpson aquifer. The altitude values are in meters reference to North American Vertical Datum of 1988 (NAVD88). The top of the aquifer is assumed to be at land surface (NED, 100-meter) based on available data. This raster was interpolated from the Digital Elevation Model (DEM) dataset (NED, 100-meter). 2. ra_46ABKSMP_bot.tif: Altitude raster dataset of the bottom of the Arbuckle-Simpson aquifer. The altitude values are in meters referenced to NAVD88. Depth raster files: 1. rd_46ABKSMP_top.tif: Depth raster dataset of the top of the Arbuckle-Simpson aquifer. The depth values are in meters below land surface (NED, 100-meter). The top of the aquifer is assumed to be at land surface (NED, 100-meter) based on available data. 2. rd_46ABKSMP_bot.tif: Depth raster dataset of the bottom of the Arbuckle-Simpson aquifer. The depth values are in meters below land surface (NED, 100-meter). This raster was interpolated from the contour line dataset c_46ABKSMP_bot.shp.

  20. R

    Data from: Best Fiends Dataset

    • universe.roboflow.com
    zip
    Updated Apr 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexey (2025). Best Fiends Dataset [Dataset]. https://universe.roboflow.com/alexey-5dz4x/best-fiends/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 23, 2025
    Dataset authored and provided by
    Alexey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Gold Bounding Boxes
    Description

    Best Fiends

    ## Overview
    
    Best Fiends is a dataset for object detection tasks - it contains Gold annotations for 364 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Neilsberg Research (2024). Illinois annual income distribution by work experience and gender dataset (Number of individuals ages 15+ with income, 2022) [Dataset]. https://www.neilsberg.com/research/datasets/23ca91c7-981b-11ee-99cf-3860777c1fe6/

Illinois annual income distribution by work experience and gender dataset (Number of individuals ages 15+ with income, 2022)

Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 9, 2024
Dataset authored and provided by
Neilsberg Research
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Illinois
Variables measured
Income for Male Population, Income for Female Population, Income for Male Population working full time, Income for Male Population working part time, Income for Female Population working full time, Income for Female Population working part time, Number of males working full time for a given income bracket, Number of males working part time for a given income bracket, Number of females working full time for a given income bracket, Number of females working part time for a given income bracket
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates. To portray the number of individuals for both the genders (Male and Female), within each income bracket we conducted an initial analysis and categorization of the American Community Survey data. Households are categorized, and median incomes are reported based on the self-identified gender of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the detailed breakdown of the count of individuals within distinct income brackets, categorizing them by gender (men and women) and employment type - full-time (FT) and part-time (PT), offering valuable insights into the diverse income landscapes within Illinois. The dataset can be utilized to gain insights into gender-based income distribution within the Illinois population, aiding in data analysis and decision-making..

Key observations

  • Employment patterns: Within Illinois, among individuals aged 15 years and older with income, there were 4.57 million men and 4.51 million women in the workforce. Among them, 2.58 million men were engaged in full-time, year-round employment, while 2.00 million women were in full-time, year-round roles.
  • Annual income under $24,999: Of the male population working full-time, 6.54% fell within the income range of under $24,999, while 9.50% of the female population working full-time was represented in the same income bracket.
  • Annual income above $100,000: 29.19% of men in full-time roles earned incomes exceeding $100,000, while 17.91% of women in full-time positions earned within this income bracket.
  • Refer to the research insights for more key observations on more income brackets ( Annual income under $24,999, Annual income between $25,000 and $49,999, Annual income between $50,000 and $74,999, Annual income between $75,000 and $99,999 and Annual income above $100,000) and employment types (full-time year-round and part-time)

https://i.neilsberg.com/ch/illinois-income-distribution-by-gender-and-employment-type.jpeg" alt="Illinois gender and employment-based income distribution analysis (Ages 15+)">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates.

Income brackets:

  • $1 to $2,499 or loss
  • $2,500 to $4,999
  • $5,000 to $7,499
  • $7,500 to $9,999
  • $10,000 to $12,499
  • $12,500 to $14,999
  • $15,000 to $17,499
  • $17,500 to $19,999
  • $20,000 to $22,499
  • $22,500 to $24,999
  • $25,000 to $29,999
  • $30,000 to $34,999
  • $35,000 to $39,999
  • $40,000 to $44,999
  • $45,000 to $49,999
  • $50,000 to $54,999
  • $55,000 to $64,999
  • $65,000 to $74,999
  • $75,000 to $99,999
  • $100,000 or more

Variables / Data Columns

  • Income Bracket: This column showcases 20 income brackets ranging from $1 to $100,000+..
  • Full-Time Males: The count of males employed full-time year-round and earning within a specified income bracket
  • Part-Time Males: The count of males employed part-time and earning within a specified income bracket
  • Full-Time Females: The count of females employed full-time year-round and earning within a specified income bracket
  • Part-Time Females: The count of females employed part-time and earning within a specified income bracket

Employment type classifications include:

  • Full-time, year-round: A full-time, year-round worker is a person who worked full time (35 or more hours per week) and 50 or more weeks during the previous calendar year.
  • Part-time: A part-time worker is a person who worked less than 35 hours per week during the previous calendar year.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Illinois median household income by gender. You can refer the same here

Search
Clear search
Close search
Google apps
Main menu