Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the detailed breakdown of the count of individuals within distinct income brackets, categorizing them by gender (men and women) and employment type - full-time (FT) and part-time (PT), offering valuable insights into the diverse income landscapes within Illinois. The dataset can be utilized to gain insights into gender-based income distribution within the Illinois population, aiding in data analysis and decision-making..
Key observations
https://i.neilsberg.com/ch/illinois-income-distribution-by-gender-and-employment-type.jpeg" alt="Illinois gender and employment-based income distribution analysis (Ages 15+)">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates.
Income brackets:
Variables / Data Columns
Employment type classifications include:
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Illinois median household income by gender. You can refer the same here
The World Top Incomes Database provides statistical information on the shares of top income groups for 30 countries. The construction of this database was possible thanks to the research of over thirty contributing authors. There has been a marked revival of interest in the study of the distribution of top incomes using tax data. Beginning with the research by Thomas Piketty of the long-run distribution of top incomes in France, a succession of studies has constructed top income share time series over the long-run for more than twenty countries to date. These projects have generated a large volume of data, which are intended as a research resource for further analysis. In using data from income tax records, these studies use similar sources and methods as the pioneering work by Kuznets for the United States.The findings of recent research are of added interest, since the new data provide estimates covering nearly all of the twentieth century -a length of time series unusual in economics. In contrast to existing international databases, generally restricted to the post-1970 or post-1980 period, the top income data cover a much longer period, which is important because structural changes in income and wealth distributions often span several decades. The data series is fairly homogenous across countries, annual, long-run, and broken down by income source for several cases. Users should be aware also about their limitations. Firstly, the series measure only top income shares and hence are silent on how inequality evolves elsewhere in the distribution. Secondly, the series are largely concerned with gross incomes before tax. Thirdly, the definition of income and the unit of observation (the individual vs. the family) vary across countries making comparability of levels across countries more difficult. Even within a country, there are breaks in comparability that arise because of changes in tax legislation affecting the definition of income, although most studies try to correct for such changes to create homogenous series. Finally and perhaps most important, the series might be biased because of tax avoidance and tax evasion. The first theme of the research programme is the assembly and analysis of historical evidence from fiscal records on the long-run development of economic inequality. “Long run” is a relative term, and here it means evidence dating back before the Second World War, and extending where possible back into the nineteenth century. The time span is determined by the sources used, which are based on taxes on incomes, earnings, wealth and estates. Perspective on current concerns is provided by the past, but also by comparison with other countries. The second theme of the research programme is that of cross-country comparisons. The research is not limited to OECD countries and will draw on evidence globally. In order to understand the drivers of inequality, it is necessary to consider the sources of economic advantage. The third theme is the analysis of the sources of income, considering separately the roles of earned incomes and property income, and examining the historical and comparative evolution of earned and property income, and their joint distribution. The fourth theme is the long-run trend in the distribution of wealth and its transmission through inheritance. Here again there are rich fiscal data on the passing of estates at death. The top income share series are constructed, in most of the cases presented in this database, using tax statistics (China is an exception; for the time being the estimates come from households surveys). The use of tax data is often regarded by economists with considerable disbelief. These doubts are well justified for at least two reasons. The first is that tax data are collected as part of an administrative process, which is not tailored to the scientists' needs, so that the definition of income, income unit, etc., are not necessarily those that we would have chosen. This causes particular difficulties for comparisons across countries, but also for time-series analysis where there have been substantial changes in the tax system, such as the moves to and from the joint taxation of couples. Secondly, it is obvious that those paying tax have a financial incentive to present their affairs in a way that reduces tax liabilities. There is tax avoidance and tax evasion. The rich, in particular, have a strong incentive to understate their taxable incomes. Those with wealth take steps to ensure that the return comes in the form of asset appreciation, typically taxed at lower rates or not at all. Those with high salaries seek to ensure that part of their remuneration comes in forms, such as fringe benefits or stock-options which receive favorable tax treatment. Both groups may make use of tax havens that allow income to be moved beyond the reach of the national tax net. These shortcomings limit what can be said from tax data, but this does not mean that the data are worthless. Like all economic data, they measure with error the 'true' variable in which we are interested. References Atkinson, Anthony B. and Thomas Piketty (2007). Top Incomes over the Twentieth Century: A Contrast between Continental European and English-Speaking Countries (Volume 1). Oxford: Oxford University Press, 585 pp. Atkinson, Anthony B. and Thomas Piketty (2010). Top Incomes over the Twentieth Century: A Global Perspective (Volume 2). Oxford: Oxford University Press, 776 pp. Atkinson, Anthony B., Thomas Piketty and Emmanuel Saez (2011). Top Incomes in the Long Run of History, Journal of Economic Literature, 49(1), pp. 3-71. Kuznets, Simon (1953). Shares of Upper Income Groups in Income and Savings. New York: National Bureau of Economic Research, 707 pp. Piketty, Thomas (2001). Les Hauts Revenus en France au 20ème siècle. Paris: Grasset, 807 pp. Piketty, Thomas (2003). Income Inequality in France, 1901-1998, Journal of Political Economy, 111(5), pp. 1004-42.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Florence, SC, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Florence median household income. You can refer the same here
Income of individuals by age group, sex and income source, Canada, provinces and selected census metropolitan areas, annual.
This geodatabase includes spatial datasets that represent the Mississippian aquifer in the States of Alabama, Illinois, Indiana, Iowa, Kentucky, Maryland, Missouri, Ohio, Pennsylvania, Tennessee, Virginia and West Virginia. The aquifer is divided into three subareas, based on the data availability. In subarea 1 (SA1), which is the aquifer extent in Iowa, data exist of the aquifer top altitude and aquifer thickness. In subarea 2 (SA2), which is the aquifer extent in Missouri, data exist of the aquifer top and bottom aquifer surface altitudes. In subarea 3 (SA3), which is the aquifer area of the remaining States, no altitude or thickness data exist. Included in this geodatabase are: (1) a feature dataset "ds40MSSPPI_altitude_and_thickness_contours that includes aquifer altitude and thickness contours used to generate the surface rasters for SA1 and SA2, (2) a feature dataset "ds40MSSPPI_extents" that includes a polygon dataset that represents the subarea extents, a polygon dataset that represents the combined overall aquifer extent, and a polygon dataset of the Ft. Dodge Fault and Manson Anomaly, (3) raster datasets that represent the altitude of the top and the bottom of the aquifer in SA1 and SA2, and (4) georeferenced images of the figures that were digitized to create the aquifer top- and bottom-altitude contours or aquifer thickness contours for SA1 and SA2. The images and digitized contours are supplied for reference. The extent of the Mississippian aquifer for all subareas was produced from the digital version of the HA-730 Mississippian aquifer extent, (USGS HA-730). For the two Subareas with vertical-surface information, SA1 and SA2, data were retrieved from the sources as described below. 1. The aquifer-altitude contours for the top and the aquifer-thickness contours for the top-to-bottom thickness of SA1 were received in digital format from the Iowa Geologic Survey. The URL for the top was ftp://ftp.igsb.uiowa.edu/GIS_Library/IA_State/Hydrologic/Ground_Waters/ Mississippian_aquifer/mississippian_topography.zip. The URL for the thickness was ftp://ftp.igsb.uiowa.edu/GIS_Library/IA_State/Hydrologic/Ground_Waters/ Mississippian_aquifer/mississippian_isopach.zip Reference for the top map is Altitude and Configuration, in feet above mean sea level, of the Mississipian Aquifer modified from a scanned image of Map 1, Sheet 1, Miscellaneous Map Series 3, Mississippian Aquifer of Iowa by P.J. Horick and W.L. Steinhilber, Iowa Geological Survey, 1973; IGS MMS-3, Map 1, Sheet 1 Reference for the thickness map is Distribution and isopach thickness, in feet, of the Mississipian Aquifer, modified from a scanned image of Map 1, Sheet 2, Miscellaneous Map Series 3, Mississippian Aquifer of Iowa by P.J. Horick and W.L. Steinhilber, Iowa Geological Survey, 1973; IGS MMS-3, Map 1, Sheet 2 The altitude contours for the top and bottom of SA2 were digitized from georeferenced figures of altitude contours in U.S. Geological Survey Professional Paper 1305 (USGS PP1305), figure 6 (for the top surface) and figure 9 (for the bottom surface). The altitude contours for SA1 and SA2 were interpolated into surface rasters within a GIS using tools that create hydrologically correct surfaces from contour data, derive the altitude from the thickness (depth from the land surface), and merge the subareas into a single surface. The primary tool was an enhanced version of "Topo to Raster" used in ArcGIS, ArcMap, Esri 2014. ArcGIS Desktop: Release 10.2 Redlands, CA: Environmental Systems Research Institute. The raster surfaces were corrected in areas where the altitude of the top of the aquifer exceeded the land surface, and where the bottom of an aquifer exceeded the altitude of the corrected top of the aquifer.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.
By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.
Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.
The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!
While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.
The files contained here are a subset of the KernelVersions
in Meta Kaggle. The file names match the ids in the KernelVersions
csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.
The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.
The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads
. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays
We love feedback! Let us know in the Discussion tab.
Happy Kaggling!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Upper Arlington household income by gender. The dataset can be utilized to understand the gender-based income distribution of Upper Arlington income.
The dataset will have the following datasets when applicable
Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
Explore our comprehensive data analysis and visual representations for a deeper understanding of Upper Arlington income distribution by gender. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wages in China increased to 120698 CNY/Year in 2023 from 114029 CNY/Year in 2022. This dataset provides - China Average Yearly Wages - actual values, historical data, forecast, chart, statistics, economic calendar and news.
The Willamette Lowland basin-fill aquifers (hereinafter referred to as the Willamette aquifer) is located in Oregon and in southern Washington. The aquifer is composed of unconsolidated deposits of sand and gravel, which are interlayered with clay units. The aquifer thickness varies from less than 100 feet to 800 feet. The aquifer is underlain by basaltic-rock. Cities such as Portland, Oregon, depend on the aquifer for public and industrial use (HA 730-H). This product provides source data for the Willamette aquifer framework, including: Georeferenced images: 1. i_08WLMLWD_bot.tif: Georeferenced figure of altitude contour lines representing the bottom of the Willamette aquifer. The original figure was from Professional Paper 1424-A, Plate 2 (1424-A-P2). The contour lines from this figure were digitized to make the file c_08WLMLWD_bot.shp, and the fault lines were digitized to make f_08WLMLWD_bot.shp. Extent shapefiles: 1. p_08WLMLWD.shp: Polygon shapefile containing the areal extent of the Willamette aquifer (Willamette_AqExtent). The original shapefile was modified to create the shapefile included in this data release. It was modified to only include the Willamette Lowland portion of the aquifer. The extent file contains no aquifer subunits. Contour line shapefiles: 1. c_08WLMLWD_bot.shp: Contour line dataset containing altitude values, in feet, referenced to National Geodetic Vertical Datum of 1929 (NGVD29), across the bottom of the Willamette aquifer. These data were used to create the ra_08WLMLWD_bot.tif raster dataset. Fault line shapefiles: 1. f_08WLMLWD_bot.shp: Fault line dataset containing fault lines across the bottom of the Willamette aquifer. These data were not used in raster creation but were included as supplementary information. Altitude raster files: 1. ra_08WLMLWD_top.tif: Altitude raster dataset of the top of the Willamette aquifer. The altitude values are in meters reference to North American Vertical Datum of 1988 (NAVD88). The top of the aquifer is assumed to be land surface based on available data and was interpolated from the digital elevation model (DEM) dataset (NED, 100-meter). 2. ra_08WLMLWD_bot.tif: Altitude raster dataset of the bottom of the Willamette aquifer. The altitude values are in meters reference to NAVD88. This raster was interpolated from the c_08WLMLWD_bot.shp dataset. Depth raster files: 1. rd_08WLMLWD_top.tif: Depth raster dataset of the top of the Willamette aquifer. The depth values are in meters below land surface (NED, 100-meter). The top of the aquifer is assumed to be land surface based on available data. 2. rd_08WLMLWD_bot.tif : Depth raster dataset of the bottom of the Willamette aquifer. The depth values are in meters below land surface (NED, 100-meter).
A global data set of soil types is available at 0.5-degree latitude by 0.5-degree longitude resolution. There are 106 soil units, based on Zobler?s (1986) assessment of the FAO/UNESCO Soil Map of the World. This data set is a conversion of the Zobler 1-degree resolution version to a 0.5-degree resolution. The resolution of the data set was not actually increased. Rather, the 1-degree squares were divided into four 0.5-degree squares with the necessary adjustment of continental boundaries and islands. The computer code used to convert the original 1-degree data to 0.5-degree is provided as a companion file. A JPG image of the data is provided in this document. The Zobler data (1-degree resolution) as distributed by Webb et al. (1993) [http://www.ngdc.noaa.gov/seg/eco/cdroms/gedii_a/datasets/a12/wr.htm#top] contains two columns, one column for continent and one column for soil type. The Soil Map of the World consists of 9 maps that represent parts of the world. The texture data that Webb et al.(1993) provided allowed for the fact that a soil type in one part of the world may have different properties than the same soil in a different part of the world. This continent-specific information is retained in this 0.5-degree resolution data set, as well as the soil type information which is the second column. A code was written (one2half.c) to take the file CONTIZOB.LER distributed by Webb et al. (1993) [http://www.ngdc.noaa.gov/seg/eco/cdroms/gedii_a/datasets/a12/wr.htm#top] and simply divide the 1-degree cells into quarters. This code also reads in a land/water file (land.wave) that specifies the cells that are land at 0.5 degrees. The code checks for consistency between the newly quartered map and the land/water map to which the quartered map is to be registered. If there is a discrepancy between the two, an attempt was made to make the two consistent using the following logic. If the cell is supposed to be water, it is forced to be water. If it is supposed to be land but was resolved to water at 1 degree, the code looks at the surrounding 8 cells and picks the most frequent soil type and assigns it to the cell. If there are no surrounding land cells then it is kept as water in the hopes that on the next pass one or more of the surrounding cells might be converted from water to a soil type. The whole map is iterated 5 times. The remaining cells that should be land but couldn't be determined from surrounding cells (mostly islands that are resolved at 0.5 degree but not at 1 degree) are printed out with coordinate information. A temporary map is output with -9 indicating where data is required. This is repeated for the continent code in CONTIZOB.LER as well. A separate map of the temporary continent codes is produced with -9 indicating required data. A nearly identical code (one2half.c) does the same for the continent codes. The printout allows one to consult the printed versions of the soil map and look up the soil type with the largest coverage in the 0.5-degree cell. The program manfix.c then will go through the temporary map and prompt for input to correct both the soil codes and the continent codes for the map. This can be done manually or by preparing a file of changes (new_fix.dat) and redirecting stdin. A new complete version of the map is outputted. This is in the form of the original CONTIZOB.LER file (contizob.half) but four times larger. Original documentation and computer codes prepared by Post et al. (1996) are provided as companion files with this data set. Image of 106 global soil types available at 0.5-degree by 0.5-degree resolution. Additional documentation from Zobler?s assessment of FAO soil units is available from the NASA Center for Scientific Information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Corporate Profits in the United States decreased to 3203.60 USD Billion in the first quarter of 2025 from 3312 USD Billion in the fourth quarter of 2024. This dataset provides the latest reported value for - United States Corporate Profits - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was derived by the Bioregional Assessment Programme. The parent datasets are identified in the Lineage statement in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.
This dataset comprises of interpreted elevation surfaces and contours for the major Triassic and Upper Permian units of the Galilee Geological Basin.
This dataset was created to provide formation extents for aquifers in the Galilee geological basin
A Quality Assurance (QA) and validation process was conducted on the original well and bore data to choose wells/bores that are within 25 kilometres of the BA Galilee Region extent.
The QA/Validation process is as follows:
Well data
a. Obtained excel file "QPED_July_2013_galilee.xlsx" from GA
b. Based on stratigraphic information in "BH_costrat" tab formation names were regularised and simplified based on current naming conventions.
c. Simplified names added to QPED_July_2013_galileet.xlsx as "Steve_geo" and "Steve_group"
d. Produced new file "GSQ_Geology.xlsx" contained decimal latitude and longitude, KB elevation, top of unit in metres from KB, top of unit in metres AHD, bottom of unit in metres from KB, bottom of unit in metres AHD, original geology, simplified geology, simplified Group geology.
i. KB obtained from "BH_wellhist"
ii. Where no KB information was available ie KB=0, sample the 1S DEM at the well's location to obtain height. KB=DEM+10. Marked well as having lower reliability.
iii. Calculated Top_m_AHD = KB - Top_m_KB
iv. Calculated Bottom_m_AHD = KB - Bottom_m_KB
e. Brought GSQ_Geology.xlsx into ArcGIS
f. Selected wells based on "Steve_geo" field for each model layer to produce a geodatabase for each layer.
i. GSQ_basement_wells
ii. GSQ_top_joe_joe_group
iii. GSQ_top_bandanna_merge
iv. GSQ_rewan_group
v. GSQ_clematis
vi. GSQ_moolyember
g. Additional wells and reinterpreted tops added to appropriate geodatabase based on well completion reports
h. Additional wells added to coverages to help model building process
i. Well_name listed as Fake
ii. Exception being GSQ_top_basement_fake which was created as a separate geodatabase
Bore data
a. Obtained QLD_DNRM_GroundwaterDatabaseExtract_20131111 from GA
b. Used files REGISTRATIONS.txt, ELEVATIONS.txt and AQUIFER.txt to build GW_stratigraphy.xlsx
i. Based on RN
ii. Latitude from GIS_LAT (REGISTRATIONS.txt)
iii. Longitude from GIS_LNG (REGISTRATIONS.txt)
iv. Elevation from (ELEVATIONS.txt)
v. FORM_DESC from (AQUIFER.txt)
vi. Top from (AQUIFER.txt)
vii. Bottom from (AQUIFER.txt)
c. Brought GW_stratigraphy.xlsx into ArcGIS
d. Created gw_bores_galilee_dem
i. Sampled 1S DEM to obtain ground level elevation column RASTERVALU
ii. Created column top_m_AHD by RASTERVALU - Top
e. Selected bores based on "FORM_DESC" field for each model layer to produce a geodatabase for each layer.
i. Gw_basement
ii. GW_bores_joe_joe_group
iii. GW_bores_bandanna
iv. Gw_bores_rewan
v. Gw_bores_clematis
vi. Gw_bores_moolyember
Georectified seismic surfaces
a. Extracted interpreted seismic surfaces for base Permian (interpreted as basement) and top Bandanna (in time) from the following seismic surveys
i. Y80A, W81A, Carmichael, Pendine, T81A, Quilpie, Ward and Powell Creek seismic survey downloaded https://qdexguest.deedi.qld.gov.au/portal/site/qdex/search?searchType=general
ii. Brought TIF images into ArcGIS and georectified
iii. Digitised shape of contours and faults into geodatabase
1. Basement_contours and basement_faults
2. bandanna_contours_new_data and bandanna_faults
iv. Added field "contour" to geodatabase
v. Converted contours to depth in "contour" field based on well and bore data (top_m_AHD) and contour progression
vi. Use the shape and depth derived from OZ SEEBASE to help to add additional contours and faults to basement and bandanna datasets
Additional contour and fault surfaces were built derived from underlying surfaces and wells/bore data
a. Joejoe_contours and joejoe)faults
b. Rewan_contour_clip (used bandanna_faults as fault coverage)
c. Clematis_contour and clematis_faults
d. Moolyember_contour (used clematis_faults as fault coverage)
Surface geology
a. Extracted surface geology from QUEENSLAND GEOLOGY_AUGUST_2012 using Galilee BA region boundary with 25 kilometre boundary to form geodatabase QLD_geology_galilee
b. Selected relevant surface geology from QLD_geology_galilee based on field "Name" for each model layer and created new geodatabase layers
i. Basement_geology: Argentine Metamorphics,Running River Metamorphics,Charters Towers Metamorphics; Bimurra Volcanics, Foyle Volcanics, Mount Wyatt Formation, Saint Anns Formation, Silver Hills Volcanics, Stones Creek Volcanics; Bulliwallah Formation, Ducabrook Formation, Mount Rankin Formation, Natal Formation, Star of Hope Formation; Cape River Metamorphics; Einasleigh Metamorphics; Gem Park Granite; Macrossan Province Cambrian-Ordovician intrusives; Macrossan Province Ordovician-Silurian intrusives; Macrossan Province Ordovician intrusives; Mount Formartine, unnamed plutonic units; Pama Province Silurian-Devonian intrusives; Seventy Mile Range Group; and Kirk River beds, Les Jumelles beds.
ii. Joe_joe_geology: Joe Joe Group
iii. Galilee_permian_geology: Back Creek Group, Betts Creek Group, Blackwater Group
iv. Rewan_geology: Rewan Group
1. Later also made dunda_beds_geology to be included in Rewan model: Dunda beds
v. Clematis_geology: Clematis Group
1. Later also made warang_sandstone_geology to be included in Clematis model: Warang Sandstone
vi. Moolyember_surface_geology: Moolyember Formation
DEM for each model layer
a. Using surface geology geodatabase extent extract grid from dem_s_1s to represent the top of the model layer at the surface
i. Basement_dem
ii. Joejoe_dem
iii. Bandanna_dem
iv. Rewan_dem and dunda_dem
v. Clematis_dem and warang_dem
vi. Moolyember_surface_dem
b. Used Contour tool in ArcGIS to obtain a 25 metre contour geodatabase from the relevant model DEM
i. Basement_dem_contours
ii. Joejoe_dem_contours
iii. Bandanna_dem_contours
iv. Rewan_dem_contours and dunda_dem_contours
v. Clematis_dem_contours and warang_dem_contours
vi. Moolyember_dem_contours
c. For the purpose of guiding the model building process additional fields were added to each DEM contour geodatabase was added based on average thickness derived from groundwater bores and petroleum wells.
i. Basement_dem_contours: Joejoe, bandanna, rewan, clematis, moolyember
ii. Joejoe_dem_contours: basement, bandanna
iii. Bandanna_dem_contours: joejoe, rewan
iv. Rewan_dem_contours and dunda_dem_contours: clematis, rewan
v. Clematis_dem_contours and warang_dem_contours: moolyember, rewan
vi. Moolyember_dem_contours: clematis
The model building process is as follows:
Used the tope to raster tool to create surface based on the following rules
a. Environment
i. Extent
1. Top: -19.7012030024424
2. Right: 148.891511819054
3. Bottom: -27.5812030024424
4. Left: 139.141511819054
ii. Output cell size: 0.01 degrees
iii. Drainage enforcement: No_enforce
b. Input
i. Basement
1. Basement_dem_contour; field - contour; type - contour
2. Joejoe_dem_contour; field - basement; type - contour
3. Basement_contour; field - contour; type - contour
4. GSQ_basement_wells; field - top_m_AHD; type - point elevation
5. GW_basement; field - top_m_AHDl type - point elevation
6. GSQ_top_basement_fake; field - top_m_AHDl type - point elevation
7. Basement_faults; type - cliff
ii. Joe Joe Group
1. Joejoe_dem_contour; field - basement; type - contour
2. Basement_dem_contour; field - joejoe; type - contour
3. permian_dem_contour; field - joejoe, type - contour
4. joejoe_contour; field - joejoe; type - contour
5. GSQ_top_joejoe_group; field - top_m_AHD; type - point elevation
6. GW_bores_joe_joe_group; field - top_m_AHDl type - point elevation
7. joejoe_faults; type - cliff
iii. Bandanna Group
1. Permian_dem_contour; field - contour; type - contour
2. Joejoe_dem_contour; field - bandanna; type - contour
3. Rewan_dem_contour: field - bandanna; type - contour
4. Dunda_dem_contour; field - bandanna; type - contour
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The Breast Cancer Dataset hosted on Kaggle is a powerful resource for researchers, data scientists, and machine learning enthusiasts looking to explore and develop predictive models for breast cancer diagnosis. This dataset, accessible via Kaggle, is designed for binary classification tasks to predict whether a breast tumor is benign or malignant. It provides a rich collection of features derived from digitized images of fine needle aspirates (FNA) of breast masses, making it an essential tool for advancing healthcare analytics and computational pathology. Below is a comprehensive, human-crafted description of the dataset, complete with examples and key highlights to make it engaging and informative.
The dataset originates from the Breast Cancer Wisconsin (Diagnostic) Data Set, a widely used benchmark in machine learning for medical diagnostics. It contains detailed measurements of cell nuclei from breast tissue samples, enabling the classification of tumors as either benign (non-cancerous) or malignant (cancerous). This dataset is particularly valuable for developing and testing machine learning models, such as logistic regression, support vector machines, or deep neural networks, to aid in early and accurate breast cancer detection.
Purpose: Binary classification to predict tumor type (benign or malignant). Source: University of Wisconsin, provided through Kaggle. Link: Breast Cancer Dataset on Kaggle. Application: Ideal for medical research, machine learning model development, and educational purposes.
##### Dataset Structure The dataset comprises 569 instances (rows) and 32 columns, including an ID column, a diagnosis label, and 30 numerical features describing cell nuclei characteristics. Each instance represents a single breast mass sample, with features computed from digitized FNA images. Key Columns:
ID: A unique identifier for each sample (e.g., 842302). Diagnosis: The target variable, labeled as: M (Malignant): Indicates a cancerous tumor. B (Benign): Indicates a non-cancerous tumor.
Features (30 columns): Numerical measurements of cell nuclei, such as radius, texture, perimeter, and area, derived from image analysis.
The 30 features are grouped into three main categories based on the characteristics of cell nuclei:
Mean: Average values of measurements (e.g., mean radius, mean texture). Standard Error (SE): Variability of measurements (e.g., standard error of radius, standard error of area). Worst: Largest (worst) values of measurements (e.g., worst radius, worst smoothness).
Each category includes 10 specific measurements:
Example Data Point: Here’s a simplified example of a single row in the dataset:
ID Diagnosis Radius_mean Texture_mean Perimeter_mean Area_mean Smoothness_mean ...
842302 M 17.99 10.38 122.80 1001.0 0.11840 ...
Interpretation: This sample (ID 842302) is malignant (M), with a mean radius of 17.99 units, a mean texture of 10.38, and so on. The remaining 27 columns provide additional measurements (e.g., standard error and worst values).
Key Highlights
Balanced Classes: The dataset includes 357 benign and 212 malignant cases, offering a relatively balanced distribution for training robust models. No Missing Values: The dataset is clean and preprocessed, with no missing or null values, making it ready for immediate analysis. High Dimensionality: With 30 numerical features, the dataset supports complex modeling techniques, including feature selection and dimensionality reduction. Real-World Impact: The dataset is widely used in research to improve diagnostic accuracy, contributing to early breast cancer detection and better patient outcomes. Open Access: Freely available on Kaggle, encouraging collaboration and innovation in the data science community.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There are multiple well-recognized and peer-reviewed global datasets that can be used to assess water availability and water pollution. Each of these datasets are based on different inputs, modeling approaches, and assumptions. Therefore, in SBTN Step 1: Assess and Step 2: Interpret & Prioritize, companies are required to consult different global datasets for a robust and comprehensive State of Nature (SoN) assessment for water availability and water pollution.
To streamline this process, WWF, the World Resources Institute (WRI), and SBTN worked together to develop two ready-to-use unified layers of SoN – one for water availability and one for water pollution – in line with the Technical Guidance for Steps 1: Assess and Step 2: Interpret & Prioritize. The result is a single file (shapefile) containing the maximum value both for water availability and for water pollution, as well as the datasets’ raw values (as references). This data is publicly available for download from this repository.
These unified layers will make it easier for companies to implement a robust approach, and they will lead to more aligned and comparable results between companies. A temporary App is available at https://arcg.is/0z9mOD0 to help companies assess the SoN for water availability and water pollution around their operations and supply chain locations. In the future, these layers will become available both in the WRI’s Aqueduct and in the WWF Risk Filter Suite.
For the SoN for water availability, the following datasets were considered:
Baseline water stress (Hofste et al. 2019), data available here
Water depletion (Brauman et al. 2016), data available here
Blue water scarcity (Mekonnen & Hoekstra 2016), data upon request to the authors
For the SoN for water pollution, the following datasets were considered:
Coastal Eutrophication Potential (Hofste et al. 2019), data available here
Nitrate-Nitrite Concentration (Damania et al. 2019), data available here
Periphyton Growth Potential (McDowell et al. 2020), data available here
In general, the same processing steps were performed for all datasets:
Compute the area-weighted median of each dataset at a common spatial resolution, i.e. HydroSHEDS HydroBasins Level 6 in this case.
Classify datasets to a common range as reclassifying raw values to 1-5 values, where 0 (zero) was used for cells or features with no data. See the documentation for more details.
Identify the maximum value between the classified datasets, separately, for Water Availability and for Water Pollution.
For transparency and reproducibility, the code is publicly available at https://github.com/rafaexx/sbtn-SoN-water
By Bob Burggraaf [source]
This dataset reveals the faces of America's urbanization by providing the total population of USA cities in 2015. Through this dataset, you can explore and analyze the populations of cities across the United States. This dataset has undergone a series of data cleaning activities to help make sure that it is easy-to-use with visualization tools, such as cleaning up names of city and joining all cities into one formatted table. Therefore, allowing you to quickly visualize various aspects - like population trends or city demographics - in order to get an informative understanding about how our country is growing. With this knowledge, engaging in discussions related to city planning recommendations is easier than ever!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
How to Use this Dataset
This dataset contains information about the population of the major cities in the United States. The columns in this dataset include city, summary level, place Fips code, state, state Fips code and total population.
Using this dataset you can explore a variety of topics related to urbanization including population growth over time and comparative analysis between cities. You can also use it to study specific social or demographic trends such as age distribution or race/ethnicity among other key metrics. With the right analysis you could even predict which areas may experience significant growth or decline in their populations over time. Lastly if you want to compare American cities with other global metropolises then you could easily create aggregate tables that include those data points too!
- Use the data to calculate and demonstrate population growth for cities in the USA over time, providing a strong visual of population changes such as migration, birth/death rates and even shows how urbanization is playing a role in US's population change.
- Analyze correlations between population size and economic indicators (such as GDP) across various cities to examine job opportunities or comparative housing prices.
- Compare different city populations by state to compare disparate areas of the country and determine how much citizens from one state may be attracted to another based on economic advantages or cultural ties
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: Total_Population_By_City_Acs_2015_5_E_AgeSex.csv | Column name | Description | |:---------------------|:----------------------------------------------------------------------| | City | Name of the city. (String) | | Summary_Level | Level of detail of the data. (Integer) | | Place_Fips | Federal Information Processing Standard code for the city. (Integer) | | State | Name of the state. (String) | | State_Fips | Federal Information Processing Standard code for the state. (Integer) | | Total_Population | Total population of the city. (Integer) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Bob Burggraaf.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.
Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.
We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:
Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.
Each sample consists of a single 3d MCFO image of neurons of the fruit fly.
For each image, we provide a pixel-wise instance segmentation for all separable neurons.
Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").
The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.
The segmentation mask for each neuron is stored in a separate channel.
The order of dimensions is CZYX.
We recommend to work in a virtual environment, e.g., by using conda:
conda create -y -n flylight-env -c conda-forge python=3.9
conda activate flylight-env
pip install zarr
import zarr
raw = zarr.open(
seg = zarr.open(
# optional:
import numpy as np
raw_np = np.array(raw)
Zarr arrays are read lazily on-demand.
Many functions that expect numpy arrays also work with zarr arrays.
Optionally, the arrays can also explicitly be converted to numpy arrays.
We recommend to use napari to view the image data.
pip install "napari[all]"
import zarr, sys, napari
raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")
gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")
viewer = napari.Viewer(ndisplay=3)
for idx, gt in enumerate(gts):
viewer.add_labels(
gt, rendering='translucent', blending='additive', name=f'gt_{idx}')
viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')
viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')
viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')
napari.run()
python view_data.py
For more information on our selected metrics and formal definitions please see our paper.
To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..
For detailed information on the methods and the quantitative results please see our paper.
The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
If you use FISBe in your research, please use the following BibTeX entry:
@misc{mais2024fisbe,
title = {FISBe: A real-world benchmark dataset for instance
segmentation of long-range thin filamentous structures},
author = {Lisa Mais and Peter Hirsch and Claire Managan and Ramya
Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena
Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller},
year = 2024,
eprint = {2404.00130},
archivePrefix ={arXiv},
primaryClass = {cs.CV}
}
We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuable
discussions.
P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.
This work was co-funded by Helmholtz Imaging.
There have been no changes to the dataset so far.
All future change will be listed on the changelog page.
If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.
All contributions are welcome!
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dominican Republic number dataset helps in many ways to gain huge amounts from business. Besides, this Dominican Republic number dataset is a very valuable directory that you can buy from us at a minimal cost. In addition, it creates many business chances because this country is rich in multiple sectors. Additionally, this directory makes all businesses more famous, competitive, and useful. For instance, this Dominican Republic number dataset builds new opportunities to do business in your selected places. Yet, the vendors can give sales promotions and make huge money from this lead. This time, they can join with the selected group of clients quickly. Overall, it provides the long-term success of your company or business. Dominican Republic phone data is a powerful way to connect many clients. Our Dominican Republic phone data can assist in getting speedy feedback from the public. In other words, our expert unit supplies this cautiously according to your needs. However, the List To Data website is the perfect source to get upgraded sales leads. Thus, check out the packages to find the one that works best for you and watch your business succeed. Moreover, the Dominican Republic phone data is perfect for sending text messages or making phone calls to potential new clients to make deals. By getting this people easily can reach out to people in this area and get positive results from the marketing. Likewise, this library retains millions of phone numbers from different businesses and people. Dominican Republic phone number list transforms your business into a profitable venture. Finding real contacts is very important because the Dominican Republic phone number list helps you reach a genuine audience, saving you time. Even, this List To Data helps you attach with many people quickly and boosts your marketing efforts. In addition, the Dominican Republic phone number list is a great source of earning from B2B and B2C platforms. The Dominican Republic’s economy is strong and diverse, with important sectors like technology, finance, and tourism. Besides, the country’s economy is persisting to grow. In the end, everyone should buy our contact data to earn a massive amount of profit from your targeted locations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data in this dataset were collected in the result of the survey of Latvian society (2021) aimed at identifying high-value data set for Latvia, i.e. data sets that, in the view of Latvian society, could create the value for the Latvian economy and society. The survey is created for both individuals and businesses. It being made public both to act as supplementary data for "Towards enrichment of the open government data: a stakeholder-centered determination of High-Value Data sets for Latvia" paper (author: Anastasija Nikiforova, University of Latvia) and in order for other researchers to use these data in their own work.
The survey was distributed among Latvian citizens and organisations. The structure of the survey is available in the supplementary file available (see Survey_HighValueDataSets.odt)
Description of the data in this data set: structure of the survey and pre-defined answers (if any) 1. Have you ever used open (government) data? - {(1) yes, once; (2) yes, there has been a little experience; (3) yes, continuously, (4) no, it wasn’t needed for me; (5) no, have tried but has failed} 2. How would you assess the value of open govenment data that are currently available for your personal use or your business? - 5-point Likert scale, where 1 – any to 5 – very high 3. If you ever used the open (government) data, what was the purpose of using them? - {(1) Have not had to use; (2) to identify the situation for an object or ab event (e.g. Covid-19 current state); (3) data-driven decision-making; (4) for the enrichment of my data, i.e. by supplementing them; (5) for better understanding of decisions of the government; (6) awareness of governments’ actions (increasing transparency); (7) forecasting (e.g. trendings etc.); (8) for developing data-driven solutions that use only the open data; (9) for developing data-driven solutions, using open data as a supplement to existing data; (10) for training and education purposes; (11) for entertainment; (12) other (open-ended question) 4. What category(ies) of “high value datasets” is, in you opinion, able to create added value for society or the economy? {(1)Geospatial data; (2) Earth observation and environment; (3) Meteorological; (4) Statistics; (5) Companies and company ownership; (6) Mobility} 5. To what extent do you think the current data catalogue of Latvia’s Open data portal corresponds to the needs of data users/ consumers? - 10-point Likert scale, where 1 – no data are useful, but 10 – fully correspond, i.e. all potentially valuable datasets are available 6. Which of the current data categories in Latvia’s open data portals, in you opinion, most corresponds to the “high value dataset”? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 7. Which of them form your TOP-3? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 8. How would you assess the value of the following data categories? 8.1. sensor data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.2. real-time data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.3. geospatial data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 9. What would be these datasets? I.e. what (sub)topic could these data be associated with? - open-ended question 10. Which of the data sets currently available could be valauble and useful for society and businesses? - open-ended question 11. Which of the data sets currently NOT available in Latvia’s open data portal could, in your opinion, be valauble and useful for society and businesses? - open-ended question 12. How did you define them? - {(1)Subjective opinion; (2) experience with data; (3) filtering out the most popular datasets, i.e. basing the on public opinion; (4) other (open-ended question)} 13. How high could be the value of these data sets value for you or your business? - 5-point Likert scale, where 1 – not valuable, 5 – highly valuable 14. Do you represent any company/ organization (are you working anywhere)? (if “yes”, please, fill out the survey twice, i.e. as an individual user AND a company representative) - {yes; no; I am an individual data user; other (open-ended)} 15. What industry/ sector does your company/ organization belong to? (if you do not work at the moment, please, choose the last option) - {Information and communication services; Financial and ansurance activities; Accommodation and catering services; Education; Real estate operations; Wholesale and retail trade; repair of motor vehicles and motorcycles; transport and storage; construction; water supply; waste water; waste management and recovery; electricity, gas supple, heating and air conditioning; manufacturing industry; mining and quarrying; agriculture, forestry and fisheries professional, scientific and technical services; operation of administrative and service services; public administration and defence; compulsory social insurance; health and social care; art, entertainment and recreation; activities of households as employers;; CSO/NGO; Iam not a representative of any company 16. To which category does your company/ organization belong to in terms of its size? - {small; medium; large; self-employeed; I am not a representative of any company} 17. What is the age group that you belong to? (if you are an individual user, not a company representative) - {11..15, 16..20, 21..25, 26..30, 31..35, 36..40, 41..45, 46+, “do not want to reveal”} 18. Please, indicate your education or a scientific degree that corresponds most to you? (if you are an individual user, not a company representative) - {master degree; bachelor’s degree; Dr. and/ or PhD; student (bachelor level); student (master level); doctoral candidate; pupil; do not want to reveal these data}
Format of the file .xls, .csv (for the first spreadsheet only), .odt
Licenses or restrictions CC-BY
The Arbuckle-Simpson aquifer covers an area of about 800 square miles in the Arbuckle Mountains and Arbuckle Plains of South-Central Oklahoma. The aquifer is in the Central Lowland Physiographic Province and is composed of the Simpson and Arbuckle Groups of Ordovician and Cambrian age. The aquifer is as thick as 9,000 feet in some areas. The aquifer provides relatively small, but important, amounts of water depended on for public supply, agricultural, and industrial use (HA 730-E). This product provides source data for the Arbuckle-Simpson aquifer framework, including: Georeferenced images: 1. i_46ARBSMP_bot.tif: Digitized figure of depth contour lines below land surface representing the base of fresh water in the Arbuckle-Simpson aquifer. The base of fresh water is considered to be the bottom of the Arbuckle-Simpson aquifer. The original figure is from the "Reconnaissance of the water resources of the Ardmore and Sherman Quadrangles, southern Oklahoma" report, map HA-3, page 2, prepared by the Oklahoma Geological Survey in cooperation with the U.S. Geological Survey (HA3_P2). Extent shapefiles: 1. p_46ABKSMP.shp: Polygon shapefile containing the areal extent of the Arbuckle-Simpson aquifer (Arbuckle-Simpson_AqExtent). The extent file contains no aquifer subunits. Contour line shapefiles: 1. c_46ABKSMP_bot.shp: Contour line dataset containing depth values, in feet below land surface, across the bottom of the Arbuckle-Simpson aquifer. This dataset is a digitized version of the map published in HA3_P2. This dataset was used to create the rd_46ABKSMP_bot.tif raster dataset. This map generalized depth values into zoned areas with associated ranges of depth. The edge of each zone was treated as the minimum value of the assigned range, thus creating the depth contour lines. This interpretation was favorable as it allowed for the creation of the resulting raster. This map was used because more detailed point or contour data for the area is unavailable. Altitude raster files: 1. ra_46ABKSMP_top.tif: Altitude raster dataset of the top of the Arbuckle-Simpson aquifer. The altitude values are in meters reference to North American Vertical Datum of 1988 (NAVD88). The top of the aquifer is assumed to be at land surface (NED, 100-meter) based on available data. This raster was interpolated from the Digital Elevation Model (DEM) dataset (NED, 100-meter). 2. ra_46ABKSMP_bot.tif: Altitude raster dataset of the bottom of the Arbuckle-Simpson aquifer. The altitude values are in meters referenced to NAVD88. Depth raster files: 1. rd_46ABKSMP_top.tif: Depth raster dataset of the top of the Arbuckle-Simpson aquifer. The depth values are in meters below land surface (NED, 100-meter). The top of the aquifer is assumed to be at land surface (NED, 100-meter) based on available data. 2. rd_46ABKSMP_bot.tif: Depth raster dataset of the bottom of the Arbuckle-Simpson aquifer. The depth values are in meters below land surface (NED, 100-meter). This raster was interpolated from the contour line dataset c_46ABKSMP_bot.shp.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Best Fiends is a dataset for object detection tasks - it contains Gold annotations for 364 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the detailed breakdown of the count of individuals within distinct income brackets, categorizing them by gender (men and women) and employment type - full-time (FT) and part-time (PT), offering valuable insights into the diverse income landscapes within Illinois. The dataset can be utilized to gain insights into gender-based income distribution within the Illinois population, aiding in data analysis and decision-making..
Key observations
https://i.neilsberg.com/ch/illinois-income-distribution-by-gender-and-employment-type.jpeg" alt="Illinois gender and employment-based income distribution analysis (Ages 15+)">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates.
Income brackets:
Variables / Data Columns
Employment type classifications include:
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Illinois median household income by gender. You can refer the same here