73 datasets found
  1. N

    Income Distribution by Quintile: Mean Household Income in Middle Inlet,...

    • neilsberg.com
    csv, json
    Updated Jan 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Income Distribution by Quintile: Mean Household Income in Middle Inlet, Wisconsin [Dataset]. https://www.neilsberg.com/research/datasets/94c785c2-7479-11ee-949f-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jan 11, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Wisconsin, Middle Inlet
    Variables measured
    Income Level, Mean Household Income
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the mean household income for each of the five quintiles in Middle Inlet, Wisconsin, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

    Key observations

    • Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 21,360, while the mean income for the highest quintile (20% of households with the highest income) is 162,915. This indicates that the top earners earn 8 times compared to the lowest earners.
    • *Top 5%: * The mean household income for the wealthiest population (top 5%) is 282,509, which is 173.41% higher compared to the highest quintile, and 1322.61% higher compared to the lowest quintile.

    https://i.neilsberg.com/ch/middle-inlet-wi-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Middle Inlet, Wisconsin (in 2022 inflation-adjusted dollars))">

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Income Levels:

    • Lowest Quintile
    • Second Quintile
    • Third Quintile
    • Fourth Quintile
    • Highest Quintile
    • Top 5 Percent

    Variables / Data Columns

    • Income Level: This column showcases the income levels (As mentioned above).
    • Mean Household Income: Mean household income, in 2022 inflation-adjusted dollars for the specific income level.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Middle Inlet town median household income. You can refer the same here

  2. Income of individuals by age group, sex and income source, Canada, provinces...

    • www150.statcan.gc.ca
    • ouvert.canada.ca
    • +2more
    Updated May 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Income of individuals by age group, sex and income source, Canada, provinces and selected census metropolitan areas [Dataset]. http://doi.org/10.25318/1110023901-eng
    Explore at:
    Dataset updated
    May 1, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Income of individuals by age group, sex and income source, Canada, provinces and selected census metropolitan areas, annual.

  3. High income tax filers in Canada

    • www150.statcan.gc.ca
    • open.canada.ca
    Updated Oct 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2024). High income tax filers in Canada [Dataset]. http://doi.org/10.25318/1110005501-eng
    Explore at:
    Dataset updated
    Oct 28, 2024
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    This table presents income shares, thresholds, tax shares, and total counts of individual Canadian tax filers, with a focus on high income individuals (95% income threshold, 99% threshold, etc.). Income thresholds are based on national threshold values, regardless of selected geography; for example, the number of Nova Scotians in the top 1% will be calculated as the number of taxfiling Nova Scotians whose total income exceeded the 99% national income threshold. Different definitions of income are available in the table namely market, total, and after-tax income, both with and without capital gains.

  4. N

    Income Distribution by Quintile: Mean Household Income in Eugene, OR // 2025...

    • neilsberg.com
    csv, json
    Updated Mar 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Income Distribution by Quintile: Mean Household Income in Eugene, OR // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/eugene-or-median-household-income/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Eugene, Oregon
    Variables measured
    Income Level, Mean Household Income
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the mean household income for each of the five quintiles in Eugene, OR, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

    Key observations

    • Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 13,271, while the mean income for the highest quintile (20% of households with the highest income) is 246,971. This indicates that the top earners earn 19 times compared to the lowest earners.
    • *Top 5%: * The mean household income for the wealthiest population (top 5%) is 452,934, which is 183.40% higher compared to the highest quintile, and 3412.96% higher compared to the lowest quintile.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Income Levels:

    • Lowest Quintile
    • Second Quintile
    • Third Quintile
    • Fourth Quintile
    • Highest Quintile
    • Top 5 Percent

    Variables / Data Columns

    • Income Level: This column showcases the income levels (As mentioned above).
    • Mean Household Income: Mean household income, in 2023 inflation-adjusted dollars for the specific income level.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Eugene median household income. You can refer the same here

  5. Multi-Camera Action Dataset (MCAD)

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip +2
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenhui Li; Yongkang Wong; An-An Liu; Yang Li; Yu-Ting Su; Mohan Kankanhalli; Wenhui Li; Yongkang Wong; An-An Liu; Yang Li; Yu-Ting Su; Mohan Kankanhalli (2020). Multi-Camera Action Dataset (MCAD) [Dataset]. http://doi.org/10.5281/zenodo.884592
    Explore at:
    application/gzip, json, txtAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Wenhui Li; Yongkang Wong; An-An Liu; Yang Li; Yu-Ting Su; Mohan Kankanhalli; Wenhui Li; Yongkang Wong; An-An Liu; Yang Li; Yu-Ting Su; Mohan Kankanhalli
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Action recognition has received increasing attentions from the computer vision and machine learning community in the last decades. Ever since then, the recognition task has evolved from single view recording under controlled laboratory environment to unconstrained environment (i.e., surveillance environment or user generated videos). Furthermore, recent work focused on other aspect of action recognition problem, such as cross-view classification, cross domain learning, multi-modality learning, and action localization. Despite the large variations of studies, we observed limited works that explore the open-set and open-view classification problem, which is a genuine inherited properties in action recognition problem. In other words, a well designed algorithm should robustly identify an unfamiliar action as “unknown” and achieved similar performance across sensors with similar field of view. The Multi-Camera Action Dataset (MCAD) is designed to evaluate the open-view classification problem under surveillance environment.

    In our multi-camera action dataset, different from common action datasets we use a total of five cameras, which can be divided into two types of cameras (StaticandPTZ), to record actions. Particularly, there are three Static cameras (Cam04 & Cam05 & Cam06) with fish eye effect and two PanTilt-Zoom (PTZ) cameras (PTZ04 & PTZ06). Static camera has a resolution of 1280×960 pixels, while PTZ camera has a resolution of 704×576 pixels and a smaller field of view than Static camera. What’s more, we don’t control the illumination environment. We even set two contrasting conditions (Daytime and Nighttime environment) which makes our dataset more challenge than many controlled datasets with strongly controlled illumination environment.The distribution of the cameras is shown in the picture on the right.

    We identified 18 units single person daily actions with/without object which are inherited from the KTH, IXMAS, and TRECIVD datasets etc. The list and the definition of actions are shown in the table. These actions can also be divided into 4 types actions. Micro action without object (action ID of 01, 02 ,05) and with object (action ID of 10, 11, 12 ,13). Intense action with object (action ID of 03, 04 ,06, 07, 08, 09) and with object (action ID of 14, 15, 16, 17, 18). We recruited a total of 20 human subjects. Each candidate repeats 8 times (4 times during the day and 4 times in the evening) of each action under one camera. In the recording process, we use five cameras to record each action sample separately. During recording stage we just tell candidates the action name then they could perform the action freely with their own habit, only if they do the action in the field of view of the current camera. This can make our dataset much closer to reality. As a results there is high intra action class variation among different action samples as shown in picture of action samples.

    URL: http://mmas.comp.nus.edu.sg/MCAD/MCAD.html

    Resources:

    • IDXXXX.mp4.tar.gz contains video data for each individual
    • boundingbox.tar.gz contains person bounding box for all videos
    • protocol.json contains the evaluation protocol
    • img_list.txt contains the download URLs for the images version of the video data
    • idt_list.txt contians the download URLs for the improved Dense Trajectory feature
    • stip_list.txt contians the download URLs for the STIP feature

    How to Cite:

    Please cite the following paper if you use the MCAD dataset in your work (papers, articles, reports, books, software, etc):

    • Wenhui Liu, Yongkang Wong, An-An Liu, Yang Li, Yu-Ting Su, Mohan Kankanhalli
      Multi-Camera Action Dataset for Cross-Camera Action Recognition Benchmarking
      IEEE Winter Conference on Applications of Computer Vision (WACV), 2017.
      http://doi.org/10.1109/WACV.2017.28
  6. n

    GameOfLife Prediction Dataset

    • data.ncl.ac.uk
    txt
    Updated Sep 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Towers; Linus Ericsson; Amir Atapour-Abarghouei; Elliot J Crowley; Andrew Stephen McGough (2025). GameOfLife Prediction Dataset [Dataset]. http://doi.org/10.25405/data.ncl.30000835.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 10, 2025
    Dataset provided by
    Newcastle University
    Authors
    David Towers; Linus Ericsson; Amir Atapour-Abarghouei; Elliot J Crowley; Andrew Stephen McGough
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The GameOfLife dataset is an algorithmically generated dataset based off John Horton Conway's Game of Life. Conway's Game of Life follows a strict set off rules at each "generation" (simulation step) where cells alternate between a dead and alive state based on number of surrounding alive cells. These rules can be found on the Game of Life's Wikipedia pageThis dataset is one of the three hidden datasets used by the 2025 NAS Unseen-Data Challenge at AutoML.The goal of this dataset is to predict the number of cells alive in the next generation. This task is relatively simple for a human to do if a bit tedious, and should theoretically be simple for Machine Learning algorithms. Each cells's state is calculated based off the number of alive neighbour's in the previous step. Effectively for every cell we only need to look at the surrounding eight cells (3x3 square, minus the centre) which means all information for each cell can be found from a 3x3 Convolution, which is a very common kernel size to use. The dataset was used to make sure that participants appraoches could handle simple tasks along with the more complicated tasks to make sure they did not overcomplicate their submission.There are 70,000 images in the dataset where each image is a randomly generated starting configuration of the Game of Life, with a random level of density (number of initial alive cells).The data is stored in a channels-first format with a shape of (n, 1, 10, 10) where n is the number of samples in the corresponding set (50,000 for training, 10,000 for validation, and 10,000 for testing).There are 25 classes in this dataset, where the label (0..24) represents the number of alive celss in the next generation and images are evenly distributed by class across the dataset (2800 each, 2000, 400, 400 for training, validation and testing respectively). We limit the data to 25 classes despite theoretically a limit of 0-100, we do this as the higher classes are increasingly unlikely to occur, and would take much longer to create a balanced dataset. Excluding 0, the lower numbers also get increasingly unlikely, though more likely than higher numbers, we wanted to prevent gaps and therefore limited to 25 contiguous classesNumPy (.npy) files can be opened through the NumPy Python library, using the numpy.load() function by inputting the path to the file into the function as a parameter. The metadata file contains some basic information about the datasets, and can be opened in many text editors such as vim, nano, notepad++, notepad, etc.

  7. t

    Tucson Equity Priority Index (TEPI): Citywide Census Tracts

    • teds.tucsonaz.gov
    • hub.arcgis.com
    Updated Jun 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tucson (2024). Tucson Equity Priority Index (TEPI): Citywide Census Tracts [Dataset]. https://teds.tucsonaz.gov/maps/cotgis::tucson-equity-priority-index-tepi-citywide-census-tracts
    Explore at:
    Dataset updated
    Jun 27, 2024
    Dataset authored and provided by
    City of Tucson
    Area covered
    Description

    For detailed information, visit the Tucson Equity Priority Index StoryMap.Download the layer's data dictionaryWhat is the Tucson Equity Priority Index (TEPI)?The Tucson Equity Priority Index (TEPI) is a tool that describes the distribution of socially vulnerable demographics. It categorizes the dataset into 5 classes that represent the differing prioritization needs based on the presence of social vulnerability: Low (0-20), Low-Moderate (20-40), Moderate (40-60), Moderate-High (60-80) High (80-100). Each class represents 20% of the dataset’s features in order of their values. The features within the Low (0-20) classification represent the areas that, when compared to all other locations in the study area, have the lowest need for prioritization, as they tend to have less socially vulnerable demographics. The features that fall into the High (80-100) classification represent the 20% of locations in the dataset that have the greatest need for prioritization, as they tend to have the highest proportions of socially vulnerable demographics. How is social vulnerability measured?The Tucson Equity Priority Index (TEPI) examines the proportion of vulnerability per feature using 11 demographic indicators:Income Below Poverty: Households with income at or below the federal poverty level (FPL), which in 2023 was $14,500 for an individual and $30,000 for a family of fourUnemployment: Measured as the percentage of unemployed persons in the civilian labor forceHousing Cost Burdened: Homeowners who spend more than 30% of their income on housing expenses, including mortgage, maintenance, and taxesRenter Cost Burdened: Renters who spend more than 30% of their income on rentNo Health Insurance: Those without private health insurance, Medicare, Medicaid, or any other plan or programNo Vehicle Access: Households without automobile, van, or truck accessHigh School Education or Less: Those highest level of educational attainment is a High School diploma, equivalency, or lessLimited English Ability: Those whose ability to speak English is "Less Than Well."People of Color: Those who identify as anything other than Non-Hispanic White Disability: Households with one or more physical or cognitive disabilities Age: Groups that tend to have higher levels of vulnerability, including children (those below 18), and seniors (those 65 and older)An overall percentile value is calculated for each feature based on the total proportion of the above indicators in each area. How are the variables combined?These indicators are divided into two main categories that we call Thematic Indices: Economic and Personal Characteristics. The two thematic indices are further divided into five sub-indices called Tier-2 Sub-Indices. Each Tier-2 Sub-Index contains 2-3 indicators. Indicators are the datasets used to measure vulnerability within each sub-index. The variables for each feature are re-scaled using the percentile normalization method, which converts them to the same scale using values between 0 to 100. The variables are then combined first into each of the five Tier-2 Sub-Indices, then the Thematic Indices, then the overall TEPI using the mean aggregation method and equal weighting. The resulting dataset is then divided into the five classes, where:High Vulnerability (80-100%): Representing the top classification, this category includes the highest 20% of regions that are the most socially vulnerable. These areas require the most focused attention. Moderate-High Vulnerability (60-80%): This upper-middle classification includes areas with higher levels of vulnerability compared to the median. While not the highest, these areas are more vulnerable than a majority of the dataset and should be considered for targeted interventions. Moderate Vulnerability (40-60%): Representing the middle or median quintile, this category includes areas of average vulnerability. These areas may show a balanced mix of high and low vulnerability. Detailed examination of specific indicators is recommended to understand the nuanced needs of these areas. Low-Moderate Vulnerability (20-40%): Falling into the lower-middle classification, this range includes areas that are less vulnerable than most but may still exhibit certain vulnerable characteristics. These areas typically have a mix of lower and higher indicators, with the lower values predominating. Low Vulnerability (0-20%): This category represents the bottom classification, encompassing the lowest 20% of data points. Areas in this range are the least vulnerable, making them the most resilient compared to all other features in the dataset.

  8. National Hydrography Dataset Plus High Resolution

    • oregonwaterdata.org
    • sal-urichmond.hub.arcgis.com
    • +1more
    Updated Mar 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2023). National Hydrography Dataset Plus High Resolution [Dataset]. https://www.oregonwaterdata.org/maps/f1f45a3ba37a4f03a5f48d7454e4b654
    Explore at:
    Dataset updated
    Mar 16, 2023
    Dataset authored and provided by
    Esrihttp://esri.com/
    Area covered
    Description

    The National Hydrography Dataset Plus High Resolution (NHDplus High Resolution) maps the lakes, ponds, streams, rivers and other surface waters of the United States. Created by the US Geological Survey, NHDPlus High Resolution provides mean annual flow and velocity estimates for rivers and streams. Additional attributes provide connections between features facilitating complicated analyses.For more information on the NHDPlus High Resolution dataset see the User’s Guide for the National Hydrography Dataset Plus (NHDPlus) High Resolution.Dataset SummaryPhenomenon Mapped: Surface waters and related features of the United States and associated territoriesGeographic Extent: The Contiguous United States, Hawaii, portions of Alaska, Puerto Rico, Guam, US Virgin Islands, Northern Marianas Islands, and American SamoaProjection: Web Mercator Auxiliary Sphere Visible Scale: Visible at all scales but layer draws best at scales larger than 1:1,000,000Source: USGSUpdate Frequency: AnnualPublication Date: July 2022This layer was symbolized in the ArcGIS Map Viewer and while the features will draw in the Classic Map Viewer the advanced symbology will not. Prior to publication, the network and non-network flowline feature classes were combined into a single flowline layer. Similarly, the Area and Waterbody feature classes were merged under a single schema.Attribute fields were added to the flowline and waterbody layers to simplify symbology and enhance the layer's pop-ups. Fields added include Pop-up Title, Pop-up Subtitle, Esri Symbology (waterbodies only), and Feature Code Description. All other attributes are from the original dataset. No data values -9999 and -9998 were converted to Null values.What can you do with this layer?Feature layers work throughout the ArcGIS system. Generally your work flow with feature layers will begin in ArcGIS Online or ArcGIS Pro. Below are just a few of the things you can do with a feature service in Online and Pro.ArcGIS OnlineAdd this layer to a map in the map viewer. The layer or a map containing it can be used in an application. Change the layer’s transparency and set its visibility rangeOpen the layer’s attribute table and make selections. Selections made in the map or table are reflected in the other. Center on selection allows you to zoom to features selected in the map or table and show selected records allows you to view the selected records in the table.Apply filters. For example you can set a filter to show larger streams and rivers using the mean annual flow attribute or the stream order attribute.Change the layer’s style and symbologyAdd labels and set their propertiesCustomize the pop-upUse as an input to the ArcGIS Online analysis tools. This layer works well as a reference layer with the trace downstream and watershed tools. The buffer tool can be used to draw protective boundaries around streams and the extract data tool can be used to create copies of portions of the data.ArcGIS ProAdd this layer to a 2d or 3d map.Use as an input to geoprocessing. For example, copy features allows you to select then export portions of the data to a new feature class.Change the symbology and the attribute field used to symbolize the dataOpen table and make interactive selections with the mapModify the pop-upsApply Definition Queries to create sub-sets of the layerThis layer is part of the ArcGIS Living Atlas of the World that provides an easy way to explore the landscape layers and many other beautiful and authoritative maps on hundreds of topics.Questions?Please leave a comment below if you have a question about this layer, and we will get back to you as soon as possible.

  9. t

    Tucson Equity Priority Index (TEPI): Pima County Block Groups

    • teds.tucsonaz.gov
    • tucson-equity-data-hub-cotgis.hub.arcgis.com
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tucson (2024). Tucson Equity Priority Index (TEPI): Pima County Block Groups [Dataset]. https://teds.tucsonaz.gov/maps/cotgis::tucson-equity-priority-index-tepi-pima-county-block-groups
    Explore at:
    Dataset updated
    Jul 23, 2024
    Dataset authored and provided by
    City of Tucson
    Area covered
    Description

    For detailed information, visit the Tucson Equity Priority Index StoryMap.Download the Data DictionaryWhat is the Tucson Equity Priority Index (TEPI)?The Tucson Equity Priority Index (TEPI) is a tool that describes the distribution of socially vulnerable demographics. It categorizes the dataset into 5 classes that represent the differing prioritization needs based on the presence of social vulnerability: Low (0-20), Low-Moderate (20-40), Moderate (40-60), Moderate-High (60-80) High (80-100). Each class represents 20% of the dataset’s features in order of their values. The features within the Low (0-20) classification represent the areas that, when compared to all other locations in the study area, have the lowest need for prioritization, as they tend to have less socially vulnerable demographics. The features that fall into the High (80-100) classification represent the 20% of locations in the dataset that have the greatest need for prioritization, as they tend to have the highest proportions of socially vulnerable demographics. How is social vulnerability measured?The Tucson Equity Priority Index (TEPI) examines the proportion of vulnerability per feature using 11 demographic indicators:Income Below Poverty: Households with income at or below the federal poverty level (FPL), which in 2023 was $14,500 for an individual and $30,000 for a family of fourUnemployment: Measured as the percentage of unemployed persons in the civilian labor forceHousing Cost Burdened: Homeowners who spend more than 30% of their income on housing expenses, including mortgage, maintenance, and taxesRenter Cost Burdened: Renters who spend more than 30% of their income on rentNo Health Insurance: Those without private health insurance, Medicare, Medicaid, or any other plan or programNo Vehicle Access: Households without automobile, van, or truck accessHigh School Education or Less: Those highest level of educational attainment is a High School diploma, equivalency, or lessLimited English Ability: Those whose ability to speak English is "Less Than Well."People of Color: Those who identify as anything other than Non-Hispanic White Disability: Households with one or more physical or cognitive disabilities Age: Groups that tend to have higher levels of vulnerability, including children (those below 18), and seniors (those 65 and older)An overall percentile value is calculated for each feature based on the total proportion of the above indicators in each area. How are the variables combined?These indicators are divided into two main categories that we call Thematic Indices: Economic and Personal Characteristics. The two thematic indices are further divided into five sub-indices called Tier-2 Sub-Indices. Each Tier-2 Sub-Index contains 2-3 indicators. Indicators are the datasets used to measure vulnerability within each sub-index. The variables for each feature are re-scaled using the percentile normalization method, which converts them to the same scale using values between 0 to 100. The variables are then combined first into each of the five Tier-2 Sub-Indices, then the Thematic Indices, then the overall TEPI using the mean aggregation method and equal weighting. The resulting dataset is then divided into the five classes, where:High Vulnerability (80-100%): Representing the top classification, this category includes the highest 20% of regions that are the most socially vulnerable. These areas require the most focused attention. Moderate-High Vulnerability (60-80%): This upper-middle classification includes areas with higher levels of vulnerability compared to the median. While not the highest, these areas are more vulnerable than a majority of the dataset and should be considered for targeted interventions. Moderate Vulnerability (40-60%): Representing the middle or median quintile, this category includes areas of average vulnerability. These areas may show a balanced mix of high and low vulnerability. Detailed examination of specific indicators is recommended to understand the nuanced needs of these areas. Low-Moderate Vulnerability (20-40%): Falling into the lower-middle classification, this range includes areas that are less vulnerable than most but may still exhibit certain vulnerable characteristics. These areas typically have a mix of lower and higher indicators, with the lower values predominating. Low Vulnerability (0-20%): This category represents the bottom classification, encompassing the lowest 20% of data points. Areas in this range are the least vulnerable, making them the most resilient compared to all other features in the dataset.

  10. l

    Chesapeake Land Cover

    • lila.science
    various
    Updated Jun 19, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chesapeake Conservancy (2019). Chesapeake Land Cover [Dataset]. https://lila.science/datasets/chesapeakelandcover
    Explore at:
    variousAvailable download formats
    Dataset updated
    Jun 19, 2019
    Dataset authored and provided by
    Chesapeake Conservancy
    License

    https://cdla.dev/permissive-1-0/https://cdla.dev/permissive-1-0/

    Area covered
    Chesapeake, United States
    Description

    This dataset contains high-resolution aerial imagery from the USDA NAIP program [1], high-resolution land cover labels from the Chesapeake Conservancy, low-resolution land cover labels from the USGS NLCD 2011 dataset, low-resolution multi-spectral imagery from Landsat 8, and high-resolution building footprint masks from Microsoft Bing, formatted to accelerate machine learning research into land cover mapping. The Chesapeake Conservancy spent over 10 months and $1.3 million creating a consistent six-class land cover dataset covering the Chesapeake Bay watershed. While the purpose of the mapping effort by the Chesapeake Conservancy was to create land cover data to be used in conservation efforts, the same data can be used to train machine learning models that can be applied over even wider areas. The organization of this dataset (detailed below) will allow users to easily test questions related to this problem of geographic generalization, i.e. how to train machine learning models that can be applied over even wider areas. For example, this dataset can be used to directly estimate how well a model trained on data from Maryland can generalize over the remainder of the Chesapeake Bay.

  11. Single-earner and dual-earner census families by number of children

    • www150.statcan.gc.ca
    • ouvert.canada.ca
    • +2more
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Single-earner and dual-earner census families by number of children [Dataset]. http://doi.org/10.25318/1110002801-eng
    Explore at:
    Dataset updated
    Jul 18, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Families of tax filers; Single-earner and dual-earner census families by number of children (final T1 Family File; T1FF).

  12. m

    A Curated Dataset for Drug Class Prediction and Repositioning

    • data.mendeley.com
    Updated Jan 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tiago Alves de Oliveira (2025). A Curated Dataset for Drug Class Prediction and Repositioning [Dataset]. http://doi.org/10.17632/j83cfgkrb5.2
    Explore at:
    Dataset updated
    Jan 27, 2025
    Authors
    Tiago Alves de Oliveira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This curated dataset offers a valuable resource for deep learning applications in drug discovery and repositioning domains. It contains 5,350 high-resolution images systematically categorized into pharmacological classes and molecular targets. The pharmacological classes encompass antifungals, antivirals, corticosteroids, diuretics, and non-steroidal anti-inflammatory drugs (NSAIDs), while the molecular targets emphasize Alzheimer's disease-related enzymes, including acetylcholinesterase, butyrylcholinesterase, and beta-secretase 1. The dataset was meticulously compiled using data from well-established databases, including DrugBank, ChEMBL, and DUD-E, ensuring diversity and quality in the compounds selected for training. Active compounds (true positives) were sourced from DrugBank and ChEMBL, while decoy compounds (true negatives) were generated using the DUD-E protocol. The decoy compounds are designed to match the physicochemical properties of active compounds while lacking binding affinity, creating a robust benchmark for machine learning evaluation. The balanced structure of the dataset, with equal representation of true positive and decoy compounds, enhances its suitability for binary and multi-class classification tasks. The collection of compounds is diverse and of high quality, thus supporting a wide range of deep learning tasks, including pharmacological class prediction, virtual screening, and molecular target identification. This ultimately advances computational approaches in drug discovery.

  13. d

    Pseudo-Label Generation for Multi-Label Text Classification

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Pseudo-Label Generation for Multi-Label Text Classification [Dataset]. https://catalog.data.gov/dataset/pseudo-label-generation-for-multi-label-text-classification
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    With the advent and expansion of social networking, the amount of generated text data has seen a sharp increase. In order to handle such a huge volume of text data, new and improved text mining techniques are a necessity. One of the characteristics of text data that makes text mining difficult, is multi-labelity. In order to build a robust and effective text classification method which is an integral part of text mining research, we must consider this property more closely. This kind of property is not unique to text data as it can be found in non-text (e.g., numeric) data as well. However, in text data, it is most prevalent. This property also puts the text classification problem in the domain of multi-label classification (MLC), where each instance is associated with a subset of class-labels instead of a single class, as in conventional classification. In this paper, we explore how the generation of pseudo labels (i.e., combinations of existing class labels) can help us in performing better text classification and under what kind of circumstances. During the classification, the high and sparse dimensionality of text data has also been considered. Although, here we are proposing and evaluating a text classification technique, our main focus is on the handling of the multi-labelity of text data while utilizing the correlation among multiple labels existing in the data set. Our text classification technique is called pseudo-LSC (pseudo-Label Based Subspace Clustering). It is a subspace clustering algorithm that considers the high and sparse dimensionality as well as the correlation among different class labels during the classification process to provide better performance than existing approaches. Results on three real world multi-label data sets provide us insight into how the multi-labelity is handled in our classification process and shows the effectiveness of our approach.

  14. a

    TETI Ward1BlkGrps 20250205

    • cotgis.hub.arcgis.com
    Updated Feb 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tucson (2025). TETI Ward1BlkGrps 20250205 [Dataset]. https://cotgis.hub.arcgis.com/maps/cotgis::teti-ward1blkgrps-20250205
    Explore at:
    Dataset updated
    Feb 10, 2025
    Dataset authored and provided by
    City of Tucson
    Area covered
    Description

    What is the Tucson Equity Tree Index (TETI)?The Tucson Equity Tree Index (TETI) is a tool that describes the distribution of tree canopy as it relates to social vulnerability. It categorizes the dataset into 5 classes that represent the differing prioritization needs for improving tree canopy coverage: Low (0-20), Low-Moderate (20-40), Moderate (40-60), Moderate-High (60-80) High (80-100). Each class represents 20% of the dataset’s features in order of their values. The features within the Low (0-20) classification represent the areas that, when compared to all other locations in the study area, have the lowest need for tree planting prioritization, as they tend to have more existing tree canopy and less socially vulnerable demographics. The features that fall into the High (80-100) classification represent the 20% of locations in the dataset that have the greatest need for tree planting prioritization, as they tend to have the least existing tree canopy and the highest proportions of socially vulnerable demographics. How is tree canopy measured?To measure tree canopy, the TETI calculates a Gap Score using the difference between the goal canopy coverage per feature (15%) and the percent of existing tree canopy coverage. The existing tree canopy coverage was calculated from PAG’s 2015 tree canopy dataset, which can be viewed here.How is social vulnerability measured?The TETI incorporates the Tucson Equity Priority Index (TEPI), which examines the proportion of vulnerability per feature using 11 demographic indicators:• % of households below poverty• % unemployed• % with no personal vehicle access• % with no health insurance• % that experience renter cost-burden• % that experience homeowner cost-burden• % that are a vulnerable age (children under 18 and seniors 65 and older)• % of households with 1 or more disabilities• % people of color• % highest educational attainment is high school or less• % with low English proficiencyAn overall percentile value is calculated for each feature based on the total proportion of the above indicators in each area. How are the variables combined?The two variables (tree canopy gap score and vulnerability score) for each feature are re-scaled using the percentile normalization method, which converts them to the same scale using values between 0 to 100. The variables are then combined using the mean aggregation method and weighting the tree canopy gap score and the vulnerability score equally. The resulting dataset is then divided into the five classes, where:High Vulnerability (80-100%): Representing the top classification, this category includes the highest 20% of regions that are the most socially vulnerable and have the largest tree canopy gap scores. These areas require the most focused attention. Moderate-High Vulnerability (60-80%): This upper-middle classification includes areas with higher levels of vulnerability and tree canopy gap scores compared to the median. While not the highest, these areas are more vulnerable and in need of tree canopy than a majority of the dataset and should be considered for targeted interventions. Moderate Vulnerability (40-60%): Representing the middle or median quintile, this category includes areas of average vulnerability and canopy gap scores. These areas may show a balanced mix of high and low vulnerability and canopy indicators. Detailed examination of specific indicators is recommended to understand the nuanced needs of these areas. Low-Moderate Vulnerability (20-40%): Falling into the lower-middle classification, this range includes areas that are less vulnerable and have more tree canopy than most but may still exhibit certain vulnerable characteristics or tree canopy needs. These areas typically have a mix of lower and higher indicators, with the lower values predominating. Low Vulnerability (0-20%): This category represents the bottom classification, encompassing the lowest 20% of data points. Areas in this range are the least vulnerable and have the greatest existing tree canopy, making them the most resilient compared to all other features in the dataset. How is the TETI different from the Tucson Tree Equity Score index?The TETI is designed to be the updated version of the Tucson Tree Equity score index in that:The demographic variables of vulnerability use the most current data and reflect how the City of Tucson as a whole is defining social vulnerabilityThis version of TETI uses census block groups instead of Tucson neighborhoods, which results in areas that are more comparable to one another in population size and an overall index that is more comparable to other existing indices (Tucson Equity Priority Index (TEPI), the Climate and Economic Justice Screening Tool (CEJST), the American Forests Tree Equity Score National Explorer, etc.)The classification method for the TETI results in equal ranges for each classification based on relative need for prioritization, whereas the Tucson Tree Equity Index has a more arbitrary classification scheme and does not have equal classification ranges

  15. Z

    Doodleverse/Segmentation Zoo/Seg2Map Res-UNet models for DeepGlobe/7-class...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Buscombe, Daniel (2024). Doodleverse/Segmentation Zoo/Seg2Map Res-UNet models for DeepGlobe/7-class segmentation of RGB 512x512 high-res. images [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7576897
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset authored and provided by
    Buscombe, Daniel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Doodleverse/Segmentation Zoo/Seg2Map Res-UNet models for DeepGlobe/7-class segmentation of RGB 512x512 high-res. images

    These Residual-UNet model data are based on the DeepGlobe dataset

    Models have been created using Segmentation Gym* using the following dataset**: https://www.kaggle.com/datasets/balraj98/deepglobe-land-cover-classification-dataset

    Image size used by model: 512 x 512 x 3 pixels

    classes: 1. urban 2. agricultural 3. rangeland 4. forest 5. water 6. bare 7. unknown

    File descriptions

    For each model, there are 5 files with the same root name:

    1. '.json' config file: this is the file that was used by Segmentation Gym* to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction. It is a handy wee thing and mastering it means mastering the entire Doodleverse.

    2. '.h5' weights file: this is the file that was created by the Segmentation Gym* function train_model.py. It contains the trained model's parameter weights. It can called by the Segmentation Gym* function seg_images_in_folder.py. Models may be ensembled.

    3. '_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the config file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model

    4. '_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function train_model.py

    5. '.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function train_model.py

    Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU

    References *Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym

    **Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D. and Raskar, R., 2018. Deepglobe 2018: A challenge to parse the earth through satellite images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 172-181).

  16. m

    Goldenhar-CFID: A Novel Dataset for Craniofacial Anomaly Detection in...

    • data.mendeley.com
    Updated Feb 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Israt Jahan (2025). Goldenhar-CFID: A Novel Dataset for Craniofacial Anomaly Detection in Goldenhar Syndrome [Dataset]. http://doi.org/10.17632/ffsthxyp4d.3
    Explore at:
    Dataset updated
    Feb 28, 2025
    Authors
    Israt Jahan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Goldenhar Syndrome Craniofacial Image Dataset (Goldenhar-CFID) is a high-resolution dataset designed for the automated detection and classification of craniofacial abnormalities associated with Goldenhar Syndrome (GS). It comprises 4,483 images, categorized into seven distinct classes of craniofacial deformities. This dataset serves as a valuable resource for researchers in medical image analysis, deep learning, and clinical decision-making. Dataset Characteristics: Total Images: 4,483 Number of Classes: 7 Image Format: JPG Image Resolution: 640 x 640 pixels Annotation: Each image is manually labeled and verified by medical experts Data Preprocessing: Auto-orientation and histogram equalization applied for enhanced feature detection Augmentation Techniques: Rotation, scaling, brightness adjustments, flipping, and contrast modifications Categories and Annotations The dataset includes images categorized into seven craniofacial deformities: Cleft Lip and Palate – Congenital anomaly where the upper lip and/or palate fails to develop properly. Epibulbar Dermoid Tumor – Benign growth on the eye’s surface, typically at the cornea-sclera junction. Eyelid Coloboma – Defect characterized by a partial or complete absence of eyelid tissue. Facial Asymmetry – Uneven development of facial structures. Malocclusion – Misalignment of the teeth and jaws. Microtia – Underdeveloped or absent outer ear. Vertebral Abnormality – Irregular development of spinal vertebrae. Dataset Structure and Splitting The dataset consists of four main subdirectories: Original – Contains 547 raw images. Unaugmented Balanced – Contains 210 images per class. Augmented Unbalanced – Includes 4,483 images with augmentation. Augmented Balanced – Contains 756 images per class. The dataset is split into: Training Set: 80% Validation Set: 10% Test Set: 10%

  17. a

    NHD Plus - High Resolution

    • pend-oreille-county-open-data-pendoreilleco.hub.arcgis.com
    Updated Jun 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pend Oreille County (2024). NHD Plus - High Resolution [Dataset]. https://pend-oreille-county-open-data-pendoreilleco.hub.arcgis.com/maps/d2660f0b23184f5087c0df2f6d6b50b8
    Explore at:
    Dataset updated
    Jun 7, 2024
    Dataset authored and provided by
    Pend Oreille County
    Area covered
    Description

    *This dataset is authored by ESRI and is being shared as a direct link to the feature service by Pend Oreille County. NHD is a primary hydrologic reference used by our organization.The National Hydrography Dataset Plus High Resolution (NHDplus High Resolution) maps the lakes, ponds, streams, rivers and other surface waters of the United States. Created by the US Geological Survey, NHDPlus High Resolution provides mean annual flow and velocity estimates for rivers and streams. Additional attributes provide connections between features facilitating complicated analyses.For more information on the NHDPlus High Resolution dataset see the User’s Guide for the National Hydrography Dataset Plus (NHDPlus) High Resolution.Dataset SummaryPhenomenon Mapped: Surface waters and related features of the United States and associated territoriesCoordinate System: Web Mercator Auxiliary Sphere Extent: The Contiguous United States, Hawaii, portions of Alaska, Puerto Rico, Guam, US Virgin Islands, Northern Marianas Islands, and American Samoa Visible Scale: Visible at all scales but layer draws best at scales larger than 1:1,000,000Source: USGSPublication Date: July 2022This layer was symbolized in the ArcGIS Map Viewer and while the features will draw in the Classic Map Viewer the advanced symbology will not.Prior to publication, the network and non-network flowline feature classes were combined into a single flowline layer. Similarly, the Area and Waterbody feature classes were merged under a single schema.Attribute fields were added to the flowline and waterbody layers to simplify symbology and enhance the layer's pop-ups. Fields added include Pop-up Title, Pop-up Subtitle, Esri Symbology (waterbodies only), and Feature Code Description. All other attributes are from the original dataset. No data values -9999 and -9998 were converted to Null values.What can you do with this Feature Layer?Feature layers work throughout the ArcGIS system. Generally your work flow with feature layers will begin in ArcGIS Online or ArcGIS Pro. Below are just a few of the things you can do with a feature service in Online and Pro.ArcGIS OnlineAdd this layer to a map in the map viewer. The layer or a map containing it can be used in an application. Change the layer’s transparency and set its visibility rangeOpen the layer’s attribute table and make selections. Selections made in the map or table are reflected in the other. Center on selection allows you to zoom to features selected in the map or table and show selected records allows you to view the selected records in the table.Apply filters. For example you can set a filter to show larger streams and rivers using the mean annual flow attribute or the stream order attribute.Change the layer’s style and symbologyAdd labels and set their propertiesCustomize the pop-upUse as an input to the ArcGIS Online analysis tools. This layer works well as a reference layer with the trace downstream and watershed tools. The buffer tool can be used to draw protective boundaries around streams and the extract data tool can be used to create copies of portions of the data.ArcGIS ProAdd this layer to a 2d or 3d map.Use as an input to geoprocessing. For example, copy features allows you to select then export portions of the data to a new feature class.Change the symbology and the attribute field used to symbolize the dataOpen table and make interactive selections with the mapModify the pop-upsApply Definition Queries to create sub-sets of the layerThis layer is part of the ArcGIS Living Atlas of the World that provides an easy way to explore the landscape layers and many other beautiful and authoritative maps on hundreds of topics.

  18. R

    Hyper Kvasir Dataset

    • universe.roboflow.com
    zip
    Updated Jul 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simula (2024). Hyper Kvasir Dataset [Dataset]. https://universe.roboflow.com/simula/hyper-kvasir/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Simula
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    GI Tract
    Description

    Overview This is the largest Gastrointestinal dataset generously provided by Simula Research Laboratory in Norway

    You can read their research paper here in Nature

    In total, the dataset contains 10,662 labeled images stored using the JPEG format. The images can be found in the images folder. The classes, which each of the images belong to, correspond to the folder they are stored in (e.g., the ’polyp’ folder contains all polyp images, the ’barretts’ folder contains all images of Barrett’s esophagus, etc.). Each class-folder is located in a subfolder describing the type of finding, which again is located in a folder describing wheter it is a lower GI or upper GI finding. The number of images per class are not balanced, which is a general challenge in the medical field due to the fact that some findings occur more often than others. This adds an additional challenge for researchers, since methods applied to the data should also be able to learn from a small amount of training data. The labeled images represent 23 different classes of findings.

    The data is collected during real gastro- and colonoscopy examinations at a Hospital in Norway and partly labeled by experienced gastrointestinal endoscopists.

    Use Cases

    "Artificial intelligence is currently a hot topic in medicine. The fact that medical data is often sparse and hard to obtain due to legal restrictions and lack of medical personnel to perform the cumbersome and tedious labeling of the data, leads to technical limitations. In this respect, we share the Hyper-Kvasir dataset, which is the largest image and video dataset from the gastrointestinal tract available today."

    "We have used the labeled data to research the classification and segmentation of GI findings using both computer vision and ML approaches to potentially be used in live and post-analysis of patient examinations. Areas of potential utilization are analysis, classification, segmentation, and retrieval of images and videos with particular findings or particular properties from the computer science area. The labeled data can also be used for teaching and training in medical education. Having expert gastroenterologists providing the ground truths over various findings, HyperKvasir provides a unique and diverse learning set for future clinicians. Moreover, the unlabeled data is well suited for semi-supervised and unsupervised methods, and, if even more ground truth data is needed, the users of the data can use their own local medical experts to provide the needed labels. Finally, the videos can in addition be used to simulate live endoscopies feeding the video into the system like it is captured directly from the endoscopes enable developers to do image classification."

    Borgli, H., Thambawita, V., Smedsrud, P.H. et al. HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Sci Data 7, 283 (2020). https://doi.org/10.1038/s41597-020-00622-y

    Using this Dataset

    Hyper-Kvasir is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source. This means that in all documents and papers that use or refer to the Hyper-Kvasir dataset or report experimental results based on the dataset, a reference to the related article needs to be added: PREPRINT: https://osf.io/mkzcq/. Additionally, one should provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

    About Roboflow

    Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

    Developers reduce 50% of their boilerplate code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

  19. D

    Data from: Dataset corresponding to 'The model for the accompaniment of...

    • ssh.datastations.nl
    • b2find.eudat.eu
    • +1more
    pdf, zip
    Updated Jul 11, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    B.J. Geyser; C.A.M. Hermans; B.J. Geyser; C.A.M. Hermans (2017). Dataset corresponding to 'The model for the accompaniment of seekers into silence in their quest for wholeness' [Dataset]. http://doi.org/10.17026/DANS-XMG-KNM8
    Explore at:
    pdf(67935), zip(22005), pdf(67856), pdf(76140), pdf(60021), pdf(57263), pdf(52444), pdf(67827), pdf(60949), pdf(55298)Available download formats
    Dataset updated
    Jul 11, 2017
    Dataset provided by
    DANS Data Station Social Sciences and Humanities
    Authors
    B.J. Geyser; C.A.M. Hermans; B.J. Geyser; C.A.M. Hermans
    License

    https://doi.org/10.17026/fp39-0x58https://doi.org/10.17026/fp39-0x58

    Description

    This dataset belongs to the following dissertation: Barend Johannes Geyser (2017). The model for the accompaniment of seekers with a Christian background into silence in their quest for wholeness. Radboud UniversityData gathering has taken place by means of phenomenological interviews , observations and making field notes during the interviews, as well as video-stimulated recall. The interview transcripts are written in the South African language.This dataset contains the interview transcripts.The researcher decided to select participants who were starting with their second half of life, thus from 40 to 55 years of age. The participants are all from a Christian background and they were all living in the Northern suburbs of Johannesburg, which means that they are from the socio- economic middle class and upper middle class. There are 3 women and 5 men interviewed. The interviews involve the conscious selection of certain participants. In this instance, the participants are seekers that ask for accompaniment into silence. They are all Christian seekers on a quest for wholeness and investigating the possibility of the practice of silence as an aid in their quest. They all attempted to practice silence in some or other way for at least three years.In addition to the eight interview transcripts, a read me text is added to explain the context of the dataset.

  20. d

    Data from: Terrestrial Ecosystems - Land Surface Forms of the Conterminous...

    • catalog.data.gov
    • data.usgs.gov
    • +3more
    Updated Sep 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Terrestrial Ecosystems - Land Surface Forms of the Conterminous United States [Dataset]. https://catalog.data.gov/dataset/terrestrial-ecosystems-land-surface-forms-of-the-conterminous-united-states
    Explore at:
    Dataset updated
    Sep 17, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Contiguous United States, United States
    Description

    The U.S. Geological Survey (USGS) has generated land surface form classes for the contiguous United States. These land surface form classes were created as part of an effort to map standardized, terrestrial ecosystems for the nation using a classification developed by NatureServe (Comer and others, 2003). Ecosystem distributions were modeled using a biophysical stratification approach developed for South America (Sayre and others, 2008) and now being implemented globally (Sayre and others, 2007). Land surface forms strongly influence the differentiation and distribution of terrestrial ecosystems, and are one of the key input layers in the ecosystem delineation process. The methodology used to produce these land surface form classes was developed by the Missouri Resource Assessment Partnership (MoRAP). MoRAP made modifications to Hammond's (1964a, 1964b) land surface form classification, which allowed the use of 30-meter source data and a 1 km2 window for neighborhood analysis (True 2002, True and others, 2000). While Hammond's methodology was based on three variables, slope, local relief, and profile type, MoRAP's methodology uses only slope and local relief (True 2002). Slope is classified as gently sloping or not gently sloping using a slope threshold of 8%, local relief is classified into five classes (0-15m, 15-30m, 30-90m, 90-150m, and >150m), and eight landform classes (flat plains, smooth plains, irregular plains, escarpments, low hills, hills, breaks, and low mountains) were derived by combining slope class and local relief. The USGS implementation of the MoRAP methodology was executed using the USGS 30-meter National Elevation Dataset (NED) and an existing USGS slope dataset. In this implementation, a new land surface form class, the high mountains/deep canyons class, was identified by using an additional local relief class (> 400m). The drainage channels class was derived independently from the other land surface form classes. This class was derived using two of Andrew Weiss's slope position classes, "valley" and "lower slope" (Weiss 2001, Jenness 2006). The USGS implemented Weiss's algorithm using the 30-meter NED and a 1 km2 neighborhood analysis window. The resultant drainage channel class was combined into the final land surface forms dataset.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Neilsberg Research (2024). Income Distribution by Quintile: Mean Household Income in Middle Inlet, Wisconsin [Dataset]. https://www.neilsberg.com/research/datasets/94c785c2-7479-11ee-949f-3860777c1fe6/

Income Distribution by Quintile: Mean Household Income in Middle Inlet, Wisconsin

Explore at:
json, csvAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Wisconsin, Middle Inlet
Variables measured
Income Level, Mean Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the mean household income for each of the five quintiles in Middle Inlet, Wisconsin, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

Key observations

  • Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 21,360, while the mean income for the highest quintile (20% of households with the highest income) is 162,915. This indicates that the top earners earn 8 times compared to the lowest earners.
  • *Top 5%: * The mean household income for the wealthiest population (top 5%) is 282,509, which is 173.41% higher compared to the highest quintile, and 1322.61% higher compared to the lowest quintile.

https://i.neilsberg.com/ch/middle-inlet-wi-mean-household-income-by-quintiles.jpeg" alt="Mean household income by quintiles in Middle Inlet, Wisconsin (in 2022 inflation-adjusted dollars))">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Income Levels:

  • Lowest Quintile
  • Second Quintile
  • Third Quintile
  • Fourth Quintile
  • Highest Quintile
  • Top 5 Percent

Variables / Data Columns

  • Income Level: This column showcases the income levels (As mentioned above).
  • Mean Household Income: Mean household income, in 2022 inflation-adjusted dollars for the specific income level.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Middle Inlet town median household income. You can refer the same here

Search
Clear search
Close search
Google apps
Main menu