Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Lancaster County, PA, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Lancaster County median household income. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Deptford Township, New Jersey, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Deptford township median household income. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is one which highlights the demographics of Upper-Middle Class people living in Gachibowli, Hyderabad, India and attempts to, through various methods of statistical analysis, establish a relationship between several of these demographic details.
This table presents income shares, thresholds, tax shares, and total counts of individual Canadian tax filers, with a focus on high income individuals (95% income threshold, 99% threshold, etc.). Income thresholds are based on national threshold values, regardless of selected geography; for example, the number of Nova Scotians in the top 1% will be calculated as the number of taxfiling Nova Scotians whose total income exceeded the 99% national income threshold. Different definitions of income are available in the table namely market, total, and after-tax income, both with and without capital gains.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Middle Inlet, Wisconsin, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Middle Inlet town median household income. You can refer the same here
The National Hydrography Dataset Plus High Resolution (NHDplus High Resolution) maps the lakes, ponds, streams, rivers and other surface waters of the United States. Created by the US Geological Survey, NHDPlus High Resolution provides mean annual flow and velocity estimates for rivers and streams. Additional attributes provide connections between features facilitating complicated analyses.For more information on the NHDPlus High Resolution dataset see the User’s Guide for the National Hydrography Dataset Plus (NHDPlus) High Resolution.Dataset SummaryPhenomenon Mapped: Surface waters and related features of the United States and associated territoriesGeographic Extent: The Contiguous United States, Hawaii, portions of Alaska, Puerto Rico, Guam, US Virgin Islands, Northern Marianas Islands, and American SamoaProjection: Web Mercator Auxiliary Sphere Visible Scale: Visible at all scales but layer draws best at scales larger than 1:1,000,000Source: USGSUpdate Frequency: AnnualPublication Date: July 2022This layer was symbolized in the ArcGIS Map Viewer and while the features will draw in the Classic Map Viewer the advanced symbology will not. Prior to publication, the network and non-network flowline feature classes were combined into a single flowline layer. Similarly, the Area and Waterbody feature classes were merged under a single schema.Attribute fields were added to the flowline and waterbody layers to simplify symbology and enhance the layer's pop-ups. Fields added include Pop-up Title, Pop-up Subtitle, Esri Symbology (waterbodies only), and Feature Code Description. All other attributes are from the original dataset. No data values -9999 and -9998 were converted to Null values.What can you do with this layer?Feature layers work throughout the ArcGIS system. Generally your work flow with feature layers will begin in ArcGIS Online or ArcGIS Pro. Below are just a few of the things you can do with a feature service in Online and Pro.ArcGIS OnlineAdd this layer to a map in the map viewer. The layer or a map containing it can be used in an application. Change the layer’s transparency and set its visibility rangeOpen the layer’s attribute table and make selections. Selections made in the map or table are reflected in the other. Center on selection allows you to zoom to features selected in the map or table and show selected records allows you to view the selected records in the table.Apply filters. For example you can set a filter to show larger streams and rivers using the mean annual flow attribute or the stream order attribute.Change the layer’s style and symbologyAdd labels and set their propertiesCustomize the pop-upUse as an input to the ArcGIS Online analysis tools. This layer works well as a reference layer with the trace downstream and watershed tools. The buffer tool can be used to draw protective boundaries around streams and the extract data tool can be used to create copies of portions of the data.ArcGIS ProAdd this layer to a 2d or 3d map.Use as an input to geoprocessing. For example, copy features allows you to select then export portions of the data to a new feature class.Change the symbology and the attribute field used to symbolize the dataOpen table and make interactive selections with the mapModify the pop-upsApply Definition Queries to create sub-sets of the layerThis layer is part of the ArcGIS Living Atlas of the World that provides an easy way to explore the landscape layers and many other beautiful and authoritative maps on hundreds of topics.Questions?Please leave a comment below if you have a question about this layer, and we will get back to you as soon as possible.
The data here is from the report entitled Trends in Enrollment, Credit Attainment, and Remediation at Connecticut Public Universities and Community Colleges: Results from P20WIN for the High School Graduating Classes of 2010 through 2016. The report answers three questions: 1. Enrollment: What percentage of the graduating class enrolled in a Connecticut public university or community college (UCONN, the four Connecticut State Universities, and 12 Connecticut community colleges) within 16 months of graduation? 2. Credit Attainment: What percentage of those who enrolled in a Connecticut public university or community college within 16 months of graduation earned at least one year’s worth of credits (24 or more) within two years of enrollment? 3. Remediation: What percentage of those who enrolled in one of the four Connecticut State Universities or one of the 12 community colleges within 16 months of graduation took a remedial course within two years of enrollment? Notes on the data: School Credit: % Earning 24 Credits is a subset of the % Enrolled in 16 Months. School Remediation: % Enrolled in Remediation is a subset of the % Enrolled in 16 Months.
This dataset contains 250k 256*256 pix high resolution human, anime and animal faces encoded by "sd-vae-ema-f8" from huggingface / diffusers (https://github.com/huggingface/diffusers/) and saved as ".pt" files (e.g. the "afhq_x.pt" contains a torch.Tensor shaped [15.8k, 32, 32, 4] with dtype float32; the "afhq_cls.pt" is the dataset label, a LongTensor shaped [15.8k,]). Each original image is 256*256*3, and is encoded to 32*32*4. The original images are from 6 different datasets on Kaggle. Motivated by personal interests, I created this dataset for class-conditional image generation with Latent Diffusion Models (LDMs).
I chose the following HQ datasets on Kaggle based on personal appetite. |Datset Label| Dataset Name | Dataset Size |Description| URL | | --- | --- | |0|AFHQ |15.8k| Cat, dog and wild animal faces| https://www.kaggle.com/datasets/dimensi0n/afhq-512/data |1|FFHQ |70.0k|Human faces|https://www.kaggle.com/datasets/xhlulu/flickrfaceshq-dataset-nvidia-resized-256px |2|CelebA-HQ |30.0k|Celebrity faces| https://www.kaggle.com/datasets/denislukovnikov/celebahq256-images-only |3|FaceAttributes |24.0k |Human faces| https://www.kaggle.com/datasets/mantasu/face-attributes-grouped |4|AnimeGAN |25.7k|Anime faces generated by styleGAN-2| https://www.kaggle.com/datasets/prasoonkottarathil/gananime-lite |5|AnimeFaces |92.2k| Anime faces|https://www.kaggle.com/datasets/scribbless/another-anime-face-dataset
I find my LDM hard to learn the samples in AFHQ and FaceAttributes, but behaves reasonably well on the other datasets.
The image is first downsampled to 256 pix (the above datasets provide original images of either 256 pix or 512 pix). They're normalized (img = img / 127.5 - 1) before encoded by the sd-vae-ema-f8 encoder. The output latent code is shaped as [batch_size, 32, 32, 4]. The std of the latent code is ~4.5 and the mean is <0.5. ``` import torch from diffusers.models import AutoencoderKL
def encode(normalized_images: torch.Tensor, mode=True): dist = vae_model.encode(normalized_images).latent_dist if mode: return dist.mode() else: return dist.sample()
def decode(latent_code: torch.Tensor): return vae_model.decode(latent_code).sample
model_name = "stabilityai/sd-vae-ft-ema" vae_model = AutoencoderKL.from_pretrained(model_name) vae_model.eval().requires_grad_(False) ``` It took about 45 min on a P100 GPU on Kaggle to encode these 250k images (with a batch size of 32, which didn't fully take advantage of the GPU's 16GB VRAM!).
https://www.etalab.gouv.fr/licence-ouverte-open-licencehttps://www.etalab.gouv.fr/licence-ouverte-open-licence
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Cellular communications, especially with the advent of 5G mobile networks, demand stringent adherence to high-reliability standards, ultra-low latency, increased capacity, enhanced security, and high-speed user connectivity. To fulfill these requirements, mobile operators require a programmable solution capable of supporting multiple independent tenants on a single physical infrastructure. The advent of 5G networks facilitates end-to-end resource allocation through Network Slicing (NS), which allows for the division of the network into distinct virtual slices.
Network slicing in 5G stands as a pivotal feature for next-generation wireless networks, delivering substantial benefits to both mobile operators and businesses. Developing a Machine Learning (ML) model is crucial for accurately predicting the optimal network slice based on key device parameters. Such a model also plays a vital role in managing network load balancing and addressing network slice failures.
The dataset is structured to support the development of an ML model that can classify the optimal network slice based on device parameters. The target output comprises three distinct classes:
Enhanced Mobile Broadband (eMBB):
Ultra-Reliable Low Latency Communication (URLLC):
Massive Machine Type Communication (mMTC):
deepslice_data.csv
The dataset includes labeled instances categorized into the three target classes: eMBB, URLLC, and mMTC. Each instance corresponds to a specific device configuration and its optimal network slice.
Network slicing in 5G is instrumental in provisioning tailored network services for specific use cases, ensuring optimal performance, resource utilization, and user experiences based on the requirements of eMBB, URLLC, and mMTC applications. This dataset is invaluable for researchers and practitioners aiming to design and implement ML models for network slice prediction, thereby enhancing the operational efficiency and reliability of 5G networks.
This dataset is meticulously curated to facilitate the development of ML models for predicting the optimal 5G network slice. It encompasses a comprehensive set of attributes and target classes, ensuring that it meets the highest standards required for advanced research and practical applications in the field of cellular communications and network management.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area exposed to one or more hazards represented on the hazard map used for risk analysis of the RPP. The hazard map is the result of the study of hazards, the objective of which is to assess the intensity of each hazard at any point in the study area. The evaluation method is specific to each hazard type. It leads to the delimitation of a set of areas on the study perimeter constituting a zoning graduated according to the level of the hazard. The allocation of a hazard level at a given point in the territory takes into account the probability of occurrence of the dangerous phenomenon and its degree of intensity. For multi-random PPRNs, each zone is usually identified on the hazard map by a code for each hazard to which it is exposed. All hazard areas shown on the hazard map are included. Areas protected by protective structures must be represented (possibly in a specific way) as they are always considered to be subject to hazard (cases of breakage or inadequacy of the structure).The hazard zones may be classified as data compiled in so far as they result from a synthesis using several sources of calculated, modelled or observed hazard data. These source data are not concerned by this class of objects but by another standard dealing with the knowledge of hazards. Some areas within the study area are considered “no or insignificant hazard zones”. These are the areas where the hazard has been studied and is nil. These areas are not included in the object class and do not have to be represented as hazard zones. However, in the case of natural RPPs, regulatory zoning may classify certain areas not exposed to hazard as prescribing areas (see definition of the PPR class).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area exposed to one or more hazards represented on the hazard map used for risk analysis of the RPP. The hazard map is the result of the study of hazards, the objective of which is to assess the intensity of each hazard at any point in the study area. The evaluation method is specific to each hazard type. It leads to the delimitation of a set of areas on the study perimeter constituting a zoning graduated according to the level of the hazard. The allocation of a hazard level at a given point in the territory takes into account the probability of occurrence of the dangerous phenomenon and its degree of intensity. For multi-random PPRNs, each zone is usually identified on the hazard map by a code for each hazard to which it is exposed. All hazard areas shown on the hazard map are included. Areas protected by protective structures must be represented (possibly in a specific way) as they are always considered to be subject to hazard (cases of breakage or inadequacy of the structure).The hazard zones may be classified as data compiled in so far as they result from a synthesis using several sources of calculated, modelled or observed hazard data. These source data are not concerned by this class of objects but by another standard dealing with the knowledge of hazards. Some areas within the study area are considered “no or insignificant hazard zones”. These are the areas where the hazard has been studied and is nil. These areas are not included in the object class and do not have to be represented as hazard zones. However, in the case of natural RPPs, regulatory zoning may classify certain areas not exposed to hazard as prescribing areas (see definition of the PPR class).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in St. Tammany Parish, LA, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for St. Tammany Parish median household income. You can refer the same here
For detailed information, visit the Tucson Equity Priority Index StoryMap.Download the Data DictionaryWhat is the Tucson Equity Priority Index (TEPI)?The Tucson Equity Priority Index (TEPI) is a tool that describes the distribution of socially vulnerable demographics. It categorizes the dataset into 5 classes that represent the differing prioritization needs based on the presence of social vulnerability: Low (0-20), Low-Moderate (20-40), Moderate (40-60), Moderate-High (60-80) High (80-100). Each class represents 20% of the dataset’s features in order of their values. The features within the Low (0-20) classification represent the areas that, when compared to all other locations in the study area, have the lowest need for prioritization, as they tend to have less socially vulnerable demographics. The features that fall into the High (80-100) classification represent the 20% of locations in the dataset that have the greatest need for prioritization, as they tend to have the highest proportions of socially vulnerable demographics. How is social vulnerability measured?The Tucson Equity Priority Index (TEPI) examines the proportion of vulnerability per feature using 11 demographic indicators:Income Below Poverty: Households with income at or below the federal poverty level (FPL), which in 2023 was $14,500 for an individual and $30,000 for a family of fourUnemployment: Measured as the percentage of unemployed persons in the civilian labor forceHousing Cost Burdened: Homeowners who spend more than 30% of their income on housing expenses, including mortgage, maintenance, and taxesRenter Cost Burdened: Renters who spend more than 30% of their income on rentNo Health Insurance: Those without private health insurance, Medicare, Medicaid, or any other plan or programNo Vehicle Access: Households without automobile, van, or truck accessHigh School Education or Less: Those highest level of educational attainment is a High School diploma, equivalency, or lessLimited English Ability: Those whose ability to speak English is "Less Than Well."People of Color: Those who identify as anything other than Non-Hispanic White Disability: Households with one or more physical or cognitive disabilities Age: Groups that tend to have higher levels of vulnerability, including children (those below 18), and seniors (those 65 and older)An overall percentile value is calculated for each feature based on the total proportion of the above indicators in each area. How are the variables combined?These indicators are divided into two main categories that we call Thematic Indices: Economic and Personal Characteristics. The two thematic indices are further divided into five sub-indices called Tier-2 Sub-Indices. Each Tier-2 Sub-Index contains 2-3 indicators. Indicators are the datasets used to measure vulnerability within each sub-index. The variables for each feature are re-scaled using the percentile normalization method, which converts them to the same scale using values between 0 to 100. The variables are then combined first into each of the five Tier-2 Sub-Indices, then the Thematic Indices, then the overall TEPI using the mean aggregation method and equal weighting. The resulting dataset is then divided into the five classes, where:High Vulnerability (80-100%): Representing the top classification, this category includes the highest 20% of regions that are the most socially vulnerable. These areas require the most focused attention. Moderate-High Vulnerability (60-80%): This upper-middle classification includes areas with higher levels of vulnerability compared to the median. While not the highest, these areas are more vulnerable than a majority of the dataset and should be considered for targeted interventions. Moderate Vulnerability (40-60%): Representing the middle or median quintile, this category includes areas of average vulnerability. These areas may show a balanced mix of high and low vulnerability. Detailed examination of specific indicators is recommended to understand the nuanced needs of these areas. Low-Moderate Vulnerability (20-40%): Falling into the lower-middle classification, this range includes areas that are less vulnerable than most but may still exhibit certain vulnerable characteristics. These areas typically have a mix of lower and higher indicators, with the lower values predominating. Low Vulnerability (0-20%): This category represents the bottom classification, encompassing the lowest 20% of data points. Areas in this range are the least vulnerable, making them the most resilient compared to all other features in the dataset.
The National Longitudinal Study of the High School Class of 1972 (NLS-72) is part of the Secondary Longitudinal Studies (SLS) program; program data is available since 1972 at https://nces.ed.gov/pubsearch/getpubcats.asp?sid=021. The National Longitudinal Study of the High School Class of 1972 (NLS-72) (https://nces.ed.gov/surveys/nls72/index.asp) is a longitudinal survey that follows high school seniors through 5 follow-ups in 1973, 1974, 1976, 1979, and 1986. The study was conducted using a national representative sample of 1972 high school seniors. Key statistics produced from the National Longitudinal Study of High School Class of 1972 are student's educational aspirations and attainment, family formation, and occupations.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Welcome to the UltraCortex repository, which hosts a unique collection of ultra-high field (9.4 Tesla) MRI data of the human brain. This dataset includes detailed structural images and high-quality manual segmentations, making it an invaluable resource for researchers in neuroimaging and computational neuroscience.
The UltraCortex dataset aims to:
The dataset is publicly available on the OpenNeuro repository:
If you use this dataset, please cite:
Mahler, L., Steiglechner, J. et al. (2024). UltraCortex: Submillimeter Ultra-High Field 9.4 T1 Brain MR Image Collection and Manual Cortical Segmentations
For any questions or further information, please contact the corresponding authors:
Thank you for using the UltraCortex dataset. We hope it facilitates your research and contributes to advancements in neuroimaging.
With the advent and expansion of social networking, the amount of generated text data has seen a sharp increase. In order to handle such a huge volume of text data, new and improved text mining techniques are a necessity. One of the characteristics of text data that makes text mining difficult, is multi-labelity. In order to build a robust and effective text classification method which is an integral part of text mining research, we must consider this property more closely. This kind of property is not unique to text data as it can be found in non-text (e.g., numeric) data as well. However, in text data, it is most prevalent. This property also puts the text classification problem in the domain of multi-label classification (MLC), where each instance is associated with a subset of class-labels instead of a single class, as in conventional classification. In this paper, we explore how the generation of pseudo labels (i.e., combinations of existing class labels) can help us in performing better text classification and under what kind of circumstances. During the classification, the high and sparse dimensionality of text data has also been considered. Although, here we are proposing and evaluating a text classification technique, our main focus is on the handling of the multi-labelity of text data while utilizing the correlation among multiple labels existing in the data set. Our text classification technique is called pseudo-LSC (pseudo-Label Based Subspace Clustering). It is a subspace clustering algorithm that considers the high and sparse dimensionality as well as the correlation among different class labels during the classification process to provide better performance than existing approaches. Results on three real world multi-label data sets provide us insight into how the multi-labelity is handled in our classification process and shows the effectiveness of our approach.
For detailed information, visit the Tucson Equity Priority Index StoryMap.Download the Data DictionaryWhat is the Tucson Equity Priority Index (TEPI)?The Tucson Equity Priority Index (TEPI) is a tool that describes the distribution of socially vulnerable demographics. It categorizes the dataset into 5 classes that represent the differing prioritization needs based on the presence of social vulnerability: Low (0-20), Low-Moderate (20-40), Moderate (40-60), Moderate-High (60-80) High (80-100). Each class represents 20% of the dataset’s features in order of their values. The features within the Low (0-20) classification represent the areas that, when compared to all other locations in the study area, have the lowest need for prioritization, as they tend to have less socially vulnerable demographics. The features that fall into the High (80-100) classification represent the 20% of locations in the dataset that have the greatest need for prioritization, as they tend to have the highest proportions of socially vulnerable demographics. How is social vulnerability measured?The Tucson Equity Priority Index (TEPI) examines the proportion of vulnerability per feature using 11 demographic indicators:Income Below Poverty: Households with income at or below the federal poverty level (FPL), which in 2023 was $14,500 for an individual and $30,000 for a family of fourUnemployment: Measured as the percentage of unemployed persons in the civilian labor forceHousing Cost Burdened: Homeowners who spend more than 30% of their income on housing expenses, including mortgage, maintenance, and taxesRenter Cost Burdened: Renters who spend more than 30% of their income on rentNo Health Insurance: Those without private health insurance, Medicare, Medicaid, or any other plan or programNo Vehicle Access: Households without automobile, van, or truck accessHigh School Education or Less: Those highest level of educational attainment is a High School diploma, equivalency, or lessLimited English Ability: Those whose ability to speak English is "Less Than Well."People of Color: Those who identify as anything other than Non-Hispanic White Disability: Households with one or more physical or cognitive disabilities Age: Groups that tend to have higher levels of vulnerability, including children (those below 18), and seniors (those 65 and older)An overall percentile value is calculated for each feature based on the total proportion of the above indicators in each area. How are the variables combined?These indicators are divided into two main categories that we call Thematic Indices: Economic and Personal Characteristics. The two thematic indices are further divided into five sub-indices called Tier-2 Sub-Indices. Each Tier-2 Sub-Index contains 2-3 indicators. Indicators are the datasets used to measure vulnerability within each sub-index. The variables for each feature are re-scaled using the percentile normalization method, which converts them to the same scale using values between 0 to 100. The variables are then combined first into each of the five Tier-2 Sub-Indices, then the Thematic Indices, then the overall TEPI using the mean aggregation method and equal weighting. The resulting dataset is then divided into the five classes, where:High Vulnerability (80-100%): Representing the top classification, this category includes the highest 20% of regions that are the most socially vulnerable. These areas require the most focused attention. Moderate-High Vulnerability (60-80%): This upper-middle classification includes areas with higher levels of vulnerability compared to the median. While not the highest, these areas are more vulnerable than a majority of the dataset and should be considered for targeted interventions. Moderate Vulnerability (40-60%): Representing the middle or median quintile, this category includes areas of average vulnerability. These areas may show a balanced mix of high and low vulnerability. Detailed examination of specific indicators is recommended to understand the nuanced needs of these areas. Low-Moderate Vulnerability (20-40%): Falling into the lower-middle classification, this range includes areas that are less vulnerable than most but may still exhibit certain vulnerable characteristics. These areas typically have a mix of lower and higher indicators, with the lower values predominating. Low Vulnerability (0-20%): This category represents the bottom classification, encompassing the lowest 20% of data points. Areas in this range are the least vulnerable, making them the most resilient compared to all other features in the dataset.
Land cover describes the surface of the earth. Land cover maps are useful in urban planning, resource management, change detection, agriculture, and a variety of other applications in which information related to earth surface is required. Land cover classification is a complex exercise and is hard to capture using traditional means. Deep learning models are highly capable of learning these complex semantics and can produce superior results.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.Fine-tuning the modelThis model can be fine-tuned using the Train Deep Learning Model tool. Follow the guide to fine-tune this model.Input8-bit, 3-band high-resolution (80 - 100 cm) imagery.OutputClassified raster with the same classes as in the Chesapeake Bay Landcover dataset (2013/2014). By default, the output raster contains 9 classes. A simpler classification with 6 classes can be performed by setting the the 'detailed_classes' model argument to false.Note: The output classified raster will not contain 'Aberdeen Proving Ground' class. Find class descriptions here.Applicable geographiesThis model is applicable in the United States and is expected to produce best results in the Chesapeake Bay Region.Model architectureThis model uses the UNet model architecture implemented in ArcGIS API for Python.Accuracy metricsThis model has an overall accuracy of 86.5% for classification into 9 land cover classes and 87.86% for 6 classes. The table below summarizes the precision, recall and F1-score of the model on the validation dataset, for classification into 9 land cover classes:ClassPrecisionRecallF1 ScoreWater0.936140.930460.93329Wetlands0.816590.759050.78677Tree Canopy0.904770.931430.91791Shrubland0.516250.186430.27394Low Vegetation0.859770.866760.86325Barren0.671650.509220.57927Structures0.80510.848870.82641Impervious Surfaces0.735320.685560.70957Impervious Roads0.762810.812380.78682The table below summarizes the precision, recall and F1-score of the model on the validation dataset, for classification into 6 land cover classes: ClassPrecisionRecallF1 ScoreWater0.950.940.95Tree Canopy and Shrubs0.910.920.92Low Vegetation0.850.850.85Barren0.790.690.74Impervious Surfaces0.840.840.84Impervious Roads0.820.830.82Training dataThis model has been trained on the Chesapeake Bay high-resolution 2013/2014 NAIP Landcover dataset (produced by Chesapeake Conservancy with their partners University of Vermont Spatial Analysis Lab (UVM SAL), and Worldview Solutions, Inc. (WSI)) and other high resolution imagery. Find more information about the dataset here.Sample resultsHere are a few results from the model.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Sands Point, NY, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Sands Point median household income. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Lancaster County, PA, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Lancaster County median household income. You can refer the same here