Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This national, tract-level experienced racial segregation dataset uses data for over 66 million anonymized and opted-in devices in Cuebiq’s Spectus Clean Room data to estimate 15 minute time overlaps of device stays in 38.2m x 19.1m grids across the United States in 2022. We infer a probability distribution of racial backgrounds for each device given their home Census block groups at the time of data collection, and calculate the probability of a diverse social contact during that space and time. These measures are then aggregated to the Census tract and across the whole time period in order to preserve privacy and develop a generalizable measure of the diversity of a place. We propose that this dataset is a better measurement of the segregation and diversity as it is experienced, which we show diverges from standard measurements of segregation. The data can be used by researchers to better understand the determinants of experienced segregation; beyond research, we suggest this data can be used by policy makers to understand the impacts of policies designed to encourage social mixing and access to opportunities such as affordable housing and mixed-income housing, and more.
For the purposes of enhanced privacy, home census block groups were pre-calculated by the data provider, and all calculations are done at the Census tract, with tracts that have more than 20 unique devices over the period of analysis.
This dataset contains Hospital Supplier Diversity Plans.
As outlined in Health and Safety Code Section 1339.85-1339.87, licensed hospitals with operating expenses of fifty million dollars ($50,000,000) or more, and each licensed hospital with operating expenses of twenty-five million dollars ($25,000,000) or more that is part of a hospital system, shall submit an annual report to the department on its minority, women, LGBT, and disabled veteran business enterprise procurement efforts during the previous year.
Details on reporting requirements can be found in Section 1339.87.
For more on Hospital Supplier Diversity Plans.
Data notes: The information contained in a hospital’s plan on minority, women, LGBT, and disabled veteran business enterprises is provided for informational purposes only.
Suppliers are not required to disclose the above information to hospitals, and therefore not all diverse spending will be accurately identified.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the median household income across different racial categories in United States. It portrays the median household income of the head of household across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to gain insights into economic disparities and trends and explore the variations in median houshold income for diverse racial categories.
Key observations
Based on our analysis of the distribution of United States population by race & ethnicity, the population is predominantly White. This particular racial category constitutes the majority, accounting for 68.17% of the total residents in United States. Notably, the median household income for White households is $79,933. Interestingly, despite the White population being the most populous, it is worth noting that Asian households actually reports the highest median household income, with a median income of $106,954. This reveals that, while Whites may be the most numerous in United States, Asian households experience greater economic prosperity in terms of median household income.
https://i.neilsberg.com/ch/united-states-median-household-income-by-race.jpeg" alt="United States median household income diversity across racial categories">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates.
Racial categories include:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for United States median household income by race. You can refer the same here
Report on Demographic Data in New York City Public Schools, 2020-21Enrollment counts are based on the November 13 Audited Register for 2020. Categories with total enrollment values of zero were omitted. Pre-K data includes students in 3-K. Data on students with disabilities, English language learners, and student poverty status are as of March 19, 2021. Due to missing demographic information in rare cases and suppression rules, demographic categories do not always add up to total enrollment and/or citywide totals. NYC DOE "Eligible for free or reduced-price lunch” counts are based on the number of students with families who have qualified for free or reduced-price lunch or are eligible for Human Resources Administration (HRA) benefits. English Language Arts and Math state assessment results for students in grade 9 are not available for inclusion in this report, as the spring 2020 exams did not take place. Spring 2021 ELA and Math test results are not included in this report for K-8 students in 2020-21. Due to the COVID-19 pandemic’s complete transformation of New York City’s school system during the 2020-21 school year, and in accordance with New York State guidance, the 2021 ELA and Math assessments were optional for students to take. As a result, 21.6% of students in grades 3-8 took the English assessment in 2021 and 20.5% of students in grades 3-8 took the Math assessment. These participation rates are not representative of New York City students and schools and are not comparable to prior years, so results are not included in this report. Dual Language enrollment includes English Language Learners and non-English Language Learners. Dual Language data are based on data from STARS; as a result, school participation and student enrollment in Dual Language programs may differ from the data in this report. STARS course scheduling and grade management software applications provide a dynamic internal data system for school use; while standard course codes exist, data are not always consistent from school to school. This report does not include enrollment at District 75 & 79 programs. Students enrolled at Young Adult Borough Centers are represented in the 9-12 District data but not the 9-12 School data. “Prior Year” data included in Comparison tabs refers to data from 2019-20. “Year-to-Year Change” data included in Comparison tabs indicates whether the demographics of a school or special program have grown more or less similar to its district or attendance zone (or school, for special programs) since 2019-20. Year-to-year changes must have been at least 1 percentage point to qualify as “More Similar” or “Less Similar”; changes less than 1 percentage point are categorized as “No Change”. The admissions method tab contains information on the admissions methods used for elementary, middle, and high school programs during the Fall 2020 admissions process. Fall 2020 selection criteria are included for all programs with academic screens, including middle and high school programs. Selection criteria data is based on school-reported information. Fall 2020 Diversity in Admissions priorities is included for applicable middle and high school programs. Note that the data on each school’s demographics and performance includes all students of the given subgroup who were enrolled in the school on November 13, 2020. Some of these students may not have been admitted under the admissions method(s) shown, as some students may have enrolled in the school outside the centralized admissions process (via waitlist, over-the-counter, or transfer), and schools may have changed admissions methods over the past few years. Admissions methods are only reported for grades K-12. "3K and Pre-Kindergarten data are reported at the site level. See below for definitions of site types included in this report. Additionally, please note that this report excludes all students at District 75 sites, reflecting slightly lower enrollment than our total of 60,265 students
This set of quarterly cubes provides employee population data for the new Ethnicity and Race Indicator (ERI). The numbers reflect the actual number of employees as of a specific point in time. The following workforce characteristics are available for analysis: Agency, State/Country, Age (5 year interval), Education Level, Ethnicity and Race Indicator (ERI), Length of Service (5 year interval), GS & Equivalent Grade, Occupation, Occupation Category, Pay Plan & Grade, Salary Level ($10,000 interval), STEM Occupations, Supervisory Status, Type of Appointment, Work Schedule, Work Status, Employment, Average Salary, Average Length of Service. Diversity cubes will be available for the most recent 8 quarters and the 5 previous end of fiscal year (September) files.
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
This dataset contains information about the demographics of all US cities and census-designated places with a population greater or equal to 65,000. This data comes from the US Census Bureau's 2015 American Community Survey. This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.
Ethnic diversity is generally associated with less social capital and lower levels of trust. However, most empirical evidence for this relationship is focused on generalized trust, rather than more theoretically appropriate measures of group-based trust. This paper evaluates the relationship between ethnic diversity – at national, regional, and local levels – and the degree to which coethnics are trusted more than non-coethnics, a value I call the “coethnic trust premium.” Using public opinion data from sixteen African countries, I find that citizens of ethnically diverse states express, on average, more ethnocentric trust. However, within countries, regional ethnic diversity is actually associated with less ethnocentric trust. This same negative pattern between diversity and ethnocentric trust appears across districts and enumeration areas within Malawi. I then show, consistent with these patterns, that diversity is only detrimental to intergroup trust at the national level in the presence of ethnic group segregation. These results highlight the importance of the spatial distribution of ethnic groups on intergroup relations, and question the utility of micro-level studies of interethnic interactions for understanding macro-level group dynamics.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the median household income across different racial categories in State College. It portrays the median household income of the head of household across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to gain insights into economic disparities and trends and explore the variations in median houshold income for diverse racial categories.
Key observations
Based on our analysis of the distribution of State College population by race & ethnicity, the population is predominantly White. This particular racial category constitutes the majority, accounting for 80.12% of the total residents in State College. Notably, the median household income for White households is $50,296. Interestingly, despite the White population being the most populous, it is worth noting that Some Other Race households actually reports the highest median household income, with a median income of $60,333. This reveals that, while Whites may be the most numerous in State College, Some Other Race households experience greater economic prosperity in terms of median household income.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Racial categories include:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for State College median household income by race. You can refer the same here
The Regional Ecological Assessment Protocol (REAP) is a screening level tool created as a way to identify priority ecological resources within the five EPA Region 6 states (Arkansas, Louisiana, New Mexico, Oklahoma, and Texas). The REAP divides eighteen individual measures into three main sub-layers: diversity, rarity, and sustainability. This geodatabase contains the diversity layers. There are 2 diversity grids within this geodatabase (diversity & diversityrank). The diversity layer shows land cover continuity and diversity. There are three measures that make up the diversity layer: appropriateness of land cover, contiguous size of undeveloped area, and the Shannon land cover diversity index. Each cell in the final diversity grid has a score of between 1 and 100 based on the average of the three measures. Cells with higher scores represent areas that are more diverse. Cells with lower scores represent areas that are the least diverse. In the diversityrank grid, the cells are placed into the following 5 groups based on the score: 1 (top 1% of scores), 10 (top 10% of scores), 25 (top 25% of scores), 50 (top 50% of scores), and 100 (all the rest of the scores). See each individual feature class for more detailed metadata.
ShellBase is a normal-form database including historical bacteriological monitoring data collected in shellfish waters of North Carolina (NC), South Carolina (SC), Georgia (GA), and Florida (FL), USA, from 1979 to 2020. Data included in ShellBase includes fecal coliform measurements, ancillary environmental data when available (e.g., tidal stage, water quality parameters), laboratory analysis method, sampling strategy, and geospatial information including sampling station, growing area, and growing area classification. ShellBase was created to allow for monitoring data from the diverse state programs to be integrated into a single database. However, although the data from NC, SC, GA, and FL are combined in this integrated database, data cannot necessarily be compared between states given the diversity in the regulatory and monitoring strategies of each state shellfish sanitation program (see the Purpose section for more detail). ShellBase metadata includes descriptions of each state program’s classification, monitoring, and analysis schemes at the time of publishing the database. However, users who are interested in using these data for analysis and modeling are encouraged to contact the respective state shellfish sanitation programs to ensure the data are used responsibly. These data include tables and related lookup tables of information regarding bacteria density, salinity, water temperature and pH measurements of offshore locations from 1979 to 2020.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the median household income across different racial categories in United States. It portrays the median household income of the head of household across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to gain insights into economic disparities and trends and explore the variations in median houshold income for diverse racial categories.
Key observations
Based on our analysis of the distribution of United States population by race & ethnicity, the population is predominantly White. This particular racial category constitutes the majority, accounting for 63.44% of the total residents in United States. Notably, the median household income for White households is $83,784. Interestingly, despite the White population being the most populous, it is worth noting that Asian households actually reports the highest median household income, with a median income of $113,106. This reveals that, while Whites may be the most numerous in United States, Asian households experience greater economic prosperity in terms of median household income.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Racial categories include:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for United States median household income by race. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Emergency medical services (EMS) workforce demographics in the United States do not reflect the diversity of the population served. Despite some efforts by professional organizations to create a more representative workforce, little has changed in the last decade. This scoping review aims to summarize existing literature on the demographic composition, recruitment, retention, and workplace experience of underrepresented groups within EMS. Peer-reviewed studies were obtained from a search of PubMed, CINAHL, Web of Science, ProQuest Thesis and Dissertations, and non-peer-reviewed (“gray”) literature from 1960 to present. Abstracts and included full-text articles were screened by two independent reviewers trained on inclusion/exclusion criteria. Studies were included if they pertained to the demographics, training, hiring, retention, promotion, compensation, or workplace experience of underrepresented groups in United States EMS by race, ethnicity, sexual orientation, or gender. Studies of non-EMS fire department activities were excluded. Disputes were resolved by two authors. A single reviewer screened the gray literature. Data extraction was performed using a standardized electronic form. Results were summarized qualitatively. We identified 87 relevant full-text articles from the peer-reviewed literature and 250 items of gray literature. Primary themes emerging from peer-reviewed literature included workplace experience (n = 48), demographics (n = 12), workforce entry and exit (n = 8), education and testing (n = 7), compensation and benefits (n = 5), and leadership, mentorship, and promotion (n = 4). Most articles focused on sex/gender comparisons (65/87, 75%), followed by race/ethnicity comparisons (42/87, 48%). Few articles examined sexual orientation (3/87, 3%). One study focused on telecommunicators and three included EMS physicians. Most studies (n = 60, 69%) were published in the last decade. In the gray literature, media articles (216/250, 86%) demonstrated significant industry discourse surrounding these primary themes. Existing EMS workforce research demonstrates continued underrepresentation of women and nonwhite personnel. Additionally, these studies raise concerns for pervasive negative workplace experiences including sexual harassment and factors that negatively affect recruitment and retention, including bias in candidate testing, a gender pay gap, and unequal promotion opportunities. Additional research is needed to elucidate recruitment and retention program efficacy, the demographic composition of EMS leadership, and the prevalence of racial harassment and discrimination in this workforce.
CDFW BIOS GIS Dataset, Contact: Melanie Gogol-Prokurat, Description: Rare species richness is a measure of the diversity of rare species in the landscape, and is one measurement used to describe the distribution of overall species biodiversity in California for the California Department of Fish and Wildlife's (CDFW) Areas of Conservation Emphasis Project (ACE). The rare species richness summary depicts relative rare species diversity within each ecoregion across the state, so that areas of highest diversity within each ecoregion are highlighted.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Species Biodiversity, Areas of Conservation Emphasis (ACE), version 3.0.
The California Department of Fish and Wildlife's (CDFW) Areas of Conservation Emphasis (ACE) Species Biodiversity dataset is a summary of the best available information on species biodiversity in California, and is based on species occurrence and distribution information for amphibians, aquatic macroinvertebrates, birds, fish, mammals, plants, and reptiles. It synthesizes information from the ACE Terrestrial Biodiversity Summary, which is compiled by hexagon, and the Aquatic Biodiversity Summary, which is compiled by watershed. The biodiversity summary combines three measures of biodiversity: 1) native species richness, which represents overall native diversity of all species in the state, both common and rare; 2) rare species richness, which represents diversity of rare species; and, 3) irreplaceability, which is a weighted measure of endemism. The data can be used to view patterns of overall species diversity, and identify areas of highest biodiversity, taking into account common, rare, and rare endemic species.
This dataset displays relative biodiversity values for each ecoregion of the state, so that the areas of highest diversity within each ecoregion are highlighted. The data is normalized so that areas of highest diversity for each taxonomic group contribute equally to the final map (see Data Sources and Models Used section). The attribute table for this dataset includes the final ranks for all ACE datasets, providing an overview of all ACE scores for an area.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Cultural diversity in the U.S. has led to great variations in names and naming traditions and names have been used to express creativity, personality, cultural identity, and values. Source: https://en.wikipedia.org/wiki/Naming_in_the_United_States
This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data.
All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences.
Fork this kernel to get started with this dataset.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:usa_names
https://cloud.google.com/bigquery/public-data/usa-names
Dataset Source: Data.gov. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @dcp from Unplash.
What are the most common names?
What are the most common female names?
Are there more female or male names?
Female names by a wide margin?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The RRING Work Package 3 (WP3) objective was to clarify how Research Funding Organisations (RFOs) and Research Performing Organisations (RPOs) operated within region-specific research and innovation environments. It explored how they navigated the governance and regulatory frameworks for Responsible Research and Innovation (RRI), as well as offering their perspectives on the entities responsible for RRI-related policy and action in their locales.
This data set covers the global survey research part, which was designed to contextualise how RPOs and RFOs interacted within the research environment and with non-academic stakeholders. Countries were grouped according to the UNESCO regions of the world and key results per region are listed below. For a detailed analysis and further findings of the work completed under WP3 of the RRING project, please refer to the full deliverable document "State of the Art of RRI in the Five UNESCO World Regions" [link to be inserted].
European and North American States
Latin American and Caribbean States
Asian and Pacific States
Arab States
African States
Note: Please refer to the "RRING WP3 - Survey Data Documentation" document for detailed instructions on how to use this dataset.
Species Biodiversity Summaries combine the three measures of biodiversity developed for Areas of Conservation Emphasis into a single measure. These three measures include: 1) native species richness, which represents overall native diversity of all species in the state, both common and rare, as well as climate vulnerable species and important game and sport fish species; 2) rare species richness, which represents diversity of rare species; and, 3) irreplaceability, which is a weighted measure of endemism that highlights areas that support unique species of limited range.Species Biodiversity, Areas of Conservation Emphasis (ACE), version 3.0.The California Department of Fish and Wildlife’s (CDFW) Areas of Conservation Emphasis (ACE) Species Biodiversity dataset is a summary of the best available information on species biodiversity in California, and is based on species occurrence and distribution information for amphibians, aquatic macroinvertebrates, birds, fish, mammals, plants, and reptiles. It synthesizes information from the ACE Terrestrial Biodiversity Summary, which is compiled by hexagon, and the Aquatic Biodiversity Summary, which is compiled by watershed. The biodiversity summary combines three measures of biodiversity: 1) native species richness, which represents overall native diversity of all species in the state, both common and rare; 2) rare species richness, which represents diversity of rare species; and, 3) irreplaceability, which is a weighted measure of endemism. The data can be used to view patterns of overall species diversity, and identify areas of highest biodiversity, taking into account common, rare, and rare endemic species. This dataset displays relative biodiversity values for each ecoregion of the state, so that the areas of highest diversity within each ecoregion are highlighted. The data is normalized so that areas of highest diversity for each taxonomic group contribute equally to the final map (see Data Sources and Models Used section). The attribute table for this dataset includes the final ranks for all ACE datasets, providing an overview of all ACE scores for an area.For more information, see the Terrestrial Biodiversity Summary Factsheet at https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=152834The user can view a list of species potentially present in each hexagon in the ACE online map viewer https://map.dfg.ca.gov/ace/. Note that the names of some rare or endemic species, such as those at risk of over-collection, have been suppressed from the list of species names per hexagon, but are still included in the species counts.The California Department of Fish and Wildlife’s (CDFW) Areas of Conservation Emphasis (ACE) is a compilation and analysis of the best-available statewide spatial information in California on biodiversity, rarity and endemism, harvested species, significant habitats, connectivity and wildlife movement, climate vulnerability, climate refugia, and other relevant data (e.g., other conservation priorities such as those identified in the State Wildlife Action Plan (SWAP), stressors, land ownership). ACE addresses both terrestrial and aquatic data. The ACE model combines and analyzes terrestrial information in a 2.5 square mile hexagon grid and aquatic information at the HUC12 watershed level across the state to produce a series of maps for use in non-regulatory evaluation of conservation priorities in California. The model addresses as many of CDFWs statewide conservation and recreational mandates as feasible using high quality data sources. High value areas statewide and in each USDA Ecoregion were identified. The ACE maps and data can be viewed in the ACE online map viewer, or downloaded for use in ArcGIS. For more detailed information see https://www.wildlife.ca.gov/Data/Analysis/ACE and https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326.Preprocessing methods: Kept all species biodiversity ranks for display on web map.
See full Resource Data Guide here.Abstract: The Natural Diversity Database Areas is a 1:24,000-scale, polygon feature-based layer that represents general locations of endangered, threatened and special concern species. The layer is based on information collected by DEEP biologists, cooperating scientists, conservation groups and landowners. In some cases an occurrence represents a location derived from literature, museum records and specimens. These data are compiled and maintained by the DEEP Bureau of Natural Resources, Natural Diversity Database Program. The layer is updated every six months and reflects information that has been submitted and accepted up to that point. The layer includes state and federally listed species. It does not include Critical Habitats, Natural Area Preserves, designated wetland areas or wildlife concentration areas. These general locations were created by randomly shifting the true locations of terrestrial species and then adding a 0.25 mile buffer distance to each point, and by mapping linear segments with a 300 foot buffer associated with aquatic, riparian and coastal species. The exact location of the species observation falls somewhere within the polygon area and not necessarily in the center. Attribute information includes the date when these data were last updated. Species names are withheld to protect sensitive species from collection and disturbance. Data is compiled at 1:24,000 scale. These data are updated every six months, approximately in June and December. It is important to use the most current data available.Purpose: This dataset was developed to help state agencies and landowners comply with the State Endangered Species Act. Under the Act, state agencies are required to ensure that any activity authorized, funded or performed by the state does not threatened the continued existence of endangered or threatened species or their essential habitat. Applicants for certain state and local permits may be required to consult with the Department of Energy and Environmental Protections's Natural Diversity Data Base (NDDB) as part of the permit process. Follow instructions provided in the appropriate permit guidance. If you require a federal endangered species review, work with your federal regulatory agency and review the US Fish & Wildlife IPaC tool. Natural Diversity Data Base Areas are intended to be used as a pre-screening tool to identify potential impacts to known locations of state listed species. To use this data for site-based endangered species review, locate the project boundaries and any additionally affected areas on the map. If any part of the project is within a NDDB Area then the project may have a conflict with listed species. In the case of a potential conflict, an Environmental Review Request (https://portal.ct.gov/deep-nddbrequest) should be made to the Natural Diversity Data Base for further review. The DEEP will provide recommendations for avoiding impacts to state listed species. Additional onsite surveys may be requested of the applicant depending on the nature and scope of a project. For this reason, applicants should apply early in the planning stages of a project. Not all land use choices will impact the particular species that is present. Often minor modifications to the proposed plan can alleviate conflicts with state listed species.Other uses of the data include targeting areas for conservation or site management to enhance and protect rare species habitats.Supplemental information: For additional information, refer to the Department of Energy and Environmental Protection En
NOTE: A more current version of the Protected Areas Database of the United States (PAD-US) is available: PAD-US 3.0 https://doi.org/10.5066/P9Q9LQ4B. The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme (https://communities.geoplatform.gov/ngda-cadastre/). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all public and nonprofit lands and waters. Most are public lands owned in fee (the owner of the property has full and irrevocable ownership of the land); however, long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g. 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. The PAD-US geodatabase maps and describes areas using over twenty-five attributes and five feature classes representing the U.S. protected areas network in separate feature classes: Fee (ownership parcels), Designation, Easement, Marine, Proclamation and Other Planning Boundaries. Five additional feature classes include various combinations of the primary layers (for example, Combined_Fee_Easement) to support data management, queries, web mapping services, and analyses. This PAD-US Version 2.1 dataset includes a variety of updates and new data from the previous Version 2.0 dataset (USGS, 2018 https://doi.org/10.5066/P955KPLE ), achieving the primary goal to "Complete the PAD-US Inventory by 2020" (https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-vision) by addressing known data gaps with newly available data. The following list summarizes the integration of "best available" spatial data to ensure public lands and other protected areas from all jurisdictions are represented in PAD-US, along with continued improvements and regular maintenance of the federal theme. Completing the PAD-US Inventory: 1) Integration of over 75,000 city parks in all 50 States (and the District of Columbia) from The Trust for Public Land's (TPL) ParkServe data development initiative (https://parkserve.tpl.org/) added nearly 2.7 million acres of protected area and significantly reduced the primary known data gap in previous PAD-US versions (local government lands). 2) First-time integration of the Census American Indian/Alaskan Native Areas (AIA) dataset (https://www2.census.gov/geo/tiger/TIGER2019/AIANNH) representing the boundaries for federally recognized American Indian reservations and off-reservation trust lands across the nation (as of January 1, 2020, as reported by the federally recognized tribal governments through the Census Bureau's Boundary and Annexation Survey) addressed another major PAD-US data gap. 3) Aggregation of nearly 5,000 protected areas owned by local land trusts in 13 states, aggregated by Ducks Unlimited through data calls for easements to update the National Conservation Easement Database (https://www.conservationeasement.us/), increased PAD-US protected areas by over 350,000 acres. Maintaining regular Federal updates: 1) Major update of the Federal estate (fee ownership parcels, easement interest, and management designations), including authoritative data from 8 agencies: Bureau of Land Management (BLM), U.S. Census Bureau (Census), Department of Defense (DOD), U.S. Fish and Wildlife Service (FWS), National Park Service (NPS), Natural Resources Conservation Service (NRCS), U.S. Forest Service (USFS), National Oceanic and Atmospheric Administration (NOAA). The federal theme in PAD-US is developed in close collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/); 2) Complete National Marine Protected Areas (MPA) update: from the National Oceanic and Atmospheric Administration (NOAA) MPA Inventory, including conservation measure ('GAP Status Code', 'IUCN Category') review by NOAA; Other changes: 1) PAD-US field name change - The "Public Access" field name changed from 'Access' to 'Pub_Access' to avoid unintended scripting errors associated with the script command 'access'. 2) Additional field - The "Feature Class" (FeatClass) field was added to all layers within PAD-US 2.1 (only included in the "Combined" layers of PAD-US 2.0 to describe which feature class data originated from). 3) Categorical GAP Status Code default changes - National Monuments are categorically assigned GAP Status Code = 2 (previously GAP 3), in the absence of other information, to better represent biodiversity protection restrictions associated with the designation. The Bureau of Land Management Areas of Environmental Concern (ACECs) are categorically assigned GAP Status Code = 3 (previously GAP 2) as the areas are administratively protected, not permanent. More information is available upon request. 4) Agency Name (FWS) geodatabase domain description changed to U.S. Fish and Wildlife Service (previously U.S. Fish & Wildlife Service). 5) Select areas in the provisional PAD-US 2.1 Proclamation feature class were removed following a consultation with the data-steward (Census Bureau). Tribal designated statistical areas are purely a geographic area for providing Census statistics with no land base. Most affected areas are relatively small; however, 4,341,120 acres and 37 records were removed in total. Contact Mason Croft (masoncroft@boisestate) for more information about how to identify these records. For more information regarding the PAD-US dataset please visit, https://usgs.gov/gapanalysis/PAD-US/. For more information about data aggregation please review the Online PAD-US Data Manual available at https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-manual .
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract
In the last years, neural networks have evolved from laboratory environments to the state-of-the-art for many real-world problems. Our hypothesis is that neural network models (i.e., their weights and biases) evolve on unique, smooth trajectories in weight space during training. Following, a population of such neural network models (refereed to as “model zoo”) would form topological structures in weight space. We think that the geometry, curvature and smoothness of these structures contain information about the state of training and can be reveal latent properties of individual models. With such zoos, one could investigate novel approaches for (i) model analysis, (ii) discover unknown learning dynamics, (iii) learn rich representations of such populations, or (iv) exploit the model zoos for generative modelling of neural network weights and biases. Unfortunately, the lack of standardized model zoos and available benchmarks significantly increases the friction for further research about populations of neural networks. With this work, we publish a novel dataset of model zoos containing systematically generated and diverse populations of neural network models for further research. In total the proposed model zoo dataset is based on six image datasets, consist of 24 model zoos with varying hyperparameter combinations are generated and includes 47’360 unique neural network models resulting in over 2’415’360 collected model states. Additionally, to the model zoo data we provide an in-depth analysis of the zoos and provide benchmarks for multiple downstream tasks as mentioned before.
Dataset
This dataset is part of a larger collection of model zoos and contains the zoos trained on the labelled samples from STL10. All zoos with extensive information and code can be found at www.modelzoos.cc.
This repository contains the raw model zoos as collections of models (file names beginning with "cifar_"). Zoos are trained with small and large CNN models, in three configurations varying the seed only (seed), varying hyperparameters with fixed seeds (hyp_fix) or varying hyperparameters with random seeds (hyp_rand). Due to the large filesize, the preprocessed datasets are hosted in a separate repository. The index_dict.json files contain information on how to read the vectorized models.
For more information on the zoos and code to access and use the zoos, please see www.modelzoos.cc.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This national, tract-level experienced racial segregation dataset uses data for over 66 million anonymized and opted-in devices in Cuebiq’s Spectus Clean Room data to estimate 15 minute time overlaps of device stays in 38.2m x 19.1m grids across the United States in 2022. We infer a probability distribution of racial backgrounds for each device given their home Census block groups at the time of data collection, and calculate the probability of a diverse social contact during that space and time. These measures are then aggregated to the Census tract and across the whole time period in order to preserve privacy and develop a generalizable measure of the diversity of a place. We propose that this dataset is a better measurement of the segregation and diversity as it is experienced, which we show diverges from standard measurements of segregation. The data can be used by researchers to better understand the determinants of experienced segregation; beyond research, we suggest this data can be used by policy makers to understand the impacts of policies designed to encourage social mixing and access to opportunities such as affordable housing and mixed-income housing, and more.
For the purposes of enhanced privacy, home census block groups were pre-calculated by the data provider, and all calculations are done at the Census tract, with tracts that have more than 20 unique devices over the period of analysis.