38 datasets found

N
Median Household Income by Racial Categories in United States (2022)
neilsberg.com
csv, json
Updated Jan 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Median Household Income by Racial Categories in United States (2022) [Dataset]. https://www.neilsberg.com/research/datasets/3693eb82-8904-11ee-9302-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 3, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Variables measured
Median Household Income for Asian Population, Median Household Income for Black Population, Median Household Income for White Population, Median Household Income for Some other race Population, Median Household Income for Two or more races Population, Median Household Income for American Indian and Alaska Native Population, Median Household Income for Native Hawaiian and Other Pacific Islander Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates. To portray the median household income within each racial category idetified by the US Census Bureau, we conducted an initial analysis and categorization of the data. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). It is important to note that the median household income estimates exclusively represent the identified racial categories and do not incorporate any ethnicity classifications. Households are categorized, and median incomes are reported based on the self-identified race of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the median household income across different racial categories in United States. It portrays the median household income of the head of household across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to gain insights into economic disparities and trends and explore the variations in median houshold income for diverse racial categories.

Key observations

Based on our analysis of the distribution of United States population by race & ethnicity, the population is predominantly White. This particular racial category constitutes the majority, accounting for 68.17% of the total residents in United States. Notably, the median household income for White households is $79,933. Interestingly, despite the White population being the most populous, it is worth noting that Asian households actually reports the highest median household income, with a median income of $106,954. This reveals that, while Whites may be the most numerous in United States, Asian households experience greater economic prosperity in terms of median household income.

https://i.neilsberg.com/ch/united-states-median-household-income-by-race.jpeg" alt="United States median household income diversity across racial categories">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates.

Racial categories include:

White

Black or African American

American Indian and Alaska Native

Asian

Native Hawaiian and Other Pacific Islander

Some other race

Two or more races (multiracial)

Variables / Data Columns

Race of the head of household: This column presents the self-identified race of the household head, encompassing all relevant racial categories (excluding ethnicity) applicable in United States.

Median household income: Median household income, adjusting for inflation, presented in 2022-inflation-adjusted dollars

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for United States median household income by race. You can refer the same here
o
US Cities: Demographics
public.opendatasoft.com
data.smartidf.services
+3more
csv, excel, json
Updated Jul 27, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). US Cities: Demographics [Dataset]. https://public.opendatasoft.com/explore/dataset/us-cities-demographics/
Explore at:
excel, csv, jsonAvailable download formats
Dataset updated
Jul 27, 2017
License
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
Area covered
United States
Description
This dataset contains information about the demographics of all US cities and census-designated places with a population greater or equal to 65,000. This data comes from the US Census Bureau's 2015 American Community Survey. This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.
d
2020 - 2021 Diversity Report
catalog.data.gov
data.cityofnewyork.us
+1more
Updated Nov 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofnewyork.us (2024). 2020 - 2021 Diversity Report [Dataset]. https://catalog.data.gov/dataset/2020-2021-diversity-report
Explore at:
Dataset updated
Nov 29, 2024
Dataset provided by
data.cityofnewyork.us
Description
Report on Demographic Data in New York City Public Schools, 2020-21Enrollment counts are based on the November 13 Audited Register for 2020. Categories with total enrollment values of zero were omitted. Pre-K data includes students in 3-K. Data on students with disabilities, English language learners, and student poverty status are as of March 19, 2021. Due to missing demographic information in rare cases and suppression rules, demographic categories do not always add up to total enrollment and/or citywide totals. NYC DOE "Eligible for free or reduced-price lunch” counts are based on the number of students with families who have qualified for free or reduced-price lunch or are eligible for Human Resources Administration (HRA) benefits. English Language Arts and Math state assessment results for students in grade 9 are not available for inclusion in this report, as the spring 2020 exams did not take place. Spring 2021 ELA and Math test results are not included in this report for K-8 students in 2020-21. Due to the COVID-19 pandemic’s complete transformation of New York City’s school system during the 2020-21 school year, and in accordance with New York State guidance, the 2021 ELA and Math assessments were optional for students to take. As a result, 21.6% of students in grades 3-8 took the English assessment in 2021 and 20.5% of students in grades 3-8 took the Math assessment. These participation rates are not representative of New York City students and schools and are not comparable to prior years, so results are not included in this report. Dual Language enrollment includes English Language Learners and non-English Language Learners. Dual Language data are based on data from STARS; as a result, school participation and student enrollment in Dual Language programs may differ from the data in this report. STARS course scheduling and grade management software applications provide a dynamic internal data system for school use; while standard course codes exist, data are not always consistent from school to school. This report does not include enrollment at District 75 & 79 programs. Students enrolled at Young Adult Borough Centers are represented in the 9-12 District data but not the 9-12 School data. “Prior Year” data included in Comparison tabs refers to data from 2019-20. “Year-to-Year Change” data included in Comparison tabs indicates whether the demographics of a school or special program have grown more or less similar to its district or attendance zone (or school, for special programs) since 2019-20. Year-to-year changes must have been at least 1 percentage point to qualify as “More Similar” or “Less Similar”; changes less than 1 percentage point are categorized as “No Change”. The admissions method tab contains information on the admissions methods used for elementary, middle, and high school programs during the Fall 2020 admissions process. Fall 2020 selection criteria are included for all programs with academic screens, including middle and high school programs. Selection criteria data is based on school-reported information. Fall 2020 Diversity in Admissions priorities is included for applicable middle and high school programs. Note that the data on each school’s demographics and performance includes all students of the given subgroup who were enrolled in the school on November 13, 2020. Some of these students may not have been admitted under the admissions method(s) shown, as some students may have enrolled in the school outside the centralized admissions process (via waitlist, over-the-counter, or transfer), and schools may have changed admissions methods over the past few years. Admissions methods are only reported for grades K-12. "3K and Pre-Kindergarten data are reported at the site level. See below for definitions of site types included in this report. Additionally, please note that this report excludes all students at District 75 sites, reflecting slightly lower enrollment than our total of 60,265 students
N
State Center, IA median household income breakdown by race betwen 2013 and...
neilsberg.com
csv, json
Updated Mar 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). State Center, IA median household income breakdown by race betwen 2013 and 2023 [Dataset]. https://www.neilsberg.com/insights/state-center-ia-median-household-income-by-race/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Mar 1, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
State Center, Iowa
Variables measured
Median Household Income Trends for Asian Population, Median Household Income Trends for Black Population, Median Household Income Trends for White Population, Median Household Income Trends for Some other race Population, Median Household Income Trends for Two or more races Population, Median Household Income Trends for American Indian and Alaska Native Population, Median Household Income Trends for Native Hawaiian and Other Pacific Islander Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To portray the median household income within each racial category idetified by the US Census Bureau, we conducted an initial analysis and categorization of the data from 2013 to 2023. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). It is important to note that the median household income estimates exclusively represent the identified racial categories and do not incorporate any ethnicity classifications. Households are categorized, and median incomes are reported based on the self-identified race of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the median household incomes over the past decade across various racial categories identified by the U.S. Census Bureau in State Center. It portrays the median household income of the head of household across racial categories (excluding ethnicity) as identified by the Census Bureau. It also showcases the annual income trends, between 2013 and 2023, providing insights into the economic shifts within diverse racial communities.The dataset can be utilized to gain insights into income disparities and variations across racial categories, aiding in data analysis and decision-making..

Key observations

White: In State Center, the median household income for the households where the householder is White decreased by $4,103(5.36%), between 2013 and 2023. The median household income, in 2023 inflation-adjusted dollars, was $76,603 in 2013 and $72,500 in 2023.

Black or African American: As per the U.S. Census Bureau population data, in State Center, there are no households where the householder is Black or African American; hence, the median household income for the Black or African American population is not applicable.

Refer to the research insights for more key observations on American Indian and Alaska Native, Asian, Native Hawaiian and Other Pacific Islander, Some other race and Two or more races (multiracial) households

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Racial categories include:

White

Black or African American

American Indian and Alaska Native

Asian

Native Hawaiian and Other Pacific Islander

Some other race

Two or more races (multiracial)

Variables / Data Columns

Race of the head of household: This column presents the self-identified race of the household head, encompassing all relevant racial categories (excluding ethnicity) applicable in State Center.

2010: 2010 median household income

2011: 2011 median household income

2012: 2012 median household income

2013: 2013 median household income

2014: 2014 median household income

2015: 2015 median household income

2016: 2016 median household income

2017: 2017 median household income

2018: 2018 median household income

2019: 2019 median household income

2020: 2020 median household income

2021: 2021 median household income

2022: 2022 median household income

2023: 2023 median household income

Please note: All incomes have been adjusted for inflation and are presented in 2023-inflation-adjusted dollars.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for State Center median household income by race. You can refer the same here
d
Protected Areas Database of the United States (PAD-US) 2.0
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Protected Areas Database of the United States (PAD-US) 2.0 [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-2-0
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
NOTE: A more current version of the Protected Areas Database of the United States (PAD-US) is available: PAD-US 2.1 https://doi.org/10.5066/P92QM3NT. The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme (https://communities.geoplatform.gov/ngda-cadastre/). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all public and nonprofit lands and waters. Most are public lands owned in fee; however, long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g. 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. The PAD-US geodatabase maps and describes areas with over twenty-five attributes in nine feature classes to support data management, queries, web mapping services, and analyses. NOTE: A more current version of the Protected Areas Database of the United States (PAD-US) is available: PAD-US 2.1 https://doi.org/10.5066/P92QM3NT This PAD-US Version 2.0 dataset includes a variety of updates and changes from the previous Version 1.4 dataset. The following list summarizes major updates and changes: 1) Expanded database structure with new layers: the geodatabase feature class structure now includes nine feature classes separating fee owned lands, conservation (and other) easements, management designations overlapping fee lands, marine areas, proclamation boundaries and various 'Combined' feature classes (e.g. 'Fee' + 'Easement' + 'Designation' feature classes); 2) Major update of the Federal estate including data from 8 agencies, developed in collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/); 3) Major updates to 30 States and limited additions to 16 other States; 4) Integration of The Nature Conservancy's (TNC) Secured Lands geodatabase; 5) Integration of Ducks Unlimited's (DU) Conservation and Recreation Lands (CARL) database; 6) Integration of The Trust for Public Land's (TPL) Conservation Almanac database; 7) The Nature Conservancy (TNC) Lands database update: the national source of lands owned in fee or managed by TNC; 8) National Conservation Easement Database (NCED) update: complete update of non-sensitive (suitable for publication in the public domain) easements; 9) Complete National Marine Protected Areas (MPA) update: from the NOAA MPA Inventory, including conservation measure ('GAP Status Code', 'IUCN Category') review by NOAA; 10) First integration of Bureau of Energy Ocean Management (BOEM) managed marine lands: BOEM submitted Outer Continental Shelf Area lands managed for natural resources (minerals, oil and gas), a significant and new addition to PAD-US; 11) Fee boundary overlap assessment: topology overlaps in the PAD-US 2.0 'Fee' feature class have been identified and are available for user and data-steward reference (See Logical_Consistency_Report Section). For more information regarding the PAD-US dataset please visit, https://usgs.gov/gapanalysis/PAD-US/. For more information about data aggregation please review the “Data Manual for PAD-US” available at https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-manual .
RRING Global Survey Research Dataset (WP3)
zenodo.org
explore.openaire.eu
+1more
Updated Jun 25, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lars Lorenz; Lars Lorenz; Eric Jensen; Eric Jensen (2021). RRING Global Survey Research Dataset (WP3) [Dataset]. http://doi.org/10.5281/zenodo.4719938
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4719938
Dataset updated
Jun 25, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lars Lorenz; Lars Lorenz; Eric Jensen; Eric Jensen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The RRING Work Package 3 (WP3) objective was to clarify how Research Funding Organisations (RFOs) and Research Performing Organisations (RPOs) operated within region-specific research and innovation environments. It explored how they navigated the governance and regulatory frameworks for Responsible Research and Innovation (RRI), as well as offering their perspectives on the entities responsible for RRI-related policy and action in their locales.

This data set covers the global survey research part, which was designed to contextualise how RPOs and RFOs interacted within the research environment and with non-academic stakeholders. Countries were grouped according to the UNESCO regions of the world and key results per region are listed below. For a detailed analysis and further findings of the work completed under WP3 of the RRING project, please refer to the full deliverable document "State of the Art of RRI in the Five UNESCO World Regions" [link to be inserted].

European and North American States

‘Diverse and inclusive': Respondents were most attitudinally supportive of the importance of ensuring ethical principles were applied in R&I (92%), followed by diverse perspectives (88%), and gender equality (79%). Including ethnic minorities was the area which garnered the least attitudinal support (71%). Respondents took the most practical steps towards engaging with diverse perspectives (63%), and the least towards inclusion of ethnic minorities (24%).

‘Anticipative and reflective’: Respondents widely agreed (82%) with the importance of ensuring R&I work does not cause concerns for society, but only 37% confirmed they had taken practical steps to ensure this.

‘Open and transparent’: Vast majorities of respondents agreed on the importance of keeping R&I methods open and transparent (94%), with 65% also confirming they take practical steps to do this. An equally high number agreed on the importance of making the results of R&I work accessible to as wide a public as possible (94%), and 68% confirmed this through their reported actions. This indicated the smallest value-action gap of all RRI measures for respondents from European and North American countries. Attitudinal agreement on the importance of making data freely available to the public was lower (83%), as was the practical action aspect for this measure (45%).

‘Responsive and adaptive to change’: Most respondents agreed (89%) that it was important to ensure their work addresses societal needs, and 62% confirmed that they take practical steps towards this aim.

Latin American and Caribbean States

‘Diverse and inclusive': Respondents were most attitudinally supportive of the importance of gender equality in R&I (86%), followed by ensuring ethical principles are applied (85%), and diverse perspectives incorporated (83%). Including ethnic minorities was the area which garnered the least attitudinal support (77%). Respondents took the most practical steps towards ensuring ethical principles guide their work (50%), and the least towards including ethnic minorities (25%), but the smallest value action gap was found for gender equality.

‘Anticipative and reflective’: Respondents agreed (79%) that it is important to ensure R&I work does not cause concerns for society, but only 29% confirmed they had taken practical steps to ensure this.

‘Open and transparent’: The majority of respondents agreed on the importance of keeping R&I methods open and transparent (89%), with 45% indicating they had taken practical action. A majority also agreed on the importance of making the results of R&I work accessible to as wide a public as possible (88%), and 44% backed this up with practical action. Attitudinal agreement on the importance of making data freely available to the public was slightly lower (81%), as was the practical action aspect for this measure (35%).

‘Responsive and adaptive to change’: Most respondents agreed (84%) that it was important to ensure their work addresses societal needs, and 49% confirmed that they take practical steps towards this aim.

Asian and Pacific States

‘Diverse and inclusive': Respondents were most attitudinally supportive of the importance of ensuring ethical principles were applied in R&I (90%), followed by diverse perspectives (89%), and gender equality (86%). Including ethnic minorities was the area which garnered the least attitudinal support (76%). Respondents took the most practical steps towards engaging with diverse perspectives (65%), and the least towards including ethnic minorities (30%).

‘Anticipative and reflective’: Respondents widely agreed (78%) with the importance of ensuring R&I work does not cause concerns for society, and 42% confirmed they had taken practical steps to ensure this.

‘Open and transparent’: The majority of respondents agreed on the importance of keeping R&I methods open and transparent (91%), with 58% indicating they take practical steps to do this. A majority also agreed on the importance of making the results of R&I work accessible to as wide a public as possible (89%), and 64% backed this up with practical action. Attitudinal agreement on the importance of making data freely available to the public was lower (79%), as was the practical action aspect for this measure (40%).

‘Responsive and adaptive to change’: Most respondents agreed (92%) that it was important to ensure their work addresses societal needs, and 69% confirmed that they take practical steps towards this aim. This was the RRI measure with the smallest valueaction gap for respondents from the Asian and Pacific region.

Arab States

‘Diverse and inclusive': Respondents were most attitudinally supportive of the importance of ensuring ethical principles were applied in R&I (93%), followed by diverse perspectives (81%), and gender equality (85%). Including ethnic minorities was the area which garnered the least attitudinal support (74%). Respondents took the most practical steps towards engaging with diverse perspectives (66%), which equated to one of two equally small value-action gaps for respondents from Arab states, and the least practical steps towards inclusion of ethnic minorities (22%).

‘Anticipative and reflective’: A high proportion of respondents (85%) agreed that it is important to ensure R&I work does not cause concerns for society. However, only 38% confirmed they had taken practical steps to ensure this.

‘Open and transparent’: The majority of respondents agreed on the importance of keeping R&I methods open and transparent (89%), with 59% also confirming they take practical steps to do this. A majority also agreed on the importance of making the results of R&I work accessible to as wide a public as possible (90%), and 66% backed this up with practical action. Ensuring public accessibility of research results was the second of two measures with equally small value-action gaps. Attitudinal agreement on the importance of making data freely available to the public was much lower (78%), which also reflected the practical action aspect for this measure (49%).

‘Responsive and adaptive to change’: Most respondents agreed (96%) that it was important to ensure their work addresses societal needs, and 68% confirmed that they take practical steps to achieve this.

African States

‘Diverse and inclusive': Respondents were most attitudinally supportive of the importance of ensuring engagement with diverse perspectives and expertise in R&I (91%), followed by ensuring ethical principles are applied (90%), and gender equality (89%). Including ethnic minorities was the area which garnered the least attitudinal support (74%). Respondents took the most practical steps towards ensuring ethical principles guide their work (57%), and the least towards including ethnic minorities (32%).

‘Anticipative and reflective’: The majority of respondents (85%) agreed that it is important to ensure R&I work does not cause concerns for society, with 59% confirming that they take practical steps to ensure this.

‘Open and transparent’: A high proportion of respondents agreed on the importance of keeping R&I methods open and transparent (90%), with 54% also confirming they take practical steps to do this. A majority also agreed on the importance of making the results of R&I work accessible to as wide a public as possible (86%), and 56% backed this up with practical action. Attitudinal agreement on the importance of making data freely available to the public was significantly lower (73%), as was the practical action aspect for this measure (38%).

‘Responsive and adaptive to change’: Respondents mostly agreed (92%) that it was important to ensure their work addresses societal needs, and 64% confirmed that they take practical steps towards this aim. This was the RRI measure with the smallest valueaction gap for respondents from African states.

Note: Please refer to the "RRING WP3 - Survey Data Documentation" document for detailed instructions on how to use this dataset.
N
State College, PA median household income breakdown by race betwen 2013 and...
neilsberg.com
csv, json
Updated Mar 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). State College, PA median household income breakdown by race betwen 2013 and 2023 [Dataset]. https://www.neilsberg.com/research/datasets/ed388060-f665-11ef-a994-3860777c1fe6/
Explore at:
json, csvAvailable download formats
Dataset updated
Mar 1, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
State College, Pennsylvania
Variables measured
Median Household Income Trends for Asian Population, Median Household Income Trends for Black Population, Median Household Income Trends for White Population, Median Household Income Trends for Some other race Population, Median Household Income Trends for Two or more races Population, Median Household Income Trends for American Indian and Alaska Native Population, Median Household Income Trends for Native Hawaiian and Other Pacific Islander Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To portray the median household income within each racial category idetified by the US Census Bureau, we conducted an initial analysis and categorization of the data from 2013 to 2023. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). It is important to note that the median household income estimates exclusively represent the identified racial categories and do not incorporate any ethnicity classifications. Households are categorized, and median incomes are reported based on the self-identified race of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the median household incomes over the past decade across various racial categories identified by the U.S. Census Bureau in State College. It portrays the median household income of the head of household across racial categories (excluding ethnicity) as identified by the Census Bureau. It also showcases the annual income trends, between 2013 and 2023, providing insights into the economic shifts within diverse racial communities.The dataset can be utilized to gain insights into income disparities and variations across racial categories, aiding in data analysis and decision-making..

Key observations

White: In State College, the median household income for the households where the householder is White increased by $13,298(35.94%), between 2013 and 2023. The median household income, in 2023 inflation-adjusted dollars, was $36,998 in 2013 and $50,296 in 2023.

Black or African American: In State College, the median household income for the households where the householder is Black or African American increased by $18,638(124.99%), between 2013 and 2023. The median household income, in 2023 inflation-adjusted dollars, was $14,912 in 2013 and $33,550 in 2023.

Refer to the research insights for more key observations on American Indian and Alaska Native, Asian, Native Hawaiian and Other Pacific Islander, Some other race and Two or more races (multiracial) households

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Racial categories include:

White

Black or African American

American Indian and Alaska Native

Asian

Native Hawaiian and Other Pacific Islander

Some other race

Two or more races (multiracial)

Variables / Data Columns

Race of the head of household: This column presents the self-identified race of the household head, encompassing all relevant racial categories (excluding ethnicity) applicable in State College.

2010: 2010 median household income

2011: 2011 median household income

2012: 2012 median household income

2013: 2013 median household income

2014: 2014 median household income

2015: 2015 median household income

2016: 2016 median household income

2017: 2017 median household income

2018: 2018 median household income

2019: 2019 median household income

2020: 2020 median household income

2021: 2021 median household income

2022: 2022 median household income

2023: 2023 median household income

Please note: All incomes have been adjusted for inflation and are presented in 2023-inflation-adjusted dollars.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for State College median household income by race. You can refer the same here
d
Protected Areas Database of the United States (PAD-US) 1.4
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Protected Areas Database of the United States (PAD-US) 1.4 [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-1-4
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
NOTE: A more current version of the Protected Areas Database of the United States (PAD-US) is available: PAD-US 2.0 https://doi.org/10.5066/P955KPLE. The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public open space and voluntarily provided, private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastral Theme (http://www.fgdc.gov/ngda-reports/NGDA_Datasets.html). PAD-US is an ongoing project with several published versions of a spatial database of areas dedicated to the preservation of biological diversity, and other natural, recreational or cultural uses, managed for these purposes through legal or other effective means. The geodatabase maps and describes public open space and other protected areas. Most areas are public lands owned in fee; however, long-term easements, leases, and agreements or administrative designations documented in agency management plans may be included. The PAD-US database strives to be a complete “best available” inventory of protected areas (lands and waters) including data provided by managing agencies and organizations. The dataset is built in collaboration with several partners and data providers (http://gapanalysis.usgs.gov/padus/stewards/). See Supplemental Information Section of this metadata record for more information on partnerships and links to major partner organizations. As this dataset is a compilation of many data sets; data completeness, accuracy, and scale may vary. Federal and state data are generally complete, while local government and private protected area coverage is about 50% complete, and depends on data management capacity in the state. For completeness estimates by state: http://www.protectedlands.net/partners. As the federal and state data are reasonably complete; focus is shifting to completing the inventory of local gov and voluntarily provided, private protected areas. The PAD-US geodatabase contains over twenty-five attributes and four feature classes to support data management, queries, web mapping services and analyses: Marine Protected Areas (MPA), Fee, Easements and Combined. The data contained in the MPA Feature class are provided directly by the National Oceanic and Atmospheric Administration (NOAA) Marine Protected Areas Center (MPA, http://marineprotectedareas.noaa.gov ) tracking the National Marine Protected Areas System. The Easements feature class contains data provided directly from the National Conservation Easement Database (NCED, http://conservationeasement.us ) The MPA and Easement feature classes contain some attributes unique to the sole source databases tracking them (e.g. Easement Holder Name from NCED, Protection Level from NOAA MPA Inventory). The "Combined" feature class integrates all fee, easement and MPA features as the best available national inventory of protected areas in the standard PAD-US framework. In addition to geographic boundaries, PAD-US describes the protection mechanism category (e.g. fee, easement, designation, other), owner and managing agency, designation type, unit name, area, public access and state name in a suite of standardized fields. An informative set of references (i.e. Aggregator Source, GIS Source, GIS Source Date) and "local" or source data fields provide a transparent link between standardized PAD-US fields and information from authoritative data sources. The areas in PAD-US are also assigned conservation measures that assess management intent to permanently protect biological diversity: the nationally relevant "GAP Status Code" and global "IUCN Category" standard. A wealth of attributes facilitates a wide variety of data analyses and creates a context for data to be used at local, regional, state, national and international scales. More information about specific updates and changes to this PAD-US version can be found in the Data Quality Information section of this metadata record as well as on the PAD-US website, http://gapanalysis.usgs.gov/padus/data/history/.) Due to the completeness and complexity of these data, it is highly recommended to review the Supplemental Information Section of the metadata record as well as the Data Use Constraints, to better understand data partnerships as well as see tips and ideas of appropriate uses of the data and how to parse out the data that you are looking for. For more information regarding the PAD-US dataset please visit, http://gapanalysis.usgs.gov/padus/. To find more data resources as well as view example analysis performed using PAD-US data visit, http://gapanalysis.usgs.gov/padus/resources/. The PAD-US dataset and data standard are compiled and maintained by the USGS Gap Analysis Program, http://gapanalysis.usgs.gov/ . For more information about data standards and how the data are aggregated please review the “Standards and Methods Manual for PAD-US,” http://gapanalysis.usgs.gov/padus/data/standards/ .
Data from: UAIC Ichthyological Collection
gbif.org
Updated Oct 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Worth Pugh; Worth Pugh (2021). UAIC Ichthyological Collection [Dataset]. http://doi.org/10.15468/a2laag
Explore at:
Unique identifier
https://doi.org/10.15468/a2laag
Dataset updated
Oct 25, 2021
Dataset provided by
Global Biodiversity Information Facilityhttps://www.gbif.org/
University of Alabama Biodiversity and Systematics
Authors
Worth Pugh; Worth Pugh
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered

Description
The State of Alabama contains the most diverse fish fauna of North America. The University of Alabama Ichthyological Collection (UAIC) documents this diversity and is one of the largest educational and research collections of fishes in the southeastern United States. This nationally and internationally recognized biological resource includes over one million preserved, skeletal, and frozen specimens, some dating back to the mid 1900's, and is the best single resource documenting past and present distributions and abundances of fishes in the State.
F
Audio Visual Speech Dataset: American English
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Audio Visual Speech Dataset: American English [Dataset]. https://www.futurebeeai.com/dataset/multi-modal-dataset/american-english-visual-speech-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
United States
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the US English Language Visual Speech Dataset! This dataset is a collection of diverse, single-person unscripted spoken videos supporting research in visual speech recognition, emotion detection, and multimodal communication.
Dataset Content
This visual speech dataset contains 1000 videos in US English language each paired with a corresponding high-fidelity audio track. Each participant is answering a specific question in a video in an unscripted and spontaneous nature.
•Participant Diversity:
•
Speakers: The dataset includes visual speech data from more than 200 participants from different states/provinces of United States of America.

•
Regions: Ensures a balanced representation of Skip 3 accents, dialects, and demographics.

•
Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.

Video Data
While recording each video extensive guidelines are kept in mind to maintain the quality and diversity.
•Recording Details:
•
File Duration: Average duration of 30 seconds to 3 minutes per video.

•
Formats: Videos are available in MP4 or MOV format.

•
Resolution: Videos are recorded in ultra-high-definition resolution with 30 fps or above.

•
Device: Both the latest Android and iOS devices are used in this collection.

•
Recording Conditions: Videos were recorded under various conditions to ensure diversity and reduce bias:

•
Indoor and Outdoor Settings: Includes both indoor and outdoor recordings.

•
Lighting Variations: Captures videos in daytime, nighttime, and varying lighting conditions.

•
Camera Positions: Includes handheld and fixed camera positions, as well as portrait and landscape orientations.

•
Face Orientation: Contains straight face and tilted face angles.

•
Participant Positions: Records participants in both standing and seated positions.

•
Motion Variations: Features both stationary and moving videos, where participants pass through different lighting conditions.

•
Occlusions: Includes videos where the participant's face is partially occluded by hand movements, microphones, hair, glasses, and facial hair.

•
Focus: In each video, the participant's face remains in focus throughout the video duration, ensuring the face stays within the video frame.

•
Video Content: In each video, the participant answers a specific question in an unscripted manner. These questions are designed to capture various emotions of participants. The dataset contain videos expressing following human emotions:

•Happy
•Sad
•Excited
•Angry
•Annoyed
•Normal
•
Question Diversity: For each human emotion participant answered a specific question expressing that particular emotion.

Metadata
The dataset provides comprehensive metadata for each video recording and participant:
•
News Events Data in North America ( Techsalerator)
datarade.ai
Updated Jun 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Techsalerator (2024). News Events Data in North America ( Techsalerator) [Dataset]. https://datarade.ai/data-products/news-events-data-in-north-america-techsalerator-techsalerator
Explore at:
.json, .csv, .xls, .txtAvailable download formats
Dataset updated
Jun 25, 2024
Dataset provided by
Techsalerator LLC
Authors
Techsalerator
Area covered
United States
Description
Techsalerator’s News Event Data in North America offers a comprehensive and detailed dataset designed to provide businesses, analysts, journalists, and researchers with a thorough view of significant news events across North America. This dataset captures and categorizes major events reported from a diverse range of news sources, including press releases, industry news sites, blogs, and PR platforms, providing valuable insights into regional developments, economic shifts, political changes, and cultural events.

Key Features of the Dataset: Extensive Coverage:

The dataset aggregates news events from a wide array of sources, including company press releases, industry-specific news outlets, blogs, PR sites, and traditional media. This broad coverage ensures a diverse range of information from multiple reporting channels. Categorization of Events:

News events are categorized into various types such as business and economic updates, political developments, technological advancements, legal and regulatory changes, and cultural events. This categorization helps users quickly find and analyze information relevant to their interests or sectors. Real-Time Updates:

The dataset is updated regularly to include the most current events, ensuring that users have access to up-to-date news and can stay informed about recent developments as they happen. Geographic Segmentation:

Events are tagged with their respective countries and territories within North America. This geographic segmentation allows users to filter and analyze news events based on specific locations, facilitating targeted research and analysis. Event Details:

Each event entry includes comprehensive details such as the date of occurrence, source of the news, a description of the event, and relevant keywords. This thorough detailing helps users understand the context and significance of each event. Historical Data:

The dataset includes historical news event data, enabling users to track trends and conduct comparative analysis over time. This feature supports longitudinal studies and provides insights into how news events evolve. Advanced Search and Filter Options:

Users can search and filter news events based on criteria such as date range, event type, location, and keywords. This functionality allows for precise and efficient retrieval of relevant information. North American Countries and Territories Covered: Countries: Canada Mexico United States Territories: American Samoa (U.S. territory) French Polynesia (French overseas collectivity; included for regional relevance) Guam (U.S. territory) New Caledonia (French special collectivity; included for regional relevance) Northern Mariana Islands (U.S. territory) Puerto Rico (U.S. territory) Saint Pierre and Miquelon (French overseas territory; geographically close to North America and included for regional comprehensiveness) Wallis and Futuna (French overseas collectivity; included for regional relevance) Benefits of the Dataset: Strategic Insights: Businesses and analysts can use the dataset to gain insights into significant regional developments, economic conditions, and political changes, aiding in strategic decision-making and market analysis. Market and Industry Trends: The dataset provides valuable information on industry-specific trends and events, helping users understand market dynamics and identify emerging opportunities. Media and PR Monitoring: Journalists and PR professionals can track relevant news across North America, enabling them to monitor media coverage, identify emerging stories, and manage public relations efforts effectively. Academic and Research Use: Researchers can utilize the dataset for longitudinal studies, trend analysis, and academic research on various topics related to North American news and events. Techsalerator’s News Event Data in North America is a crucial resource for accessing and analyzing significant news events across the continent. By providing detailed, categorized, and up-to-date information, it supports effective decision-making, research, and media monitoring across diverse sectors.
d
Protected Areas Database of the United States (PAD-US) 4.0
catalog.data.gov
data.usgs.gov
Updated Jul 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Protected Areas Database of the United States (PAD-US) 4.0 [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-4-0
Explore at:
Dataset updated
Jul 20, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme ( https://ngda-cadastre-geoplatform.hub.arcgis.com/ ). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all open space public and nonprofit lands and waters. Most are public lands owned in fee (the owner of the property has full and irrevocable ownership of the land); however, permanent and long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g., 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of U.S. public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. PAD-US provides a full inventory geodatabase, spatial analysis, statistics, data downloads, web services, poster maps, and data submissions included in efforts to track global progress toward biodiversity protection. PAD-US integrates spatial data to ensure public lands and other protected areas from all jurisdictions are represented. PAD-US version 4.0 includes new and updated data from the following data providers. All other data were transferred from previous versions of PAD-US. Federal updates - The USGS remains committed to updating federal fee owned lands data and major designation changes in regular PAD-US updates, where authoritative data provided directly by managing agencies are available or alternative data sources are recommended. Revisions associated with the federal estate in this version include updates to the Federal estate (fee ownership parcels, easement interest, management designations, and proclamation boundaries), with authoritative data from 7 agencies: Bureau of Land Management (BLM), U.S. Census Bureau (Census Bureau), Department of Defense (DOD), U.S. Fish and Wildlife Service (FWS), National Park Service (NPS), Natural Resources Conservation Service (NRCS), and the U.S. Forest Service (USFS). The federal theme in PAD-US is developed in close collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://ngda-gov-units-geoplatform.hub.arcgis.com/pages/federal-lands-workgroup/ ). This includes improved the representation of boundaries and attributes for the National Park Service, U.S. Forest Service, Bureau of Land Management, and U.S. Fish and Wildlife Service lands, in collaboration with agency data-stewards, in response to feedback from the PAD-US Team and stakeholders. Additionally, National Cemetery boundaries were added using geospatial boundary data provided by the U.S. Department of Veterans Affairs and NASA boundaries were added using data contained in the USGS National Boundary Dataset (NBD). State Updates - USGS is committed to building capacity in the state data steward network and the PAD-US Team to increase the frequency of state land and NGO partner updates, as resources allow. State Lands Workgroup ( https://ngda-gov-units-geoplatform.hub.arcgis.com/pages/state-lands-workgroup ) is focused on improving protected land inventories in PAD-US, increase update efficiency, and facilitate local review. PAD-US 4.0 included updates and additions from the following seventeen states and territories: California (state, local, and nonprofit fee); Colorado (state, local, and nonprofit fee and easement); Georgia (state and local fee); Kentucky (state, local, and nonprofit fee and easement); Maine (state, local, and nonprofit fee and easement); Montana (state, local, and nonprofit fee); Nebraska (state fee); New Jersey (state, local, and nonprofit fee and easement); New York (state, local, and nonprofit fee and easement); North Carolina (state, local, and nonprofit fee); Pennsylvania (state, local, and nonprofit fee and easement); Puerto Rico (territory fee); Tennessee (land trust fee); Texas (state, local, and nonprofit fee); Virginia (state, local, and nonprofit fee); West Virginia (state, local, and nonprofit fee); and Wisconsin (state fee data). Additionally, the following datasets were incorporated from NGO data partners: Trust for Public Land (TPL) Parkserve (new fee and easement data); The Nature Conservancy (TNC) Lands (fee owned by TNC); TNC Northeast Secured Areas; Ducks Unlimited (land trust fee); and the National Conservation Easement Database (NCED). All state and NGO easement submissions are provided to NCED. For more information regarding the PAD-US dataset please visit, https://www.usgs.gov/programs/gap-analysis-project/science/protected-areas . For more information regarding the PAD-US dataset please visit, https://www.usgs.gov/programs/gap-analysis-project/science/protected-areas . For more information about data aggregation please review the PAD-US Data Manual available at https://www.usgs.gov/programs/gap-analysis-project/pad-us-data-manual . A version history of PAD-US updates is summarized below (See https://www.usgs.gov/programs/gap-analysis-project/pad-us-data-history/ for more information): 1) First posted - April 2009 (Version 1.0 - available from the PAD-US: Team pad-us@usgs.gov). 2) Revised - May 2010 (Version 1.1 - available from the PAD-US: Team pad-us@usgs.gov). 3) Revised - April 2011 (Version 1.2 - available from the PAD-US: Team pad-us@usgs.gov). 4) Revised - November 2012 (Version 1.3) https://doi.org/10.5066/F79Z92XD 5) Revised - May 2016 (Version 1.4) https://doi.org/10.5066/F7G73BSZ 6) Revised - September 2018 (Version 2.0) https://doi.org/10.5066/P955KPLE 7) Revised - September 2020 (Version 2.1) https://doi.org/10.5066/P92QM3NT 8) Revised - January 2022 (Version 3.0) https://doi.org/10.5066/P9Q9LQ4B 9) Revised - April 2024 (Version 4.0) https://doi.org/10.5066/P96WBCHS Comparing protected area trends between PAD-US versions is not recommended without consultation with USGS as many changes reflect improvements to agency and organization GIS systems, or conservation and recreation measure classification, rather than actual changes in protected area acquisition on the ground.
Data from: SNAPSHOT USA 2019-2023: The first five years of data from a...
data.niaid.nih.gov
dataone.org
+2more
zip
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brigit Rooney; William McShea; Roland Kays; Michael Cove (2025). SNAPSHOT USA 2019-2023: The first five years of data from a coordinated camera trap survey of the United States [Dataset]. http://doi.org/10.5061/dryad.k0p2ngfhn
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.k0p2ngfhn
Dataset updated
Apr 10, 2025
Dataset provided by
North Carolina State University
Smithsonian Conservation Biology Institute
North Carolina Museum of Natural Sciences
Authors
Brigit Rooney; William McShea; Roland Kays; Michael Cove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
United States
Description
SNAPSHOT USA is an annual, multi-contributor camera trap survey of mammals across the United States. The growing SNAPSHOT USA dataset is intended for tracking the spatial and temporal responses of mammal populations to changes in land use, land cover, and climate. These data will be useful for exploring the drivers of spatial and temporal changes in relative abundance and distribution, as well as the impacts of species interactions on daily activity patterns. SNAPSHOT USA 2019–2023 contains 987,979 records of camera trap image sequence data and 9,694 records of camera trap deployment metadata. Data were collected across the United States of America in all 50 states, 12 ecoregions, and many ecosystems. Data were collected between August 1st and December 29th each year from 2019 to 2023. The dataset includes a wide range of taxa but is primarily focused on medium to large mammals. SNAPSHOT USA 2019–2023 comprises two .csv files. The original data can be found within the SNAPSHOT USA Initiative in the Wildlife Insights platform. Methods The first three annual SNAPSHOT USA surveys were coordinated by Roland Kays, Michael Cove, and William McShea. The 2019, 2020, and 2021 datasets are accessible for public use through the Supporting Information of their respective publications. Although the 2019 and 2020 surveys were originally processed and stored in eMammal (https://www.emammal.si.edu), all data are now housed in Wildlife Insights (WI) within the SNAPSHOT USA Initiative. The two most recent surveys, 2022 and 2023, were coordinated by the SNAPSHOT USA Survey Coordinator Brigit Rooney. This dataset represents the first publication of 2022 and 2023 SNAPSHOT USA data. The SNAPSHOT USA project developed a standard protocol in 2019 to survey mammals >100 g and large identifiable birds. Cameras are unbaited and set at approximately 50 cm height across an array of at least 7 cameras with a minimum distance of 200 m and a maximum of 5 km between them. The collection period for SNAPSHOT USA data is between September and October and the target minimum of camera trap-nights per array is 400. Some contributors to SNAPSHOT USA 2019–2023 started collecting data earlier or deployed cameras later based on locations or logistics, and we chose to include data from August 1st through December 29th each year in this dataset. The first two years of SNAPSHOT USA data incorporated an Expert Review Tool to verify the accuracy of every identification, as that was built in to the eMammal repository. This tool required SNAPSHOT USA project managers (Cove and Kays in 2019, with more taxon-specific reviewers in 2020) to review and confirm all species identifications, in an effort to minimize identification errors. As eMammal automatically grouped all uploaded images into “sequences” of images taken within 60 seconds of each other, by using the image timestamps, species identifications were made for individual sequences rather than images. These data have since been transferred to WI, where they underwent opportunistic review and correction by the SNAPSHOT USA Survey Coordinator. In contrast, SNAPSHOT USA 2021, 2022, and 2023 were managed and identified entirely in WI. All SNAPSHOT USA projects in this repository were created as “Sequence” projects, to enable the identification of sequences in the same manner as eMammal. Each 60-second sequence of images was classified to the narrowest taxonomic level possible by three iterations of validation. First, WI’s Artificial Intelligence algorithm suggested a taxonomic identification. This algorithm consists of a multiclass classification deep convolutional neural network model that uses pre-trained image embedding from Inception, a model used to identify objects. Second, each array’s Principal Investigator was responsible for validating the data, fixing Artificial Intelligence identification mistakes, and approving the data they contributed to the survey. Lastly, the SNAPSHOT USA Survey Coordinator quality-checked the deployment data and as many identified sequences as possible. This was a multistep process that began with checking the sequence metadata for obvious timestamp errors by organizing them chronologically in Microsoft Excel, and the deployment metadata for location errors by mapping their coordinates and looking for outliers. Next, the coordinator checked the sequence metadata for unlikely identifications, including species detections in places outside their known range, and verified their accuracy by viewing the images in WI. Finally, identifications for the most common species were verified by using the “Species” filter on WI to look for mistakes, one species at a time. When combining the five years of SNAPSHOT USA data to create SNAPSHOT USA 2019–2023, several aspects of the data were standardized to ensure consistency across all years. These were camera array names, camera location names, and taxonomy classifications. To match protocol requirements, all camera locations less than 5 km apart were classified as one array. This resulted in combining several arrays that were originally recorded under different names and ensuring that arrays in the same place maintained the same name each year. The camera location names were standardized by ensuring that all locations with geographic coordinates that were the same to four decimal places, in Decimal Degrees notation, had the same name. However, the original coordinates were retained in the dataset. Finally, all species taxonomy classifications for the 2019 and 2020 datasets (identified in eMammal) were standardized to match those used by WI. As part of this process, all subspecies of mammals in the dataset were changed to species level (e.g., Florida black bear (Ursus americanus floridanus) became American black bear (Ursus americanus)). For mammal taxonomy classifications, WI uses a combination of the International Union for Conservation of Nature (IUCN) Red List of Threatened Species (2023; https://iucnredlist.org) and the American Society of Mammalogists Mammal Diversity Database (2024; https://www.mammaldiversity.org). For bird species, WI uses Birdlife International’s taxonomy classifications (2024; https://datazone.birdlife.org/species/search). The WI taxonomy is continually updated in response to public user suggestions and the taxonomy used in the SNAPSHOT USA 2019–2023 dataset reflects the WI taxonomy used in June 2024.
Z
INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET
data.niaid.nih.gov
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nafiz Sadman (2024). INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4047647
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
Nishat Anjum
Nafiz Sadman
Kishor Datta Gupta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Bangladesh, United States
Description
Introduction

There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.

However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.

2 Data-set Introduction

2.1 Data Collection

We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:

The headline must have one or more words directly or indirectly related to COVID-19.

The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.

The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.

Avoid taking duplicate reports.

Maintain a time frame for the above mentioned newspapers.

To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.

2.2 Data Pre-processing and Statistics

Some pre-processing steps performed on the newspaper report dataset are as follows:

Remove hyperlinks.

Remove non-English alphanumeric characters.

Remove stop words.

Lemmatize text.

While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.

The primary data statistics of the two dataset are shown in Table 1 and 2.

Table 1: Covid-News-USA-NNK data statistics

No of words per headline

7 to 20

No of words per body content

150 to 2100

Table 2: Covid-News-BD-NNK data statistics No of words per headline

10 to 20

No of words per body content

100 to 1500

2.3 Dataset Repository

We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.

3 Literature Review

Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.

Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].

Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.

Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.

4 Our experiments and Result analysis

We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:

In February, both the news paper have talked about China and source of the outbreak.

StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.

Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.

Washington Post discussed global issues more than StarTribune.

StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.

While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.

We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases

where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,
N
State Line City, IN median household income breakdown by race betwen 2013...
neilsberg.com
csv, json
Updated Mar 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). State Line City, IN median household income breakdown by race betwen 2013 and 2023 [Dataset]. https://www.neilsberg.com/research/datasets/ed3880db-f665-11ef-a994-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Mar 1, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
State Line City
Variables measured
Median Household Income Trends for Asian Population, Median Household Income Trends for Black Population, Median Household Income Trends for White Population, Median Household Income Trends for Some other race Population, Median Household Income Trends for Two or more races Population, Median Household Income Trends for American Indian and Alaska Native Population, Median Household Income Trends for Native Hawaiian and Other Pacific Islander Population
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To portray the median household income within each racial category idetified by the US Census Bureau, we conducted an initial analysis and categorization of the data from 2013 to 2023. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). It is important to note that the median household income estimates exclusively represent the identified racial categories and do not incorporate any ethnicity classifications. Households are categorized, and median incomes are reported based on the self-identified race of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents the median household incomes over the past decade across various racial categories identified by the U.S. Census Bureau in State Line City. It portrays the median household income of the head of household across racial categories (excluding ethnicity) as identified by the Census Bureau. It also showcases the annual income trends, between 2013 and 2023, providing insights into the economic shifts within diverse racial communities.The dataset can be utilized to gain insights into income disparities and variations across racial categories, aiding in data analysis and decision-making..

Key observations

White: In State Line City, the median household income for the households where the householder is White decreased by $699(1.08%), between 2013 and 2023. The median household income, in 2023 inflation-adjusted dollars, was $64,866 in 2013 and $64,167 in 2023.

Black or African American: Even though there is a population where the householder is Black or African American, there was no median household income reported by the U.S. Census Bureau for both 2013 and 2023.

Refer to the research insights for more key observations on American Indian and Alaska Native, Asian, Native Hawaiian and Other Pacific Islander, Some other race and Two or more races (multiracial) households

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

Racial categories include:

White

Black or African American

American Indian and Alaska Native

Asian

Native Hawaiian and Other Pacific Islander

Some other race

Two or more races (multiracial)

Variables / Data Columns

Race of the head of household: This column presents the self-identified race of the household head, encompassing all relevant racial categories (excluding ethnicity) applicable in State Line City.

2010: 2010 median household income

2011: 2011 median household income

2012: 2012 median household income

2013: 2013 median household income

2014: 2014 median household income

2015: 2015 median household income

2016: 2016 median household income

2017: 2017 median household income

2018: 2018 median household income

2019: 2019 median household income

2020: 2020 median household income

2021: 2021 median household income

2022: 2022 median household income

2023: 2023 median household income

Please note: All incomes have been adjusted for inflation and are presented in 2023-inflation-adjusted dollars.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for State Line City median household income by race. You can refer the same here
d
Protected Areas Database of the United States (PAD-US) 3.0 (ver. 2.0, March...
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Protected Areas Database of the United States (PAD-US) 3.0 (ver. 2.0, March 2023) [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-3-0-ver-2-0-march-2023
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public land and voluntarily provided private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastre Theme ( https://communities.geoplatform.gov/ngda-cadastre/ ). The PAD-US is an ongoing project with several published versions of a spatial database including areas dedicated to the preservation of biological diversity, and other natural (including extraction), recreational, or cultural uses, managed for these purposes through legal or other effective means. The database was originally designed to support biodiversity assessments; however, its scope expanded in recent years to include all open space public and nonprofit lands and waters. Most are public lands owned in fee (the owner of the property has full and irrevocable ownership of the land); however, permanent and long-term easements, leases, agreements, Congressional (e.g. 'Wilderness Area'), Executive (e.g. 'National Monument'), and administrative designations (e.g. 'Area of Critical Environmental Concern') documented in agency management plans are also included. The PAD-US strives to be a complete inventory of U.S. public land and other protected areas, compiling “best available” data provided by managing agencies and organizations. The PAD-US geodatabase maps and describes areas using thirty-six attributes and five separate feature classes representing the U.S. protected areas network: Fee (ownership parcels), Designation, Easement, Marine, Proclamation and Other Planning Boundaries. An additional Combined feature class includes the full PAD-US inventory to support data management, queries, web mapping services, and analyses. The Feature Class (FeatClass) field in the Combined layer allows users to extract data types as needed. A Federal Data Reference file geodatabase lookup table (PADUS3_0Combined_Federal_Data_References) facilitates the extraction of authoritative federal data provided or recommended by managing agencies from the Combined PAD-US inventory. This PAD-US Version 3.0 dataset includes a variety of updates from the previous Version 2.1 dataset (USGS, 2020, https://doi.org/10.5066/P92QM3NT ), achieving goals to: 1) Annually update and improve spatial data representing the federal estate for PAD-US applications; 2) Update state and local lands data as state data-steward and PAD-US Team resources allow; and 3) Automate data translation efforts to increase PAD-US update efficiency. The following list summarizes the integration of "best available" spatial data to ensure public lands and other protected areas from all jurisdictions are represented in the PAD-US (other data were transferred from PAD-US 2.1). Federal updates - The USGS remains committed to updating federal fee owned lands data and major designation changes in annual PAD-US updates, where authoritative data provided directly by managing agencies are available or alternative data sources are recommended. The following is a list of updates or revisions associated with the federal estate: 1) Major update of the Federal estate (fee ownership parcels, easement interest, and management designations where available), including authoritative data from 8 agencies: Bureau of Land Management (BLM), U.S. Census Bureau (Census Bureau), Department of Defense (DOD), U.S. Fish and Wildlife Service (FWS), National Park Service (NPS), Natural Resources Conservation Service (NRCS), U.S. Forest Service (USFS), and National Oceanic and Atmospheric Administration (NOAA). The federal theme in PAD-US is developed in close collaboration with the Federal Geographic Data Committee (FGDC) Federal Lands Working Group (FLWG, https://communities.geoplatform.gov/ngda-govunits/federal-lands-workgroup/ ). 2) Improved the representation (boundaries and attributes) of the National Park Service, U.S. Forest Service, Bureau of Land Management, and U.S. Fish and Wildlife Service lands, in collaboration with agency data-stewards, in response to feedback from the PAD-US Team and stakeholders. 3) Added a Federal Data Reference file geodatabase lookup table (PADUS3_0Combined_Federal_Data_References) to the PAD-US 3.0 geodatabase to facilitate the extraction (by Data Provider, Dataset Name, and/or Aggregator Source) of authoritative data provided directly (or recommended) by federal managing agencies from the full PAD-US inventory. A summary of the number of records (Frequency) and calculated GIS Acres (vs Documented Acres) associated with features provided by each Aggregator Source is included; however, the number of records may vary from source data as the "State Name" standard is applied to national files. The Feature Class (FeatClass) field in the table and geodatabase describe the data type to highlight overlapping features in the full inventory (e.g. Designation features often overlap Fee features) and to assist users in building queries for applications as needed. 4) Scripted the translation of the Department of Defense, Census Bureau, and Natural Resource Conservation Service source data into the PAD-US format to increase update efficiency. 5) Revised conservation measures (GAP Status Code, IUCN Category) to more accurately represent protected and conserved areas. For example, Fish and Wildlife Service (FWS) Waterfowl Production Area Wetland Easements changed from GAP Status Code 2 to 4 as spatial data currently represents the complete parcel (about 10.54 million acres primarily in North Dakota and South Dakota). Only aliquot parts of these parcels are documented under wetland easement (1.64 million acres). These acreages are provided by the U.S. Fish and Wildlife Service and are referenced in the PAD-US geodatabase Easement feature class 'Comments' field. State updates - The USGS is committed to building capacity in the state data-steward network and the PAD-US Team to increase the frequency of state land updates, as resources allow. The USGS supported efforts to significantly increase state inventory completeness with the integration of local parks data in the PAD-US 2.1, and developed a state-to-PAD-US data translation script during PAD-US 3.0 development to pilot in future updates. Additional efforts are in progress to support the technical and organizational strategies needed to increase the frequency of state updates. The PAD-US 3.0 included major updates to the following three states: 1) California - added or updated state, regional, local, and nonprofit lands data from the California Protected Areas Database (CPAD), managed by GreenInfo Network, and integrated conservation and recreation measure changes following review coordinated by the data-steward with state managing agencies. Developed a data translation Python script (see Process Step 2 Source Data Documentation) in collaboration with the data-steward to increase the accuracy and efficiency of future PAD-US updates from CPAD. 2) Virginia - added or updated state, local, and nonprofit protected areas data (and removed legacy data) from the Virginia Conservation Lands Database, provided by the Virginia Department of Conservation and Recreation's Natural Heritage Program, and integrated conservation and recreation measure changes following review by the data-steward. 3) West Virginia - added or updated state, local, and nonprofit protected areas data provided by the West Virginia University, GIS Technical Center. For more information regarding the PAD-US dataset please visit, https://www.usgs.gov/gapanalysis/PAD-US/. For more information about data aggregation please review the PAD-US Data Manual available at https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-manual . A version history of PAD-US updates is summarized below (See https://www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/pad-us-data-history for more information): 1) First posted - April 2009 (Version 1.0 - available from the PAD-US: Team pad-us@usgs.gov). 2) Revised - May 2010 (Version 1.1 - available from the PAD-US: Team pad-us@usgs.gov). 3) Revised - April 2011 (Version 1.2 - available from the PAD-US: Team pad-us@usgs.gov). 4) Revised - November 2012 (Version 1.3) https://doi.org/10.5066/F79Z92XD 5) Revised - May 2016 (Version 1.4) https://doi.org/10.5066/F7G73BSZ 6) Revised - September 2018 (Version 2.0) https://doi.org/10.5066/P955KPLE 7) Revised - September 2020 (Version 2.1) https://doi.org/10.5066/P92QM3NT 8) Revised - January 2022 (Version 3.0) https://doi.org/10.5066/P9Q9LQ4B Comparing protected area trends between PAD-US versions is not recommended without consultation with USGS as many changes reflect improvements to agency and organization GIS systems, or conservation and recreation measure classification, rather than actual changes in protected area acquisition on the ground.
n
State of Utah Acquired LiDAR Data - Wasatch Front - Dataset - CKAN
nationaldataplatform.org
Updated Feb 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). State of Utah Acquired LiDAR Data - Wasatch Front - Dataset - CKAN [Dataset]. https://nationaldataplatform.org/catalog/dataset/state-of-utah-acquired-lidar-data-wasatch-front
Explore at:
Dataset updated
Feb 28, 2024
Area covered
Wasatch Range, Utah, Wasatch Front
Description
The State of Utah, including the Utah Automated Geographic Reference Center, Utah Geological Survey, and the Utah Division of Emergency Management, along with local and federal partners, including Salt Lake County and local cities, the Federal Emergency Management Agency, the U.S. Geological Survey, and the U.S. Environmental Protection Agency, have funded and collected over 8380 km2 (3236 mi2) of high-resolution (0.5 or 1 meter) Lidar data across the state since 2011, in support of a diverse set of flood mapping, geologic, transportation, infrastructure, solar energy, and vegetation projects. The datasets include point cloud, first return digital surface model (DSM), and bare-earth digital terrain/elevation model (DEM) data, along with appropriate metadata (XML, project tile indexes, and area completion reports). This 0.5-meter 2013-2014 Wasatch Front dataset includes most of the Salt Lake and Utah Valleys (Utah), and the Wasatch (Utah and Idaho), and West Valley fault zones (Utah). Other recently acquired State of Utah data include the 2011 Utah Geological Survey Lidar dataset covering Cedar and Parowan Valleys, the east shore/wetlands of Great Salt Lake, the Hurricane fault zone, the west half of Ogden Valley, North Ogden, and part of the Wasatch Plateau in Utah.
D
Decennial Census Data, 2020
catalog.dvrpc.org
csv
Updated Mar 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DVRPC (2025). Decennial Census Data, 2020 [Dataset]. https://catalog.dvrpc.org/dataset/decennial-census-data-2020
Explore at:
csv(45639), csv(12201), csv(1628), csv(3138210), csv(48864), csv(278080), csv(51283), csv(194128), csv(20901), csv(530289), csv, csv(292974), csv(1102597), csv(9443624)Available download formats
Dataset updated
Mar 17, 2025
Dataset authored and provided by
DVRPC
License
https://catalog.dvrpc.org/dvrpc_data_license.htmlhttps://catalog.dvrpc.org/dvrpc_data_license.html
Description
This dataset contains data from the P.L. 94-171 2020 Census Redistricting Program. The 2020 Census Redistricting Data Program provides states the opportunity to delineate voting districts and to suggest census block boundaries for use in the 2020 Census redistricting data tabulations (Public Law 94-171 Redistricting Data File). In addition, the Redistricting Data Program will periodically collect state legislative and congressional district boundaries if they are changed by the states. The program is also responsible for the effective delivery of the 2020 Census P.L. 94-171 Redistricting Data statutorily required by one year from Census Day. The program ensures continued dialogue with the states in regard to 2020 Census planning, thereby allowing states ample time for their planning, response, and participation. The U.S. Census Bureau will deliver the Public Law 94-171 redistricting data to all states by Sept. 30, 2021. COVID-19-related delays and prioritizing the delivery of the apportionment results delayed the Census Bureau’s original plan to deliver the redistricting data to the states by April 1, 2021.

Data in this dataset contains information on population, diversity, race, ethnicity, housing, household, vacancy rate for 2020 for various geographies (county, MCD, Philadelphia Planning Districts (referred to as county planning areas [CPAs] internally, Census designated places, tracts, block groups, and blocks)

For more information on the 2020 Census, visit https://www.census.gov/programs-surveys/decennial-census/about/rdo/summary-files.html

PLEASE NOTE: 2020 Decennial Census data has had noise injected into it because of the Census's new Disclosure Avoidance System (DAS). This can mean that population counts and characteristics, especially when they are particularly small, may not exactly correspond to the data as collected. As such, caution should be exercised when examining areas with small counts. Ron Jarmin, acting director of the Census Bureau posted a discussion of the redistricting data, which outlines what to expect with the new DAS. For more details on accuracy you can read it here: https://www.census.gov/newsroom/blogs/director/2021/07/redistricting-data.html
F
American English General Conversation Speech Dataset for ASR
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). American English General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-english-usa
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
United States
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the US English General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of English speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world US English communication.
Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade English speech models that understand and respond to authentic American accents and dialects.
Speech Data
The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of US English. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
•Participant Diversity:
•
Speakers: 60 verified native US English speakers from FutureBeeAI’s contributor community.

•
Regions: Representing various provinces of United States of America to ensure dialectal diversity and demographic balance.

•
Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.

•Recording Details:
•
Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.

•
Duration: Each conversation ranges from 15 to 60 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.

•
Environment: Quiet, echo-free settings with no background noise.

Topic Diversity
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
•Sample Topics Include:
•Family & Relationships
•Food & Recipes
•Education & Career
•Healthcare Discussions
•Social Issues
•Technology & Gadgets
•Travel & Local Culture
•Shopping & Marketplace Experiences, and many more.
Transcription
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
•Transcription Highlights:
•Speaker-segmented dialogues
•Time-coded utterances
•Non-speech elements (pauses, laughter, etc.)
•High transcription accuracy, achieved through double QA pass, average WER < 5%
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
Metadata
The dataset comes with granular metadata for both speakers and recordings:
•
Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.

•
Recording Metadata: Topic, duration, audio format, device type, and sample rate.

Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
Usage and Applications
This dataset is a versatile resource for multiple English speech and language AI applications:
•
ASR Development: Train accurate speech-to-text systems for US English.

•
Voice Assistants: Build smart assistants capable of understanding natural American conversations.

<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px; align-items:
d
Protected Areas Database of the United States (PAD-US) 3.0 - World Database...
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Protected Areas Database of the United States (PAD-US) 3.0 - World Database on Protected Areas (WDPA) Submission [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-3-0-world-database-on-protected-areas
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
The United States Geological Survey (USGS) - Science Analytics and Synthesis (SAS) - Gap Analysis Project (GAP) manages the Protected Areas Database of the United States (PAD-US), an Arc10x geodatabase, that includes a full inventory of areas dedicated to the preservation of biological diversity and to other natural, recreation, historic, and cultural uses, managed for these purposes through legal or other effective means (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/protected-areas). The PAD-US is developed in partnership with many organizations, including coordination groups at the [U.S.] Federal level, lead organizations for each State, and a number of national and other non-governmental organizations whose work is closely related to the PAD-US. Learn more about the USGS PAD-US partners program here: www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-data-stewards. The United Nations Environmental Program - World Conservation Monitoring Centre (UNEP-WCMC) tracks global progress toward biodiversity protection targets enacted by the Convention on Biological Diversity (CBD) through the World Database on Protected Areas (WDPA) and World Database on Other Effective Area-based Conservation Measures (WD-OECM) available at: www.protectedplanet.net. See the Aichi Target 11 dashboard (www.protectedplanet.net/en/thematic-areas/global-partnership-on-aichi-target-11) for official protection statistics recognized globally and developed for the CBD, or here for more information and statistics on the United States of America's protected areas: www.protectedplanet.net/country/USA. It is important to note statistics published by the National Oceanic and Atmospheric Administration (NOAA) Marine Protected Areas (MPA) Center (www.marineprotectedareas.noaa.gov/dataanalysis/mpainventory/) and the USGS-GAP (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-statistics-and-reports) differ from statistics published by the UNEP-WCMC as methods to remove overlapping designations differ slightly and U.S. Territories are reported separately by the UNEP-WCMC (e.g. The largest MPA, "Pacific Remote Islands Marine Monument" is attributed to the United States Minor Outlying Islands statistics). At the time of PAD-US 2.1 publication (USGS-GAP, 2020), NOAA reported 26% of U.S. marine waters (including the Great Lakes) as protected in an MPA that meets the International Union for Conservation of Nature (IUCN) definition of biodiversity protection (www.iucn.org/theme/protected-areas/about). USGS-GAP released PAD-US 3.0 Statistics and Reports in the summer of 2022. The relationship between the USGS, the NOAA, and the UNEP-WCMC is as follows: - USGS manages and publishes the full inventory of U.S. marine and terrestrial protected areas data in the PAD-US representing many values, developed in collaboration with a partnership network in the U.S. and; - USGS is the primary source of U.S. marine and terrestrial protected areas data for the WDPA, developed from a subset of the PAD-US in collaboration with the NOAA, other agencies and non-governmental organizations in the U.S., and the UNEP-WCMC and; - UNEP-WCMC is the authoritative source of global protected area statistics from the WDPA and WD-OECM and; - NOAA is the authoritative source of MPA data in the PAD-US and MPA statistics in the U.S. and; - USGS is the authoritative source of PAD-US statistics (including areas primarily managed for biodiversity, multiple uses including natural resource extraction, and public access). The PAD-US 3.0 Combined Marine, Fee, Designation, Easement feature class (GAP Status Code 1 and 2 only) is the source of protected areas data in this WDPA update. Tribal areas and military lands represented in the PAD-US Proclamation feature class as GAP Status Code 4 (no known mandate for biodiversity protection) are not included as spatial data to represent internal protected areas are not available at this time. The USGS submitted more than 51,000 protected areas from PAD-US 3.0, including all 50 U.S. States and 6 U.S. Territories, to the UNEP-WCMC for inclusion in the WDPA, available at www.protectedplanet.net. The NOAA is the sole source of MPAs in PAD-US and the National Conservation Easement Database (NCED, www.conservationeasement.us/) is the source of conservation easements. The USGS aggregates authoritative federal lands data directly from managing agencies for PAD-US (https://ngda-gov-units-geoplatform.hub.arcgis.com/pages/federal-lands-workgroup), while a network of State data-stewards provide state, local government lands, and some land trust preserves. National nongovernmental organizations contribute spatial data directly (www.usgs.gov/core-science-systems/science-analytics-and-synthesis/gap/science/pad-us-data-stewards). The USGS translates the biodiversity focused subset of PAD-US into the WDPA schema (UNEP-WCMC, 2019) for efficient aggregation by the UNEP-WCMC. The USGS maintains WDPA Site Identifiers (WDPAID, WDPA_PID), a persistent identifier for each protected area, provided by UNEP-WCMC. Agency partners are encouraged to track WDPA Site Identifier values in source datasets to improve the efficiency and accuracy of PAD-US and WDPA updates. The IUCN protected areas in the U.S. are managed by thousands of agencies and organizations across the country and include over 51,000 designated sites such as National Parks, National Wildlife Refuges, National Monuments, Wilderness Areas, some State Parks, State Wildlife Management Areas, Local Nature Preserves, City Natural Areas, The Nature Conservancy and other Land Trust Preserves, and Conservation Easements. The boundaries of these protected places (some overlap) are represented as polygons in the PAD-US, along with informative descriptions such as Unit Name, Manager Name, and Designation Type. As the WDPA is a global dataset, their data standards (UNEP-WCMC 2019) require simplification to reduce the number of records included, focusing on the protected area site name and management authority as described in the Supplemental Information section in this metadata record. Given the numerous organizations involved, sites may be added or removed from the WDPA between PAD-US updates. These differences may reflect actual change in protected area status; however, they also reflect the dynamic nature of spatial data or Geographic Information Systems (GIS). Many agencies and non-governmental organizations are working to improve the accuracy of protected area boundaries, the consistency of attributes, and inventory completeness between PAD-US updates. In addition, USGS continually seeks partners to review and refine the assignment of conservation measures in the PAD-US.

Facebook

Twitter

Click to copy link

Link copied

Cite

Neilsberg Research (2024). Median Household Income by Racial Categories in United States (2022) [Dataset]. https://www.neilsberg.com/research/datasets/3693eb82-8904-11ee-9302-3860777c1fe6/

Median Household Income by Racial Categories in United States (2022)

Explore at:

csv, jsonAvailable download formats

Dataset updated

Jan 3, 2024

Dataset authored and provided by

Neilsberg Research

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

United States

Variables measured

Median Household Income for Asian Population, Median Household Income for Black Population, Median Household Income for White Population, Median Household Income for Some other race Population, Median Household Income for Two or more races Population, Median Household Income for American Indian and Alaska Native Population, Median Household Income for Native Hawaiian and Other Pacific Islander Population

Measurement technique

The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates. To portray the median household income within each racial category idetified by the US Census Bureau, we conducted an initial analysis and categorization of the data. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). It is important to note that the median household income estimates exclusively represent the identified racial categories and do not incorporate any ethnicity classifications. Households are categorized, and median incomes are reported based on the self-identified race of the head of the household. For additional information about these estimations, please contact us via email at research@neilsberg.com

Dataset funded by

Neilsberg Research

Description

About this dataset

Context

The dataset presents the median household income across different racial categories in United States. It portrays the median household income of the head of household across racial categories (excluding ethnicity) as identified by the Census Bureau. The dataset can be utilized to gain insights into economic disparities and trends and explore the variations in median houshold income for diverse racial categories.

Key observations

Based on our analysis of the distribution of United States population by race & ethnicity, the population is predominantly White. This particular racial category constitutes the majority, accounting for 68.17% of the total residents in United States. Notably, the median household income for White households is $79,933. Interestingly, despite the White population being the most populous, it is worth noting that Asian households actually reports the highest median household income, with a median income of $106,954. This reveals that, while Whites may be the most numerous in United States, Asian households experience greater economic prosperity in terms of median household income.

https://i.neilsberg.com/ch/united-states-median-household-income-by-race.jpeg" alt="United States median household income diversity across racial categories">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2022 1-Year Estimates.

Racial categories include:

White
Black or African American
American Indian and Alaska Native
Asian
Native Hawaiian and Other Pacific Islander
Some other race
Two or more races (multiracial)

Variables / Data Columns

Race of the head of household: This column presents the self-identified race of the household head, encompassing all relevant racial categories (excluding ethnicity) applicable in United States.
Median household income: Median household income, adjusting for inflation, presented in 2022-inflation-adjusted dollars

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for United States median household income by race. You can refer the same here

Clear search

Close search

Google apps

Main menu

Median Household Income by Racial Categories in United States (2022)

About this dataset

Content

Inspiration

Recommended for further research

US Cities: Demographics

2020 - 2021 Diversity Report

State Center, IA median household income breakdown by race betwen 2013 and...

About this dataset

Content

Inspiration

Recommended for further research

Protected Areas Database of the United States (PAD-US) 2.0

RRING Global Survey Research Dataset (WP3)

State College, PA median household income breakdown by race betwen 2013 and...

About this dataset

Content

Inspiration

Recommended for further research

Protected Areas Database of the United States (PAD-US) 1.4

Data from: UAIC Ichthyological Collection

Audio Visual Speech Dataset: American English

Introduction

Dataset Content

Video Data

Metadata

News Events Data in North America ( Techsalerator)

Protected Areas Database of the United States (PAD-US) 4.0

Data from: SNAPSHOT USA 2019-2023: The first five years of data from a...

INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET

State Line City, IN median household income breakdown by race betwen 2013...

About this dataset

Content

Inspiration

Recommended for further research

Protected Areas Database of the United States (PAD-US) 3.0 (ver. 2.0, March...

State of Utah Acquired LiDAR Data - Wasatch Front - Dataset - CKAN

Decennial Census Data, 2020

American English General Conversation Speech Dataset for ASR

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Protected Areas Database of the United States (PAD-US) 3.0 - World Database...

Median Household Income by Racial Categories in United States (2022)See More Versions

About this dataset

Content

Inspiration

Recommended for further research

Median Household Income by Racial Categories in United States (2022)