25 datasets found

n
A dataset of 5 million city trees from 63 US cities: species, location,...
data.niaid.nih.gov
datadryad.org
zip
Updated Aug 31, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dakota McCoy; Benjamin Goulet-Scott; Weilin Meng; Bulent Atahan; Hana Kiros; Misako Nishino; John Kartesz (2022). A dataset of 5 million city trees from 63 US cities: species, location, nativity status, health, and more. [Dataset]. http://doi.org/10.5061/dryad.2jm63xsrf
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.2jm63xsrf
Dataset updated
Aug 31, 2022
Dataset provided by
Stanford University
Cornell University
Harvard University
Worcester Polytechnic Institute
The Biota of North America Program (BONAP)
Authors
Dakota McCoy; Benjamin Goulet-Scott; Weilin Meng; Bulent Atahan; Hana Kiros; Misako Nishino; John Kartesz
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
United States
Description
Sustainable cities depend on urban forests. City trees -- a pillar of urban forests -- improve our health, clean the air, store CO2, and cool local temperatures. Comparatively less is known about urban forests as ecosystems, particularly their spatial composition, nativity statuses, biodiversity, and tree health. Here, we assembled and standardized a new dataset of N=5,660,237 trees from 63 of the largest US cities. The data comes from tree inventories conducted at the level of cities and/or neighborhoods. Each data sheet includes detailed information on tree location, species, nativity status (whether a tree species is naturally occurring or introduced), health, size, whether it is in a park or urban area, and more (comprising 28 standardized columns per datasheet). This dataset could be analyzed in combination with citizen-science datasets on bird, insect, or plant biodiversity; social and demographic data; or data on the physical environment. Urban forests offer a rare opportunity to intentionally design biodiverse, heterogenous, rich ecosystems. Methods See eLife manuscript for full details. Below, we provide a summary of how the dataset was collected and processed.

Data Acquisition We limited our search to the 150 largest cities in the USA (by census population). To acquire raw data on street tree communities, we used a search protocol on both Google and Google Datasets Search (https://datasetsearch.research.google.com/). We first searched the city name plus each of the following: street trees, city trees, tree inventory, urban forest, and urban canopy (all combinations totaled 20 searches per city, 10 each in Google and Google Datasets Search). We then read the first page of google results and the top 20 results from Google Datasets Search. If the same named city in the wrong state appeared in the results, we redid the 20 searches adding the state name. If no data were found, we contacted a relevant state official via email or phone with an inquiry about their street tree inventory. Datasheets were received and transformed to .csv format (if they were not already in that format). We received data on street trees from 64 cities. One city, El Paso, had data only in summary format and was therefore excluded from analyses.

Data Cleaning All code used is in the zipped folder Data S5 in the eLife publication. Before cleaning the data, we ensured that all reported trees for each city were located within the greater metropolitan area of the city (for certain inventories, many suburbs were reported - some within the greater metropolitan area, others not). First, we renamed all columns in the received .csv sheets, referring to the metadata and according to our standardized definitions (Table S4). To harmonize tree health and condition data across different cities, we inspected metadata from the tree inventories and converted all numeric scores to a descriptive scale including “excellent,” “good”, “fair”, “poor”, “dead”, and “dead/dying”. Some cities included only three points on this scale (e.g., “good”, “poor”, “dead/dying”) while others included five (e.g., “excellent,” “good”, “fair”, “poor”, “dead”). Second, we used pandas in Python (W. McKinney & Others, 2011) to correct typos, non-ASCII characters, variable spellings, date format, units used (we converted all units to metric), address issues, and common name format. In some cases, units were not specified for tree diameter at breast height (DBH) and tree height; we determined the units based on typical sizes for trees of a particular species. Wherever diameter was reported, we assumed it was DBH. We standardized health and condition data across cities, preserving the highest granularity available for each city. For our analysis, we converted this variable to a binary (see section Condition and Health). We created a column called “location_type” to label whether a given tree was growing in the built environment or in green space. All of the changes we made, and decision points, are preserved in Data S9. Third, we checked the scientific names reported using gnr_resolve in the R library taxize (Chamberlain & Szöcs, 2013), with the option Best_match_only set to TRUE (Data S9). Through an iterative process, we manually checked the results and corrected typos in the scientific names until all names were either a perfect match (n=1771 species) or partial match with threshold greater than 0.75 (n=453 species). BGS manually reviewed all partial matches to ensure that they were the correct species name, and then we programmatically corrected these partial matches (for example, Magnolia grandifolia-- which is not a species name of a known tree-- was corrected to Magnolia grandiflora, and Pheonix canariensus was corrected to its proper spelling of Phoenix canariensis). Because many of these tree inventories were crowd-sourced or generated in part through citizen science, such typos and misspellings are to be expected. Some tree inventories reported species by common names only. Therefore, our fourth step in data cleaning was to convert common names to scientific names. We generated a lookup table by summarizing all pairings of common and scientific names in the inventories for which both were reported. We manually reviewed the common to scientific name pairings, confirming that all were correct. Then we programmatically assigned scientific names to all common names (Data S9). Fifth, we assigned native status to each tree through reference to the Biota of North America Project (Kartesz, 2018), which has collected data on all native and non-native species occurrences throughout the US states. Specifically, we determined whether each tree species in a given city was native to that state, not native to that state, or that we did not have enough information to determine nativity (for cases where only the genus was known). Sixth, some cities reported only the street address but not latitude and longitude. For these cities, we used the OpenCageGeocoder (https://opencagedata.com/) to convert addresses to latitude and longitude coordinates (Data S9). OpenCageGeocoder leverages open data and is used by many academic institutions (see https://opencagedata.com/solutions/academia). Seventh, we trimmed each city dataset to include only the standardized columns we identified in Table S4. After each stage of data cleaning, we performed manual spot checking to identify any issues.
i
Illinois Cities by Population
illinois-demographics.com
Updated Jun 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kristen Carney (2024). Illinois Cities by Population [Dataset]. https://www.illinois-demographics.com/cities_by_population
Explore at:
Dataset updated
Jun 20, 2024
Dataset provided by
Cubit Planning, Inc.
Authors
Kristen Carney
License
https://www.illinois-demographics.com/terms_and_conditionshttps://www.illinois-demographics.com/terms_and_conditions
Area covered
Illinois
Description
A dataset listing Illinois cities by population for 2024.
f
Florida Cities by Population
florida-demographics.com
Updated Jun 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kristen Carney (2024). Florida Cities by Population [Dataset]. https://www.florida-demographics.com/cities_by_population
Explore at:
Dataset updated
Jun 20, 2024
Dataset provided by
Cubit Planning, Inc.
Authors
Kristen Carney
License
https://www.florida-demographics.com/terms_and_conditionshttps://www.florida-demographics.com/terms_and_conditions
Area covered
Florida, Florida City
Description
A dataset listing Florida cities by population for 2024.
g
Georgia Cities by Population
georgia-demographics.com
Updated Jun 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kristen Carney (2024). Georgia Cities by Population [Dataset]. https://www.georgia-demographics.com/cities_by_population
Explore at:
Dataset updated
Jun 20, 2024
Dataset provided by
Cubit Planning, Inc.
Authors
Kristen Carney
License
https://www.georgia-demographics.com/terms_and_conditionshttps://www.georgia-demographics.com/terms_and_conditions
Area covered
Georgia
Description
A dataset listing Georgia cities by population for 2024.
N
California Annual Population and Growth Analysis Dataset: A Comprehensive...
neilsberg.com
csv, json
Updated Feb 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). California Annual Population and Growth Analysis Dataset: A Comprehensive Overview of Population Changes and Yearly Growth Rates in California from 2000 to 2024 // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/california-population-by-year/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Feb 24, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
California
Variables measured
Annual Population Growth Rate, Population Between 2000 and 2024, Annual Population Growth Rate Percent
Measurement technique
The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2024. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2024. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the California population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of California across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

Key observations

In 2024, the population of California was 39.43 million, a 0.59% increase year-by-year from 2023. Previously, in 2023, California population was 39.2 million, an increase of 0.14% compared to a population of 39.14 million in 2022. Over the last 20 plus years, between 2000 and 2024, population of California increased by 5.44 million. In this period, the peak population was 39.52 million in the year 2020. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).

Content

When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

Data Coverage:

From 2000 to 2024

Variables / Data Columns

Year: This column displays the data year (Measured annually and for years 2000 to 2024)

Population: The population for the specific year for the California is shown in this column.

Year on Year Change: This column displays the change in California population for each year compared to the previous year.

Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for California Population by Year. You can refer the same here
n
New York Cities by Population
newyork-demographics.com
Updated Jun 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kristen Carney (2024). New York Cities by Population [Dataset]. https://www.newyork-demographics.com/cities_by_population
Explore at:
Dataset updated
Jun 20, 2024
Dataset provided by
Cubit Planning, Inc.
Authors
Kristen Carney
License
https://www.newyork-demographics.com/terms_and_conditionshttps://www.newyork-demographics.com/terms_and_conditions
Area covered
New York
Description
A dataset listing New York cities by population for 2024.
Data from: Urban-rural continuum
figshare.com
datasetcatalog.nlm.nih.gov
tiff
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrea Cattaneo; Andy Nelson; Theresa McMenomy (2023). Urban-rural continuum [Dataset]. http://doi.org/10.6084/m9.figshare.12579572.v4
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12579572.v4
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Andrea Cattaneo; Andy Nelson; Theresa McMenomy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The urban–rural continuum classifies the global population, allocating rural populations around differently-sized cities. The classification is based on four dimensions: population distribution, population density, urban center location, and travel time to urban centers, all of which can be mapped globally and consistently and then aggregated as administrative unit statistics.Using spatial data, we matched all rural locations to their urban center of reference based on the time needed to reach these urban centers. A hierarchy of urban centers by population size (largest to smallest) is used to determine which center is the point of “reference” for a given rural location: proximity to a larger center “dominates” over a smaller one in the same travel time category. This was done for 7 urban categories and then aggregated, for presentation purposes, into “large cities” (over 1 million people), “intermediate cities” (250,000 –1 million), and “small cities and towns” (20,000–250,000).Finally, to reflect the diversity of population density across the urban–rural continuum, we distinguished between high-density rural areas with over 1,500 inhabitants per km2 and lower density areas. Unlike traditional functional area approaches, our approach does not define urban catchment areas by using thresholds, such as proportion of people commuting; instead, these emerge endogenously from our urban hierarchy and by calculating the shortest travel time.Urban-Rural Catchment Areas (URCA).tif is a raster dataset of the 30 urban–rural continuum categories for the urban–rural continuum showing the catchment areas around cities and towns of different sizes. Each rural pixel is assigned to one defined travel time category: less than one hour, one to two hours, and two to three hours travel time to one of seven urban agglomeration sizes. The agglomerations range from large cities with i) populations greater than 5 million and ii) between 1 to 5 million; intermediate cities with iii) 500,000 to 1 million and iv) 250,000 to 500,000 inhabitants; small cities with populations v) between 100,000 and 250,000 and vi) between 50,000 and 100,000; and vii) towns of between 20,000 and 50,000 people. The remaining pixels that are more than 3 hours away from any urban agglomeration of at least 20,000 people are considered as either hinterland or dispersed towns being that they are not gravitating around any urban agglomeration. The raster also allows for visualizing a simplified continuum created by grouping the seven urban agglomerations into 4 categories.Urban-Rural Catchment Areas (URCA).tif is in GeoTIFF format, band interleaved with LZW compression, suitable for use in Geographic Information Systems and statistical packages. The data type is byte, with pixel values ranging from 1 to 30. The no data value is 128. It has a spatial resolution of 30 arc seconds, which is approximately 1km at the equator. The spatial reference system (projection) is EPSG:4326 - WGS84 - Geographic Coordinate System (lat/long). The geographic extent is 83.6N - 60S / 180E - 180W. The same tif file is also available as an ESRI ArcMap MapPackage Urban-Rural Catchment Areas.mpkFurther details are in the ReadMe_data_description.docx
f
Table_1_Does the population size of a city matter to its older adults’...
frontiersin.figshare.com
docx
Updated Feb 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zehan Pan; Weizhen Dong; Zuyu Huang (2024). Table_1_Does the population size of a city matter to its older adults’ self-rated health? Results of China data analysis.docx [Dataset]. http://doi.org/10.3389/fpubh.2024.1333961.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fpubh.2024.1333961.s001
Dataset updated
Feb 1, 2024
Dataset provided by
Frontiers
Authors
Zehan Pan; Weizhen Dong; Zuyu Huang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Clarifying the association between city population size and older adults’ health is vital in understanding the health disparity across different cities in China. Using a nationally representative dataset, this study employed Multilevel Mixed-effects Probit regression models and Sorting Analysis to elucidate this association, taking into account the sorting decisions made by older adults. The main results of the study include: (1) The association between city population size and the self-rated health of older adults shifts from a positive linear to an inverted U-shaped relationship once individual socioeconomic status is controlled for; the socioeconomic development of cities, intertwined with the growth of their populations, plays a pivotal role in yielding health benefits. (2) There is a sorting effect in older adults’ residential decisions; compared to cities with over 5 million residents, unobserved factors result in smaller cities hosting more less-healthy older adults, which may cause overestimation of health benefits in cities with greater population size. (3) The evolving socioeconomic and human-made environment resulting from urban population growth introduces health risks for migratory older adults but yields benefits for those with local resident status who are male, aged over 70, and have lower living standards and socioeconomic status. And (4) The sorting effects are more pronounced among older adults with greater resources supporting their mobility or those without permanent local resident status. Thus, policymakers should adapt planning and development strategies to consider the intricate relationship between city population size and the health of older adults.
e
Focus on London - Population and Migration
data.europa.eu
data.wu.ac.at
unknown
Updated Oct 18, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GLA Intelligence Unit (2021). Focus on London - Population and Migration [Dataset]. https://data.europa.eu/88u/dataset/focus-on-london-population-and-migration-1
Explore at:
unknownAvailable download formats
Dataset updated
Oct 18, 2021
Dataset authored and provided by
GLA Intelligence Unit
Area covered
London
Description
This report was released in September 2010. However, recent demographic data is available on the datastore - you may find other datasets on the Datastore useful such as: GLA Population Projections, National Insurance Number Registrations of Overseas Nationals, Births by Birthplace of Mother, Births and Fertility Rates, Office for National Statistics (ONS) Population Estimates

FOCUSONLONDON2010:POPULATIONANDMIGRATION

London is the United Kingdom’s only city region. Its population of 7.75 million is 12.5 per cent of the UK population living on just 0.6 per cent of the land area. London’s average population density is over 4,900 persons per square kilometre, this is ten times that of the second most densely populated region.

Between 2001 and 2009 London’s population grew by over 430 thousand, more than any other region, accounting for over 16 per cent of the UK increase.

This report discusses in detail the population of London including Population Age Structure, Fertility and Mortality, Internal Migration, International Migration, Population Turnover and Churn, and Demographic Projections.

Population and Migration report is the first release of the Focus on London 2010-12 series. Reports on themes such as Income, Poverty, Labour Market, Skills, Health, and Housing are also available.

PRESENTATION:

To access an interactive presentation about population changes in London click the link to see it on Prezi.com

FACTS:

Top five boroughs for babies born per 10,000 population in 2008-09:

1. Newham – 244.4

2. Barking and Dagenham – 209.3

3. Hackney – 205.7

4. Waltham Forest – 202.7

5. Greenwich – 196.2

...

32. Havering – 116.8

33. City of London – 47.0

In 2009, Barnet overtook Croydon as the most populous London borough. Prior to this Croydon had been the largest since 1966

Population per hectare of land used for Domestic building and gardens is highest in Tower Hamlets

In 2008-09, natural change (births minus deaths) led to 78,000 more Londoners compared with only 8,000 due to migration. read more about this or click play on the chart below to reveal how regional components of populations change have altered over time.
World Cities Culture Report, 2022
icpsr.umich.edu
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Cities Culture Forum (2025). World Cities Culture Report, 2022 [Dataset]. https://www.icpsr.umich.edu/web/NADAC/studies/39411
Explore at:
Dataset updated
May 16, 2025
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
Authors
World Cities Culture Forum
License
https://www.icpsr.umich.edu/web/ICPSR/studies/39411/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/39411/terms
Description
The World Cities Culture Forum, established in 2012, is a leading global network of civic leaders from over 40 creative cities across six continents, representing a combined population of over 245 million. The forum fosters collaborations to place culture at the core of urban development, addressing 21st-century challenges such as climate change, affordable workspaces, cultural tourism, and diversity in public spaces. Through its Global Summit, partnerships, and programs like the Leadership Exchange Programme and Digital Dialogue Masterclasses, the forum promotes cultural integration in city planning. The World Cities Culture Report 2022 provides comprehensive open-source data on culture, including over 60 datasets from 40 cities. Contextual Data: Includes demographics such as characteristics of the overall and working-age populations (including percent who were foreign born) and of the geographical area, such as the percentage of national population living in the city and the percentage of the area devoted to parks and other public green spaces. Cultural Infrastructure: Provides counts (and rates) of various facilities and venues, including art galleries, artists' studios, rehearsal spaces, bars, bookshops, cinemas, community centers, concert halls, museums, nightclubs, libraries, video game arcades, and theatres. Participation and Tourism: Focuses on cultural participation metrics, such as cinema and theatre admissions, festival attendance, museum visits, average daily attendance at the top five art exhibits, and international tourist numbers. Creative Economy: Encompasses data on book publishing, creative industries employment, film festivals, restaurant ratings, and performances. Education: Includes statistics on public library book loans, higher education levels, international student enrollment, and specialist institutes in art and design education. The source for each number is identified within the dataset. Data users can freely download selected datasets as .csv files.
Population estimates, quarterly
www150.statcan.gc.ca
open.canada.ca
+2more
Updated Jun 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Canada, Statistics Canada (2025). Population estimates, quarterly [Dataset]. http://doi.org/10.25318/1710000901-eng
Explore at:
Unique identifier
https://doi.org/10.25318/1710000901-eng
Dataset updated
Jun 18, 2025
Dataset provided by
Government of Canadahttp://www.gg.ca/
Statistics Canadahttps://statcan.gc.ca/en
Area covered
Canada
Description
Estimated number of persons by quarter of a year and by year, Canada, provinces and territories.
LANGUAGE SPOKEN AT HOME FOR THE POPULATION 5 YEARS AND OVER IN LIMITED...
hub.arcgis.com
data.seattle.gov
+1more
Updated Sep 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Seattle ArcGIS Online (2023). LANGUAGE SPOKEN AT HOME FOR THE POPULATION 5 YEARS AND OVER IN LIMITED ENGLISH SPEAKING HOUSEHOLDS (B16003) [Dataset]. https://hub.arcgis.com/datasets/SeattleCityGIS::language-spoken-at-home-for-the-population-5-years-and-over-in-limited-english-speaking-households-b16003/about
Explore at:
Dataset updated
Sep 3, 2023
Dataset provided by
https://arcgis.com/
Authors
City of Seattle ArcGIS Online
Description
Table from the American Community Survey (ACS) B16003 of age by language spoken at home for the population 5 years and over in limited English-speaking households. These are multiple, nonoverlapping vintages of the 5-year ACS estimates of population and housing attributes starting in 2010 shown by the corresponding census tract vintage. Also includes the most recent release annually.King County, Washington census tracts with nonoverlapping vintages of the 5-year American Community Survey (ACS) estimates starting in 2010. Vintage identified in the "ACS Vintage" field.The census tract boundaries match the vintage of the ACS data (currently 2010 and 2020) so please note the geographic changes between the decades. Tracts have been coded as being within the City of Seattle as well as assigned to neighborhood groups called "Community Reporting Areas". These areas were created after the 2000 census to provide geographically consistent neighborhoods through time for reporting U.S. Census Bureau data. This is not an attempt to identify neighborhood boundaries as defined by neighborhoods themselves.Vintages: 2010, 2015, 2020, 2021, 2022, 2023ACS Table(s): B16003Data downloaded from: Census Bureau's Explore Census Data The United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2020 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
N
Population and Languages of the Limited English Proficient (LEP) Speakers by...
data.cityofnewyork.us
catalog.data.gov
+1more
application/rdfxml +5
Updated Apr 25, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Civic Engagement Commission (CEC) (2022). Population and Languages of the Limited English Proficient (LEP) Speakers by Community District [Dataset]. https://data.cityofnewyork.us/City-Government/Population-and-Languages-of-the-Limited-English-Pr/ajin-gkbp
Explore at:
application/rssxml, xml, csv, tsv, application/rdfxml, jsonAvailable download formats
Dataset updated
Apr 25, 2022
Dataset authored and provided by
Civic Engagement Commission (CEC)
Description
Many residents of New York City speak more than one language; a number of them speak and understand non-English languages more fluently than English. This dataset, derived from the Census Bureau's American Community Survey (ACS), includes information on over 1.7 million limited English proficient (LEP) residents and a subset of that population called limited English proficient citizens of voting age (CVALEP) at the Community District level. There are 59 community districts throughout NYC, with each district being represented by a Community Board.
Global Urban Rural Catchment Areas (URCA) Grid - 2021
data.amerigeoss.org
http, png, tif, wms
Updated Mar 5, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Food and Agriculture Organization (2022). Global Urban Rural Catchment Areas (URCA) Grid - 2021 [Dataset]. https://data.amerigeoss.org/dataset/9dc31512-a438-4b59-acfd-72830fbd6943
Explore at:
wms, png, http, tifAvailable download formats
Dataset updated
Mar 5, 2022
Dataset provided by
Food and Agriculture Organizationhttp://fao.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Global Urban-Rural Catchment Areas (URCA) is a raster dataset of the 30 urban-rural continuum categories of catchment areas for cities and towns. Each rural pixel is assigned to one defined travel time category: less than one hour, one to two hours, and two to three hours travel time to one of seven urban agglomeration sizes. The agglomerations range from large cities with i) populations greater than 5 million and ii) between 1 to 5 million; intermediate cities with iii) 500,000 to 1 million and iv) 250,000 to 500,000 inhabitants; small cities with populations v) between 100,000 and 250,000 and vi) between 50,000 and 100,000; and vii) towns of between 20,000 and 50,000 people. The remaining pixels that are more than 3 hours away from any urban agglomeration of at least 20,000 people are considered as either hinterland or dispersed towns being that they are not gravitating around any urban agglomeration.

Data publication: 2021-01-01

Contact points:

Metadata contact: Theresa McMenomy FAO-UN

Contact: Andrea Cattaneo FAO-UN

Contact: Theresa McMenomy FAO-UN

Data lineage:

The dataset is from https://doi/10.1073/pnas.2011990118 and http://dx.doi.org/10.6084/m9.figshare.12579572

Resource constraints:

CC By 4.0

Online resources:

Urban-rural continuum dataset download

urban_rural_catchment_areas.tif
a
2020 Census Block Groups Top 50 American Community Survey Data with Seattle...
hub.arcgis.com
data.seattle.gov
+1more
Updated Feb 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Seattle ArcGIS Online (2024). 2020 Census Block Groups Top 50 American Community Survey Data with Seattle Neighborhoods [Dataset]. https://hub.arcgis.com/datasets/ff59dc88bfab4eb3bc4cd11eaf67ec2a
Explore at:
Dataset updated
Feb 6, 2024
Dataset authored and provided by
City of Seattle ArcGIS Online
Area covered

Description
U.S. Census Bureau 2020 block groups within the City of Seattle with American Community Survey (ACS) 5-year series data of frequently requested topics. Data is pulled from block group tables for the most recent ACS vintage. Seattle neighborhood geography of Council Districts, Comprehensive Plan Growth Areas are also included based on block group assignment.The census block groups have been assigned to a neighborhood based on the distribution of the total population from the 2020 decennial census for the component census blocks. If the majority of the population in the block group were inside the boundaries of the neighborhood, the block group was assigned wholly to that neighborhood.Feature layer created for and used in the Neighborhood Profiles application.The attribute data associated with this map is updated annually to contain the most currently released American Community Survey (ACS) 5-year data and contains estimates and margins of error. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Vintages: 2023ACS Table(s): Select fields from the tables listed here.Data downloaded from: Census Bureau's Explore Census Data The United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2020 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
a
Household Types and Populations - Seattle Neighborhoods
arc-gis-hub-home-arcgishub.hub.arcgis.com
data.seattle.gov
+2more
Updated Feb 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Seattle ArcGIS Online (2024). Household Types and Populations - Seattle Neighborhoods [Dataset]. https://arc-gis-hub-home-arcgishub.hub.arcgis.com/datasets/SeattleCityGIS::household-types-and-populations-seattle-neighborhoods/explore
Explore at:
Dataset updated
Feb 16, 2024
Dataset authored and provided by
City of Seattle ArcGIS Online
Area covered
Seattle
Description
Table from the American Community Survey (ACS) 5-year series on household types and population related topics for City of Seattle Council Districts, Comprehensive Plan Growth Areas and Community Reporting Areas. Table includes B11003 Family Type by Presence and Age of Own Children under 18 Years, B11005 Households by Presence of People Under 18 Years by Household Type, B11007 Households by Presence of People 65 Years and Over by Household Type, B11001 Household Type (Including Living Alone), B11002 Household Type by Relatives and Nonrelatives for Population in Households, B25003 Tenure, B25008 Total Population in Occupied Housing Units by Tenure, B09019 Household Type (Including Living Alone) by Relationship. Data is pulled from block group tables for the most recent ACS vintage and summarized to the neighborhoods based on block group assignment.Table created for and used in the Neighborhood Profiles application.Vintages: 2023ACS Table(s): B11003, B11005, B11007, B11001, B11002, B25003, B25008, B09019Data downloaded from: Census Bureau's Explore Census Data The United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2020 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
Population estimates on July 1, by age and gender
www150.statcan.gc.ca
open.canada.ca
+1more
Updated Sep 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Canada, Statistics Canada (2024). Population estimates on July 1, by age and gender [Dataset]. http://doi.org/10.25318/1710000501-eng
Explore at:
Unique identifier
https://doi.org/10.25318/1710000501-eng
Dataset updated
Sep 25, 2024
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Area covered
Canada
Description
Estimated number of persons on July 1, by 5-year age groups and gender, and median age, for Canada, provinces and territories.
e
Consuming urban poverty survey 2016-2017 - Dataset - B2FIND
b2find.eudat.eu
Updated Jan 16, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2015). Consuming urban poverty survey 2016-2017 - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/fcf62459-6d80-53f5-97f0-66fabc9198a8
Explore at:
Dataset updated
Jan 16, 2015
Description
The Consuming Urban Poverty (CUP) project - based at the University of Cape Town’s African Centre for Cities - sought to generate an understanding of the connections between poverty, governance, urban space, and food. CUP research focused on secondary cities in three countries: Kisumu, Kenya; Kitwe, Zambia; and Epworth, Zimbabwe.The research included three quantitative surveys: A retail mapping exercise, a food vendor and retailer survey, and a household survey. Over 2,200 households and 1,200 food retailers were interviewed (between April 2016 and February 2017) in the three secondary cities. In addition, nearly 4,500 traders were mapped as part of a retailer census in these cities. The surveys examined the nature of the urban food system and the experience of food poverty. Qualitative in-depth interviews were also carried out in households across the three cities. A qualitative reverse value chain assessment was also undertaken, which traced five key food items (aligned to the food groups of protein, staple, vegetable, traditional food item and snack food) from the point of consumption to origin (or a point where no further information was available) in each city.Urban areas in sub-Saharan Africa are growing rapidly. While there has been considerable attention paid to the challenges of African mega-cities, the experiences of smaller urban areas have been relatively neglected. Secondary cities, with populations of less than half a million, are absorbing two-thirds of all urban population growth in Africa. This project focuses on three such cities to build a clearer picture of the dynamics of poverty in these kinds of urban spaces and to provide information and insights which can address poverty reduction. Poverty cannot be understood or addressed by focusing on poor individuals or households alone. Rather it needs to be understood as having many intersecting drivers operating at a range of scales, from the individual, to the neighbourhood, to the city and beyond. Nor can it be understood or addressed by focusing on governance, infrastructure or economic growth, alone. The challenge of this project is to understand the dynamic connections between poverty, governance and urban spaces. We argue that the study of food is a powerful lens to understand these connections. As Carolyn Steel writes, "In order to understand cities properly, we need to look at them through food". The project therefore asks the central question: What does the urban food system in three secondary cities in Africa reveal about the dynamics of urban poverty and its governance, and what are the lessons for generic poverty reduction? There are significant gaps in knowledge about African urban growth and urban poverty. This project therefore consolidates existing survey and census data to understand patterns and trends of urbanization and poverty in the three case study countries and cities. Because there are data gaps, we will also use remote sensing to generate new data on the spread of urban areas. This information provides the basis for general statements to be made about urban poverty, and for poverty reduction strategies generated in the project to be assessed against a broader representation of poverty. The project turns its focus to food as a way to understand the connections between poverty, governance and urban space. It will conduct a survey in each of three cities to assess how many households, and what kinds of households and individuals, are unable to get enough safe and nutritious food. Poor nutrition is an important indicator and driver of poverty. Most work on food poverty has focused on the household scale alone. This project argues that if food poverty, and poverty more generally, is to be addressed, it will be necessary to take a broader view and look at the food system. The food system in these cities is shifting rapidly as the supermarket sector increases and the flows of food become more global. This project assesses these changes by mapping the food retail environment, interviewing key people involved in the food system and analyses policy in order to test the impact of a changing food system on food poverty, and what appropriate governance responses might be. The project therefore scans the globe for useful precedents in addressing urban poverty through strategic planning of, and interventions in the urban food system. Throughout the project the focus will be on working with local governments, NGOs and civil society organisations to generate local solutions that are adaptable to multiple contexts. The outputs from this project are designed to have both practical and academic impacts. Policy impact will be generated by policy briefs and city reports that support the workshops to be held with municipal officials and policy makers. These will be translated into popular media resources to raise public awareness. Reports addressing urbanization, poverty and governance at a wider scale will be produced. These will be disseminated at major urban events and included in university curricula. Peer-reviewed academic publications will be produced in order to influence academic debates. Two questionnaires were used in the survey, one for retailers and one for households. A retailer mapping questionnaire was used in the mapping of a census of retailers in the survey cities.
i
20 Richest Counties in Illinois
illinois-demographics.com
Updated Jun 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kristen Carney (2024). 20 Richest Counties in Illinois [Dataset]. https://www.illinois-demographics.com/counties_by_population
Explore at:
Dataset updated
Jun 20, 2024
Dataset provided by
Cubit Planning, Inc.
Authors
Kristen Carney
License
https://www.illinois-demographics.com/terms_and_conditionshttps://www.illinois-demographics.com/terms_and_conditions
Area covered
Illinois
Description
A dataset listing Illinois counties by population for 2024.
w
R2 & NE: County Level 2006-2010 ACS Population Summary
data.wu.ac.at
tgrshp (compressed)
Updated Jan 13, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Environmental Protection Agency (2018). R2 & NE: County Level 2006-2010 ACS Population Summary [Dataset]. https://data.wu.ac.at/schema/data_gov/NWExNzExZGQtYmJkMi00YTEzLWExYWQtODI0NTRkYmRkNmIx
Explore at:
tgrshp (compressed)Available download formats
Dataset updated
Jan 13, 2018
Dataset provided by
U.S. Environmental Protection Agency
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Area covered
944ad82b606cd941374663462ffd9a8949c8af53
Description
The TIGER/Line Files are shapefiles and related database files (.dbf) that are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line File is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The primary legal divisions of most States are termed counties. In Louisiana, these divisions are known as parishes. In Alaska, which has no counties, the equivalent entities are the organized boroughs, city and boroughs, and municipalities, and for the unorganized area, census areas. The latter are delineated cooperatively for statistical purposes by the State of Alaska and the Census Bureau. In four States (Maryland, Missouri, Nevada, and Virginia), there are one or more incorporated places that are independent of any county organization and thus constitute primary divisions of their States. These incorporated places are known as independent cities and are treated as equivalent entities for purposes of data presentation. The District of Columbia and Guam have no primary divisions, and each area is considered an equivalent entity for purposes of data presentation. The Census Bureau treats the following entities as equivalents of counties for purposes of data presentation: Municipios in Puerto Rico, Districts and Islands in American Samoa, Municipalities in the Commonwealth of the Northern Mariana Islands, and Islands in the U.S. Virgin Islands. The entire area of the United States, Puerto Rico, and the Island Areas is covered by counties or equivalent entities. The 2010 Census boundaries for counties and equivalent entities are as of January 1, 2010, primarily as reported through the Census Bureau's Boundary and Annexation Survey (BAS).

This table contains data on race, age, sex, and marital status from the American Community Survey 2006-2010 database for counties. The American Community Survey (ACS) is a household survey conducted by the U.S. Census Bureau that currently has an annual sample size of about 3.5 million addresses. ACS estimates provides communities with the current information they need to plan investments and services. Information from the survey generates estimates that help determine how more than $400 billion in federal and state funds are distributed annually. Each year the survey produces data that cover the periods of 1-year, 3-year, and 5-year estimates for geographic areas in the United States and Puerto Rico, ranging from neighborhoods to Congressional districts to the entire nation. This table also has a companion table (Same table name with MOE Suffix) with the margin of error (MOE) values for each estimated element. MOE is expressed as a measure value for each estimated element. So a value of 25 and an MOE of 5 means 25 +/- 5 (or statistical certainty between 20 and 30). There are also special cases of MOE. An MOE of -1 means the associated estimates do not have a measured error. An MOE of 0 means that error calculation is not appropriate for the associated value. An MOE of 109 is set whenever an estimate value is 0. The MOEs of aggregated elements and percentages must be calculated. This process means using standard error calculations as described in "American Community Survey Multiyear Accuracy of the Data (3-year 2008-2010 and 5-year 2006-2010)". Also, following Census guidelines, aggregated MOEs do not use more than 1 0-element MOE (109) to prevent over estimation of the error. Due to the complexity of the calculations, some percentage MOEs cannot be calculated (these are set to null in the summary-level MOE tables).

The name for table 'ACS10POPCNTYMOE' was added as a prefix to all field names imported from that table. Be sure to turn off 'Show Field Aliases' to see complete field names in the Attribute Table of this feature layer. This can be done in the 'Table Options' drop-down menu in the Attribute Table or with key sequence '[CTRL]+[SHIFT]+N'. Due to database restrictions, the prefix may have been abbreviated if the field name exceded the maximum allowed characters.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dakota McCoy; Benjamin Goulet-Scott; Weilin Meng; Bulent Atahan; Hana Kiros; Misako Nishino; John Kartesz (2022). A dataset of 5 million city trees from 63 US cities: species, location, nativity status, health, and more. [Dataset]. http://doi.org/10.5061/dryad.2jm63xsrf

A dataset of 5 million city trees from 63 US cities: species, location, nativity status, health, and more.

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5061/dryad.2jm63xsrf

Dataset updated

Aug 31, 2022

Dataset provided by

Stanford University
Cornell University
Harvard University
Worcester Polytechnic Institute
The Biota of North America Program (BONAP)

Authors

Dakota McCoy; Benjamin Goulet-Scott; Weilin Meng; Bulent Atahan; Hana Kiros; Misako Nishino; John Kartesz

License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Area covered

United States

Description

Sustainable cities depend on urban forests. City trees -- a pillar of urban forests -- improve our health, clean the air, store CO2, and cool local temperatures. Comparatively less is known about urban forests as ecosystems, particularly their spatial composition, nativity statuses, biodiversity, and tree health. Here, we assembled and standardized a new dataset of N=5,660,237 trees from 63 of the largest US cities. The data comes from tree inventories conducted at the level of cities and/or neighborhoods. Each data sheet includes detailed information on tree location, species, nativity status (whether a tree species is naturally occurring or introduced), health, size, whether it is in a park or urban area, and more (comprising 28 standardized columns per datasheet). This dataset could be analyzed in combination with citizen-science datasets on bird, insect, or plant biodiversity; social and demographic data; or data on the physical environment. Urban forests offer a rare opportunity to intentionally design biodiverse, heterogenous, rich ecosystems. Methods See eLife manuscript for full details. Below, we provide a summary of how the dataset was collected and processed.

Data Acquisition We limited our search to the 150 largest cities in the USA (by census population). To acquire raw data on street tree communities, we used a search protocol on both Google and Google Datasets Search (https://datasetsearch.research.google.com/). We first searched the city name plus each of the following: street trees, city trees, tree inventory, urban forest, and urban canopy (all combinations totaled 20 searches per city, 10 each in Google and Google Datasets Search). We then read the first page of google results and the top 20 results from Google Datasets Search. If the same named city in the wrong state appeared in the results, we redid the 20 searches adding the state name. If no data were found, we contacted a relevant state official via email or phone with an inquiry about their street tree inventory. Datasheets were received and transformed to .csv format (if they were not already in that format). We received data on street trees from 64 cities. One city, El Paso, had data only in summary format and was therefore excluded from analyses.

Data Cleaning All code used is in the zipped folder Data S5 in the eLife publication. Before cleaning the data, we ensured that all reported trees for each city were located within the greater metropolitan area of the city (for certain inventories, many suburbs were reported - some within the greater metropolitan area, others not). First, we renamed all columns in the received .csv sheets, referring to the metadata and according to our standardized definitions (Table S4). To harmonize tree health and condition data across different cities, we inspected metadata from the tree inventories and converted all numeric scores to a descriptive scale including “excellent,” “good”, “fair”, “poor”, “dead”, and “dead/dying”. Some cities included only three points on this scale (e.g., “good”, “poor”, “dead/dying”) while others included five (e.g., “excellent,” “good”, “fair”, “poor”, “dead”). Second, we used pandas in Python (W. McKinney & Others, 2011) to correct typos, non-ASCII characters, variable spellings, date format, units used (we converted all units to metric), address issues, and common name format. In some cases, units were not specified for tree diameter at breast height (DBH) and tree height; we determined the units based on typical sizes for trees of a particular species. Wherever diameter was reported, we assumed it was DBH. We standardized health and condition data across cities, preserving the highest granularity available for each city. For our analysis, we converted this variable to a binary (see section Condition and Health). We created a column called “location_type” to label whether a given tree was growing in the built environment or in green space. All of the changes we made, and decision points, are preserved in Data S9. Third, we checked the scientific names reported using gnr_resolve in the R library taxize (Chamberlain & Szöcs, 2013), with the option Best_match_only set to TRUE (Data S9). Through an iterative process, we manually checked the results and corrected typos in the scientific names until all names were either a perfect match (n=1771 species) or partial match with threshold greater than 0.75 (n=453 species). BGS manually reviewed all partial matches to ensure that they were the correct species name, and then we programmatically corrected these partial matches (for example, Magnolia grandifolia-- which is not a species name of a known tree-- was corrected to Magnolia grandiflora, and Pheonix canariensus was corrected to its proper spelling of Phoenix canariensis). Because many of these tree inventories were crowd-sourced or generated in part through citizen science, such typos and misspellings are to be expected. Some tree inventories reported species by common names only. Therefore, our fourth step in data cleaning was to convert common names to scientific names. We generated a lookup table by summarizing all pairings of common and scientific names in the inventories for which both were reported. We manually reviewed the common to scientific name pairings, confirming that all were correct. Then we programmatically assigned scientific names to all common names (Data S9). Fifth, we assigned native status to each tree through reference to the Biota of North America Project (Kartesz, 2018), which has collected data on all native and non-native species occurrences throughout the US states. Specifically, we determined whether each tree species in a given city was native to that state, not native to that state, or that we did not have enough information to determine nativity (for cases where only the genus was known). Sixth, some cities reported only the street address but not latitude and longitude. For these cities, we used the OpenCageGeocoder (https://opencagedata.com/) to convert addresses to latitude and longitude coordinates (Data S9). OpenCageGeocoder leverages open data and is used by many academic institutions (see https://opencagedata.com/solutions/academia). Seventh, we trimmed each city dataset to include only the standardized columns we identified in Table S4. After each stage of data cleaning, we performed manual spot checking to identify any issues.

Clear search

Close search

Google apps

Main menu

A dataset of 5 million city trees from 63 US cities: species, location,...

Illinois Cities by Population

Florida Cities by Population

Georgia Cities by Population

California Annual Population and Growth Analysis Dataset: A Comprehensive...

About this dataset

Content

Inspiration

Recommended for further research

New York Cities by Population

Data from: Urban-rural continuum

Table_1_Does the population size of a city matter to its older adults’...

Focus on London - Population and Migration

World Cities Culture Report, 2022

Population estimates, quarterly

LANGUAGE SPOKEN AT HOME FOR THE POPULATION 5 YEARS AND OVER IN LIMITED...

Population and Languages of the Limited English Proficient (LEP) Speakers by...

Global Urban Rural Catchment Areas (URCA) Grid - 2021

2020 Census Block Groups Top 50 American Community Survey Data with Seattle...

Household Types and Populations - Seattle Neighborhoods

Population estimates on July 1, by age and gender

Consuming urban poverty survey 2016-2017 - Dataset - B2FIND

20 Richest Counties in Illinois

R2 & NE: County Level 2006-2010 ACS Population Summary

A dataset of 5 million city trees from 63 US cities: species, location, nativity status, health, and more.See More Versions

A dataset of 5 million city trees from 63 US cities: species, location, nativity status, health, and more.