This statistic shows the top 25 cities in the United States with the highest resident population as of July 1, 2022. There were about 8.34 million people living in New York City as of July 2022.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Sustainable cities depend on urban forests. City trees -- a pillar of urban forests -- improve our health, clean the air, store CO2, and cool local temperatures. Comparatively less is known about urban forests as ecosystems, particularly their spatial composition, nativity statuses, biodiversity, and tree health. Here, we assembled and standardized a new dataset of N=5,660,237 trees from 63 of the largest US cities. The data comes from tree inventories conducted at the level of cities and/or neighborhoods. Each data sheet includes detailed information on tree location, species, nativity status (whether a tree species is naturally occurring or introduced), health, size, whether it is in a park or urban area, and more (comprising 28 standardized columns per datasheet). This dataset could be analyzed in combination with citizen-science datasets on bird, insect, or plant biodiversity; social and demographic data; or data on the physical environment. Urban forests offer a rare opportunity to intentionally design biodiverse, heterogenous, rich ecosystems. Methods See eLife manuscript for full details. Below, we provide a summary of how the dataset was collected and processed.
Data Acquisition We limited our search to the 150 largest cities in the USA (by census population). To acquire raw data on street tree communities, we used a search protocol on both Google and Google Datasets Search (https://datasetsearch.research.google.com/). We first searched the city name plus each of the following: street trees, city trees, tree inventory, urban forest, and urban canopy (all combinations totaled 20 searches per city, 10 each in Google and Google Datasets Search). We then read the first page of google results and the top 20 results from Google Datasets Search. If the same named city in the wrong state appeared in the results, we redid the 20 searches adding the state name. If no data were found, we contacted a relevant state official via email or phone with an inquiry about their street tree inventory. Datasheets were received and transformed to .csv format (if they were not already in that format). We received data on street trees from 64 cities. One city, El Paso, had data only in summary format and was therefore excluded from analyses.
Data Cleaning All code used is in the zipped folder Data S5 in the eLife publication. Before cleaning the data, we ensured that all reported trees for each city were located within the greater metropolitan area of the city (for certain inventories, many suburbs were reported - some within the greater metropolitan area, others not). First, we renamed all columns in the received .csv sheets, referring to the metadata and according to our standardized definitions (Table S4). To harmonize tree health and condition data across different cities, we inspected metadata from the tree inventories and converted all numeric scores to a descriptive scale including “excellent,” “good”, “fair”, “poor”, “dead”, and “dead/dying”. Some cities included only three points on this scale (e.g., “good”, “poor”, “dead/dying”) while others included five (e.g., “excellent,” “good”, “fair”, “poor”, “dead”). Second, we used pandas in Python (W. McKinney & Others, 2011) to correct typos, non-ASCII characters, variable spellings, date format, units used (we converted all units to metric), address issues, and common name format. In some cases, units were not specified for tree diameter at breast height (DBH) and tree height; we determined the units based on typical sizes for trees of a particular species. Wherever diameter was reported, we assumed it was DBH. We standardized health and condition data across cities, preserving the highest granularity available for each city. For our analysis, we converted this variable to a binary (see section Condition and Health). We created a column called “location_type” to label whether a given tree was growing in the built environment or in green space. All of the changes we made, and decision points, are preserved in Data S9. Third, we checked the scientific names reported using gnr_resolve in the R library taxize (Chamberlain & Szöcs, 2013), with the option Best_match_only set to TRUE (Data S9). Through an iterative process, we manually checked the results and corrected typos in the scientific names until all names were either a perfect match (n=1771 species) or partial match with threshold greater than 0.75 (n=453 species). BGS manually reviewed all partial matches to ensure that they were the correct species name, and then we programmatically corrected these partial matches (for example, Magnolia grandifolia-- which is not a species name of a known tree-- was corrected to Magnolia grandiflora, and Pheonix canariensus was corrected to its proper spelling of Phoenix canariensis). Because many of these tree inventories were crowd-sourced or generated in part through citizen science, such typos and misspellings are to be expected. Some tree inventories reported species by common names only. Therefore, our fourth step in data cleaning was to convert common names to scientific names. We generated a lookup table by summarizing all pairings of common and scientific names in the inventories for which both were reported. We manually reviewed the common to scientific name pairings, confirming that all were correct. Then we programmatically assigned scientific names to all common names (Data S9). Fifth, we assigned native status to each tree through reference to the Biota of North America Project (Kartesz, 2018), which has collected data on all native and non-native species occurrences throughout the US states. Specifically, we determined whether each tree species in a given city was native to that state, not native to that state, or that we did not have enough information to determine nativity (for cases where only the genus was known). Sixth, some cities reported only the street address but not latitude and longitude. For these cities, we used the OpenCageGeocoder (https://opencagedata.com/) to convert addresses to latitude and longitude coordinates (Data S9). OpenCageGeocoder leverages open data and is used by many academic institutions (see https://opencagedata.com/solutions/academia). Seventh, we trimmed each city dataset to include only the standardized columns we identified in Table S4. After each stage of data cleaning, we performed manual spot checking to identify any issues.
The purpose of this data package is to offer demographic data for U.S. cities. The data sources are multiple, the most important one being the U.S. Census Bureau, American Community Survey. In this case, the data was organized by the Big Cities Health Coalition (BCHC). Others are the New York City Department of City Planning and Department of Parks and Recreation, data being available through the NYC Open Data.
In the United States, city governments provide many services: they run public school districts, administer certain welfare and health programs, build roads and manage airports, provide police and fire protection, inspect buildings, and often run water and utility systems. Cities also get revenues through certain local taxes, various fees and permit costs, sale of property, and through the fees they charge for the utilities they run.
It would be interesting to compare all these expenses and revenues across cities and over time, but also quite difficult. Cities share many of these service responsibilities with other government agencies: in one particular city, some roads may be maintained by the state government, some law enforcement provided by the county sheriff, some schools run by independent school districts with their own tax revenue, and some utilities run by special independent utility districts. These governmental structures vary greatly by state and by individual city. It would be hard to make a fair comparison without taking into account all these differences.
This dataset takes into account all those differences. The Lincoln Institute of Land Policy produces what they call “Fiscally Standardized Cities” (FiSCs), aggregating all services provided to city residents regardless of how they may be divided up by different government agencies and jurisdictions. Using this, we can study city expenses and revenues, and how the proportions of different costs vary over time.
The dataset tracks over 200 American cities between 1977 and 2020. Each row represents one city for one year. Revenue and expenditures are broken down into more than 120 categories.
Values are available for FiSCs and also for the entities that make it up: the city, the county, independent school districts, and any special districts, such as utility districts. There are hence five versions of each variable, with suffixes indicating the entity. For example, taxes gives the FiSC’s tax revenue, while taxes_city, taxes_cnty, taxes_schl, and taxes_spec break it down for the city, county, school districts, and special districts.
The values are organized hierarchically. For example, taxes is the sum of tax_property (property taxes), tax_sales_general (sales taxes), tax_income (income tax), and tax_other (other taxes). And tax_income is itself the sum of tax_income_indiv (individual income tax) and tax_income_corp (corporate income tax) subcategories.
The revenue and expenses variables are described in this detailed table. Further documentation is available on the FiSC Database website, linked in References below.
All monetary data is already adjusted for inflation, and is given in terms of 2020 US dollars per capita. The Consumer Price Index is provided for each year if you prefer to use numbers not adjusted for inflation, scaled so that 2020 is 1; simply divide each value by the CPI to get the value in that year’s nominal dollars. The total population is also provided if you want total values instead of per-capita values.
This data set includes cities in the United States, Puerto Rico and the U.S. Virgin Islands. These cities were collected from the 1970 National Atlas of the United States. Where applicable, U.S. Census Bureau codes for named populated places were associated with each name to allow additional information to be attached. The Geographic Names Information System (GNIS) was also used as a source for additional information. This is a revised version of the December, 2003, data set.
This layer is sourced from maps.bts.dot.gov.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dynamic social media content, such as Twitter messages, can be used to examine individuals’ beliefs and perceptions. By analyzing Twitter messages, this study examines how Twitter users exchanged and recognized toponyms (city names) for different cities in the United States. The frequency and variety of city names found in their online conversations were used to identify the unique spatiotemporal patterns of “geographical awareness” for Twitter users. A new analytic method, Knowledge Discovery in Cyberspace for Geographical Awareness (KDCGA), is introduced to help identify the dynamic spatiotemporal patterns of geographic awareness among social media conversations. Twitter data were collected across 50 U.S. cities. Thousands of city names around the world were extracted from a large volume of Twitter messages (over 5 million tweets) by using the Twitter Application Programming Interface (APIs) and Python language computer programs. The percentages of distant city names (cities located in distant states or other countries far away from the locations of Twitter users) were used to estimate the level of global geographical awareness for Twitter users in each U.S. city. A Global awareness index (GAI) was developed to quantify the level of geographical awareness of Twitter users from within the same city. Our findings are that: (1) the level of geographical awareness varies depending on when and where Twitter messages are posted, yet Twitter users from big cities are more aware of the names of international cities or distant US cities than users from mid-size cities; (2) Twitter users have an increased awareness of other city names far away from their home city during holiday seasons; and (3) Twitter users are more aware of nearby city names than distant city names, and more aware of big city names rather than small city names.
This city boundary shapefile was extracted from Esri Data and Maps for ArcGIS 2014 - U.S. Populated Place Areas. This shapefile can be joined to 500 Cities city-level Data (GIS Friendly Format) in a geographic information system (GIS) to make city-level maps.
VITAL SIGNS INDICATOR Population (LU1)
FULL MEASURE NAME Population estimates
LAST UPDATED October 2019
DESCRIPTION Population is a measurement of the number of residents that live in a given geographical area, be it a neighborhood, city, county or region.
DATA SOURCES U.S Census Bureau: Decennial Census No link available (1960-1990) http://factfinder.census.gov (2000-2010)
California Department of Finance: Population and Housing Estimates Table E-6: County Population Estimates (1961-1969) Table E-4: Population Estimates for Counties and State (1971-1989) Table E-8: Historical Population and Housing Estimates (2001-2018) Table E-5: Population and Housing Estimates (2011-2019) http://www.dof.ca.gov/Forecasting/Demographics/Estimates/
U.S. Census Bureau: Decennial Census - via Longitudinal Tract Database Spatial Structures in the Social Sciences, Brown University Population Estimates (1970 - 2010) http://www.s4.brown.edu/us2010/index.htm
U.S. Census Bureau: American Community Survey 5-Year Population Estimates (2011-2017) http://factfinder.census.gov
U.S. Census Bureau: Intercensal Estimates Estimates of the Intercensal Population of Counties (1970-1979) Intercensal Estimates of the Resident Population (1980-1989) Population Estimates (1990-1999) Annual Estimates of the Population (2000-2009) Annual Estimates of the Population (2010-2017) No link available (1970-1989) http://www.census.gov/popest/data/metro/totals/1990s/tables/MA-99-03b.txt http://www.census.gov/popest/data/historical/2000s/vintage_2009/metro.html https://www.census.gov/data/datasets/time-series/demo/popest/2010s-total-metro-and-micro-statistical-areas.html
CONTACT INFORMATION vitalsigns.info@bayareametro.gov
METHODOLOGY NOTES (across all datasets for this indicator) All legal boundaries and names for Census geography (metropolitan statistical area, county, city, and tract) are as of January 1, 2010, released beginning November 30, 2010, by the U.S. Census Bureau. A Priority Development Area (PDA) is a locally-designated area with frequent transit service, where a jurisdiction has decided to concentrate most of its housing and jobs growth for development in the foreseeable future. PDA boundaries are current as of August 2019. For more information on PDA designation see http://gis.abag.ca.gov/website/PDAShowcase/.
Population estimates for Bay Area counties and cities are from the California Department of Finance, which are as of January 1st of each year. Population estimates for non-Bay Area regions are from the U.S. Census Bureau. Decennial Census years reflect population as of April 1st of each year whereas population estimates for intercensal estimates are as of July 1st of each year. Population estimates for Bay Area tracts are from the decennial Census (1970 -2010) and the American Community Survey (2008-2012 5-year rolling average; 2010-2014 5-year rolling average; 2013-2017 5-year rolling average). Estimates of population density for tracts use gross acres as the denominator.
Population estimates for Bay Area PDAs are from the decennial Census (1970 - 2010) and the American Community Survey (2006-2010 5 year rolling average; 2010-2014 5-year rolling average; 2013-2017 5-year rolling average). Population estimates for PDAs are derived from Census population counts at the tract level for 1970-1990 and at the block group level for 2000-2017. Population from either tracts or block groups are allocated to a PDA using an area ratio. For example, if a quarter of a Census block group lies with in a PDA, a quarter of its population will be allocated to that PDA. Tract-to-PDA and block group-to-PDA area ratios are calculated using gross acres. Estimates of population density for PDAs use gross acres as the denominator.
Annual population estimates for metropolitan areas outside the Bay Area are from the Census and are benchmarked to each decennial Census. The annual estimates in the 1990s were not updated to match the 2000 benchmark.
The following is a list of cities and towns by geographical area: Big Three: San Jose, San Francisco, Oakland Bayside: Alameda, Albany, Atherton, Belmont, Belvedere, Berkeley, Brisbane, Burlingame, Campbell, Colma, Corte Madera, Cupertino, Daly City, East Palo Alto, El Cerrito, Emeryville, Fairfax, Foster City, Fremont, Hayward, Hercules, Hillsborough, Larkspur, Los Altos, Los Altos Hills, Los Gatos, Menlo Park, Mill Valley, Millbrae, Milpitas, Monte Sereno, Mountain View, Newark, Pacifica, Palo Alto, Piedmont, Pinole, Portola Valley, Redwood City, Richmond, Ross, San Anselmo, San Bruno, San Carlos, San Leandro, San Mateo, San Pablo, San Rafael, Santa Clara, Saratoga, Sausalito, South San Francisco, Sunnyvale, Tiburon, Union City, Vallejo, Woodside Inland, Delta and Coastal: American Canyon, Antioch, Benicia, Brentwood, Calistoga, Clayton, Cloverdale, Concord, Cotati, Danville, Dixon, Dublin, Fairfield, Gilroy, Half Moon Bay, Healdsburg, Lafayette, Livermore, Martinez, Moraga, Morgan Hill, Napa, Novato, Oakley, Orinda, Petaluma, Pittsburg, Pleasant Hill, Pleasanton, Rio Vista, Rohnert Park, San Ramon, Santa Rosa, Sebastopol, Sonoma, St. Helena, Suisun City, Vacaville, Walnut Creek, Windsor, Yountville Unincorporated: all unincorporated towns
In 2025, approximately 23 million people lived in the SĂŁo Paulo metropolitan area, making it the biggest in Latin America and the Caribbean and the sixth most populated in the world. The homonymous state of SĂŁo Paulo was also the most populous federal entity in the country. The second place for the region was Mexico City with 22.75 million inhabitants. Brazil's cities Brazil is home to two large metropolises, only counting the population within the city limits, SĂŁo Paulo had approximately 11.45 million inhabitants, and Rio de Janeiro around 6.21 million inhabitants. It also contains a number of smaller, but well known cities such as BrasĂlia, Salvador, Belo Horizonte and many others, which report between 2 and 3 million inhabitants each. As a result, the country's population is primarily urban, with nearly 88 percent of inhabitants living in cities. Mexico City Mexico City's metropolitan area ranks sevenths in the ranking of most populated cities in the world. Founded over the Aztec city of Tenochtitlan in 1521 after the Spanish conquest as the capital of the Viceroyalty of New Spain, the city still stands as one of the most important in Latin America. Nevertheless, the preeminent economic, political, and cultural position of Mexico City has not prevented the metropolis from suffering the problems affecting the rest of the country, namely, inequality and violence. Only in 2023, the city registered a crime incidence of 52,723 reported cases for every 100,000 inhabitants and around 24 percent of the population lived under the poverty line.
WARNING: This is a pre-release dataset and its fields names and data structures are subject to change. It should be considered pre-release until the end of 2024. Expected changes:
Purpose
County and incorporated place (city) boundaries along with third party identifiers used to join in external data. Boundaries are from the authoritative source the California Department of Tax and Fee Administration (CDTFA), altered to show the counties as one polygon. This layer displays the city polygons on top of the County polygons so the area isn"t interrupted. The GEOID attribute information is added from the US Census. GEOID is based on merged State and County FIPS codes for the Counties. Abbreviations for Counties and Cities were added from Caltrans Division of Local Assistance (DLA) data. Place Type was populated with information extracted from the Census. Names and IDs from the US Board on Geographic Names (BGN), the authoritative source of place names as published in the Geographic Name Information System (GNIS), are attached as well. Finally, coastal buffers are removed, leaving the land-based portions of jurisdictions. This feature layer is for public use.
Related Layers
This dataset is part of a grouping of many datasets:
Point of Contact
California Department of Technology, Office of Digital Services, odsdataservices@state.ca.gov
Field and Abbreviation Definitions
Accuracy
CDTFA"s source data notes the following about accuracy:
City boundary changes and county boundary line adjustments filed with the Board of Equalization per Government Code 54900. This GIS layer contains the boundaries of the unincorporated county and incorporated cities within the state of California. The initial dataset was created in March of 2015 and was based on the State Board of Equalization tax rate area boundaries. As of April 1, 2024, the maintenance of this dataset is provided by the California Department of Tax and Fee Administration for the purpose of determining sales and use tax rates. The boundaries are continuously being revised to align with aerial imagery when areas of conflict are discovered between the original boundary provided by the California State Board of Equalization and the boundary made publicly available by local, state, and federal government. Some differences may occur between actual recorded boundaries and the boundaries used for sales and use tax purposes. The boundaries in this map are representations of taxing jurisdictions for the purpose of determining sales and use tax rates and should not be used to determine precise city or county boundary line locations. COUNTY = county name; CITY = city name or unincorporated territory; COPRI =
In 2023, the metropolitan area of New York-Newark-Jersey City had the biggest population in the United States. Based on annual estimates from the census, the metropolitan area had around 19.5 million inhabitants, which was a slight decrease from the previous year. The Los Angeles and Chicago metro areas rounded out the top three. What is a metropolitan statistical area? In general, a metropolitan statistical area (MSA) is a core urbanized area with a population of at least 50,000 inhabitants – the smallest MSA is Carson City, with an estimated population of nearly 56,000. The urban area is made bigger by adjacent communities that are socially and economically linked to the center. MSAs are particularly helpful in tracking demographic change over time in large communities and allow officials to see where the largest pockets of inhabitants are in the country. How many MSAs are in the United States? There were 421 metropolitan statistical areas across the U.S. as of July 2021. The largest city in each MSA is designated the principal city and will be the first name in the title. An additional two cities can be added to the title, and these will be listed in population order based on the most recent census. So, in the example of New York-Newark-Jersey City, New York has the highest population, while Jersey City has the lowest. The U.S. Census Bureau conducts an official population count every ten years, and the new count is expected to be announced by the end of 2030.
https://www.maine-demographics.com/terms_and_conditionshttps://www.maine-demographics.com/terms_and_conditions
A dataset listing Maine cities by population for 2024.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Update NotesMar 16 2024, remove spaces in the file and folder names.Mar 31 2024, delete the underscore in the city names with a space (such as San Francisco) in the '02_TransCAD_results' folder to ensure correct data loading by TransCAD (software version: 9.0).Aug 31 2024, add the 'cityname_link_LinkFlows.csv' file in the '02_TransCAD_results' folder to match the link from input data and the link from TransCAD results (LinkFlows) with the same Link_ID.IntroductionThis is a unified and validated traffic dataset for 20 US cities. There are 3 folders for each city.01 Input datathe initial network data obtained from OpenStreetMap (OSM)the visualization of the OSM dataprocessed node / link / od data02 TransCAD results (software version: 9.0)cityname.dbd : geographical network database of the city supported by TransCAD (version 9.0)cityname_link.shp / cityname_node.shp : network data supported by GIS software, which can be imported into TransCAD manually. Then the corresponding '.dbd' file can be generated for TransCAD with a version lower than 9.0od.mtx : OD matrix supported by TransCADLinkFlows.bin / LinkFlows.csv : traffic assignment results by TransCADcityname_link_LinkFlows.csv: the input link attributes with the traffic assignment results by TransCADShortestPath.mtx / ue_travel_time.csv : the traval time (min) between OD pairs by TransCAD03 AequilibraE results (software version: 0.9.3)cityname.shp : shapefile network data of the city support by QGIS or other GIS softwareod_demand.aem : OD matrix supported by AequilibraEnetwork.csv : the network file used for traffic assignment in AequilibraEassignment_result.csv : traffic assignment results by AequilibraEPublicationXu, X., Zheng, Z., Hu, Z. et al. (2024). A unified dataset for the city-scale traffic assignment model in 20 U.S. cities. Sci Data 11, 325. https://doi.org/10.1038/s41597-024-03149-8Usage NotesIf you use this dataset in your research or any other work, please cite both the dataset and paper above.A brief introduction about how to use this dataset can be found in GitHub. More detailed illustration for compiling the traffic dataset on AequilibraE can be referred to GitHub code or Colab code.ContactIf you have any inquiries, please contact Xiaotong Xu (email: kid-a.xu@connect.polyu.hk).
The Street Name Dictionary (SND) contains street names and street codes for New York City. Street names (which include names of other geographic features as well) are associated to street codes. Alias street names and variant spellings are related through a street code hierarchy. All previously released versions of this data are available at BYTES of the BIG APPLE - Archive.
https://www.georgia-demographics.com/terms_and_conditionshttps://www.georgia-demographics.com/terms_and_conditions
A dataset listing Georgia cities by population for 2024.
https://www.washington-demographics.com/terms_and_conditionshttps://www.washington-demographics.com/terms_and_conditions
A dataset listing Washington cities by population for 2024.
https://www.newyork-demographics.com/terms_and_conditionshttps://www.newyork-demographics.com/terms_and_conditions
A dataset listing New York cities by population for 2024.
This dataset contains a listing of incorporated places (cities and towns) and counties within the United States including the GNIS code, FIPS code, name, entity type and primary point (location) for the entity. The types of entities listed in this dataset are based on codes provided by the U.S. Census Bureau, and include the following: C1 - An active incorporated place that does not serve as a county subdivision equivalent; C2 - An active incorporated place legally coextensive with a county subdivision but treated as independent of any county subdivision; C3 - A consolidated city; C4 - An active incorporated place with an alternate official common name; C5 - An active incorporated place that is independent of any county subdivision and serves as a county subdivision equivalent; C6 - An active incorporated place that partially is independent of any county subdivision and serves as a county subdivision equivalent or partially coextensive with a county subdivision but treated as independent of any county subdivision; C7 - An incorporated place that is independent of any county; C8 - The balance of a consolidated city excluding the separately incorporated place(s) within that consolidated government; C9 - An inactive or nonfunctioning incorporated place; H1 - An active county or statistically equivalent entity; H4 - A legally defined inactive or nonfunctioning county or statistically equivalent entity; H5 - A census areas in Alaska, a statistical county equivalent entity; and H6 - A county or statistically equivalent entity that is areally coextensive or governmentally consolidated with an incorporated place, part of an incorporated place, or a consolidated city.
https://www.colorado-demographics.com/terms_and_conditionshttps://www.colorado-demographics.com/terms_and_conditions
A dataset listing Colorado cities by population for 2024.
This statistic shows the top 25 cities in the United States with the highest resident population as of July 1, 2022. There were about 8.34 million people living in New York City as of July 2022.