MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is a very small but useful dataset if you are ever looking to get jobs for a certain US city in LinkedIn. It contains a list of US cities and states and it's corresponding LinkedIn ID (which is usually externally hidden).
The cities list was retreived from here: https://github.com/kelvins/US-Cities-Database and the names of the ciiadjusted to match the name used in LinkedIn (which could differ in subtle ways).
Some cities do not have an ID, this is because the city is either too small or because there was a difference in the name on LinkedIn which I did not detect (human error). If you ever run in to one of these feel free to enhance this dataset.
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
This dataset is part of the Geographical repository maintained by Opendatasoft. This dataset contains data for places and equivalent entities in United States of America.This layer both incorporated places (legal entities) and census designated places or CDPs (statistical entities). An incorporated place is established to provide governmental functions for a concentration of people as opposed to a minor civil division (MCD), which generally is created to provide services or administer an area without regard, necessarily, to population. Places always nest within a state, but may extend across county and county subdivision boundaries. An incorporated place usually is a city, town, village, or borough, but can have other legal descriptions. CDPs are delineated for the decennial census as the statistical counterparts of incorporated places. CDPs are delineated to provide data for settled concentrations of population that are identifiable by name, but are not legally incorporated under the laws of the state in which they are located. The boundaries for CDPs often are defined in partnership with state, local, and/or tribal officials and usually coincide with visible features or the boundary of an adjacent incorporated place or another legal entity. CDP boundaries often change from one decennial census to the next with changes in the settlement pattern and development; a CDP with the same name as in an earlier census does not necessarily have the same boundary. The only population/housing size requirement for CDPs is that they must contain some housing and population. Processors and tools are using this data. Enhancements Add ISO 3166-3 codes. Simplify geometries to provide better performance across the services. Add administrative hierarchy.
Geospatial data about US Cities and Towns (Local). Export to CAD, GIS, PDF, CSV and access via API.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is a cobbled together dataset of official U.S. city names and place names recognized by the U.S. Census Bureau. I originally used this in a spell checker for user input.
The names were collected in multiple stages using the US Census API and later combined into one dataset. There are about more than 48,000 city and place names in this dataset.
During the collection process, I learned that finding city names is not as straight forward as I thought. For example, some cities are "incorporated" and other areas that we think are cities, are actually considered "populated places".
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
This data set includes cities in the United States, Puerto Rico and the U.S. Virgin Islands. These cities were collected from the 1970 National Atlas of the United States. Where applicable, U.S. Census Bureau codes for named populated places were associated with each name to allow additional information to be attached. The Geographic Names Information System (GNIS) was also used as a source for additional information. This is a revised version of the December, 2003, data set.
This layer is sourced from maps.bts.dot.gov.
How many incorporated places are registered in the U.S.?
There were 19,502 incorporated places registered in the United States as of July 31, 2019. 16,410 had a population under 10,000 while, in contrast, only 10 cities had a population of one million or more.
Small-town America
Suffice it to say, almost nothing is more idealized in the American imagination than small-town America. When asked where they would prefer to live, 30 percent of Americans reported that they would prefer to live in a small town. Americans tend to prefer small-town living due to a perceived slower pace of life, close-knit communities, and a more affordable cost of living when compared to large cities.
An increasing population
Despite a preference for small-town life, metropolitan areas in the U.S. still see high population figures, with the New York, Los Angeles, and Chicago metro areas being the most populous in the country. Metro and state populations are projected to increase by 2040, so while some may move to small towns to escape city living, those small towns may become more crowded in the upcoming decades.
This dataset contains a listing of incorporated places (cities and towns) and counties within the United States including the GNIS code, FIPS code, name, entity type and primary point (location) for the entity. The types of entities listed in this dataset are based on codes provided by the U.S. Census Bureau, and include the following: C1 - An active incorporated place that does not serve as a county subdivision equivalent; C2 - An active incorporated place legally coextensive with a county subdivision but treated as independent of any county subdivision; C3 - A consolidated city; C4 - An active incorporated place with an alternate official common name; C5 - An active incorporated place that is independent of any county subdivision and serves as a county subdivision equivalent; C6 - An active incorporated place that partially is independent of any county subdivision and serves as a county subdivision equivalent or partially coextensive with a county subdivision but treated as independent of any county subdivision; C7 - An incorporated place that is independent of any county; C8 - The balance of a consolidated city excluding the separately incorporated place(s) within that consolidated government; C9 - An inactive or nonfunctioning incorporated place; H1 - An active county or statistically equivalent entity; H4 - A legally defined inactive or nonfunctioning county or statistically equivalent entity; H5 - A census areas in Alaska, a statistical county equivalent entity; and H6 - A county or statistically equivalent entity that is areally coextensive or governmentally consolidated with an incorporated place, part of an incorporated place, or a consolidated city.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Sustainable cities depend on urban forests. City trees -- a pillar of urban forests -- improve our health, clean the air, store CO2, and cool local temperatures. Comparatively less is known about urban forests as ecosystems, particularly their spatial composition, nativity statuses, biodiversity, and tree health. Here, we assembled and standardized a new dataset of N=5,660,237 trees from 63 of the largest US cities. The data comes from tree inventories conducted at the level of cities and/or neighborhoods. Each data sheet includes detailed information on tree location, species, nativity status (whether a tree species is naturally occurring or introduced), health, size, whether it is in a park or urban area, and more (comprising 28 standardized columns per datasheet). This dataset could be analyzed in combination with citizen-science datasets on bird, insect, or plant biodiversity; social and demographic data; or data on the physical environment. Urban forests offer a rare opportunity to intentionally design biodiverse, heterogenous, rich ecosystems. Methods See eLife manuscript for full details. Below, we provide a summary of how the dataset was collected and processed.
Data Acquisition We limited our search to the 150 largest cities in the USA (by census population). To acquire raw data on street tree communities, we used a search protocol on both Google and Google Datasets Search (https://datasetsearch.research.google.com/). We first searched the city name plus each of the following: street trees, city trees, tree inventory, urban forest, and urban canopy (all combinations totaled 20 searches per city, 10 each in Google and Google Datasets Search). We then read the first page of google results and the top 20 results from Google Datasets Search. If the same named city in the wrong state appeared in the results, we redid the 20 searches adding the state name. If no data were found, we contacted a relevant state official via email or phone with an inquiry about their street tree inventory. Datasheets were received and transformed to .csv format (if they were not already in that format). We received data on street trees from 64 cities. One city, El Paso, had data only in summary format and was therefore excluded from analyses.
Data Cleaning All code used is in the zipped folder Data S5 in the eLife publication. Before cleaning the data, we ensured that all reported trees for each city were located within the greater metropolitan area of the city (for certain inventories, many suburbs were reported - some within the greater metropolitan area, others not). First, we renamed all columns in the received .csv sheets, referring to the metadata and according to our standardized definitions (Table S4). To harmonize tree health and condition data across different cities, we inspected metadata from the tree inventories and converted all numeric scores to a descriptive scale including “excellent,” “good”, “fair”, “poor”, “dead”, and “dead/dying”. Some cities included only three points on this scale (e.g., “good”, “poor”, “dead/dying”) while others included five (e.g., “excellent,” “good”, “fair”, “poor”, “dead”). Second, we used pandas in Python (W. McKinney & Others, 2011) to correct typos, non-ASCII characters, variable spellings, date format, units used (we converted all units to metric), address issues, and common name format. In some cases, units were not specified for tree diameter at breast height (DBH) and tree height; we determined the units based on typical sizes for trees of a particular species. Wherever diameter was reported, we assumed it was DBH. We standardized health and condition data across cities, preserving the highest granularity available for each city. For our analysis, we converted this variable to a binary (see section Condition and Health). We created a column called “location_type” to label whether a given tree was growing in the built environment or in green space. All of the changes we made, and decision points, are preserved in Data S9. Third, we checked the scientific names reported using gnr_resolve in the R library taxize (Chamberlain & Szöcs, 2013), with the option Best_match_only set to TRUE (Data S9). Through an iterative process, we manually checked the results and corrected typos in the scientific names until all names were either a perfect match (n=1771 species) or partial match with threshold greater than 0.75 (n=453 species). BGS manually reviewed all partial matches to ensure that they were the correct species name, and then we programmatically corrected these partial matches (for example, Magnolia grandifolia-- which is not a species name of a known tree-- was corrected to Magnolia grandiflora, and Pheonix canariensus was corrected to its proper spelling of Phoenix canariensis). Because many of these tree inventories were crowd-sourced or generated in part through citizen science, such typos and misspellings are to be expected. Some tree inventories reported species by common names only. Therefore, our fourth step in data cleaning was to convert common names to scientific names. We generated a lookup table by summarizing all pairings of common and scientific names in the inventories for which both were reported. We manually reviewed the common to scientific name pairings, confirming that all were correct. Then we programmatically assigned scientific names to all common names (Data S9). Fifth, we assigned native status to each tree through reference to the Biota of North America Project (Kartesz, 2018), which has collected data on all native and non-native species occurrences throughout the US states. Specifically, we determined whether each tree species in a given city was native to that state, not native to that state, or that we did not have enough information to determine nativity (for cases where only the genus was known). Sixth, some cities reported only the street address but not latitude and longitude. For these cities, we used the OpenCageGeocoder (https://opencagedata.com/) to convert addresses to latitude and longitude coordinates (Data S9). OpenCageGeocoder leverages open data and is used by many academic institutions (see https://opencagedata.com/solutions/academia). Seventh, we trimmed each city dataset to include only the standardized columns we identified in Table S4. After each stage of data cleaning, we performed manual spot checking to identify any issues.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
This dataset includes basic data about all US cities with a population over 100.000 (333 cities)
Source: https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population
Coordinates of cities have been geocoded using https://rapidapi.com/GeocodeSupport/api/forward-reverse-geocoding/
Rows description:
City: Name of city State: Name of state Latitude, Longitude, Population_estimate_2022: Estimated population in 2022 Population_2020: Population figure from 2020 census Change_population: % change in population between 2022 and 2020 Land_area: City land area in sq. mi. Population_density_2020: density of population per sq. mi. in 2020
This city boundary shapefile was extracted from Esri Data and Maps for ArcGIS 2014 - U.S. Populated Place Areas. This shapefile can be joined to 500 Cities city-level Data (GIS Friendly Format) in a geographic information system (GIS) to make city-level maps.
Paid dataset with city names, coordinates, regions, and administrative divisions of United States Minor Outlying Islands. Available in Excel (.xlsx), CSV, JSON, XML, and SQL formats after purchase.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data were downloaded from Google Earth Engine. Each csv includes Year Month Day CM Scenario and other columns. CM includes three scenario (historical, ssp245 and ssp585). The scenario involves the parameter (tasmax or tasmin). Other columns means the city ID. The city ID and corresponding city name can be seen in the csv file named UID Abbreviation.csv above and the shapefile of cities were used from the open repository (Chen et al. 2022). This global inventory includes 247 cities within the contiguous US. This can be found at: Chen, B., Wu, S., Song, Y. et al. Contrasting inequality in human exposure to greenspace between cities of Global North and Global South. Nat Commun 13, 4636 (2022). https://doi.org/10.1038/s41467-022-32258-4.
And to select the 50 populous cities, 1 km X 1 km resolution imagery of population by Wang et. al (2022) was used.[Wang, X., Meng, X. & Long, Y. Projecting 1 km-grid population distributions from 2020 to 2100 globally under shared socioeconomic pathways. Sci Data 9, 563 (2022). https://doi.org/10.1038/s41597-022-01675-x]. Based on this, we ranked 50 cities from those 247 US cities found in Chen et al. 2022.
Those data and real names of the cities can be studied from these papers.
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
This dataset contains information about the demographics of all US cities and census-designated places with a population greater or equal to 65,000. This data comes from the US Census Bureau's 2015 American Community Survey. This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Update NotesMar 16 2024, remove spaces in the file and folder names.Mar 31 2024, delete the underscore in the city names with a space (such as San Francisco) in the '02_TransCAD_results' folder to ensure correct data loading by TransCAD (software version: 9.0).Aug 31 2024, add the 'cityname_link_LinkFlows.csv' file in the '02_TransCAD_results' folder to match the link from input data and the link from TransCAD results (LinkFlows) with the same Link_ID.IntroductionThis is a unified and validated traffic dataset for 20 US cities. There are 3 folders for each city.01 Input datathe initial network data obtained from OpenStreetMap (OSM)the visualization of the OSM dataprocessed node / link / od data02 TransCAD results (software version: 9.0)cityname.dbd : geographical network database of the city supported by TransCAD (version 9.0)cityname_link.shp / cityname_node.shp : network data supported by GIS software, which can be imported into TransCAD manually. Then the corresponding '.dbd' file can be generated for TransCAD with a version lower than 9.0od.mtx : OD matrix supported by TransCADLinkFlows.bin / LinkFlows.csv : traffic assignment results by TransCADcityname_link_LinkFlows.csv: the input link attributes with the traffic assignment results by TransCADShortestPath.mtx / ue_travel_time.csv : the traval time (min) between OD pairs by TransCAD03 AequilibraE results (software version: 0.9.3)cityname.shp : shapefile network data of the city support by QGIS or other GIS softwareod_demand.aem : OD matrix supported by AequilibraEnetwork.csv : the network file used for traffic assignment in AequilibraEassignment_result.csv : traffic assignment results by AequilibraEPublicationXu, X., Zheng, Z., Hu, Z. et al. (2024). A unified dataset for the city-scale traffic assignment model in 20 U.S. cities. Sci Data 11, 325. https://doi.org/10.1038/s41597-024-03149-8Usage NotesIf you use this dataset in your research or any other work, please cite both the dataset and paper above.A brief introduction about how to use this dataset can be found in GitHub. More detailed illustration for compiling the traffic dataset on AequilibraE can be referred to GitHub code or Colab code.ContactIf you have any inquiries, please contact Xiaotong Xu (email: kid-a.xu@connect.polyu.hk).
https://www.icpsr.umich.edu/web/ICPSR/studies/9515/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/9515/terms
The Geographic Names Information System (GNIS) is an automated data system developed by the United States Geological Survey (USGS) to standardize and disseminate information on geographic names. GNIS provides primary information for all known places, features, and areas in the United States identified by proper name. The data file described here is a standard report file written from the National Geographic Names Data Base that lists all populated place records in GNIS for the United States. The entries are sorted by state and then listed alphabetically by feature name. Information provided includes the official placename, the feature type, the Federal Information Processing Standards (FIPS) code referencing the state, the principal county in which the place is located, the geographic coordinates (in degrees, minutes, and seconds) that locate the approximate original center of the place, the year of any pertinent United States Board on Geographic Names activity regarding the placename or its application, and a reference to the 1:24,000-scale USGS topographic map on which the feature is portrayed. The elevation in feet is given if available, as is the 1980 Census population figure.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dynamic social media content, such as Twitter messages, can be used to examine individuals’ beliefs and perceptions. By analyzing Twitter messages, this study examines how Twitter users exchanged and recognized toponyms (city names) for different cities in the United States. The frequency and variety of city names found in their online conversations were used to identify the unique spatiotemporal patterns of “geographical awareness” for Twitter users. A new analytic method, Knowledge Discovery in Cyberspace for Geographical Awareness (KDCGA), is introduced to help identify the dynamic spatiotemporal patterns of geographic awareness among social media conversations. Twitter data were collected across 50 U.S. cities. Thousands of city names around the world were extracted from a large volume of Twitter messages (over 5 million tweets) by using the Twitter Application Programming Interface (APIs) and Python language computer programs. The percentages of distant city names (cities located in distant states or other countries far away from the locations of Twitter users) were used to estimate the level of global geographical awareness for Twitter users in each U.S. city. A Global awareness index (GAI) was developed to quantify the level of geographical awareness of Twitter users from within the same city. Our findings are that: (1) the level of geographical awareness varies depending on when and where Twitter messages are posted, yet Twitter users from big cities are more aware of the names of international cities or distant US cities than users from mid-size cities; (2) Twitter users have an increased awareness of other city names far away from their home city during holiday seasons; and (3) Twitter users are more aware of nearby city names than distant city names, and more aware of big city names rather than small city names.
The 2023 cartographic boundary shapefiles are simplified representations of selected geographic areas from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). These boundary files are specifically designed for small-scale thematic mapping. When possible, generalization is performed with the intent to maintain the hierarchical relationships among geographies and to maintain the alignment of geographies within a file set for a given year. Geographic areas may not align with the same areas from another year. Some geographies are available as nation-based files while others are available only as state-based files. The cartographic boundary files include both incorporated places (legal entities) and census designated places or CDPs (statistical entities). An incorporated place is established to provide governmental functions for a concentration of people as opposed to a minor civil division (MCD), which generally is created to provide services or administer an area without regard, necessarily, to population. Places always nest within a state, but may extend across county and county subdivision boundaries. An incorporated place usually is a city, town, village, or borough, but can have other legal descriptions. CDPs are delineated for the decennial census as the statistical counterparts of incorporated places. CDPs are delineated to provide data for settled concentrations of population that are identifiable by name, but are not legally incorporated under the laws of the state in which they are located. The boundaries for CDPs often are defined in partnership with state, local, and/or tribal officials and usually coincide with visible features or the boundary of an adjacent incorporated place or another legal entity. CDP boundaries often change from one decennial census to the next with changes in the settlement pattern and development; a CDP with the same name as in an earlier census does not necessarily have the same boundary. The only population/housing size requirement for CDPs is that they must contain some housing and population. The generalized boundaries of most incorporated places in this file are based on those as of January 1, 2023, as reported through the Census Bureau's Boundary and Annexation Survey (BAS). The generalized boundaries of all CDPs are based on those delineated or updated as part of the the 2023 BAS or the Census Bureau's Participant Statistical Areas Program (PSAP) for the 2020 Census.
WARNING: This is a pre-release dataset and its fields names and data structures are subject to change. It should be considered pre-release until the end of 2024. Expected changes:
Purpose
County and incorporated place (city) boundaries along with third party identifiers used to join in external data. Boundaries are from the authoritative source the California Department of Tax and Fee Administration (CDTFA), altered to show the counties as one polygon. This layer displays the city polygons on top of the County polygons so the area isn"t interrupted. The GEOID attribute information is added from the US Census. GEOID is based on merged State and County FIPS codes for the Counties. Abbreviations for Counties and Cities were added from Caltrans Division of Local Assistance (DLA) data. Place Type was populated with information extracted from the Census. Names and IDs from the US Board on Geographic Names (BGN), the authoritative source of place names as published in the Geographic Name Information System (GNIS), are attached as well. Finally, coastal buffers are removed, leaving the land-based portions of jurisdictions. This feature layer is for public use.
Related Layers
This dataset is part of a grouping of many datasets:
Point of Contact
California Department of Technology, Office of Digital Services, odsdataservices@state.ca.gov
Field and Abbreviation Definitions
Accuracy
CDTFA"s source data notes the following about accuracy:
City boundary changes and county boundary line adjustments filed with the Board of Equalization per Government Code 54900. This GIS layer contains the boundaries of the unincorporated county and incorporated cities within the state of California. The initial dataset was created in March of 2015 and was based on the State Board of Equalization tax rate area boundaries. As of April 1, 2024, the maintenance of this dataset is provided by the California Department of Tax and Fee Administration for the purpose of determining sales and use tax rates. The boundaries are continuously being revised to align with aerial imagery when areas of conflict are discovered between the original boundary provided by the California State Board of Equalization and the boundary made publicly available by local, state, and federal government. Some differences may occur between actual recorded boundaries and the boundaries used for sales and use tax purposes. The boundaries in this map are representations of taxing jurisdictions for the purpose of determining sales and use tax rates and should not be used to determine precise city or county boundary line locations. COUNTY = county name; CITY = city name or unincorporated territory; COPRI =
https://www.newyork-demographics.com/terms_and_conditionshttps://www.newyork-demographics.com/terms_and_conditions
A dataset listing New York cities by population for 2024.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is a very small but useful dataset if you are ever looking to get jobs for a certain US city in LinkedIn. It contains a list of US cities and states and it's corresponding LinkedIn ID (which is usually externally hidden).
The cities list was retreived from here: https://github.com/kelvins/US-Cities-Database and the names of the ciiadjusted to match the name used in LinkedIn (which could differ in subtle ways).
Some cities do not have an ID, this is because the city is either too small or because there was a difference in the name on LinkedIn which I did not detect (human error). If you ever run in to one of these feel free to enhance this dataset.