Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data is from:
https://simplemaps.com/data/world-cities
We're proud to offer a simple, accurate and up-to-date database of the world's cities and towns. We've built it from the ground up using authoritative sources such as the NGIA, US Geological Survey, US Census Bureau, and NASA.
Our database is:
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
This dataset contains information about the demographics of all US cities and census-designated places with a population greater or equal to 65,000. This data comes from the US Census Bureau's 2015 American Community Survey. This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.
https://www.illinois-demographics.com/terms_and_conditionshttps://www.illinois-demographics.com/terms_and_conditions
A dataset listing Illinois cities by population for 2024.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Sustainable cities depend on urban forests. City trees -- a pillar of urban forests -- improve our health, clean the air, store CO2, and cool local temperatures. Comparatively less is known about urban forests as ecosystems, particularly their spatial composition, nativity statuses, biodiversity, and tree health. Here, we assembled and standardized a new dataset of N=5,660,237 trees from 63 of the largest US cities. The data comes from tree inventories conducted at the level of cities and/or neighborhoods. Each data sheet includes detailed information on tree location, species, nativity status (whether a tree species is naturally occurring or introduced), health, size, whether it is in a park or urban area, and more (comprising 28 standardized columns per datasheet). This dataset could be analyzed in combination with citizen-science datasets on bird, insect, or plant biodiversity; social and demographic data; or data on the physical environment. Urban forests offer a rare opportunity to intentionally design biodiverse, heterogenous, rich ecosystems. Methods See eLife manuscript for full details. Below, we provide a summary of how the dataset was collected and processed.
Data Acquisition We limited our search to the 150 largest cities in the USA (by census population). To acquire raw data on street tree communities, we used a search protocol on both Google and Google Datasets Search (https://datasetsearch.research.google.com/). We first searched the city name plus each of the following: street trees, city trees, tree inventory, urban forest, and urban canopy (all combinations totaled 20 searches per city, 10 each in Google and Google Datasets Search). We then read the first page of google results and the top 20 results from Google Datasets Search. If the same named city in the wrong state appeared in the results, we redid the 20 searches adding the state name. If no data were found, we contacted a relevant state official via email or phone with an inquiry about their street tree inventory. Datasheets were received and transformed to .csv format (if they were not already in that format). We received data on street trees from 64 cities. One city, El Paso, had data only in summary format and was therefore excluded from analyses.
Data Cleaning All code used is in the zipped folder Data S5 in the eLife publication. Before cleaning the data, we ensured that all reported trees for each city were located within the greater metropolitan area of the city (for certain inventories, many suburbs were reported - some within the greater metropolitan area, others not). First, we renamed all columns in the received .csv sheets, referring to the metadata and according to our standardized definitions (Table S4). To harmonize tree health and condition data across different cities, we inspected metadata from the tree inventories and converted all numeric scores to a descriptive scale including “excellent,” “good”, “fair”, “poor”, “dead”, and “dead/dying”. Some cities included only three points on this scale (e.g., “good”, “poor”, “dead/dying”) while others included five (e.g., “excellent,” “good”, “fair”, “poor”, “dead”). Second, we used pandas in Python (W. McKinney & Others, 2011) to correct typos, non-ASCII characters, variable spellings, date format, units used (we converted all units to metric), address issues, and common name format. In some cases, units were not specified for tree diameter at breast height (DBH) and tree height; we determined the units based on typical sizes for trees of a particular species. Wherever diameter was reported, we assumed it was DBH. We standardized health and condition data across cities, preserving the highest granularity available for each city. For our analysis, we converted this variable to a binary (see section Condition and Health). We created a column called “location_type” to label whether a given tree was growing in the built environment or in green space. All of the changes we made, and decision points, are preserved in Data S9. Third, we checked the scientific names reported using gnr_resolve in the R library taxize (Chamberlain & Szöcs, 2013), with the option Best_match_only set to TRUE (Data S9). Through an iterative process, we manually checked the results and corrected typos in the scientific names until all names were either a perfect match (n=1771 species) or partial match with threshold greater than 0.75 (n=453 species). BGS manually reviewed all partial matches to ensure that they were the correct species name, and then we programmatically corrected these partial matches (for example, Magnolia grandifolia-- which is not a species name of a known tree-- was corrected to Magnolia grandiflora, and Pheonix canariensus was corrected to its proper spelling of Phoenix canariensis). Because many of these tree inventories were crowd-sourced or generated in part through citizen science, such typos and misspellings are to be expected. Some tree inventories reported species by common names only. Therefore, our fourth step in data cleaning was to convert common names to scientific names. We generated a lookup table by summarizing all pairings of common and scientific names in the inventories for which both were reported. We manually reviewed the common to scientific name pairings, confirming that all were correct. Then we programmatically assigned scientific names to all common names (Data S9). Fifth, we assigned native status to each tree through reference to the Biota of North America Project (Kartesz, 2018), which has collected data on all native and non-native species occurrences throughout the US states. Specifically, we determined whether each tree species in a given city was native to that state, not native to that state, or that we did not have enough information to determine nativity (for cases where only the genus was known). Sixth, some cities reported only the street address but not latitude and longitude. For these cities, we used the OpenCageGeocoder (https://opencagedata.com/) to convert addresses to latitude and longitude coordinates (Data S9). OpenCageGeocoder leverages open data and is used by many academic institutions (see https://opencagedata.com/solutions/academia). Seventh, we trimmed each city dataset to include only the standardized columns we identified in Table S4. After each stage of data cleaning, we performed manual spot checking to identify any issues.
https://www.georgia-demographics.com/terms_and_conditionshttps://www.georgia-demographics.com/terms_and_conditions
A dataset listing Georgia cities by population for 2024.
https://www.florida-demographics.com/terms_and_conditionshttps://www.florida-demographics.com/terms_and_conditions
A dataset listing Florida cities by population for 2024.
https://www.newyork-demographics.com/terms_and_conditionshttps://www.newyork-demographics.com/terms_and_conditions
A dataset listing New York cities by population for 2024.
https://www.washington-demographics.com/terms_and_conditionshttps://www.washington-demographics.com/terms_and_conditions
A dataset listing Washington cities by population for 2024.
The "Major Cities" layer is derived from the "World Cities" dataset provided by ArcGIS Data and Maps group as part of the global data layers made available for public use. "Major cities" layer specifically contains National and Provincial capitals that have the highest population within their respective country. Cities were filtered based on the STATUS (“National capital”, “National and provincial capital”, “Provincial capital”, “National capital and provincial capital enclave”, and “Other”). Majority of these cities within larger countries have been filtered at the highest levels of POP_CLASS (“5,000,000 and greater” and “1,000,000 to 4,999,999”). However, China for example, was filtered with cities over 11 million people due to many highly populated cities. Population approximations are sourced from US Census and UN Data. Credits: ESRI, CIA World Factbook, GMI, NIMA, UN Data, UN Habitat, US Census Bureau Disclaimer: The designations employed and the presentation of material at this site do not imply the expression of any opinion whatsoever on the part of the Secretariat of the United Nations concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries.
Many residents of New York City speak more than one language; a number of them speak and understand non-English languages more fluently than English. This dataset, derived from the Census Bureau's American Community Survey (ACS), includes information on over 1.7 million limited English proficient (LEP) residents and a subset of that population called limited English proficient citizens of voting age (CVALEP) at the Community District level. There are 59 community districts throughout NYC, with each district being represented by a Community Board.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the California population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of California across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.
Key observations
In 2024, the population of California was 39.43 million, a 0.59% increase year-by-year from 2023. Previously, in 2023, California population was 39.2 million, an increase of 0.14% compared to a population of 39.14 million in 2022. Over the last 20 plus years, between 2000 and 2024, population of California increased by 5.44 million. In this period, the peak population was 39.52 million in the year 2020. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).
When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).
Data Coverage:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for California Population by Year. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the New York population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of New York across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.
Key observations
In 2024, the population of New York was 19.87 million, a 0.66% increase year-by-year from 2023. Previously, in 2023, New York population was 19.74 million, an increase of 0.17% compared to a population of 19.7 million in 2022. Over the last 20 plus years, between 2000 and 2024, population of New York increased by 870,289. In this period, the peak population was 20.11 million in the year 2020. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).
When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).
Data Coverage:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for New York Population by Year. You can refer the same here
https://www.newmexico-demographics.com/terms_and_conditionshttps://www.newmexico-demographics.com/terms_and_conditions
A dataset listing New Mexico cities by population for 2024.
With a population just short of 3 million people, the city of Toronto is the largest in Canada, and one of the largest in North America (behind only Mexico City, New York and Los Angeles). Toronto is also one of the most multicultural cities in the world, making life in Toronto a wonderful multicultural experience for all. More than 140 languages and dialects are spoken in the city, and almost half the population Toronto were born outside Canada.It is a place where people can try the best of each culture, either while they work or just passing through. Toronto is well known for its great food.
This dataset was created by doing webscraping of Toronto wikipedia page . The dataset contains the latitude and longitude of all the neighborhoods and boroughs with postal code of Toronto City,Canada.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset consists of detailed information about the weather conditions in different cities from one of the official weather websites. It includes several variables including temperature, humidity, pressure, wind speed and direction, precipitation levels, cloud cover etc. which can be used to analyze the correlation between economic activities in these cities and their weather conditions. For example, this data can help us understand how certain types of business like tourism, retail or leisure activities are affected by changes in temperature and humidity levels. Additionally, it allows us to identify which specific kind of weather has more economic impact in a certain region and thus create accurate forecasts which could further improve commercial performances. All in all, this dataset is an invaluable source of information for people interested in understanding the relation between climate dynamics and economic outcomes
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
- City Name: This column provides the name of the cities covered in this dataset.
- Weather Condition: This column lists the weather conditions associated with each city, such as sunny, cloudy, windy, etc.
- Temperature (C): This column provides the temperature (in Celsius) of each city as provided by official weather sources.
- Population: This column lists the population size (in millions) of each city covered in this dataset.
- GDP Per Capita: This column presents GDP per capita (measured in US Dollars) for each city included in our dataset 6 Economic Activity Index: This index measures economic activity levels for a particular state or region and can be used to analyze how different weather conditions affect economic activities such as tourism, retail, and leisure activities
How to use this dataset?
This dataset can be used to explore relationships between different factors that might influence economic activity levels at a regional level—namely population size and wealth as well as weather condition—or across countries over time and certain seasons or months to identify trends in regional differences between regions regarding their respective economics activities levels due to varying climates or meteorological events . Some specific analysis that could be done includes:
Use City Name & Weather Condition columns together to calculate correlations between types of weather patterns/conditions seen throughout different locales; temperatures could also potentially be included for more comprehensive data exploration/analysis on climate dynamics - research on how “cold” vs “warm” periods affect local economies overall would also benefit from including these two columns together;
Analyze Population & Economic Activity Index together - use these variables together to see if any correlation exists between populations sizes within a given region versus their respective economic performance level; other related variables such as GDP Per Capita could also potentially provide valuable insight into how economic activity varies depending on population density;
Using all 6 columns together would enable even more comprehensive analysis e..g comparing temperatures & storm information versus expected tourist visits data or analyzing effects/correlations between strong winds & droughts versus changes seen within agricultural outputs . With careful combination of all 6 columns you could easily create some interesting models & computations for understanding broad implications which climate dynamics have upon global economics ; conversely you may explore individual cities too!
- Use this dataset to analyze the correlation between weather conditions and consumer sentiment by comparing customer purchasing decisions in different cities under different weather conditions.
- Use this dataset to identify the optimal temperature for selling certain products, so that retailers can optimize their prices accordingly.
- Use this dataset to study how changes in weather influencers the types of transportation used by the population of a certain city, and help suggest improvements to public systems for better customer experience in changing climate situations
If you use this dataset in your research, please credit the original authors. Data Source
https://www.montana-demographics.com/terms_and_conditionshttps://www.montana-demographics.com/terms_and_conditions
A dataset listing Montana cities by population for 2024.
The 2019 cartographic boundary KMLs are simplified representations of selected geographic areas from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). These boundary files are specifically designed for small-scale thematic mapping. When possible, generalization is performed with the intent to maintain the hierarchical relationships among geographies and to maintain the alignment of geographies within a file set for a given year. Geographic areas may not align with the same areas from another year. Some geographies are available as nation-based files while others are available only as state-based files.
In New England (Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, and Vermont), the Office of Management and Budget (OMB) has defined an alternative county subdivision (generally cities and towns) based definition of Core Based Statistical Areas (CBSAs) known as New England City and Town Areas (NECTAs). NECTAs are defined using the same criteria as Metropolitan Statistical Areas and Micropolitan Statistical Areas and are identified as either metropolitan or micropolitan, based, respectively, on the presence of either an urban area of 50,000 or more population or an urban cluster of at least 10,000 and less than 50,000 population. A NECTA containing a single core urban area with a population of at least 2.5 million may be subdivided to form smaller groupings of cities and towns referred to as NECTA Divisions.
The generalized boundaries in this file are based on those defined by OMB based on the 2010 Census, published in 2013, and updated in 2015, 2017, and 2018.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The urban–rural continuum classifies the global population, allocating rural populations around differently-sized cities. The classification is based on four dimensions: population distribution, population density, urban center location, and travel time to urban centers, all of which can be mapped globally and consistently and then aggregated as administrative unit statistics.Using spatial data, we matched all rural locations to their urban center of reference based on the time needed to reach these urban centers. A hierarchy of urban centers by population size (largest to smallest) is used to determine which center is the point of “reference” for a given rural location: proximity to a larger center “dominates” over a smaller one in the same travel time category. This was done for 7 urban categories and then aggregated, for presentation purposes, into “large cities” (over 1 million people), “intermediate cities” (250,000 –1 million), and “small cities and towns” (20,000–250,000).Finally, to reflect the diversity of population density across the urban–rural continuum, we distinguished between high-density rural areas with over 1,500 inhabitants per km2 and lower density areas. Unlike traditional functional area approaches, our approach does not define urban catchment areas by using thresholds, such as proportion of people commuting; instead, these emerge endogenously from our urban hierarchy and by calculating the shortest travel time.Urban-Rural Catchment Areas (URCA).tif is a raster dataset of the 30 urban–rural continuum categories for the urban–rural continuum showing the catchment areas around cities and towns of different sizes. Each rural pixel is assigned to one defined travel time category: less than one hour, one to two hours, and two to three hours travel time to one of seven urban agglomeration sizes. The agglomerations range from large cities with i) populations greater than 5 million and ii) between 1 to 5 million; intermediate cities with iii) 500,000 to 1 million and iv) 250,000 to 500,000 inhabitants; small cities with populations v) between 100,000 and 250,000 and vi) between 50,000 and 100,000; and vii) towns of between 20,000 and 50,000 people. The remaining pixels that are more than 3 hours away from any urban agglomeration of at least 20,000 people are considered as either hinterland or dispersed towns being that they are not gravitating around any urban agglomeration. The raster also allows for visualizing a simplified continuum created by grouping the seven urban agglomerations into 4 categories.Urban-Rural Catchment Areas (URCA).tif is in GeoTIFF format, band interleaved with LZW compression, suitable for use in Geographic Information Systems and statistical packages. The data type is byte, with pixel values ranging from 1 to 30. The no data value is 128. It has a spatial resolution of 30 arc seconds, which is approximately 1km at the equator. The spatial reference system (projection) is EPSG:4326 - WGS84 - Geographic Coordinate System (lat/long). The geographic extent is 83.6N - 60S / 180E - 180W. The same tif file is also available as an ESRI ArcMap MapPackage Urban-Rural Catchment Areas.mpkFurther details are in the ReadMe_data_description.docx
https://www.mississippi-demographics.com/terms_and_conditionshttps://www.mississippi-demographics.com/terms_and_conditions
A dataset listing Mississippi cities by population for 2024.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name