Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about cities in the United States. It has 4,171 rows. It features 7 columns including country, population, latitude, and longitude.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data is from:
https://simplemaps.com/data/world-cities
We're proud to offer a simple, accurate and up-to-date database of the world's cities and towns. We've built it from the ground up using authoritative sources such as the NGIA, US Geological Survey, US Census Bureau, and NASA.
Our database is:
A crosswalk table from US postal ZIP codes to geo-points (latitude, longitude)
Data source: public.opendatasoft.
The ZIP code database contained in 'zipcode.csv' contains 43204 ZIP codes for the continental United States, Alaska, Hawaii, Puerto Rico, and American Samoa. The database is in comma separated value format, with columns for ZIP code, city, state, latitude, longitude, timezone (offset from GMT), and daylight savings time flag (1 if DST is observed in this ZIP code and 0 if not).
This database was composed using ZIP code gazetteers from the US Census Bureau from 1999 and 2000, augmented with additional ZIP code information The database is believed to contain over 98% of the ZIP Codes in current use in the United States. The remaining ZIP Codes absent from this database are entirely PO Box or Firm ZIP codes added in the last five years, which are no longer published by the Census Bureau, but in any event serve a very small minority of the population (probably on the order of .1% or less). Although every attempt has been made to filter them out, this data set may contain up to .5% false positives, that is, ZIP codes that do not exist or are no longer in use but are included due to erroneous data sources. The latitude and longitude given for each ZIP code is typically (though not always) the geographic centroid of the ZIP code; in any event, the location given can generally be expected to lie somewhere within the ZIP code's "boundaries".The ZIP code database contained in 'zipcode.csv' contains 43204 ZIP codes for the continental United States, Alaska, Hawaii, Puerto Rico, and American Samoa. The database is in comma separated value format, with columns for ZIP code, city, state, latitude, longitude, timezone (offset from GMT), and daylight savings time flag (1 if DST is observed in this ZIP code and 0 if not). This database was composed using ZIP code gazetteers from the US Census Bureau from 1999 and 2000, augmented with additional ZIP code information The database is believed to contain over 98% of the ZIP Codes in current use in the United States. The remaining ZIP Codes absent from this database are entirely PO Box or Firm ZIP codes added in the last five years, which are no longer published by the Census Bureau, but in any event serve a very small minority of the population (probably on the order of .1% or less). Although every attempt has been made to filter them out, this data set may contain up to .5% false positives, that is, ZIP codes that do not exist or are no longer in use but are included due to erroneous data sources. The latitude and longitude given for each ZIP code is typically (though not always) the geographic centroid of the ZIP code; in any event, the location given can generally be expected to lie somewhere within the ZIP code's "boundaries".
The database and this README are copyright 2004 CivicSpace Labs, Inc., and are published under a Creative Commons Attribution-ShareAlike license, which requires that all updates must be released under the same license. See http://creativecommons.org/licenses/by-sa/2.0/ for more details. Please contact schuyler@geocoder.us if you are interested in receiving updates to this database as they become available.The database and this README are copyright 2004 CivicSpace Labs, Inc., and are published under a Creative Commons Attribution-ShareAlike license, which requires that all updates must be released under the same license. See http://creativecommons.org/licenses/by-sa/2.0/ for more details. Please contact schuyler@geocoder.us if you are interested in receiving updates to this database as they become available.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Sustainable cities depend on urban forests. City trees -- a pillar of urban forests -- improve our health, clean the air, store CO2, and cool local temperatures. Comparatively less is known about urban forests as ecosystems, particularly their spatial composition, nativity statuses, biodiversity, and tree health. Here, we assembled and standardized a new dataset of N=5,660,237 trees from 63 of the largest US cities. The data comes from tree inventories conducted at the level of cities and/or neighborhoods. Each data sheet includes detailed information on tree location, species, nativity status (whether a tree species is naturally occurring or introduced), health, size, whether it is in a park or urban area, and more (comprising 28 standardized columns per datasheet). This dataset could be analyzed in combination with citizen-science datasets on bird, insect, or plant biodiversity; social and demographic data; or data on the physical environment. Urban forests offer a rare opportunity to intentionally design biodiverse, heterogenous, rich ecosystems. Methods See eLife manuscript for full details. Below, we provide a summary of how the dataset was collected and processed.
Data Acquisition We limited our search to the 150 largest cities in the USA (by census population). To acquire raw data on street tree communities, we used a search protocol on both Google and Google Datasets Search (https://datasetsearch.research.google.com/). We first searched the city name plus each of the following: street trees, city trees, tree inventory, urban forest, and urban canopy (all combinations totaled 20 searches per city, 10 each in Google and Google Datasets Search). We then read the first page of google results and the top 20 results from Google Datasets Search. If the same named city in the wrong state appeared in the results, we redid the 20 searches adding the state name. If no data were found, we contacted a relevant state official via email or phone with an inquiry about their street tree inventory. Datasheets were received and transformed to .csv format (if they were not already in that format). We received data on street trees from 64 cities. One city, El Paso, had data only in summary format and was therefore excluded from analyses.
Data Cleaning All code used is in the zipped folder Data S5 in the eLife publication. Before cleaning the data, we ensured that all reported trees for each city were located within the greater metropolitan area of the city (for certain inventories, many suburbs were reported - some within the greater metropolitan area, others not). First, we renamed all columns in the received .csv sheets, referring to the metadata and according to our standardized definitions (Table S4). To harmonize tree health and condition data across different cities, we inspected metadata from the tree inventories and converted all numeric scores to a descriptive scale including “excellent,” “good”, “fair”, “poor”, “dead”, and “dead/dying”. Some cities included only three points on this scale (e.g., “good”, “poor”, “dead/dying”) while others included five (e.g., “excellent,” “good”, “fair”, “poor”, “dead”). Second, we used pandas in Python (W. McKinney & Others, 2011) to correct typos, non-ASCII characters, variable spellings, date format, units used (we converted all units to metric), address issues, and common name format. In some cases, units were not specified for tree diameter at breast height (DBH) and tree height; we determined the units based on typical sizes for trees of a particular species. Wherever diameter was reported, we assumed it was DBH. We standardized health and condition data across cities, preserving the highest granularity available for each city. For our analysis, we converted this variable to a binary (see section Condition and Health). We created a column called “location_type” to label whether a given tree was growing in the built environment or in green space. All of the changes we made, and decision points, are preserved in Data S9. Third, we checked the scientific names reported using gnr_resolve in the R library taxize (Chamberlain & Szöcs, 2013), with the option Best_match_only set to TRUE (Data S9). Through an iterative process, we manually checked the results and corrected typos in the scientific names until all names were either a perfect match (n=1771 species) or partial match with threshold greater than 0.75 (n=453 species). BGS manually reviewed all partial matches to ensure that they were the correct species name, and then we programmatically corrected these partial matches (for example, Magnolia grandifolia-- which is not a species name of a known tree-- was corrected to Magnolia grandiflora, and Pheonix canariensus was corrected to its proper spelling of Phoenix canariensis). Because many of these tree inventories were crowd-sourced or generated in part through citizen science, such typos and misspellings are to be expected. Some tree inventories reported species by common names only. Therefore, our fourth step in data cleaning was to convert common names to scientific names. We generated a lookup table by summarizing all pairings of common and scientific names in the inventories for which both were reported. We manually reviewed the common to scientific name pairings, confirming that all were correct. Then we programmatically assigned scientific names to all common names (Data S9). Fifth, we assigned native status to each tree through reference to the Biota of North America Project (Kartesz, 2018), which has collected data on all native and non-native species occurrences throughout the US states. Specifically, we determined whether each tree species in a given city was native to that state, not native to that state, or that we did not have enough information to determine nativity (for cases where only the genus was known). Sixth, some cities reported only the street address but not latitude and longitude. For these cities, we used the OpenCageGeocoder (https://opencagedata.com/) to convert addresses to latitude and longitude coordinates (Data S9). OpenCageGeocoder leverages open data and is used by many academic institutions (see https://opencagedata.com/solutions/academia). Seventh, we trimmed each city dataset to include only the standardized columns we identified in Table S4. After each stage of data cleaning, we performed manual spot checking to identify any issues.
This data collection contains information about the population of each county, town, and city of the United States in 1850 and 1860. Specific variables include tabulations of white, black, and slave males and females, and aggregate population for each town. Foreign-born population, total population of each county, and centroid latitudes and longitudes of each county and state were also compiled. (Source: downloaded from ICPSR 7/13/10)
Please Note: This dataset is part of the historical CISER Data Archive Collection and is also available at ICPSR -- https://doi.org/10.3886/ICPSR09424.v2. We highly recommend using the ICPSR version as they made this dataset available in multiple data formats.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Complete dataset of all 29,850 USA cities Roads network as a graph in the shp format. The extracts follow 2016 official USA cities boundaries. Graph are identified by their [city_code].shp. Cities code are provided by the Tiger Census Dataset. Graph have been created by extracting all openstreetmap.org (osm) maps for each USA Cityextracting the graph from osm extract using the policosm python github librarysimplifying the graph by removing all degree two nodes to retain only a workable transportation network. Original road length is retained as an attribute Nodes includes latitude and longitude attributes from WGS84 projection Edges includes length in meter (precision < 1m), tag:highway value from osm See policosm on github for more informations on extractions algorithm
The datasets are split by census block, cities, counties, districts, provinces, and states. The typical dataset includes the below fields.
Column numbers, Data attribute, Description 1, device_id, hashed anonymized unique id per moving device 2, origin_geoid, geohash id of the origin grid cell 3, destination_geoid, geohash id of the destination grid cell 4, origin_lat, origin latitude with 4-to-5 decimal precision 5, origin_long, origin longitude with 4-to-5 decimal precision 6, destination_lat, destination latitude with 5-to-6 decimal precision 7, destination_lon, destination longitude with 5-to-6 decimal precision 8, start_timestamp, start timestamp / local time 9, end_timestamp, end timestamp / local time 10, origin_shape_zone, customer provided origin shape id, zone or census block id 11, destination_shape_zone, customer provided destination shape id, zone or census block id 12, trip_distance, inferred distance traveled in meters, as the crow flies 13, trip_duration, inferred duration of the trip in seconds 14, trip_speed, inferred speed of the trip in meters per second 15, hour_of_day, hour of day of trip start (0-23) 16, time_period, time period of trip start (morning, afternoon, evening, night) 17, day_of_week, day of week of trip start(mon, tue, wed, thu, fri, sat, sun) 18, year, year of trip start 19, iso_week, iso week of the trip 20, iso_week_start_date, start date of the iso week 21, iso_week_end_date, end date of the iso week 22, travel_mode, mode of travel (walking, driving, bicycling, etc) 23, trip_event, trip or segment events (start, route, end, start-end) 24, trip_id, trip identifier (unique for each batch of results) 25, origin_city_block_id, census block id for the trip origin point 26, destination_city_block_id, census block id for the trip destination point 27, origin_city_block_name, census block name for the trip origin point 28, destination_city_block_name, census block name for the trip destination point 29, trip_scaled_ratio, ratio used to scale up each trip, for example, a trip_scaled_ratio value of 10 means that 1 original trip was scaled up to 10 trips 30, route_geojson, geojson line representing trip route trajectory or geometry
The datasets can be processed and enhanced to also include places, POI visitation patterns, hour-of-day patterns, weekday patterns, weekend patterns, dwell time inferences, and macro movement trends.
The dataset is delivered as gzipped CSV archive files that are uploaded to your AWS s3 bucket upon request.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Latitude and longitude coordinates, population size, and mean baseline pneumonia and influenza death rates for 66 large US reporting cities (1910–1920) with 100, 000 or more inhabitants [10].
This dataset includes all valid felony, misdemeanor, and violation crimes reported to the New York City Police Department (NYPD) for all complete quarters so far this year (2017). For additional details, please see the attached data dictionary in the ‘About’ section.
Location of wifi hotspots in the city with basic descriptive information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Extract From geonames files. The data represents populated places with a population > 5000 inhabitants.Table information:geonameid : integer id of record in geonames database name : name of geographical point (utf8) varchar(200) asciiname : name of geographical point in plain ascii characters, varchar(200) alternatenames : alternatenames, comma separated, ascii names automatically transliterated, convenience attribute from alternatename table, varchar(10000) latitude : latitude in decimal degrees (wgs84) longitude : longitude in decimal degrees (wgs84) feature class : see http://www.geonames.org/export/codes.html, char(1) feature code : see http://www.geonames.org/export/codes.html, varchar(10) country code : ISO-3166 2-letter country code, 2 characters cc2 : alternate country codes, comma separated, ISO-3166 2-letter country code, 200 characters admin1 code : fipscode (subject to change to iso code), see exceptions below, see file admin1Codes.txt for display names of this code; varchar(20) admin2 code : code for the second administrative division, a county in the US, see file admin2Codes.txt; varchar(80) admin3 code : code for third level administrative division, varchar(20) admin4 code : code for fourth level administrative division, varchar(20) population : bigint (8 byte int) elevation : in meters, integer dem : digital elevation model, srtm3 or gtopo30, average elevation of 3''x3'' (ca 90mx90m) or 30''x30'' (ca 900mx900m) area in meters, integer. srtm processed by cgiar/ciat. timezone : the iana timezone id (see file timeZone.txt) varchar(40) modification date : date of last modification in yyyy-MM-dd format
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Who amongst us doesn't small talk about the weather every once in a while?
The goal of this dataset is to elevate this small talk to medium talk.
Just kidding, I actually originally decided to collect this dataset in order to demonstrate basic signal processing concepts, such as filtering, Fourier transform, auto-correlation, cross-correlation, etc..., (for a data analysis course I'm currently preparing).
I wanted to demonstrate these concepts on signals that we all have intimate familiarity with and hope that this way these concepts will be better understood than with just made up signals.
The weather is excellent for demonstrating these kinds of concepts as it contains periodic temporal structure with two very different periods (daily and yearly).
http://www.sciencehub4kids.com/wp-content/uploads/2015/08/The-four-seasons.jpg" alt="a nice 4 seasons image">
The dataset contains ~5 years of high temporal resolution (hourly measurements) data of various weather attributes, such as temperature, humidity, air pressure, etc.
This data is available for 30 US and Canadian Cities, as well as 6 Israeli cities.
I've organized the data according to a common time axis for easy use.
Each attribute has it's own file and is organized such that the rows are the time axis (it's the same time axis for all files), and the columns are the different cities (it's the same city ordering for all files as well).
Additionally, for each city we also have the country, latitude and longitude information in a separate file.
The dataset was aquired using Weather API on the OpenWeatherMap website, and is available under the ODbL License.
Weather data is both intrinsically interesting, and also potentially useful when correlated with other types of data.
For example, Wildfire spread is potentially related to weather conditions, demand for cabs is famously known to be correlated with weather conditions (here, here and here you can find NYC cab ride data), and use of city bikes is probably also correlated with weather in interesting ways (check out this Austin dataset, this SF dataset, this Montreal dataset, and this NYC dataset).
Traffic is also probably related to weather.
Another potentially interesting source of correlation is between weather and crime. Here are a few crime datasets on kaggle of cities present in this weather dataset: Chicago, Philadelphia, Los Angeles, Vancouver, Austin, NYC
There are many other potentially interesting connections between everyday life and the weather that we can explore together with the help of this dataset. Have fun!
The purpose of this tool is to estimate daily precipitation patterns for a yearly cycle at any location on the globe. The user input is simply the latitude and longitude of the selected location. There is an embedded Zip Code search routine to find the latitude and longitude for US cities. GlobalRainSIM forecasts the daily rainfall based upon two databases.The first was the average number of days in a month with precipitation (wet days) that were compiled and interpolated by Legates and Willmott (1990a and 1990b) with further improvements by Willmott and Matsuura (1995). The second database was the global average monthly precipitation data collected 1961-1990 and cross-validated by New et al. (1999). These two datasets were then used to establish the monthly precipitation totals and the frequency of precipitation in a month. The average precipitation event was calculated as the monthly mean divided by the number of wet days. This mean value was then randomly assigned to a day of the month looping through the number of wet days. In other words, if the average monthly rainfall was 10 mm/month with 5 average wet days, each rain event was 2 mm. This amount (2 mm) was then randomly assigned to 5 days of that month. The advantage of this tool is that a typical pattern of precipitation can be simulated for any global location arriving at an •average year• as a baseline case for comparison. This tool also outputs the daily rainfall as a file or can be easily embedded within another program. Resources in this dataset:Resource Title: Global RainSIM Verson 1.0. File Name: Web Page, url: https://www.ars.usda.gov/research/software/download/?softwareid=227&modecode=50-60-05-00 download page
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This database is a collection of data over 23,000 US earthquakes. It contains data from the year 1638 to 1985. The digital database also includes information regarding epicentral coordinates, magnitudes, focal depths, names and coordinates of reporting cities (or localities), reported intensities, and the distance from the city (or locality) to the epicenter. The majority of felt reports are from the US but there is information about also some other countries such as Antigua and Barbuda, Canada, Mexico, Panama, and the Philippines.
Year Mo Da Hr Mn Sec The Date and Time are listed in Universal Coordinated Time and are Year, Month (Mo), Day (Da), Hour (Hr), Minute (Mn), Second (Sec)
UTC Conv Number of hours to subtract from the Date and Time given in Universal Coordinated Time to get local standard time for the epicenter. In general: 4 = 60 degree meridian (Atlantic Standard Time) 5 = 75 degree meridian (Eastern Standard Time) 6 = 90 degree meridian (Central Standard Time) 7 = 105 degree meridian (Mountain Standard Time) 8 = 120 degree meridian (Pacific Standard Time) 9 = 135 degree meridian (Alaska Standard Time) 10 = 150 degree meridian (Hawaii-Aleutian Standard Time)
U/G Unpublished or grouped intensity U = Intensity (MMI) assigned that was not listed in the source document. G = Intensity grouped I-III in the source document was reassigned intensity III.
EQ Lat / EQ Long This is the geographic latitude and longitude of the epicenter expressed as decimal numbers. The units are degrees. The latitude range is +4.0 to +69.0, where "+" designates North latitude (there are no South latitudes in the database). The longitude range is -179.0 to +180.0, where "-" designates West longitude and "+" designates East longitude. Most of the epicenters are West longitude (from -56 to -179), but a few epicenters in the Philippines and Aleutian Islands are East longitude (from +120 to +180).
Mag These are magnitudes as listed in United States Earthquakes, Earthquake History of the United States (either mb, MS, or ML), or the equivalent derived from intensities for pre-instrumental events. The magnitude is a measure of seismic energy. The magnitude scale is logarithmic. An increase of one in magnitude represents a tenfold increase in the recorded wave amplitude. However, the energy release associated with an increase of one in magnitude is not tenfold, but thirtyfold. For example, approximately 900 times more energy is released in an earthquake of magnitude 7 than in an earthquake of magnitude 5. Each increase in magnitude of one unit is equivalent to an increase of seismic energy of about 1,600,000,000,000 ergs.
Depth (km) Hypocentral Depth (positive downward) in kilometers from the surface.
Epi Dis Epicentral Distance in km that the reporting city (or locality) is located from the epicenter of the earthquake.
City Lat / City Long This is the geographic latitude and longitude of the city (or locality) where the Modified Mercalli Intensity was observed, expressed as decimal numbers. The units are degrees. The latitude range is +6.0 to +72.0, where "+" designates North latitude (there are no South latitudes in the database). The longitude range is -177.0 to +180.0, where "-" designates West longitude and "+" designates East longitude. Most of the reporting cities (or localities) are West longitude (from -29 to -177), but a few reporting cities (or localities) in the Philippines and Aleutian Islands are East longitude (from +119 to +180).
MMI Modified Mercalli Scale Intensity (MMI) is given in Roman Numerals. Values range from I to XII. (Roman Numerals were converted to numbers in the digital database. Values range from 1 to 12.) Macroseismic information is compiled from various sources including newspaper articles, foreign broadcasts, U.S. Geological Survey Earthquake reports and seismological station reports.
State Code Numerical i identifier for state, province, or country in which the earthquake was reported (felt) by residents: 01 Alabama 02 Alaska 03 Arizona 04 Arkansas 05 California 07 Colorado 08 Connecticut 09 Delaware 10 District of Columbia 11 Florida 12 Georgia 14 Hawaii 15 Idaho 16 Illinois 17 Indiana 18 Iowa 19 Kansas 20 Kentucky 21 Louisiana 22 Maine 23 Maryland 24 Massachusetts 25 Michigan 26 Minnesota 27 Mississippi 28 Missouri 29 Montana 30 Nebraska 31 Nevada 32 New Hampshire 33 New Jersey 34 New Mexico 35 New York 36 North Carolina 37 North Dakota 38 Ohio 39 Oklahoma 40 Oregon 41 Pennsylvania 42 Puerto Rico 43 Rhode Island 45 South Carolina 46 South Dakota 47 Tennessee 48 Texas 49 Utah 50 Vermont 51 Virginia 52 Virgin Islands 54 Washington 55 West Virginia 56 Wisconsin 57 Wyoming 58 West Indies 74 Panama 75 Philippine Is. 80 Mexico 81 Baja California 90 Canada 91 Alberta 92 Manitoba 93 Saskatchewan 94 British Columbia 95 Ontario 96 New Brunswick 97 Quebec 98 Nova Scotia 99 Yukon Territory City Name City (or locality) in which the earthquake was reported (felt) by residents.
Data Source
This is a code referring to the source of one or more of the reported parameters (e.g., epicenter, city, and intensity).
A = Source unknown; 1925 earthquake in Boston area (reports not listed in source H).
B = Report by Bollinger and Stover, 1976.
C = Quarterly Seismological Reports, 1925-27.
D = Source unknown; 1937-1977 earthquakes in Hawaii, California, and the eastern U.S.
H = Earthquake History of the United States (Coffman and others, 1982).
K = Report by Carnegie Institution, 1908, 1910.
M = Source unknown; 1899-1912 earthquakes in Alaska.
N = Report by Nuttli, 1973.
Q = Abstracts of Earthquake Reports for the United States, 1933-70.
S = Unpublished report by Nina Scott, 1965.
T = Source unknown; 1872-1904 earthquakes along U.S. west coast.
U = United States Earthquakes, 1928-85.
W = Monthly Weather Service Seismological Reports, 1914-24.
https://www.icpsr.umich.edu/web/ICPSR/studies/9516/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/9516/terms
Public Law 94-171, enacted in 1975, requires the Census Bureau to provide redistricting data in a format requested by state governments. Within one year following the Decennial Census (by April 1, 1991), the Census Bureau must provide the governor and legislature of each state with the population data needed to redraw legislative districts. To meet this requirement, the Census Bureau established a voluntary program to allow states to receive data for voting districts (e.g., election precincts, city wards) in addition to standard census geographic areas such as counties, cities, census tracts, and blocks. These files contain data for voting districts for those counties for which a state outlined voting district boundaries around a set of census blocks on census maps, in accordance with the guidelines of the program. Each state file provides data for the state and its subareas in the following order: state, county, voting district, county subdivision, place, census tract, block group, and block. Additionally, complete summaries are provided for the following geographic areas: county subdivision, place, consolidated city, state portion of American Indian and Alaska Native area, and county portion of American Indian and Alaska Native area. Area characteristics such as land area, water area, latitude, and longitude are provided. Summary statistics are provided for all persons and housing units in the geographic areas. Counts by race and by Hispanic and non-Hispanic origin are also given.
https://borealisdata.ca/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.5683/SP3/KKP7TPhttps://borealisdata.ca/api/datasets/:persistentId/versions/1.2/customlicense?persistentId=doi:10.5683/SP3/KKP7TP
The database includes ZIP code, city name, alias city name, state code, phone area code, city type, county name, country FIPS, time zone, day light saving flag, latitude, longitude, county elevation, Metropolitan Statistical Area (MSA), Primary Metropolitan Statistical Area (PMSA), Core Based Statistical Area (CBSA) and census 2000 data on population by race, average household income, and average house value.
This dataset displays all the hazardous waste sites in the United States and it's Territories as of 5.08. The data comes from the Agency for Toxic Substances and Disease Registry(ATSDR). The dataset contains information about the site: Site ID Site Name CERCLIS # Address City State County Latitude Longitude Population Region # Congressional Districts Federal Facility National Priorities List Status Ownership Status Classification For more information go to the Agency for Toxic Substances and Disease Registry(ATSDR)website at http://www.atsdr.cdc.gov
This Wyoming Cities coverage contains data for 109 Wyoming towns, cities, and Census Designated Areas. The coverage was created from U.S. Census Bureau Tiger Data. It contains many of the same attributes as the Census Bureau .dbf files, but there are a few modifications. A code has been added to distinguish towns, cities, and CDPs. Also a Countyseat item has been added to provide a way to display the County Seats only. Latitude/Longitude coordinates were also converted from the Census Bureau format to be used in GENERATE.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset includes all tables which can be imported to the SQL database under the GeoAPEXOL interface. The dataset includes location (zipcode, city, state, latitude, longitude), soil (all variables required by setting up APEX model and the values were extracted from the SSURGO database for the US ), climate (datasets for over 2700 stations across the US from 1974 to 2013, including daily observed values for precipitation and temperature as well as monthly statistics values for cligen running.), management (template management files for normal agricultural operations and some non-structural BMP parameters. ) Currently, the datasets are prepared for the contiguous US.
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
This dataset is the output of an advanced vehicular simulation performed using the VEINS OMNeT++ framework. The simulation was designed to model realistic vehicle dynamics, sensor readings, and communication parameters over a network of urban roadways connecting major US cities. The dataset provides a comprehensive view of vehicular performance and environmental conditions over a 30-day period for 500 vehicles.
Authors: Muhammad Ali & Tariq Qayyum
Below is a detailed description of each feature included in the dataset:
Temporal and Geospatial Data
VehicleID: A unique identifier for each vehicle in the simulation. Timestamp: The exact date and time for each record, allowing temporal analysis of vehicle behavior. Day, Hour, Minute, Second: These fields break down the timestamp into finer granularity for detailed time-series analysis. Latitude and Longitude: The real-time geographic coordinates of the vehicle as it progresses along its route. StartingPointLatitude and StartingPointLongitude: The coordinates of the vehicle's origin (selected from major US cities). DestinationLatitude and DestinationLongitude: The target coordinates where the vehicle is headed. Mobility and Performance Metrics
Speed: The current speed of the vehicle (in km/h), adjusted based on simulated traffic, weather conditions, and road quality. Direction: The vehicle's heading in degrees, indicating its travel direction. Odometer: The cumulative distance traveled by the vehicle during the simulation. VehicleType: Categorical variable indicating the type of vehicle (Car, Truck, Bus, Motorcycle). VehicleAge: The age of the vehicle, measured in years. EngineTemperature: The simulated engine temperature (in °C), which responds dynamically to vehicle speed. FuelLevel: The remaining fuel level, tracking consumption over distance traveled. BatteryLevel: The current battery level, which decreases based on speed and usage. TirePressure, BrakeFluidLevel, CoolantLevel, OilLevel, WiperFluidLevel: These parameters simulate vehicle maintenance metrics, reflecting system status during operation. Environmental and Traffic Conditions
WeatherCondition: The prevailing weather conditions (Clear, Rain, Fog, or Snow) set for each simulation day. RoadCondition: A categorical assessment of road quality (Good, Moderate, or Poor). TrafficDensity: The simulated traffic level at each time step (Low, Medium, or High), affecting vehicle speed and behavior. Communication and Processing Attributes
CPU_Available and Memory_Available: Simulated computational resource levels that might impact on-board processing and task execution. NetworkLatency: The simulated network latency in milliseconds, indicating communication delays. SignalStrength: The wireless signal strength measured in dBm. GPSStatus, WiFiStatus, BluetoothStatus, CellularStatus: Boolean flags indicating the operational status of various communication systems on the vehicle. RadarStatus, LidarStatus, CameraStatus, IMUStatus: Status indicators for the vehicle’s sensor suite, providing situational awareness. TaskType: The category of a computational or communication task being executed (Navigation, Entertainment, DataAnalysis, or Safety). TaskSize: A numeric value representing the computational or data size requirement of the task. TaskPriority: The priority level of the task (High, Medium, or Low), potentially affecting its execution order. TaskOffloaded: A boolean flag indicating whether a task was offloaded, influenced by the vehicle’s battery level and simulated decision-making. Additional Vehicle Features
HeadlightStatus: A boolean indicating if the headlights are on, based on the time of day and weather conditions. BrakeLightStatus: Indicates whether brake lights are activated when the vehicle is decelerating. TurnSignalStatus: A boolean showing if the vehicle’s turn signal is active. HazardLightStatus: Indicates the status of the hazard lights. ABSStatus: Reflects the functioning status of the Anti-lock Braking System. AirbagStatus: Indicates whether the airbag system is active (typically remains enabled during the simulation). Simulation Environment
The dataset encapsulates a realistic simulation of vehicular movement and behavior in urban settings. Key aspects include:
Dynamic Speed Adjustment: Vehicles adjust speeds based on time-of-day (e.g., rush hour), weather impacts, and traffic density. Resource Management: The simulation models the gradual depletion of battery, fuel, and other vehicular maintenance parameters. Sensor and Communication Systems: Real-time statuses of sensor suites and communication modules are recorded, emulating modern connected vehicles. Route Progression: Each vehicle's journey is tracked from a starting city to a destination, with geospatial progression computed using a distance estimation formula. Potential Applications
Researchers and practitioners can utilize thi...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about cities in the United States. It has 4,171 rows. It features 7 columns including country, population, latitude, and longitude.