The datasets are split by census block, cities, counties, districts, provinces, and states. The typical dataset includes the below fields.
Column numbers, Data attribute, Description 1, device_id, hashed anonymized unique id per moving device 2, origin_geoid, geohash id of the origin grid cell 3, destination_geoid, geohash id of the destination grid cell 4, origin_lat, origin latitude with 4-to-5 decimal precision 5, origin_long, origin longitude with 4-to-5 decimal precision 6, destination_lat, destination latitude with 5-to-6 decimal precision 7, destination_lon, destination longitude with 5-to-6 decimal precision 8, start_timestamp, start timestamp / local time 9, end_timestamp, end timestamp / local time 10, origin_shape_zone, customer provided origin shape id, zone or census block id 11, destination_shape_zone, customer provided destination shape id, zone or census block id 12, trip_distance, inferred distance traveled in meters, as the crow flies 13, trip_duration, inferred duration of the trip in seconds 14, trip_speed, inferred speed of the trip in meters per second 15, hour_of_day, hour of day of trip start (0-23) 16, time_period, time period of trip start (morning, afternoon, evening, night) 17, day_of_week, day of week of trip start(mon, tue, wed, thu, fri, sat, sun) 18, year, year of trip start 19, iso_week, iso week of the trip 20, iso_week_start_date, start date of the iso week 21, iso_week_end_date, end date of the iso week 22, travel_mode, mode of travel (walking, driving, bicycling, etc) 23, trip_event, trip or segment events (start, route, end, start-end) 24, trip_id, trip identifier (unique for each batch of results) 25, origin_city_block_id, census block id for the trip origin point 26, destination_city_block_id, census block id for the trip destination point 27, origin_city_block_name, census block name for the trip origin point 28, destination_city_block_name, census block name for the trip destination point 29, trip_scaled_ratio, ratio used to scale up each trip, for example, a trip_scaled_ratio value of 10 means that 1 original trip was scaled up to 10 trips 30, route_geojson, geojson line representing trip route trajectory or geometry
The datasets can be processed and enhanced to also include places, POI visitation patterns, hour-of-day patterns, weekday patterns, weekend patterns, dwell time inferences, and macro movement trends.
The dataset is delivered as gzipped CSV archive files that are uploaded to your AWS s3 bucket upon request.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The primary legal divisions of most states are termed counties. In Louisiana, these divisions are known as parishes. In Alaska, which has no counties, the equivalent entities are the organized boroughs, city and boroughs, municipalities, and for the unorganized area, census areas. The latter are delineated cooperatively for statistical purposes by the State of Alaska and the Census Bureau. In four states (Maryland, Missouri, Nevada, and Virginia), there are one or more incorporated places that are independent of any county organization and thus constitute primary divisions of their states. These incorporated places are known as independent cities and are treated as equivalent entities for purposes of data presentation. The District of Columbia and Guam have no primary divisions, and each area is considered an equivalent entity for purposes of data presentation. The Census Bureau treats the following entities as equivalents of counties for purposes of data presentation: Municipios in Puerto Rico, Districts and Islands in American Samoa, Municipalities in the Commonwealth of the Northern Mariana Islands, and Islands in the U.S. Virgin Islands. The entire area of the United States, Puerto Rico, and the Island Areas is covered by counties or equivalent entities. The boundaries for counties and equivalent entities are as of January 1, 2017, primarily as reported through the Census Bureau's Boundary and Annexation Survey (BAS).
This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://data.cityofchicago.org/Public-Safety/Chicago-Police-Department-Illinois-Uniform-Crime-R/c7ck-438e
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The AQS Data Mart is a database containing all of the information from AQS. It has every measured value the EPA has collected via the national ambient air monitoring program. It also includes the associated aggregate values calculated by EPA (8-hour, daily, annual, etc.). The AQS Data Mart is a copy of AQS made once per week and made accessible to the public through web-based applications. The intended users of the Data Mart are air quality data analysts in the regulatory, academic, and health research communities. It is intended for those who need to download large volumes of detailed technical data stored at EPA and does not provide any interactive analytical tools. It serves as the back-end database for several Agency interactive tools that could not fully function without it: AirData, AirCompare, The Remote Sensing Information Gateway, the Map Monitoring Sites KML page, etc.
AQS must maintain constant readiness to accept data and meet high data integrity requirements, thus is limited in the number of users and queries to which it can respond. The Data Mart, as a read only copy, can allow wider access.
The most commonly requested aggregation levels of data (and key metrics in each) are:
Sample Values (2.4 billion values back as far as 1957, national consistency begins in 1980, data for 500 substances routinely collected) The sample value converted to standard units of measure (generally 1-hour averages as reported to EPA, sometimes 24-hour averages) Local Standard Time (LST) and GMT timestamps Measurement method Measurement uncertainty, where known Any exceptional events affecting the data NAAQS Averages NAAQS average values (8-hour averages for ozone and CO, 24-hour averages for PM2.5) Daily Summary Values (each monitor has the following calculated each day) Observation count Observation per cent (of expected observations) Arithmetic mean of observations Max observation and time of max AQI (air quality index) where applicable Number of observations > Standard where applicable Annual Summary Values (each monitor has the following calculated each year) Observation count and per cent Valid days Required observation count Null observation count Exceptional values count Arithmetic Mean and Standard Deviation 1st - 4th maximum (highest) observations Percentiles (99, 98, 95, 90, 75, 50) Number of observations > Standard Site and Monitor Information FIPS State Code (the first 5 items on this list make up the AQS Monitor Identifier) FIPS County Code Site Number (unique within the county) Parameter Code (what is measured) POC (Parameter Occurrence Code) to distinguish from different samplers at the same site Latitude Longitude Measurement method information Owner / operator / data-submitter information Monitoring Network to which the monitor belongs Exemptions from regulatory requirements Operational dates City and CBSA where the monitor is located Quality Assurance Information Various data fields related to the 19 different QA assessments possible
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.epa_historical_air_quality.[TABLENAME]
. Fork this kernel to get started.
Data provided by the US Environmental Protection Agency Air Quality System Data Mart.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
All BPD data on Open Baltimore is preliminary data and subject to change. The information presented through Open Baltimore represents Part I victim based crime data. The data do not represent statistics submitted to the FBI's Uniform Crime Report (UCR); therefore any comparisons are strictly prohibited. For further clarification of UCR data, please visit http://www.fbi.gov/about-us/cjis/ucr/ucr. Please note that this data is preliminary and subject to change. Prior month data is likely to show changes when it is refreshed on a monthly basis. All data is geocoded to the approximate latitude/longitude location of the incident and excludes those records for which an address could not be geocoded. Any attempt to match the approximate location of the incident to an exact address is strictly prohibited.
This dataset was kindly made available by the City of Baltimore. You can find the original dataset, which is updated regularly, here.
This dataset contains shapefile boundaries for CA State, counties and places from the US Census Bureau's 2023 MAF/TIGER database. Current geography in the 2023 TIGER/Line Shapefiles generally reflects the boundaries of governmental units in effect as of January 1, 2023.
Timeseries data from 'South Bay Ocean Outfall Real Time Mooring' (south-bay-ocean-outfall) cdm_altitude_proxy=z cdm_data_type=TimeSeriesProfile cdm_profile_variables=time cdm_timeseries_variables=station,longitude,latitude contributor_email=feedback@axiomdatascience.com contributor_name=Axiom Data Science contributor_role=processor contributor_role_vocabulary=NERC contributor_url=https://www.axiomdatascience.com Conventions=IOOS-1.2, CF-1.6, ACDD-1.3, NCCSV-1.2 defaultDataQuery=sea_water_ph_reported_on_total_scale_internal_qc_agg,cdom_qc_agg,sea_water_turbidity,sea_water_ph_reported_on_total_scale_external,sea_water_temperature,sea_water_ph_reported_on_total_scale_external_qc_agg,sea_water_practical_salinity_qc_agg,mole_concentration_of_nitrate_in_sea_water,sea_water_ph_reported_on_total_scale_internal,cdom,mass_concentration_of_chlorophyll_in_sea_water_qc_agg,mole_fraction_of_carbon_dioxide_in_sea_water_in_wet_gas_qc_agg,northward_sea_water_velocity,mole_concentration_of_nitrate_in_sea_water_qc_agg,sea_water_turbidity_qc_agg,mole_fraction_of_carbon_dioxide_in_sea_water_in_wet_gas,sea_water_practical_salinity,biochemicaloxygendemand_dep,eastward_sea_water_velocity,mass_concentration_of_oxygen_in_sea_water,mass_concentration_of_chlorophyll_in_sea_water,sea_water_temperature_qc_agg,biochemicaloxygendemand_dep_qc_agg,z,eastward_sea_water_velocity_qc_agg,time,northward_sea_water_velocity_qc_agg,mass_concentration_of_oxygen_in_sea_water_qc_agg&time>=max(time)-3days Easternmost_Easting=-117.18631 featureType=TimeSeriesProfile geospatial_lat_max=32.53171 geospatial_lat_min=32.53171 geospatial_lat_units=degrees_north geospatial_lon_max=-117.18631 geospatial_lon_min=-117.18631 geospatial_lon_units=degrees_east geospatial_vertical_max=0.0 geospatial_vertical_min=-37.4 geospatial_vertical_positive=up geospatial_vertical_units=m history=Downloaded from City of San Diego Public Utilities Department at http://mooring.ucsd.edu/sboo/sboo_01/csv id=111243 infoUrl=https://sensors.ioos.us/#metadata/111243/station institution=City of San Diego Public Utilities Department naming_authority=com.axiomdatascience Northernmost_Northing=32.53171 platform=buoy platform_name=South Bay Ocean Outfall Real Time Mooring platform_vocabulary=http://mmisw.org/ont/ioos/platform processing_level=Level 2 references=http://mooring.ucsd.edu/sboo/sboo_01/,http://mooring.ucsd.edu/sboo/sboo_01/csv, sourceUrl=http://mooring.ucsd.edu/sboo/sboo_01/csv Southernmost_Northing=32.53171 standard_name_vocabulary=CF Standard Name Table v72 station_id=111243 time_coverage_end=2025-03-26T20:04:36Z time_coverage_start=2021-11-03T22:00:00Z Westernmost_Easting=-117.18631
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The datasets are split by census block, cities, counties, districts, provinces, and states. The typical dataset includes the below fields.
Column numbers, Data attribute, Description 1, device_id, hashed anonymized unique id per moving device 2, origin_geoid, geohash id of the origin grid cell 3, destination_geoid, geohash id of the destination grid cell 4, origin_lat, origin latitude with 4-to-5 decimal precision 5, origin_long, origin longitude with 4-to-5 decimal precision 6, destination_lat, destination latitude with 5-to-6 decimal precision 7, destination_lon, destination longitude with 5-to-6 decimal precision 8, start_timestamp, start timestamp / local time 9, end_timestamp, end timestamp / local time 10, origin_shape_zone, customer provided origin shape id, zone or census block id 11, destination_shape_zone, customer provided destination shape id, zone or census block id 12, trip_distance, inferred distance traveled in meters, as the crow flies 13, trip_duration, inferred duration of the trip in seconds 14, trip_speed, inferred speed of the trip in meters per second 15, hour_of_day, hour of day of trip start (0-23) 16, time_period, time period of trip start (morning, afternoon, evening, night) 17, day_of_week, day of week of trip start(mon, tue, wed, thu, fri, sat, sun) 18, year, year of trip start 19, iso_week, iso week of the trip 20, iso_week_start_date, start date of the iso week 21, iso_week_end_date, end date of the iso week 22, travel_mode, mode of travel (walking, driving, bicycling, etc) 23, trip_event, trip or segment events (start, route, end, start-end) 24, trip_id, trip identifier (unique for each batch of results) 25, origin_city_block_id, census block id for the trip origin point 26, destination_city_block_id, census block id for the trip destination point 27, origin_city_block_name, census block name for the trip origin point 28, destination_city_block_name, census block name for the trip destination point 29, trip_scaled_ratio, ratio used to scale up each trip, for example, a trip_scaled_ratio value of 10 means that 1 original trip was scaled up to 10 trips 30, route_geojson, geojson line representing trip route trajectory or geometry
The datasets can be processed and enhanced to also include places, POI visitation patterns, hour-of-day patterns, weekday patterns, weekend patterns, dwell time inferences, and macro movement trends.
The dataset is delivered as gzipped CSV archive files that are uploaded to your AWS s3 bucket upon request.