Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This table contains data on the annual miles traveled by place of occurrence and by mode of transportation (vehicle, pedestrian, bicycle), for California, its regions, counties, and cities/towns. The ratio uses data from the California Department of Transportation, the U.S. Department of Transportation, and the U.S. Census Bureau. The table is part of a series of indicators in the Healthy Communities Data and Indicators Project of the Office of Health Equity. Miles traveled by individuals and their choice of mode – car, truck, public transit, walking or bicycling – have a major impact on mobility and population health. Miles traveled by automobile offers extraordinary personal mobility and independence, but it is also associated with air pollution, greenhouse gas emissions linked to global warming, road traffic injuries, and sedentary lifestyles. Active modes of transport – bicycling and walking alone and in combination with public transit – offer opportunities for physical activity, which has many documented health benefits. More information about the data table and a data dictionary can be found in the About/Attachments section.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Have you taken a flight in the U.S. in the past 15 years? If so, then you are a part of monthly data that the U.S. Department of Transportation's TranStats service makes available on various metrics for 15 U.S. airlines and 30 major U.S airports. Their website unfortunately does not include a method for easily downloading and sharing files. Furthermore, the source is built in ASP.NET, so extracting the data is rather cumbersome. To allow easier community access to this rich source of information, I scraped the metrics for every airline / airport combination and stored them in separate CSV files.
Occasionally, an airline doesn't serve a certain airport, or it didn't serve it for the entire duration that the data collection period covers*. In those cases, the data either doesn't exist or is typically too sparse to be of much use. As such, I've only uploaded complete files for airports that an airline served for the entire uninterrupted duration of the collection period. For these files, there should be 174 time series points for one or more of the nine columns below. I recommend any of the files for American, Delta, or United Airlines for outstanding examples of complete and robust airline data.
* No data for Atlas Air exists, and Virgin America commenced service in 2007, so no folders for either airline are included.
There are 13 airlines that have at least one complete dataset. Each airline's folder includes CSV file(s) for each airport that are complete as defined by the above criteria. I've double-checked the files, but if you find one that violates the criteria, please point it out. The file names have the format "AIRLINE-AIRPORT.csv", where both AIRLINE and AIRPORT are IATA codes. For a full listing of the airlines and airports that the codes correspond to, check out the airline_codes.csv or airport_codes.csv files that are included, or perform a lookup here. Note that the data in each airport file represents metrics for flights that originated at the airport.
Among the 13 airlines in data.zip, there are a total of 161 individual datasets. There are also two special folders included - airlines_all_airports.csv and airports_all_airlines.csv. The first contains datasets for each airline aggregated over all airports, while the second contains datasets for each airport aggregated over all airlines. To preview a sample dataset, check out all_airlines_all_airports.csv, which contains industry-wide data.
Each file includes the following metrics for each month from October 2002 to March 2017:
* Frequently contains missing values
Thanks to the U.S. Department of Transportation for collecting this data every month and making it publicly available to us all.
Source: https://www.transtats.bts.gov/Data_Elements.aspx
The airline / airport datasets are perfect for practicing and/or testing time series forecasting with classic statistical models such as autoregressive integrated moving average (ARIMA), or modern deep learning techniques such as long short-term memory (LSTM) networks. The datasets typically show evidence of trends, seasonality, and noise, so modeling and accurate forecasting can be challenging, but still more tractable than time series problems possessing more stochastic elements, e.g. stocks, currencies, commodities, etc. The source releases new data each month, so feel free to check your models' performances against new data as it comes out. I will update the files here every 3 to 6 months depending on how things go.
A future plan is to build a SQLite database so a vast array of queries can be run against the data. The data in it its current time series format is not conducive for this, so coming up with a workable structure for the tables is the first step towards this goal. If you have any suggestions for how I can improve the data presentation, or anything that you would like me to add, please let me know. Looking forward to seeing the questions that we can answer together!
Southern California residents were rudely awakened Sunday morning June 28, 1992 at 04:57 am (June 28 at 11:57 GMT), by an earthquake of magnitude 7.6 (Ms) followed by a smaller 6.7 (Ms) magnitude earthquake about three hours later (June 28 at 15:05 GMT). The largest shock occurred approximately 6 miles southwest of Landers, California and 110 miles east of Los Angeles. The second earthquake was entered approximately 8 miles southeast of Big Bear City in the San Bernardino Mountains near Barton Flats. A distance of 17 miles and 7,000 feet in elevation separate the two earthquake locations.
The Harvard Forest is a collection of five properties, totaling about 1500 hectares, in Petersham, Massachusetts. Petersham is a rural town in Worcester County, Massachusetts, about 60 miles west of Boston. It is largely in the Swift River Watershed, and lies near the center of a twenty-mile wide band of hilly uplands that form the eastern edge of the Connecticut Valley. The north part of the town is rolling and the south more distinctly hilly; the lowest basins are about 200 m above sea level, the flats around 400m. Th e climate is cool temperate. Petersham, like many of the adjacent towns, was settled in the early 18th century, extensively cleared and farmed in the next hundred years, and then progressively abandoned after about 1830. Reforestation proceeded quickly, and by the time of the first Harvard Forest maps in 1909 HF was almost entirely wooded. Th e common forest types are dominated, variously, by red oak, red maple, white pine, or hemlock. Most are of low or average fertility and under 100 years old. Hemlock is now locally dominant in many stands that have been continuously forested; oaks, red maples and pines are the common dominants in stands that developed in old fields.
Our model is a full-annual-cycle population model {hostetler2015full} that tracks groups of bat surviving through four seasons: breeding season/summer, fall migration, non-breeding/winter, and spring migration. Our state variables are groups of bats that use a specific maternity colony/breeding site and hibernaculum/non-breeding site. Bats are also accounted for by life stages (juveniles/first-year breeders versus adults) and seasonal habitats (breeding versus non-breeding) during each year, This leads to four states variable (here depicted in vector notation): the population of juveniles during the non-breeding season, the population of adults during the non-breeding season, the population of juveniles during the breeding season, and the population of adults during the breeding season, Each vector's elements depict a specific migratory pathway, e.g., is comprised of elements, {non-breeding sites}, {breeding sites}The variables may be summed by either breeding site or non-breeding site to calculate the total population using a specific geographic location. Within our code, we account for this using an index column for breeding sites and an index column for non-breeding sides within the data table. Our choice of state variables caused the time step (i.e. (t)) to be 1 year. However, we recorded the population of each group during the breeding and non-breeding season as an artifact of our state-variable choice. We choose these state variables partially for their biological information and partially to simplify programming. We ran our simulation for 30 years because the USFWS currently issues Indiana Bat take permits for 30 years. Our model covers the range of the Indiana Bat, which is approximately the eastern half of the contiguous United States (Figure \ref{fig:BatInput}). The boundaries of our range was based upon the United States boundary, the NatureServe Range map, and observations of the species. The maximum migration distance was 500-km, which was based upon field observations reported in the literature \citep{gardner2002seasonal, winhold2006aspects}. The landscape was covered with approximately 33,000, 6475-ha grid cells and the grid size was based upon management considerations. The U.S.~Fish and Wildlife Service considers a 2.5 mile radius around a known maternity colony to be its summer habitat range and all of the hibernaculum within a 2.5 miles radius to be a single management unit. Hence the choice of 5-by-5 square grids (25 miles(^2) or 6475 ha). Each group of bats within the model has a summer and winter grid cell as well as a pathway connecting the cells. It is possible for a group to be in the cell for both seasons, but improbable for females (which we modeled). The straight line between summer and winter cells were buffered with different distances (1-km, 2-km, 10-km, 20-km, 100-km, and 200-km) as part of the turbine sensitivity and uncertainty analysis. We dropped the largest two buffer sizes during the model development processes because they were biologically unrealistic and including them caused all populations to go extinct all of the time. Note a 1-km buffer would be a 2-km wide path. An example of two pathways are included in Figure \ref{fig:BatPath}. The buffers accounts for bats not migrating in a straight line. If we had precise locations for all summer maternity colonies, other approaches such as Circuitscape \citep{hanks2013circuit} could have been used to model migration routes and this would have reduced migration uncertainty.
Southern California residents were rudely awakened Sunday morning June 28, 1992 at 04:57 am (June 28 at 11:57 GMT), by an earthquake of magnitude 7.6 (Ms) followed by a smaller 6.7 (Ms) magnitude earthquake about three hours later (June 28 at 15:05 GMT). The largest shock occurred approximately 6 miles southwest of Landers, California and 110 miles east of Los Angeles. The second earthquake was entered approximately 8 miles southeast of Big Bear City in the San Bernardino Mountains near Barton Flats. A distance of 17 miles and 7,000 feet in elevation separate the two earthquake locations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Feature layer containing authoritative greenway mile marker points for Sioux Falls, South Dakota.
NOAA is responsible for depicting on its nautical charts the limits of the 12 nautical mile Territorial Sea, 24 nautical mile Contiguous Zone, and 200 nautical mile Exclusive Economic Zone (EEZ). The outer limit of each of these zones is measured from the U.S. normal baseline, which coincides with the low water line depicted on NOAA charts and includes closing lines across the entrances of lega...
We present synthetic spectral energy distributions (SEDs) for single-age, single-metallicity stellar populations (SSPs) covering the full optical spectral range at moderately high resolution [full width at half-maximum (FWHM)=2.3{AA}]. These SEDs constitute our base models, as they combine scaled-solar isochrones with an empirical stellar spectral library [Medium resolution INT Library of Empirical Spectra (MILES)], which follows the chemical evolution pattern of the solar neighbourhood. The models rely as much as possible on empirical ingredients, not just on the stellar spectra, but also on extensive photometric libraries, which are used to determine the transformations from the theoretical parameters of the isochrones to observational quantities. The unprecedented stellar parameter coverage of the MILES stellar library allowed us to safely extend our optical SSP SED predictions from intermediate- to very-old-age regimes and the metallicity coverage of the SSPs from super-solar to [M/H]=-2.3. SSPs with such low metallicities are particularly useful for globular cluster studies. We have computed SSP SEDs for a suite of initial mass function shapes and slopes. We provide a quantitative analysis of the dependence of the synthesized SSP SEDs on the (in)complete coverage of the stellar parameter space in the input library that not only shows that our models are of higher quality than those of other works, but also in which range of SSP parameters our models are reliable.
Original Dataset Product: Processed, classified lidar point cloud data tiles in LAZ 1.4 format. Original Dataset Geographic Extent: HI_NOAAMauiOahu_3: The work unit covers approximately Approximately 306 square miles on the eastern side of the big island of Hawaii. Original Dataset Description: HI_NOAAMauiOahu_3 (Big Island) The HI_NOAAMauiOahu_3_B20 lidar project called for the planning, acquisition, processing, and production of derivative products of QL1 lidar data to be collected an aggregate nominal pulse spacing (ANPS) of 0.35-meters and 8 points per square meter (ppsm). Project specifications were based on the National Geospatial Program Lidar Base Specification Version 2.1, and the American Society of Photogrammetry and Remote Sensing (ASPRS) Positional Accuracy Standards for Digital Geospatial Data (Edition 1, Version 1.0). The data was developed based on a horizontal reference system of NAD83 (PA11), UTM 5 (EPSG 6635), Meter, and a vertical reference system of NAVD88 (GEOID12B), Meter. Lidar data was delivered as processed LAZ 1.4 files formatted to 3,450 individual 500-meters x 500-meters tiles. Note: Between 2020 and 2023 multiple mobilizations were made to collect the data in the project area due to the extreme terrain and persistent low clouds. On March 31, 2023, it was decided between Woolpert and USGS to end the acquisition phase of the project and move onto processing with the data collected. The DPA and work unit has been clipped to the extent of the data collected. Areas of low point density and/or small data voids within the work unit have been identified with low confidence polygons. Original Dataset Ground Conditions: HI_NOAAMauiOahu_3 (Big Island) Lidar was collected from February 14, 2023, through March 15, 2023 while no snow was on the ground and rivers were at or below normal levels. In order to post process the lidar data to meet task order specifications and meet ASPRS vertical accuracy guidelines, Woolpert established ground control points that were used to calibrate the lidar to known ground locations established throughout the entire project area. An additional independent accuracy checkpoints were collected throughout the entire project area and used to assess the vertical accuracy of the data. These checkpoints were not used to calibrate or post process the data.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Summary:
Estimated stand-off distance between ADS-B equipped aircraft and obstacles. Obstacle information was sourced from the FAA Digital Obstacle File and the FHWA National Bridge Inventory. Aircraft tracks were sourced from processed data curated from the OpenSky Network. Results are presented as histograms organized by aircraft type and distance away from runways.
Description:
For many aviation safety studies, aircraft behavior is represented using encounter models, which are statistical models of how aircraft behave during close encounters. They are used to provide a realistic representation of the range of encounter flight dynamics where an aircraft collision avoidance system would be likely to alert. These models currently and have historically have been limited to interactions between aircraft; they have not represented the specific interactions between obstacles and aircraft equipped transponders. In response, we calculated the standoff distance between obstacles and ADS-B equipped manned aircraft.
For robustness, this assessment considered two different datasets of manned aircraft tracks and two datasets of obstacles. For robustness, MIT LL calculated the standoff distance using two different datasets of aircraft tracks and two datasets of obstacles. This approach aligned with the foundational research used to support the ASTM F3442/F3442M-20 well clear criteria of 2000 feet laterally and 250 feet AGL vertically.
The two datasets of processed tracks of ADS-B equipped aircraft curated from the OpenSky Network. It is likely that rotorcraft were underrepresented in these datasets. There were also no considerations for aircraft equipped only with Mode C or not equipped with any transponders. The first dataset was used to train the v1.3 uncorrelated encounter models and referred to as the “Monday” dataset. The second dataset is referred to as the “aerodrome” dataset and was used to train the v2.0 and v3.x terminal encounter model. The Monday dataset consisted of 104 Mondays across North America. The other dataset was based on observations at least 8 nautical miles within Class B, C, D aerodromes in the United States for the first 14 days of each month from January 2019 through February 2020. Prior to any processing, the datasets required 714 and 847 Gigabytes of storage. For more details on these datasets, please refer to "Correlated Bayesian Model of Aircraft Encounters in the Terminal Area Given a Straight Takeoff or Landing" and “Benchmarking the Processing of Aircraft Tracks with Triples Mode and Self-Scheduling.”
Two different datasets of obstacles were also considered. First was point obstacles defined by the FAA digital obstacle file (DOF) and consisted of point obstacle structures of antenna, lighthouse, meteorological tower (met), monument, sign, silo, spire (steeple), stack (chimney; industrial smokestack), transmission line tower (t-l tower), tank (water; fuel), tramway, utility pole (telephone pole, or pole of similar height, supporting wires), windmill (wind turbine), and windsock. Each obstacle was represented by a cylinder with the height reported by the DOF and a radius based on the report horizontal accuracy. We did not consider the actual width and height of the structure itself. Additionally, we only considered obstacles at least 50 feet tall and marked as verified in the DOF.
The other obstacle dataset, termed as “bridges,” was based on the identified bridges in the FAA DOF and additional information provided by the National Bridge Inventory. Due to the potential size and extent of bridges, it would not be appropriate to model them as point obstacles; however, the FAA DOF only provides a point location and no information about the size of the bridge. In response, we correlated the FAA DOF with the National Bridge Inventory, which provides information about the length of many bridges. Instead of sizing the simulated bridge based on horizontal accuracy, like with the point obstacles, the bridges were represented as circles with a radius of the longest, nearest bridge from the NBI. A circle representation was required because neither the FAA DOF or NBI provided sufficient information about orientation to represent bridges as rectangular cuboid. Similar to the point obstacles, the height of the obstacle was based on the height reported by the FAA DOF. Accordingly, the analysis using the bridge dataset should be viewed as risk averse and conservative. It is possible that a manned aircraft was hundreds of feet away from an obstacle in actuality but the estimated standoff distance could be significantly less. Additionally, all obstacles are represented with a fixed height, the potentially flat and low level entrances of the bridge are assumed to have the same height as the tall bridge towers. The attached figure illustrates an example simulated bridge.
It would had been extremely computational inefficient to calculate the standoff distance for all possible track points. Instead, we define an encounter between an aircraft and obstacle as when an aircraft flying 3069 feet AGL or less comes within 3000 feet laterally of any obstacle in a 60 second time interval. If the criteria were satisfied, then for that 60 second track segment we calculate the standoff distance to all nearby obstacles. Vertical separation was based on the MSL altitude of the track and the maximum MSL height of an obstacle.
For each combination of aircraft track and obstacle datasets, the results were organized seven different ways. Filtering criteria were based on aircraft type and distance away from runways. Runway data was sourced from the FAA runways of the United States, Puerto Rico, and Virgin Islands open dataset. Aircraft type was identified as part of the em-processing-opensky workflow.
License
This dataset is licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International(CC BY-NC-ND 4.0).
This license requires that reusers give credit to the creator. It allows reusers to copy and distribute the material in any medium or format in unadapted form and for noncommercial purposes only. Only noncommercial use of your work is permitted. Noncommercial means not primarily intended for or directed towards commercial advantage or monetary compensation. Exceptions are given for the not for profit standards organizations of ASTM International and RTCA.
MIT is releasing this dataset in good faith to promote open and transparent research of the low altitude airspace. Given the limitations of the dataset and a need for more research, a more restrictive license was warranted. Namely it is based only on only observations of ADS-B equipped aircraft, which not all aircraft in the airspace are required to employ; and observations were source from a crowdsourced network whose surveillance coverage has not been robustly characterized.
As more research is conducted and the low altitude airspace is further characterized or regulated, it is expected that a future version of this dataset may have a more permissive license.
Distribution Statement
DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.
© 2021 Massachusetts Institute of Technology.
Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.
This material is based upon work supported by the Federal Aviation Administration under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Federal Aviation Administration.
This document is derived from work done for the FAA (and possibly others); it is not the direct product of work done for the FAA. The information provided herein may include content supplied by third parties. Although the data and information contained herein has been produced or processed from sources believed to be reliable, the Federal Aviation Administration makes no warranty, expressed or implied, regarding the accuracy, adequacy, completeness, legality, reliability or usefulness of any information, conclusions or recommendations provided herein. Distribution of the information contained herein does not constitute an endorsement or warranty of the data or information provided herein by the Federal Aviation Administration or the U.S. Department of Transportation. Neither the Federal Aviation Administration nor the U.S. Department of
The Department of Water Resources’ (DWR’s) Statewide Airborne Electromagnetic (AEM) Surveys Project is funded through California’s Proposition 68 and the General Fund. The goal of the project is to improve the understanding of groundwater aquifer structure to support the state and local goal of sustainable groundwater management and the implementation of the Sustainable Groundwater Management Act (SGMA).
During an AEM survey, a helicopter tows electronic equipment that sends signals into the ground which bounce back. The data collected are used to create continuous images showing the distribution of electrical resistivity values of the subsurface materials that can be interpreted for lithologic properties. The resulting information will provide a standardized, statewide dataset that improves the understanding of large-scale aquifer structures and supports the development or refinement of hydrogeologic conceptual models and can help identify areas for recharging groundwater.
DWR collected AEM data in all of California’s high- and medium-priority groundwater basins, where data collection is feasible. Data were collected in a coarsely spaced grid, with a line spacing of approximately 2-miles by 8-miles. AEM data collection started in 2021 and was completed in 2023. Additional information about the project can be found on the Statewide AEM Survey website. See the publication below for an overview of the project and a preliminary analysis of the AEM data.
AEM data are being collected in groups of groundwater basins, defined as a Survey Area. See Survey Area Map for groundwater subbasins within a Survey Area:
Data reports detail the AEM data collection, processing, inversion, interpretation, and uncertainty analyses methods and procedures. Data reports also describe additional datasets used to support the AEM surveys, including digitized lithology and geophysical logs. Multiple data reports may be provided for a single Survey Area, depending on the Survey Area coverage.
All data collected as a part of the Statewide AEM Surveys will be made publicly available, by survey area, approximately six to twelve months after individual surveys are complete (depending on survey area size). Datasets that will be publicly available include:
DWR has developed AEM Data Viewers to provides a quick and easy way to visualize the AEM electrical resistivity data and the AEM data interpretations (as texture) in a three-dimensional space. The most recent data available are shown, which my be the provisional data for some areas that are not yet finalized. The Data Viewers can be accessed by direct link, below, or from the Data Viewer Landing Page.
As a part of DWR’s upcoming Basin Characterization Program, DWR will be publishing a series of maps and tools to support advanced data analyses. The first of these maps have now been published and provide analyses of the Statewide AEM Survey data to support the identification of potential recharge areas. The maps are located on the SGMA Data Viewer (under the Hydrogeologic Conceptual Model tab) and show the AEM electrical resistivity and AEM-derived texture data as the following:
Shallow Subsurface Average: Maps showing the average electrical resistivity and AEM-derived texture in the shallow subsurface (the top approximately 50 feet below ground surface). These maps support identification of potential recharge areas, where the top 50 feet is dominated by high resistivity or coarse-grained materials.
Depth Slices: Depth slice automations showing changes in electrical resistivity and AEM-derived texture with depth. These maps aid in delineating the geometry of large-scale features (for example, incised valley fills).
Shapefiles for the formatted AEM electrical resistivity data and AEM derived texture data as depth slices and the shallow subsurface average can be downloaded here:
Electrical Resistivity Depth Slices and Shallow Subsurface Average Maps
Texture Interpretation (Coarse Fraction) Depth Slices and Shallow Subsurface Average Maps
Technical memos are developed by DWR's consultant team (Ramboll Consulting) to describe research related to AEM survey planning or data collection. Research described in the technical memos may also be formally published in a journal publication.
Three pilot studies were conducted in California from 2018-2020 to support the development of the Statewide AEM Survey Project. The AEM Pilot Studies were conducted in the Sacramento Valley in Colusa and Butte county groundwater basins, the Salinas Valley in Paso Robles groundwater basin, and in the Indian Wells Valley groundwater basin.
Data Reports and datasets labeled as provisional may be incomplete and are subject to revision until they have been thoroughly reviewed and received final approval. Provisional data and reports may be inaccurate and subsequent review may result in revisions to the data and reports. Data users are cautioned to consider carefully the provisional nature of the information before using it for decisions that concern personal or public safety or the conduct of business that involves substantial monetary or operational consequences.
The San Juan basin is a significant physical and structural element in the southeastern part of the Colorado Plateau physiographic province. The San Juan basin is in New Mexico, Colorado, Arizona, and Utah and has an area of about 21,600 square miles. The basin is about 140 miles wide and about 200 miles long. In the 1980’s and 1990’s, the U.S. Geological Survey's Evolution of Sedimentary Basins—San Juan basin study produced several reports on aspects of the stratigraphy and sedimentology of the basin. A report of the stratigraphy, structure, and paleogeography of Pennsylvanian and Permian rocks included 18 plates of contoured elevation and thickness data for various units (Huffman and Condon, 1993). This digital dataset contains spatial datasets corresponding to selected contour maps from the Evolution of Sedimentary Basins—San Juan basin study. The data help define the elevation, thickness, and extent of principal stratigraphic units of the basin. The digital data describe the following stratigraphic units: the Molas Formation, the Rico Formation, the elevation of the top of Permian strata, and the estimated thickness of Permian and Pennsylvanian rocks. Digital data for each unit are contained in individual features classes within a geodatabase (also saved as individual shapefiles). Feature classes have a single attribute, either elevation or thickness, that represents the contoured value. Contoured values are given in feet, to maintain consistency with the original publication, and in meters.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This dataset provides high-resolution spatio-temporal data on shared mobility vehicles in Munich, Germany, collected between June 1, 2023, and May 31, 2025. It includes:
The dataset covers five providers across three shared mobility modes:
Content: Idling periods derived from position data
Criteria: Stationary within 100 m radius
Columns:
- id: Vehicle ID
- lat: latitude (EPSG:4326) of the vehicle’s idling location
- lon: longitude (EPSG:4326) of the vehicle’s idling location
- starttime: unix timestamp of the vehicle’s idling start time
- endtime: unix timestamp of the vehicle’s idling end time
Content: Derived trips between idling periods
Criteria: Distance >= 100 m, duration <= 6 hours
Columns:
- id: Vehicle ID
- startlat: latitude (EPSG:4326) of the trip's departure position
- startlon: longitude (EPSG:4326) of the trip's departure position
- starttime: unix timestamp of departure
- endlat: latitude (EPSG:4326) of the trip's arrival position
- endlon: longitude (EPSG:4326) of the trip's arrival position
- endtime: unix timestamp of arrival
Content: Vehicle-specific information
Columns:
- id: vehicle ID
- vehicle_type: vehicle’s model specification
- fuel_type: vehicle's primary energy source
- color: vehicle color
- time_first_seen: unix timestamp of vehicle’s first appearance in the data
- time_last_seen: unix timestamp of vehicle’s last appearance in the data
Content: Service Areas
Columns:
- provider
- geom_service_area: multipolygon of provider’s service area (EPSG:4326)
Content: List of all queried URLs
Provider | Mode | Unique IDs | Entries (Idling Records) |
Miles | Car-Sharing | 5,019 | 2,873,693 |
MVG Rad | Bike-Sharing | 3,796 | 1,582,172 |
ShareNow | Car-Sharing | 1,727 | 1,348,692 |
TIER | E-Scooter | 6,705 | 3,011,856 |
VOI | E-Scooter | 9,242 | 5,454,555 |
- Source: move.mvg.de (now offline)
- Method: Python-based web scraping in 3-minute intervals
- Coverage: Munich area divided into overlapping grid cells
- Storage: PostgreSQL database
- Idling detection: Based on spatial clustering within 100 m
- Trip detection: Transitions between idling periods, filtered by distance and duration
Trip data for MVG Rad was partially validated against official open data.
In June 2023, 88.2% of official trips were matched with derived trips.
- Gaps may occur due to scraping interruptions, reservations, or round-trips.
- Some ShareNow vehicle IDs show unusually short trip durations.
- No raw position data, route, pricing, or user data is included.
Global Surface Summary of the Day is derived from The Integrated Surface Hourly (ISH) dataset. The ISH dataset includes global data obtained from the USAF Climatology Center, located in the Federal Climate Complex with NCDC. The latest daily summary data are normally available 1-2 days after the date-time of the observations used in the daily summaries. The online data files begin with 1929 and are at the time of this writing at the Version 8 software level. Over 9000 stations' data are typically available. The daily elements included in the dataset (as available from each station) are: Mean temperature (.1 Fahrenheit) Mean dew point (.1 Fahrenheit) Mean sea level pressure (.1 mb) Mean station pressure (.1 mb) Mean visibility (.1 miles) Mean wind speed (.1 knots) Maximum sustained wind speed (.1 knots) Maximum wind gust (.1 knots) Maximum temperature (.1 Fahrenheit) Minimum temperature (.1 Fahrenheit) Precipitation amount (.01 inches) Snow depth (.1 inches) Indicator for occurrence of: Fog, Rain or Drizzle, Snow or Ice Pellets, Hail, Thunder, Tornado/Funnel Cloud Global summary of day data for 18 surface meteorological elements are derived from the synoptic/hourly observations contained in USAF DATSAV3 Surface data and Federal Climate Complex Integrated Surface Hourly (ISH). Historical data are generally available for 1929 to the present, with data from 1973 to the present being the most complete. For some periods, one or more countries' data may not be available due to data restrictions or communications problems. In deriving the summary of day data, a minimum of 4 observations for the day must be present (allows for stations which report 4 synoptic observations/day). Since the data are converted to constant units (e.g, knots), slight rounding error from the originally reported values may occur (e.g, 9.9 instead of 10.0). The mean daily values described below are based on the hours of operation for the station. For some stations/countries, the visibility will sometimes 'cluster' around a value (such as 10 miles) due to the practice of not reporting visibilities greater than certain distances. The daily extremes and totals--maximum wind gust, precipitation amount, and snow depth--will only appear if the station reports the data sufficiently to provide a valid value. Therefore, these three elements will appear less frequently than other values. Also, these elements are derived from the stations' reports during the day, and may comprise a 24-hour period which includes a portion of the previous day. The data are reported and summarized based on Greenwich Mean Time (GMT, 0000Z - 2359Z) since the original synoptic/hourly data are reported and based on GMT.
This data set concerns data in team histories of MLB.
This data set is 2594*23 in dimensions. It mainly keeps track of the existing 30 teams, with respect of winning records, managers and players chronically from 1870s to 2016.
We hereby appreciate professor Miles Chen at UCLA, for introducing us getting this dataset using nodes extraction from the MLB website "baseball-reference.com". This dataset is for solving Homework 3.
This dataset is for an analysis on coaching records of managers, and for figuring out the reason why managers switches jobs. Moreover, we are supposed to find out the big picture of MLB over the past one century and forty years. The dataset is expected to receive feedbacks on details of manager ratings and player ratings.
The National Petroleum Reserve-Alaska (NPRA) Legacy Data Archive contains geological and geophysical data collected during two extensive exploration programs, operated first by the U.S. Navy (1944-1953) and later by the U.S. Geological Survey (USGS) (1974-1982). This dataset includes records from 36 test wells, 45 core tests, and over 12,000 line miles of seismic data, along with a wide array of analyses and documents generated during the exploration of the NPRA. The archive serves as a vital resource for understanding the geophysical characteristics of the NPRA region and supports ongoing and future research. Future updates will replace low-resolution images with higher-quality versions, provide Section 508-compliant documents, cross reference data with the USGS Core Research Center and further documentation in this metadata.
These data were collected by Dewberry using a CZMIL Super Nova system. The data were acquired from 20221018 through 20221203. The data include topobathy data in LAS 1.4 format classified as unclassified (1); ground (2); low noise (7); high noise (18); bathymetric bottom (40); water surface (41); and derived water surface (42) in accordance with project specifications. The project consists of approximately 1,373 square miles of data along the shores of Big Bend and contains 17,639 500 m x 500 m lidar tiles. This South Block dataset contains 9,585 500 m x 500 m tiles.
Reference Post locations on the Nebraska Highways.
Max Speed limit values in miles per hour. This data is an extract from the Geospatial Roadway Inventory Databse (GRID), which is TxDOT's system for managing roadway assets in Texas.Note: Extracts from GRID are made on a regular basis and reflect the state of the data at that moment. Assets on routes that are in the process of being edited may be affected.Update Frequency: 1 MonthsSource: Geospatial Roadway Inventory Database (GRID)Security Level: PublicOwned by TxDOT: TrueRelated LinksData Dictionary PDF [Generated 2025/04/24]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This table contains data on the annual miles traveled by place of occurrence and by mode of transportation (vehicle, pedestrian, bicycle), for California, its regions, counties, and cities/towns. The ratio uses data from the California Department of Transportation, the U.S. Department of Transportation, and the U.S. Census Bureau. The table is part of a series of indicators in the Healthy Communities Data and Indicators Project of the Office of Health Equity. Miles traveled by individuals and their choice of mode – car, truck, public transit, walking or bicycling – have a major impact on mobility and population health. Miles traveled by automobile offers extraordinary personal mobility and independence, but it is also associated with air pollution, greenhouse gas emissions linked to global warming, road traffic injuries, and sedentary lifestyles. Active modes of transport – bicycling and walking alone and in combination with public transit – offer opportunities for physical activity, which has many documented health benefits. More information about the data table and a data dictionary can be found in the About/Attachments section.