100+ datasets found
  1. R

    Distance Calculation Dataset

    • universe.roboflow.com
    zip
    Updated Mar 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jatin-rane (2023). Distance Calculation Dataset [Dataset]. https://universe.roboflow.com/jatin-rane/distance-calculation/model/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 2, 2023
    Dataset authored and provided by
    jatin-rane
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Vehicles Bounding Boxes
    Description

    Distance Calculation

    ## Overview
    
    Distance Calculation is a dataset for object detection tasks - it contains Vehicles annotations for 4,056 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
    
  2. 🛒 Supermarket Data

    • kaggle.com
    zip
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mexwell (2024). 🛒 Supermarket Data [Dataset]. https://www.kaggle.com/datasets/mexwell/supermarket-data/versions/1
    Explore at:
    zip(78427538 bytes)Available download formats
    Dataset updated
    Jul 19, 2024
    Authors
    mexwell
    Description

    This is the dataset released as companion for the paper “Explaining the Product Range Effect in Purchase Data“, presented at the BigData 2013 conference.

    • supermarket_distances: three columns. The first column is the customer id, the second is the shop id and the third is the distance between the customer’s house and the shop location. The distance is a calculated in meters as a straight line so it does not take into account the road graph.
    • supermarket_prices: two columns. The first column is the product id and the second column is its unit price. The price is in Euro and it is calculated as the average unit price for the time span of the dataset.
    • supermarket_purchases: four columns. The first column is the customer id, the second is the product id, the third is the shop id and the fourth is the total amount of items that the customer bought the product in that particular shop. The data is recorded from January 2007 to December 2011.

    Citation

    Pennacchioli, D., Coscia, M., Rinzivillo, S., Pedreschi, D. and Giannotti, F., Explaining the Product Range Effect in Purchase Data. In BigData, 2013.

    Acknowlegement

    Foto von Eduardo Soares auf Unsplash

  3. Braking Distance data

    • kaggle.com
    Updated Jun 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ritu Parna Ghosh (2024). Braking Distance data [Dataset]. https://www.kaggle.com/datasets/rituparnaghosh18/braking-distance-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 11, 2024
    Dataset provided by
    Kaggle
    Authors
    Ritu Parna Ghosh
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This mini dataset is having 2 columns for calculating the braking distance of a car i.e., the distance a vehicle will travel from the point when its brakes are fully applied to when it comes to a complete stop.

  4. R

    Golf Ball Distance Calculation Dataset

    • universe.roboflow.com
    zip
    Updated Jun 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Awais Ahmad (2025). Golf Ball Distance Calculation Dataset [Dataset]. https://universe.roboflow.com/awais-ahmad-dtpcl/golf-ball-distance-calculation/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 14, 2025
    Dataset authored and provided by
    Awais Ahmad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Golf Balls Bounding Boxes
    Description

    Golf Ball Distance Calculation

    ## Overview
    
    Golf Ball Distance Calculation is a dataset for object detection tasks - it contains Golf Balls annotations for 318 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  5. Electric Two-Wheeler Synthetic Dataset

    • kaggle.com
    zip
    Updated Jul 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prajwal M Poojary (2025). Electric Two-Wheeler Synthetic Dataset [Dataset]. https://www.kaggle.com/datasets/prajwalmpoojary/electric-two-wheeler-synthetic-dataset
    Explore at:
    zip(199123 bytes)Available download formats
    Dataset updated
    Jul 14, 2025
    Authors
    Prajwal M Poojary
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    This dataset is synthetically generated to simulate realistic operational data from electric two-wheelers. It includes common parameters such as vehicle speed, battery voltage, current, state of charge, motor and ambient temperatures, regenerative braking status, and trip duration.

    The feature distributions are designed to mimic real-world driving behavior and environmental conditions typically observed in urban commuting scenarios for electric scooters and bikes.

    The dataset is primarily intended for machine learning tasks such as EV range prediction, regression modeling, exploratory data analysis (EDA), and educational purposes.

    This can serve as a baseline dataset for practicing predictive modeling workflows, building dashboards, or testing data visualization techniques.

    Dataset Columns Explained:

    • vehicle_speed_kmph: Vehicle speed in km/h.
    • battery_voltage_V: Battery voltage in volts.
    • battery_current_A: Battery current draw in amperes.
    • state_of_charge_%: Battery charge percentage.
    • motor_temperature_C: Motor temperature in Celsius.
    • ambient_temperature_C: Outside temperature during trip.
    • regen_braking_active: Whether regenerative braking was active (1 = yes, 0 = no).
    • trip_duration_min: Duration of the trip in minutes.
    • estimated_range_km: Calculated estimated range of vehicle in km.
  6. d

    Distance to the nearest stream by stream order for eighteen selected...

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Distance to the nearest stream by stream order for eighteen selected watersheds in the United States, Comma-separated value formatted [Dataset]. https://catalog.data.gov/dataset/distance-to-the-nearest-stream-by-stream-order-for-eighteen-selected-watersheds-in-the-uni
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    United States
    Description

    Accurate representation of stream networks at various scales in a hydrogeologic system is integral to modeling groundwater-stream interactions at the continental scale. To assess the accurate representation of stream networks, the distance of a point on the land surface to the nearest stream (DS) has been calculated. DS was calculated from the 30-meter Multi Order Hydrologic Position (MOHP) raster datasets for 18 watersheds in the United States that have been prioritized for intensive monitoring and assessment by the U.S. Geological Survey. DS was calculated by multiplying the 30-meter MOHP Lateral Position (LP) datasets by the 30-meter MOHP Distance from Stream Divide (DSD) datasets for stream orders one through five. DS was calculated for the purposes of considering the spatial scale needed for accurate representation of groundwater-stream interaction at the continental scale for a grid with 1-kilometer cell spacing. The data are available as Comma-Separated Value formatted files.

  7. Estimated stand-off distance between ADS-B equipped aircraft and obstacles

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    jpeg, zip
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Weinert; Andrew Weinert (2024). Estimated stand-off distance between ADS-B equipped aircraft and obstacles [Dataset]. http://doi.org/10.5281/zenodo.7741273
    Explore at:
    zip, jpegAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrew Weinert; Andrew Weinert
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Summary:

    Estimated stand-off distance between ADS-B equipped aircraft and obstacles. Obstacle information was sourced from the FAA Digital Obstacle File and the FHWA National Bridge Inventory. Aircraft tracks were sourced from processed data curated from the OpenSky Network. Results are presented as histograms organized by aircraft type and distance away from runways.

    Description:

    For many aviation safety studies, aircraft behavior is represented using encounter models, which are statistical models of how aircraft behave during close encounters. They are used to provide a realistic representation of the range of encounter flight dynamics where an aircraft collision avoidance system would be likely to alert. These models currently and have historically have been limited to interactions between aircraft; they have not represented the specific interactions between obstacles and aircraft equipped transponders. In response, we calculated the standoff distance between obstacles and ADS-B equipped manned aircraft.

    For robustness, this assessment considered two different datasets of manned aircraft tracks and two datasets of obstacles. For robustness, MIT LL calculated the standoff distance using two different datasets of aircraft tracks and two datasets of obstacles. This approach aligned with the foundational research used to support the ASTM F3442/F3442M-20 well clear criteria of 2000 feet laterally and 250 feet AGL vertically.

    The two datasets of processed tracks of ADS-B equipped aircraft curated from the OpenSky Network. It is likely that rotorcraft were underrepresented in these datasets. There were also no considerations for aircraft equipped only with Mode C or not equipped with any transponders. The first dataset was used to train the v1.3 uncorrelated encounter models and referred to as the “Monday” dataset. The second dataset is referred to as the “aerodrome” dataset and was used to train the v2.0 and v3.x terminal encounter model. The Monday dataset consisted of 104 Mondays across North America. The other dataset was based on observations at least 8 nautical miles within Class B, C, D aerodromes in the United States for the first 14 days of each month from January 2019 through February 2020. Prior to any processing, the datasets required 714 and 847 Gigabytes of storage. For more details on these datasets, please refer to "Correlated Bayesian Model of Aircraft Encounters in the Terminal Area Given a Straight Takeoff or Landing" and “Benchmarking the Processing of Aircraft Tracks with Triples Mode and Self-Scheduling.”

    Two different datasets of obstacles were also considered. First was point obstacles defined by the FAA digital obstacle file (DOF) and consisted of point obstacle structures of antenna, lighthouse, meteorological tower (met), monument, sign, silo, spire (steeple), stack (chimney; industrial smokestack), transmission line tower (t-l tower), tank (water; fuel), tramway, utility pole (telephone pole, or pole of similar height, supporting wires), windmill (wind turbine), and windsock. Each obstacle was represented by a cylinder with the height reported by the DOF and a radius based on the report horizontal accuracy. We did not consider the actual width and height of the structure itself. Additionally, we only considered obstacles at least 50 feet tall and marked as verified in the DOF.

    The other obstacle dataset, termed as “bridges,” was based on the identified bridges in the FAA DOF and additional information provided by the National Bridge Inventory. Due to the potential size and extent of bridges, it would not be appropriate to model them as point obstacles; however, the FAA DOF only provides a point location and no information about the size of the bridge. In response, we correlated the FAA DOF with the National Bridge Inventory, which provides information about the length of many bridges. Instead of sizing the simulated bridge based on horizontal accuracy, like with the point obstacles, the bridges were represented as circles with a radius of the longest, nearest bridge from the NBI. A circle representation was required because neither the FAA DOF or NBI provided sufficient information about orientation to represent bridges as rectangular cuboid. Similar to the point obstacles, the height of the obstacle was based on the height reported by the FAA DOF. Accordingly, the analysis using the bridge dataset should be viewed as risk averse and conservative. It is possible that a manned aircraft was hundreds of feet away from an obstacle in actuality but the estimated standoff distance could be significantly less. Additionally, all obstacles are represented with a fixed height, the potentially flat and low level entrances of the bridge are assumed to have the same height as the tall bridge towers. The attached figure illustrates an example simulated bridge.

    It would had been extremely computational inefficient to calculate the standoff distance for all possible track points. Instead, we define an encounter between an aircraft and obstacle as when an aircraft flying 3069 feet AGL or less comes within 3000 feet laterally of any obstacle in a 60 second time interval. If the criteria were satisfied, then for that 60 second track segment we calculate the standoff distance to all nearby obstacles. Vertical separation was based on the MSL altitude of the track and the maximum MSL height of an obstacle.

    For each combination of aircraft track and obstacle datasets, the results were organized seven different ways. Filtering criteria were based on aircraft type and distance away from runways. Runway data was sourced from the FAA runways of the United States, Puerto Rico, and Virgin Islands open dataset. Aircraft type was identified as part of the em-processing-opensky workflow.

    • All: No filter, all observations that satisfied encounter conditions
    • nearRunway: Aircraft within or at 2 nautical miles of a runway
    • awayRunway: Observations more than 2 nautical miles from a runway
    • glider: Observations when aircraft type is a glider
    • fwme: Observations when aircraft type is a fixed-wing multi-engine
    • fwse: Observations when aircraft type is a fixed-wing single engine
    • rotorcraft: Observations when aircraft type is a rotorcraft

    License

    This dataset is licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International(CC BY-NC-ND 4.0).

    This license requires that reusers give credit to the creator. It allows reusers to copy and distribute the material in any medium or format in unadapted form and for noncommercial purposes only. Only noncommercial use of your work is permitted. Noncommercial means not primarily intended for or directed towards commercial advantage or monetary compensation. Exceptions are given for the not for profit standards organizations of ASTM International and RTCA.

    MIT is releasing this dataset in good faith to promote open and transparent research of the low altitude airspace. Given the limitations of the dataset and a need for more research, a more restrictive license was warranted. Namely it is based only on only observations of ADS-B equipped aircraft, which not all aircraft in the airspace are required to employ; and observations were source from a crowdsourced network whose surveillance coverage has not been robustly characterized.

    As more research is conducted and the low altitude airspace is further characterized or regulated, it is expected that a future version of this dataset may have a more permissive license.

    Distribution Statement

    DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.

    © 2021 Massachusetts Institute of Technology.

    Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

    This material is based upon work supported by the Federal Aviation Administration under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Federal Aviation Administration.

    This document is derived from work done for the FAA (and possibly others); it is not the direct product of work done for the FAA. The information provided herein may include content supplied by third parties. Although the data and information contained herein has been produced or processed from sources believed to be reliable, the Federal Aviation Administration makes no warranty, expressed or implied, regarding the accuracy, adequacy, completeness, legality, reliability or usefulness of any information, conclusions or recommendations provided herein. Distribution of the information contained herein does not constitute an endorsement or warranty of the data or information provided herein by the Federal Aviation Administration or the U.S. Department of Transportation. Neither the Federal Aviation Administration nor the U.S. Department of

  8. Traveling Salesman Computer Vision

    • kaggle.com
    zip
    Updated Apr 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff Heaton (2022). Traveling Salesman Computer Vision [Dataset]. https://www.kaggle.com/datasets/jeffheaton/traveling-salesman-computer-vision
    Explore at:
    zip(2977884049 bytes)Available download formats
    Dataset updated
    Apr 20, 2022
    Authors
    Jeff Heaton
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    The Traveling Salesperson Problem (TSP) is a class problem of computer science that seeks to find the shortest route between a group of cities. It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

    https://data.heatonresearch.com/images/wustl/kaggle/tsp/world-tsp.png" alt="World Map">

    In this Kaggle competition, your goal is not to find the shortest route among cities. Rather, you must attempt to determine the route labeled on a map.

    Calculating Line Distances

    The data for this competition is not made up of real-world maps, but rather randomly generated maps of varying attributes of size, city count, and optimality of the routes. The following image demonstrates a relatively small map, with few cities, and an optimal route.

    https://data.heatonresearch.com/images/wustl/kaggle/tsp/1.jpg" alt="Small Map">

    Not all maps are this small, or contain this optimal a route. Consider the following map, which is much larger.

    https://data.heatonresearch.com/images/wustl/kaggle/tsp/6.jpg" alt="Larger Map">

    The following attributes were randomly selected to generate each image.

    • Height
    • Width
    • City count
    • Cycles of Simulated Annealing optimization of initial random path

    The path distance is based on the sum of the Euclidean distance of all segments in the path. The distance units are in pixels.

    Dataset Challenges

    This is a regression problem, you are to estimate the total path length. Several challenges to consider.

    • If you indiscriminately scale the maps, you will lose size information.
    • Paths might overlap, causing the ration of total pixels to total length to become misleading.
    • As paths overlap bot other path segments and cities, the resulting color becomes brighter.

    The following picture shows a section from one map zoomed to the pixel-level:

    https://data.heatonresearch.com/images/wustl/kaggle/tsp/tsp_zoom.jpg" alt="TSP Zoom">

    CSV Files

    The following CSV files are provided, in addition to the images.

    • train.csv - Training data, with distance labels.
    • test.csv - Test data without distance labels.
    • tsp-all.csv - Training and test data combined with complete labels and additional information about each generated map.

    CSV File Format

    The tsp-all.csv file contains the following data.

    id,filename,distance,key
    0,0.jpg,83110,503x673-270-83110.jpg
    1,1.jpg,1035,906x222-10-1035.jpg
    2,2.jpg,20756,810x999-299-20756.jpg
    3,3.jpg,13286,781x717-272-13286.jpg
    4,4.jpg,13924,609x884-312-13924.jpg
    

    The columns:

    • id - A unique ID that allows linking across all three CSV files.
    • filename - The name of each map's image file.
    • distance - The total distance through the cities, this is the y/label.
    • key - The generator filename, provides the dimensions, city count, & distance.
  9. Helsinki Region Travel Time Matrix

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Henrikki Tenkanen; Henrikki Tenkanen; Tuuli Toivonen; Tuuli Toivonen (2020). Helsinki Region Travel Time Matrix [Dataset]. http://doi.org/10.5281/zenodo.3247564
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Henrikki Tenkanen; Henrikki Tenkanen; Tuuli Toivonen; Tuuli Toivonen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Helsinki metropolitan area, Helsinki
    Description

    Helsinki Region Travel Time Matrix contains travel time and distance information for routes between all 250 m x 250 m grid cell centroids (n = 13231) in the Helsinki Region, Finland by walking, cycling, public transportation and car. The grid cells are compatible with the statistical grid cells used by Statistics Finland and the YKR (yhdyskuntarakenteen seurantajärjestelmä) data set. The Helsinki Region Travel Time Matrix is available for three different years:

    • 2018
    • 2015
    • 2013

    The data consists of travel time and distance information of the routes that have been calculated between all statistical grid cell centroids (n = 13231) by walking, cycling, public transportation and car.

    The data have been calculated for two different times of the day: 1) midday and 2) rush hour.

    The data may be used freely (under Creative Commons 4.0 licence). We do not take any responsibility for any mistakes, errors or other deficiencies in the data.

    Organization of data

    The data have been divided into 13231 text files according to destinations of the routes. The data files have been organized into sub-folders that contain multiple (approx. 4-150) Travel Time Matrix result files. Individual folders consist of all the Travel Time Matrices that have same first four digits in their filename (e.g. 5785xxx).

    In order to visualize the data on a map, the result tables can be joined with the MetropAccess YKR-grid shapefile (attached here). The data can be joined by using the field ‘from_id’ in the text files and the field ‘YKR_ID’ in MetropAccess-YKR-grid shapefile as a common key.

    Data structure

    The data have been divided into 13231 text files according to destinations of the routes. One file includes the routes from all statistical grid cells to a particular destination grid cell. All files have been named according to the destination grid cell code and each file includes 13231 rows.

    NODATA values have been stored as value -1.

    Each file consists of 17 attribute fields: 1) from_id, 2) to_id, 3) walk_t, 4) walk_d, 5) bike_f_t, 6) bike_s_t, 7) bike_d, 8) pt_r_tt, 9) pt_r_t, 10) pt_r_d, 11) pt_m_tt, 12) pt_m_t, 13) pt_m_d, 14) car_r_t, 15) car_r_d, 16) car_m_t, 17) car_m_d, 18) car_sl_t

    The fields are separated by semicolon in the text files.

    Attributes

    • from_id: ID number of the origin grid cell
    • to_id: ID number of the destination grid cell
    • walk_t: Travel time in minutes from origin to destination by walking
    • walk_d: Distance in meters of the walking route
    • bike_f_t: Total travel time in minutes from origin to destination by fast cycling; Includes extra time (1 min) that it takes to take/return bike
    • bike_s_t: Total travel time in minutes from origin to destination by slow cycling; Includes extra time (1 min) that it takes to take/return bike
    • bike_d:Distance in meters of the cycling route
    • pt_r_tt: Travel time in minutes from origin to destination by public transportation in rush hour traffic; whole travel chain has been taken into account including the waiting time at home
    • pt_r_t: Travel time in minutes from origin to destination by public transportation in rush hour traffic; whole travel chain has been taken into account excluding the waiting time at home
    • pt_r_d: Distance in meters of the public transportation route in rush hour traffic
    • pt_m_tt: Travel time in minutes from origin to destination by public transportation in midday traffic; whole travel chain has been taken into account including the waiting time at home
    • pt_m_t: Travel time in minutes from origin to destination by public transportation in midday traffic; whole travel chain has been taken into account excluding the waiting time at home
    • pt_m_d: Distance in meters of the public transportation route in midday traffic
    • car_r_t: Travel time in minutes from origin to destination by private car in rush hour traffic; the whole travel chain has been taken into account
    • car_r_d: Distance in meters of the private car route in rush hour traffic
    • car_m_t: Travel time in minutes from origin to destination by private car in midday traffic; the whole travel chain has been taken into account
    • car_m_d: Distance in meters of the private car route in midday traffic
    • car_sl_t: Travel time from origin to destination by private car following speed limits without any additional impedances; the whole travel chain has been taken into account

    METHODS

    For detailed documentation and how to reproduce the data, see HelsinkiRegionTravelTimeMatrix2018 GitHub repository.

    THE ROUTE BY CAR have been calculated with a dedicated open source tool called DORA (DOor-to-door Routing Analyst) developed for this project. DORA uses PostgreSQL database with PostGIS extension and is based on the pgRouting toolkit. MetropAccess-Digiroad (modified from the original Digiroad data provided by Finnish Transport Agency) has been used as a street network in which the travel times of the road segments are made more realistic by adding crossroad impedances for different road classes.

    The calculations have been repeated for two times of the day using 1) the “midday impedance” (i.e. travel times outside rush hour) and 2) the “rush hour impendance” as impedance in the calculations. Moreover, there is 3) the “speed limit impedance” calculated in the matrix (i.e. using speed limit without any additional impedances).

    The whole travel chain (“door-to-door approach”) is taken into account in the calculations:
    1) walking time from the real origin to the nearest network location (based on Euclidean distance),
    2) average walking time from the origin to the parking lot,
    3) travel time from parking lot to destination,
    4) average time for searching a parking lot,
    5) walking time from parking lot to nearest network location of the destination and
    6) walking time from network location to the real destination (based on Euclidean distance).

    THE ROUTES BY PUBLIC TRANSPORTATION have been calculated by using the MetropAccess-Reititin tool which also takes into account the whole travel chains from the origin to the destination:
    1) possible waiting at home before leaving,
    2) walking from home to the transit stop,
    3) waiting at the transit stop,
    4) travel time to next transit stop,
    5) transport mode change,
    6) travel time to next transit stop and
    7) walking to the destination.

    Travel times by public transportation have been optimized using 10 different departure times within the calculation hour using so called Golomb ruler. The fastest route from these calculations are selected for the final travel time matrix.

    THE ROUTES BY CYCLING are also calculated using the DORA tool. The network dataset underneath is MetropAccess-CyclingNetwork, which is a modified version from the original Digiroad data provided by Finnish Transport Agency. In the dataset the travel times for the road segments have been modified to be more realistic based on Strava sports application data from the Helsinki region from 2016 and the bike sharing system data from Helsinki from 2017.

    For each road segment a separate speed value was calculated for slow and fast cycling. The value for fast cycling is based on a percentual difference between segment specific Strava speed value and the average speed value for the whole Strava data. This same percentual difference has been applied to calculate the slower speed value for each road segment. The speed value is then the average speed value of bike sharing system users multiplied by the percentual difference value.

    The reference value for faster cycling has been 19km/h, which is based on the average speed of Strava sports application users in the Helsinki region. The reference value for slower cycling has been 12km/, which has been the average travel speed of bike sharing system users in Helsinki. Additional 1 minute have been added to the travel time to consider the time for taking (30s) and returning (30s) bike on the origin/destination.

    More information of the Strava dataset that was used can be found from the Cycling routes and fluency report, which was published by us and the city of Helsinki.

    THE ROUTES BY WALKING were also calculated using the MetropAccess-Reititin by disabling all motorized transport modesin the calculation. Thus, all routes are based on the Open Street Map geometry.

    The walking speed has been adjusted to 70 meters per minute, which is the default speed in the HSL Journey Planner (also in the calculations by public transportation).

    All calculations were done using the computing resources of CSC-IT Center for Science (https://www.csc.fi/home).

  10. housing

    • kaggle.com
    zip
    Updated Sep 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HappyRautela (2023). housing [Dataset]. https://www.kaggle.com/datasets/happyrautela/housing
    Explore at:
    zip(809785 bytes)Available download formats
    Dataset updated
    Sep 22, 2023
    Authors
    HappyRautela
    Description

    The exercise after this contains questions that are based on the housing dataset.

    1. How many houses have a waterfront? a. 21000 b. 21450 c. 163 d. 173

    2. How many houses have 2 floors? a. 2692 b. 8241 c. 10680 d. 161

    3. How many houses built before 1960 have a waterfront? a. 80 b. 7309 c. 90 d. 92

    4. What is the price of the most expensive house having more than 4 bathrooms? a. 7700000 b. 187000 c. 290000 d. 399000

    5. For instance, if the ‘price’ column consists of outliers, how can you make the data clean and remove the redundancies? a. Calculate the IQR range and drop the values outside the range. b. Calculate the p-value and remove the values less than 0.05. c. Calculate the correlation coefficient of the price column and remove the values less than the correlation coefficient. d. Calculate the Z-score of the price column and remove the values less than the z-score.

    6. What are the various parameters that can be used to determine the dependent variables in the housing data to determine the price of the house? a. Correlation coefficients b. Z-score c. IQR Range d. Range of the Features

    7. If we get the r2 score as 0.38, what inferences can we make about the model and its efficiency? a. The model is 38% accurate, and shows poor efficiency. b. The model is showing 0.38% discrepancies in the outcomes. c. Low difference between observed and fitted values. d. High difference between observed and fitted values.

    8. If the metrics show that the p-value for the grade column is 0.092, what all inferences can we make about the grade column? a. Significant in presence of other variables. b. Highly significant in presence of other variables c. insignificance in presence of other variables d. None of the above

    9. If the Variance Inflation Factor value for a feature is considerably higher than the other features, what can we say about that column/feature? a. High multicollinearity b. Low multicollinearity c. Both A and B d. None of the above

  11. GLAS/ICESat L1B Global Waveform-based Range Corrections Data (HDF5) V034 -...

    • data.nasa.gov
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). GLAS/ICESat L1B Global Waveform-based Range Corrections Data (HDF5) V034 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/glas-icesat-l1b-global-waveform-based-range-corrections-data-hdf5-v034
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    GLAH05 Level-1B waveform parameterization data include output parameters from the waveform characterization procedure and other parameters required to calculate surface slope and relief characteristics. GLAH05 contains parameterizations of both the transmitted and received pulses and other characteristics from which elevation and footprint-scale roughness and slope are calculated. The received pulse characterization uses two implementations of the retracking algorithms: one tuned for ice sheets, called the standard parameterization, used to calculate surface elevation for ice sheets, oceans, and sea ice; and another for land (the alternative parameterization). Each data granule has an associated browse product.

  12. Russian Cities Distance Dataset

    • kaggle.com
    zip
    Updated Nov 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marlin (2023). Russian Cities Distance Dataset [Dataset]. https://www.kaggle.com/datasets/lightningforpython/russian-cities-distance-dataset
    Explore at:
    zip(501 bytes)Available download formats
    Dataset updated
    Nov 9, 2023
    Authors
    Marlin
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Russia
    Description

    https://media.tenor.com/8rI9LFm_cs0AAAAM/imba.gif" alt="">

    The "Russian Cities Distance Dataset" is a comprehensive collection of distance data between major cities in Stavropol region in Russia, designed to facilitate pathfinding and optimization tasks. This connected dataset includes information about the distances (in kilometres) between pairs of cities, allowing users to calculate the shortest paths and optimize routes for various purposes.

    Key features of this dataset

    City Connections: The dataset provides connectivity information between all cities of the Stavropol region, making it an invaluable resource for route planning, navigation, and logistics optimization.

    Distance Data: Each entry in the dataset includes the distance in kilometres between two cities. The distances have been curated to reflect the actual road or travel distances between these locations.

    A* Search Algorithm: This dataset is ideal for use with the A* (A-star) search algorithm, a widely used optimization and pathfinding algorithm. The A* algorithm can help find the shortest and most efficient routes between cities, making it suitable for applications in transportation, tourism, and urban planning.

    City Pairings: The dataset covers pairs of cities, enabling users to calculate the shortest paths and travel distances between any two cities included in the dataset.

  13. d

    Data from: U.S. Geological Survey calculated half interpercentile range...

    • catalog.data.gov
    • search.dataone.org
    • +1more
    Updated Nov 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). U.S. Geological Survey calculated half interpercentile range (half of the difference between the 16th and 84th percentiles) of wave-current bottom shear stress in the South Atlantic Bight from May 2010 to May 2011 (SAB_hIPR.shp, polygon shapefile, Geographic, WGS84) [Dataset]. https://catalog.data.gov/dataset/u-s-geological-survey-calculated-half-interpercentile-range-half-of-the-difference-between
    Explore at:
    Dataset updated
    Nov 26, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The U.S. Geological Survey has been characterizing the regional variation in shear stress on the sea floor and sediment mobility through statistical descriptors. The purpose of this project is to identify patterns in stress in order to inform habitat delineation or decisions for anthropogenic use of the continental shelf. The statistical characterization spans the continental shelf from the coast to approximately 120 m water depth, at approximately 5 km resolution. Time-series of wave and circulation are created using numerical models, and near-bottom output of steady and oscillatory velocities and an estimate of bottom roughness are used to calculate a time-series of bottom shear stress at 1-hour intervals. Statistical descriptions such as the median and 95th percentile, which are the output included with this database, are then calculated to create a two-dimensional picture of the regional patterns in shear stress. In addition, time-series of stress are compared to critical stress values at select points calculated from observed surface sediment texture data to determine estimates of sea floor mobility.

  14. A

    Data from: Industrial Process Heat Demand Characterization

    • data.amerigeoss.org
    • data.openei.org
    • +4more
    csv
    Updated Jul 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2018). Industrial Process Heat Demand Characterization [Dataset]. https://data.amerigeoss.org/tr/dataset/4bcb6731-a1d7-4126-9802-aa7165eb8281
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 25, 2018
    Dataset provided by
    United States
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set is an expansion of the facility-level process heat survey conducted by McMillan et al. 2015 https//doi.org/10.2172/1334495. Using the same methodology annual facility combustion energy use was calculated from emissions data reported to U.S. Environmental Protection Agency Greenhouse Gas Reporting Program https//www.epa.gov/ghgreporting for 2010 to 2015. The resulting energy values in TJ total and by fuel type were further characterized by end use using data from U.S. Energy Information Administration 2010 Manufacturing Energy Consumption Survey https//www.eia.gov/consumption/manufacturing/data/2010/ and by temperature range. The data set includes calculated GHG emissions in MMTCO2e by fuel type end use and temperature range.

  15. a

    Annual Average Temperature Change - Projections (12km)

    • hub.arcgis.com
    • climatedataportal.metoffice.gov.uk
    • +1more
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Met Office (2023). Annual Average Temperature Change - Projections (12km) [Dataset]. https://hub.arcgis.com/datasets/cf8f426fffde4956af27a38857cd55b9
    Explore at:
    Dataset updated
    Jun 1, 2023
    Dataset authored and provided by
    Met Office
    Area covered
    Description

    [Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 0.13°C.]What does the data show? This dataset shows the change in annual temperature for a range of global warming levels, including the recent past (2001-2020), compared to the 1981-2000 baseline period. Note, as the values in this dataset are averaged over a year they do not represent possible extreme conditions.The dataset uses projections of daily average air temperature from UKCP18 which are averaged to give values for the 1981-2000 baseline, the recent past (2001-2020) and global warming levels. The warming levels available are 1.5°C, 2.0°C, 2.5°C, 3.0°C and 4.0°C above the pre-industrial (1850-1900) period. The recent past value and global warming level values are stated as a change (in °C) relative to the 1981-2000 value. This enables users to compare annual average temperature trends for the different periods. In addition to the change values, values for the 1981-2000 baseline (corresponding to 0.51°C warming) and recent past (2001-2020, corresponding to 0.87°C warming) are also provided. This is summarised in the table below.

    PeriodDescription 1981-2000 baselineAverage temperature (°C) for the period 2001-2020 (recent past)Average temperature (°C) for the period 2001-2020 (recent past) changeTemperature change (°C) relative to 1981-2000 1.5°C global warming level changeTemperature change (°C) relative to 1981-2000 2°C global warming level changeTemperature change (°C) relative to 1981-20002.5°C global warming level changeTemperature change (°C) relative to 1981-2000 3°C global warming level changeTemperature change (°C) relative to 1981-2000 4°C global warming level changeTemperature change (°C) relative to 1981-2000What is a global warming level?The Annual Average Temperature Change is calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the Annual Average Temperature Change, an average is taken across the 21 year period.We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data?This data contains a field for the 1981-2000 baseline, 2001-2020 period and each warming level. They are named 'tas annual change' (change in air 'temperature at surface'), the warming level or historic time period, and 'upper' 'median' or 'lower' as per the description below. e.g. 'tas annual change 2.0 median' is the median value for the 2.0°C warming level. Decimal points are included in field aliases but not in field names, e.g. 'tas annual change 2.0 median' is named 'tas_annual_change_20_median'. To understand how to explore the data, refer to the New Users ESRI Storymap. Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘tas annual change 2.0°C median’ values.What do the 'median', 'upper', and 'lower' values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future.For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, the Annual Average Temperature Change was calculated for each ensemble member and they were then ranked in order from lowest to highest for each location.The ‘lower’ fields are the second lowest ranked ensemble member. The ‘higher’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and higher fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline period as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksFor further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal.

  16. f

    Summary and methods used to calculate the physical characteristics used to...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Mar 31, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathan, Senthilvel K. S. S.; Saldivar, Diana A. Ramirez; Vaughan, Ian P.; Goossens, Benoit; Stark, Danica J. (2017). Summary and methods used to calculate the physical characteristics used to compare the home range estimators. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001743878
    Explore at:
    Dataset updated
    Mar 31, 2017
    Authors
    Nathan, Senthilvel K. S. S.; Saldivar, Diana A. Ramirez; Vaughan, Ian P.; Goossens, Benoit; Stark, Danica J.
    Description

    Summary and methods used to calculate the physical characteristics used to compare the home range estimators.

  17. r

    Dataset for The effects of a number line intervention on calculation skills

    • researchdata.edu.au
    • figshare.mq.edu.au
    Updated May 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saskia Kohnen; Rebecca Bull; Carola Ruiz Hornblas (2023). Dataset for The effects of a number line intervention on calculation skills [Dataset]. http://doi.org/10.25949/22799717.V1
    Explore at:
    Dataset updated
    May 18, 2023
    Dataset provided by
    Macquarie University
    Authors
    Saskia Kohnen; Rebecca Bull; Carola Ruiz Hornblas
    Description

    Study information

    The sample included in this dataset represents five children who participated in a number line intervention study. Originally six children were included in the study, but one of them fulfilled the criterion for exclusion after missing several consecutive sessions. Thus, their data is not included in the dataset.

    All participants were currently attending Year 1 of primary school at an independent school in New South Wales, Australia. For children to be able to eligible to participate they had to present with low mathematics achievement by performing at or below the 25th percentile in the Maths Problem Solving and/or Numerical Operations subtests from the Wechsler Individual Achievement Test III (WIAT III A & NZ, Wechsler, 2016). Participants were excluded from participating if, as reported by their parents, they have any other diagnosed disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, intellectual disability, developmental language disorder, cerebral palsy or uncorrected sensory disorders.

    The study followed a multiple baseline case series design, with a baseline phase, a treatment phase, and a post-treatment phase. The baseline phase varied between two and three measurement points, the treatment phase varied between four and seven measurement points, and all participants had 1 post-treatment measurement point.

    The number of measurement points were distributed across participants as follows:

    Participant 1 – 3 baseline, 6 treatment, 1 post-treatment

    Participant 3 – 2 baseline, 7 treatment, 1 post-treatment

    Participant 5 – 2 baseline, 5 treatment, 1 post-treatment

    Participant 6 – 3 baseline, 4 treatment, 1 post-treatment

    Participant 7 – 2 baseline, 5 treatment, 1 post-treatment

    In each session across all three phases children were assessed in their performance on a number line estimation task, a single-digit computation task, a multi-digit computation task, a dot comparison task and a number comparison task. Furthermore, during the treatment phase, all children completed the intervention task after these assessments. The order of the assessment tasks varied randomly between sessions.


    Measures

    Number Line Estimation. Children completed a computerised bounded number line task (0-100). The number line is presented in the middle of the screen, and the target number is presented above the start point of the number line to avoid signalling the midpoint (Dackermann et al., 2018). Target numbers included two non-overlapping sets (trained and untrained) of 30 items each. Untrained items were assessed on all phases of the study. Trained items were assessed independent of the intervention during baseline and post-treatment phases, and performance on the intervention is used to index performance on the trained set during the treatment phase. Within each set, numbers were equally distributed throughout the number range, with three items within each ten (0-10, 11-20, 21-30, etc.). Target numbers were presented in random order. Participants did not receive performance-based feedback. Accuracy is indexed by percent absolute error (PAE) [(number estimated - target number)/ scale of number line] x100.


    Single-Digit Computation. The task included ten additions with single-digit addends (1-9) and single-digit results (2-9). The order was counterbalanced so that half of the additions present the lowest addend first (e.g., 3 + 5) and half of the additions present the highest addend first (e.g., 6 + 3). This task also included ten subtractions with single-digit minuends (3-9), subtrahends (1-6) and differences (1-6). The items were presented horizontally on the screen accompanied by a sound and participants were required to give a verbal response. Participants did not receive performance-based feedback. Performance on this task was indexed by item-based accuracy.


    Multi-digit computational estimation. The task included eight additions and eight subtractions presented with double-digit numbers and three response options. None of the response options represent the correct result. Participants were asked to select the option that was closest to the correct result. In half of the items the calculation involved two double-digit numbers, and in the other half one double and one single digit number. The distance between the correct response option and the exact result of the calculation was two for half of the trials and three for the other half. The calculation was presented vertically on the screen with the three options shown below. The calculations remained on the screen until participants responded by clicking on one of the options on the screen. Participants did not receive performance-based feedback. Performance on this task is measured by item-based accuracy.


    Dot Comparison and Number Comparison. Both tasks included the same 20 items, which were presented twice, counterbalancing left and right presentation. Magnitudes to be compared were between 5 and 99, with four items for each of the following ratios: .91, .83, .77, .71, .67. Both quantities were presented horizontally side by side, and participants were instructed to press one of two keys (F or J), as quickly as possible, to indicate the largest one. Items were presented in random order and participants did not receive performance-based feedback. In the non-symbolic comparison task (dot comparison) the two sets of dots remained on the screen for a maximum of two seconds (to prevent counting). Overall area and convex hull for both sets of dots is kept constant following Guillaume et al. (2020). In the symbolic comparison task (Arabic numbers), the numbers remained on the screen until a response was given. Performance on both tasks was indexed by accuracy.


    The Number Line Intervention

    During the intervention sessions, participants estimated the position of 30 Arabic numbers in a 0-100 bounded number line. As a form of feedback, within each item, the participants’ estimate remained visible, and the correct position of the target number appeared on the number line. When the estimate’s PAE was lower than 2.5, a message appeared on the screen that read “Excellent job”, when PAE was between 2.5 and 5 the message read “Well done, so close! and when PAE was higher than 5 the message read “Good try!” Numbers were presented in random order.


    Variables in the dataset

    Age = age in ‘years, months’ at the start of the study

    Sex = female/male/non-binary or third gender/prefer not to say (as reported by parents)

    Math_Problem_Solving_raw = Raw score on the Math Problem Solving subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).

    Math_Problem_Solving_Percentile = Percentile equivalent on the Math Problem Solving subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).

    Num_Ops_Raw = Raw score on the Numerical Operations subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).

    Math_Problem_Solving_Percentile = Percentile equivalent on the Numerical Operations subtest from the WIAT III (WIAT III A & NZ, Wechsler, 2016).


    The remaining variables refer to participants’ performance on the study tasks. Each variable name is composed by three sections. The first one refers to the phase and session. For example, Base1 refers to the first measurement point of the baseline phase, Treat1 to the first measurement point on the treatment phase, and post1 to the first measurement point on the post-treatment phase.


    The second part of the variable name refers to the task, as follows:

    DC = dot comparison

    SDC = single-digit computation

    NLE_UT = number line estimation (untrained set)

    NLE_T= number line estimation (trained set)

    CE = multidigit computational estimation

    NC = number comparison

    The final part of the variable name refers to the type of measure being used (i.e., acc = total correct responses and pae = percent absolute error).


    Thus, variable Base2_NC_acc corresponds to accuracy on the number comparison task during the second measurement point of the baseline phase and Treat3_NLE_UT_pae refers to the percent absolute error on the untrained set of the number line task during the third session of the Treatment phase.





  18. c

    Data from: Variable Terrestrial GPS Telemetry Detection Rates: Parts 1 -...

    • s.cnmilf.com
    • data.usgs.gov
    • +2more
    Updated Oct 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Variable Terrestrial GPS Telemetry Detection Rates: Parts 1 - 7—Data [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/variable-terrestrial-gps-telemetry-detection-rates-parts-1-7data
    Explore at:
    Dataset updated
    Oct 2, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    Studies utilizing Global Positioning System (GPS) telemetry rarely result in 100% fix success rates (FSR). Many assessments of wildlife resource use do not account for missing data, either assuming data loss is random or because a lack of practical treatment for systematic data loss. Several studies have explored how the environment, technological features, and animal behavior influence rates of missing data in GPS telemetry, but previous spatially explicit models developed to correct for sampling bias have been specified to small study areas, on a small range of data loss, or to be species-specific, limiting their general utility. Here we explore environmental effects on GPS fix acquisition rates across a wide range of environmental conditions and detection rates for bias correction of terrestrial GPS-derived, large mammal habitat use. We also evaluate patterns in missing data that relate to potential animal activities that change the orientation of the antennae and characterize home-range probability of GPS detection for 4 focal species; cougars (Puma concolor), desert bighorn sheep (Ovis canadensis nelsoni), Rocky Mountain elk (Cervus elaphus ssp. nelsoni) and mule deer (Odocoileus hemionus). Part 1, Positive Openness Raster (raster dataset): Openness is an angular measure of the relationship between surface relief and horizontal distance. For angles less than 90 degrees it is equivalent to the internal angle of a cone with its apex at a DEM _location, and is constrained by neighboring elevations within a specified radial distance. 480 meter search radius was used for this calculation of positive openness. Openness incorporates the terrain line-of-sight or viewshed concept and is calculated from multiple zenith and nadir angles-here along eight azimuths. Positive openness measures openness above the surface, with high values for convex forms and low values for concave forms (Yokoyama et al. 2002). We calculated positive openness using a custom python script, following the methods of Yokoyama et. al (2002) using a USGS National Elevation Dataset as input. Part 2, Northern Arizona GPS Test Collar (csv): Bias correction in GPS telemetry data-sets requires a strong understanding of the mechanisms that result in missing data. We tested wildlife GPS collars in a variety of environmental conditions to derive a predictive model of fix acquisition. We found terrain exposure and tall over-story vegetation are the primary environmental features that affect GPS performance. Model evaluation showed a strong correlation (0.924) between observed and predicted fix success rates (FSR) and showed little bias in predictions. The model's predictive ability was evaluated using two independent data-sets from stationary test collars of different make/model, fix interval programming, and placed at different study sites. No statistically significant differences (95% CI) between predicted and observed FSRs, suggest changes in technological factors have minor influence on the models ability to predict FSR in new study areas in the southwestern US. The model training data are provided here for fix attempts by hour. This table can be linked with the site _location shapefile using the site field. Part 3, Probability Raster (raster dataset): Bias correction in GPS telemetry datasets requires a strong understanding of the mechanisms that result in missing data. We tested wildlife GPS collars in a variety of environmental conditions to derive a predictive model of fix aquistion. We found terrain exposure and tall overstory vegetation are the primary environmental features that affect GPS performance. Model evaluation showed a strong correlation (0.924) between observed and predicted fix success rates (FSR) and showed little bias in predictions. The models predictive ability was evaluated using two independent datasets from stationary test collars of different make/model, fix interval programing, and placed at different study sites. No statistically significant differences (95% CI) between predicted and observed FSRs, suggest changes in technological factors have minor influence on the models ability to predict FSR in new study areas in the southwestern US. We evaluated GPS telemetry datasets by comparing the mean probability of a successful GPS fix across study animals home-ranges, to the actual observed FSR of GPS downloaded deployed collars on cougars (Puma concolor), desert bighorn sheep (Ovis canadensis nelsoni), Rocky Mountain elk (Cervus elaphus ssp. nelsoni) and mule deer (Odocoileus hemionus). Comparing the mean probability of acquisition within study animals home-ranges and observed FSRs of GPS downloaded collars resulted in a approximatly 1:1 linear relationship with an r-sq= 0.68. Part 4, GPS Test Collar Sites (shapefile): Bias correction in GPS telemetry data-sets requires a strong understanding of the mechanisms that result in missing data. We tested wildlife GPS collars in a variety of environmental conditions to derive a predictive model of fix acquisition. We found terrain exposure and tall over-story vegetation are the primary environmental features that affect GPS performance. Model evaluation showed a strong correlation (0.924) between observed and predicted fix success rates (FSR) and showed little bias in predictions. The model's predictive ability was evaluated using two independent data-sets from stationary test collars of different make/model, fix interval programming, and placed at different study sites. No statistically significant differences (95% CI) between predicted and observed FSRs, suggest changes in technological factors have minor influence on the models ability to predict FSR in new study areas in the southwestern US. Part 5, Cougar Home Ranges (shapefile): Cougar home-ranges were calculated to compare the mean probability of a GPS fix acquisition across the home-range to the actual fix success rate (FSR) of the collar as a means for evaluating if characteristics of an animal’s home-range have an effect on observed FSR. We estimated home-ranges using the Local Convex Hull (LoCoH) method using the 90th isopleth. Data obtained from GPS download of retrieved units were only used. Satellite delivered data was omitted from the analysis for animals where the collar was lost or damaged because satellite delivery tends to lose as additional 10% of data. Comparisons with home-range mean probability of fix were also used as a reference for assessing if the frequency animals use areas of low GPS acquisition rates may play a role in observed FSRs. Part 6, Cougar Fix Success Rate by Hour (csv): Cougar GPS collar fix success varied by hour-of-day suggesting circadian rhythms with bouts of rest during daylight hours may change the orientation of the GPS receiver affecting the ability to acquire fixes. Raw data of overall fix success rates (FSR) and FSR by hour were used to predict relative reductions in FSR. Data only includes direct GPS download datasets. Satellite delivered data was omitted from the analysis for animals where the collar was lost or damaged because satellite delivery tends to lose approximately an additional 10% of data. Part 7, Openness Python Script version 2.0: This python script was used to calculate positive openness using a 30 meter digital elevation model for a large geographic area in Arizona, California, Nevada and Utah. A scientific research project used the script to explore environmental effects on GPS fix acquisition rates across a wide range of environmental conditions and detection rates for bias correction of terrestrial GPS-derived, large mammal habitat use.

  19. Annual Count of Summer Days - Projections (12km)

    • climatedataportal.metoffice.gov.uk
    Updated Feb 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Met Office (2023). Annual Count of Summer Days - Projections (12km) [Dataset]. https://climatedataportal.metoffice.gov.uk/datasets/TheMetOffice::annual-count-of-summer-days-projections-12km/explore?showTable=true
    Explore at:
    Dataset updated
    Feb 7, 2023
    Dataset authored and provided by
    Met Officehttp://www.metoffice.gov.uk/
    Area covered
    Description

    [Updated 28/01/25 to fix an issue in the ‘Lower’ values, which were not fully representing the range of uncertainty. ‘Median’ and ‘Higher’ values remain unchanged. The size of the change varies by grid cell and fixed period/global warming levels but the average difference between the 'lower' values before and after this update is 0.6.]What does the data show? The Annual Count of Summer Days is the number of days per year where the maximum daily temperature (the hottest point in the day) is above 25°C. It measures how many times the threshold is exceeded (not by how much) in a year. Note, the term ‘summer days’ is used to refer to the threshold and temperatures above 25°C outside the summer months also contribute to the annual count. The results should be interpreted as an approximation of the projected number of days when the threshold is exceeded as there will be many factors such as natural variability and local scale processes that the climate model is unable to represent.The Annual Count of Summer Days is calculated for two baseline (historical) periods 1981-2000 (corresponding to 0.51°C warming) and 2001-2020 (corresponding to 0.87°C warming) and for global warming levels of 1.5°C, 2.0°C, 2.5°C, 3.0°C, 4.0°C above the pre-industrial (1850-1900) period. This enables users to compare the future number of summer days to previous values. What are the possible societal impacts?An increase in the Annual Count of Summer Days indicates increased health risks from high temperatures. Impacts include:Increased heat related illnesses, hospital admissions or death for vulnerable people.Transport disruption due to overheating of railway infrastructure. Periods of increased water demand.Other metrics such as the Annual Count of Hot Summer Days (days above 30°C), Annual Count of Extreme Summer Days (days above 35°C) and the Annual Count of Tropical Nights (where the minimum temperature does not fall below 20°C) also indicate impacts from high temperatures, however they use different temperature thresholds.What is a global warming level?The Annual Count of Summer Days is calculated from the UKCP18 regional climate projections using the high emissions scenario (RCP 8.5) where greenhouse gas emissions continue to grow. Instead of considering future climate change during specific time periods (e.g. decades) for this scenario, the dataset is calculated at various levels of global warming relative to the pre-industrial (1850-1900) period. The world has already warmed by around 1.1°C (between 1850–1900 and 2011–2020), whilst this dataset allows for the exploration of greater levels of warming. The global warming levels available in this dataset are 1.5°C, 2°C, 2.5°C, 3°C and 4°C. The data at each warming level was calculated using a 21 year period. These 21 year periods are calculated by taking 10 years either side of the first year at which the global warming level is reached. This time will be different for different model ensemble members. To calculate the value for the 'Annual Count of Summer Days', an average is taken across the 21 year period. Therefore, the Annual Count of Summer Days show the number of summer days that could occur each year, for each given level of warming. We cannot provide a precise likelihood for particular emission scenarios being followed in the real world future. However, we do note that RCP8.5 corresponds to emissions considerably above those expected with current international policy agreements. The results are also expressed for several global warming levels because we do not yet know which level will be reached in the real climate as it will depend on future greenhouse emission choices and the sensitivity of the climate system, which is uncertain. Estimates based on the assumption of current international agreements on greenhouse gas emissions suggest a median warming level in the region of 2.4-2.8°C, but it could either be higher or lower than this level.What are the naming conventions and how do I explore the data? This data contains a field for each global warming level and two baselines. They are named ‘Summer Days’, the warming level or baseline, and ‘upper’ ‘median’ or ‘lower’ as per the description below. E.g. ‘Summer Days 2.5 median’ is the median value for the 2.5°C warming level. Decimal points are included in field aliases but not field names e.g. ‘Summer Days 2.5 median’ is ‘SummerDays_25_median’. To understand how to explore the data, see this page: https://storymaps.arcgis.com/stories/457e7a2bc73e40b089fac0e47c63a578Please note, if viewing in ArcGIS Map Viewer, the map will default to ‘Summer Days 2.0°C median’ values.What do the ‘median’, ‘upper’, and ‘lower’ values mean?Climate models are numerical representations of the climate system. To capture uncertainty in projections for the future, an ensemble, or group, of climate models are run. Each ensemble member has slightly different starting conditions or model set-ups. Considering all of the model outcomes gives users a range of plausible conditions which could occur in the future. For this dataset, the model projections consist of 12 separate ensemble members. To select which ensemble members to use, the Annual Count of Summer Days was calculated for each ensemble member and they were then ranked in order from lowest to highest for each location. The ‘lower’ fields are the second lowest ranked ensemble member. The ‘upper’ fields are the second highest ranked ensemble member. The ‘median’ field is the central value of the ensemble.This gives a median value, and a spread of the ensemble members indicating the range of possible outcomes in the projections. This spread of outputs can be used to infer the uncertainty in the projections. The larger the difference between the lower and upper fields, the greater the uncertainty.‘Lower’, ‘median’ and ‘upper’ are also given for the baseline periods as these values also come from the model that was used to produce the projections. This allows a fair comparison between the model projections and recent past. Useful linksThis dataset was calculated following the methodology in the ‘Future Changes to high impact weather in the UK’ report and uses the same temperature thresholds as the 'State of the UK Climate' report.Further information on the UK Climate Projections (UKCP).Further information on understanding climate data within the Met Office Climate Data Portal

  20. ERA5 post-processed daily statistics on single levels from 1940 to present

    • cds.climate.copernicus.eu
    grib
    Updated Dec 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECMWF (2025). ERA5 post-processed daily statistics on single levels from 1940 to present [Dataset]. http://doi.org/10.24381/cds.4991cf48
    Explore at:
    gribAvailable download formats
    Dataset updated
    Dec 3, 2025
    Dataset provided by
    European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
    Authors
    ECMWF
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. This catalogue entry provides post-processed ERA5 hourly single-level data aggregated to daily time steps. In addition to the data selection options found on the hourly page, the following options can be selected for the daily statistic calculation:

    The daily aggregation statistic (daily mean, daily max, daily min, daily sum*) The sub-daily frequency sampling of the original data (1 hour, 3 hours, 6 hours) The option to shift to any local time zone in UTC (no shift means the statistic is computed from UTC+00:00)

    *The daily sum is only available for the accumulated variables (see ERA5 documentation for more details). Users should be aware that the daily aggregation is calculated during the retrieval process and is not part of a permanently archived dataset. For more details on how the daily statistics are calculated, including demonstrative code, please see the documentation. For more details on the hourly data used to calculate the daily statistics, please refer to the ERA5 hourly single-level data catalogue entry and the documentation found therein.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
jatin-rane (2023). Distance Calculation Dataset [Dataset]. https://universe.roboflow.com/jatin-rane/distance-calculation/model/3

Distance Calculation Dataset

distance-calculation

distance-calculation-dataset

Explore at:
zipAvailable download formats
Dataset updated
Mar 2, 2023
Dataset authored and provided by
jatin-rane
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Variables measured
Vehicles Bounding Boxes
Description

Distance Calculation

## Overview

Distance Calculation is a dataset for object detection tasks - it contains Vehicles annotations for 4,056 images.

## Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

  ## License

  This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
Search
Clear search
Close search
Google apps
Main menu