100+ datasets found
  1. Indian Cities Distance Dataset

    • kaggle.com
    zip
    Updated Mar 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    K.B. Dharun Krishna (2024). Indian Cities Distance Dataset [Dataset]. https://www.kaggle.com/datasets/kbdharun/a-star-algorithm-route-planning-dataset/code
    Explore at:
    zip(804 bytes)Available download formats
    Dataset updated
    Mar 1, 2024
    Authors
    K.B. Dharun Krishna
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    India
    Description

    The "Indian Cities Distance Dataset" is a comprehensive collection of distance data between major cities in India, designed to facilitate pathfinding and optimization tasks.

    This connected dataset includes information about the distances (in kilometres) between pairs of cities, allowing users to calculate the shortest paths and optimize routes for various purposes.

    Key features of this dataset

    City Pairings: The dataset provides connectivity information between pairs of prominent Indian cities, enabling users to calculate the shortest paths and travel distances between any two cities included in the dataset. It is an excellent resource for delving into programming route planning, navigation, and logistics optimization programs.

    Distance Data: Each entry in the dataset includes the distance in kilometres between two cities. The distances have been curated to reflect the actual road distances between these locations.

    A* Search Algorithm: This dataset is ideal for use with the A* (A-star) search algorithm, a widely used optimization and pathfinding algorithm. The A* algorithm can help find the shortest and most efficient routes between cities, making it suitable for transportation, tourism, and urban planning applications.

    Beginner friendly: This dataset contains a minimum number of features for better processing and analyzing of data making it suitable for beginners.

  2. Mathematics Dataset

    • github.com
    • opendatalab.com
    • +1more
    Updated Apr 3, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DeepMind (2019). Mathematics Dataset [Dataset]. https://github.com/Wikidepia/mathematics_dataset_id
    Explore at:
    Dataset updated
    Apr 3, 2019
    Dataset provided by
    DeepMindhttp://deepmind.com/
    Description

    This dataset consists of mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.

    ## Example questions

     Question: Solve -42*r + 27*c = -1167 and 130*r + 4*c = 372 for r.
     Answer: 4
     
     Question: Calculate -841880142.544 + 411127.
     Answer: -841469015.544
     
     Question: Let x(g) = 9*g + 1. Let q(c) = 2*c + 1. Let f(i) = 3*i - 39. Let w(j) = q(x(j)). Calculate f(w(a)).
     Answer: 54*a - 30
    

    It contains 2 million (question, answer) pairs per module, with questions limited to 160 characters in length, and answers to 30 characters in length. Note the training data for each question type is split into "train-easy", "train-medium", and "train-hard". This allows training models via a curriculum. The data can also be mixed together uniformly from these training datasets to obtain the results reported in the paper. Categories:

    • algebra (linear equations, polynomial roots, sequences)
    • arithmetic (pairwise operations and mixed expressions, surds)
    • calculus (differentiation)
    • comparison (closest numbers, pairwise comparisons, sorting)
    • measurement (conversion, working with time)
    • numbers (base conversion, remainders, common divisors and multiples, primality, place value, rounding numbers)
    • polynomials (addition, simplification, composition, evaluating, expansion)
    • probability (sampling without replacement)
  3. GLAS/ICESat L1B Global Waveform-based Range Corrections Data (HDF5) V034 -...

    • data.nasa.gov
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). GLAS/ICESat L1B Global Waveform-based Range Corrections Data (HDF5) V034 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/glas-icesat-l1b-global-waveform-based-range-corrections-data-hdf5-v034
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    GLAH05 Level-1B waveform parameterization data include output parameters from the waveform characterization procedure and other parameters required to calculate surface slope and relief characteristics. GLAH05 contains parameterizations of both the transmitted and received pulses and other characteristics from which elevation and footprint-scale roughness and slope are calculated. The received pulse characterization uses two implementations of the retracking algorithms: one tuned for ice sheets, called the standard parameterization, used to calculate surface elevation for ice sheets, oceans, and sea ice; and another for land (the alternative parameterization). Each data granule has an associated browse product.

  4. Russian Cities Distance Dataset

    • kaggle.com
    zip
    Updated Nov 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marlin (2023). Russian Cities Distance Dataset [Dataset]. https://www.kaggle.com/datasets/lightningforpython/russian-cities-distance-dataset
    Explore at:
    zip(501 bytes)Available download formats
    Dataset updated
    Nov 9, 2023
    Authors
    Marlin
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Russia
    Description

    https://media.tenor.com/8rI9LFm_cs0AAAAM/imba.gif" alt="">

    The "Russian Cities Distance Dataset" is a comprehensive collection of distance data between major cities in Stavropol region in Russia, designed to facilitate pathfinding and optimization tasks. This connected dataset includes information about the distances (in kilometres) between pairs of cities, allowing users to calculate the shortest paths and optimize routes for various purposes.

    Key features of this dataset

    City Connections: The dataset provides connectivity information between all cities of the Stavropol region, making it an invaluable resource for route planning, navigation, and logistics optimization.

    Distance Data: Each entry in the dataset includes the distance in kilometres between two cities. The distances have been curated to reflect the actual road or travel distances between these locations.

    A* Search Algorithm: This dataset is ideal for use with the A* (A-star) search algorithm, a widely used optimization and pathfinding algorithm. The A* algorithm can help find the shortest and most efficient routes between cities, making it suitable for applications in transportation, tourism, and urban planning.

    City Pairings: The dataset covers pairs of cities, enabling users to calculate the shortest paths and travel distances between any two cities included in the dataset.

  5. S

    Data from: A wide-range multiphase equation of state for lead

    • scidb.cn
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fang Jun; zhao yan hong; Gao Xingyu; Zhang Qili; Wang Yuechao; Sun Bo; Liu Haifeng; Song Haifeng (2025). A wide-range multiphase equation of state for lead [Dataset]. http://doi.org/10.57760/sciencedb.j00213.00166
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 23, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Fang Jun; zhao yan hong; Gao Xingyu; Zhang Qili; Wang Yuechao; Sun Bo; Liu Haifeng; Song Haifeng
    Description

    This dataset provides the equation of state data for lead in the temperature and pressure range from room temperature to 10 MK, and from atmospheric pressure to 107GPa. The thermodynamic properties of the shock Hugoniot line, 300 K isotherm, melting line, and temperature dense transition zone were calculated.

  6. 13.3 Distance Analysis Using ArcGIS

    • hub.arcgis.com
    Updated Mar 4, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iowa Department of Transportation (2017). 13.3 Distance Analysis Using ArcGIS [Dataset]. https://hub.arcgis.com/datasets/IowaDOT::13-3-distance-analysis-using-arcgis
    Explore at:
    Dataset updated
    Mar 4, 2017
    Dataset authored and provided by
    Iowa Department of Transportationhttps://iowadot.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    One important reason for performing GIS analysis is to determine proximity. Often, this type of analysis is done using vector data and possibly the Buffer or Near tools. In this course, you will learn how to calculate distance using raster datasets as inputs in order to assign cells a value based on distance to the nearest source (e.g., city, campground). You will also learn how to allocate cells to a particular source and to determine the compass direction from a cell in a raster to a source.What if you don't want to just measure the straight line from one place to another? What if you need to determine the best route to a destination, taking speed limits, slope, terrain, and road conditions into consideration? In cases like this, you could use the cost distance tools in order to assign a cost (such as time) to each raster cell based on factors like slope and speed limit. From these calculations, you could create a least-cost path from one place to another. Because these tools account for variables that could affect travel, they can help you determine that the shortest path may not always be the best path.After completing this course, you will be able to:Create straight-line distance, direction, and allocation surfaces.Determine when to use Euclidean and weighted distance tools.Perform a least-cost path analysis.

  7. Data from: Current and projected research data storage needs of Agricultural...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +2more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel

  8. NIST Stopping-Power & Range Tables for Electrons, Protons, and Helium Ions -...

    • catalog.data.gov
    • data.amerigeoss.org
    • +1more
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). NIST Stopping-Power & Range Tables for Electrons, Protons, and Helium Ions - SRD 124 [Dataset]. https://catalog.data.gov/dataset/nist-stopping-power-range-tables-for-electrons-protons-and-helium-ions-srd-124
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    The databases ESTAR, PSTAR, and ASTAR calculate stopping-power and range tables for electrons, protons, or helium ions. Stopping-power and range tables can be calculated for electrons in any user-specified material and for protons and helium ions in 74 materials.

  9. d

    Data from: Exact finite range DWBA calculations for heavy-ion induced...

    • elsevier.digitalcommonsdata.com
    Updated Jan 1, 1974
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    T. Tamura (1974). Exact finite range DWBA calculations for heavy-ion induced nuclear reactions [Dataset]. http://doi.org/10.17632/xthy9b534c.1
    Explore at:
    Dataset updated
    Jan 1, 1974
    Authors
    T. Tamura
    License

    https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/

    Description

    Title of program: MARS-1-FOR-EFR-DWBA Catalogue Id: ABPB_v1_0

    Nature of problem The package SATURN-MARS-1 consists of two programs SATURN and MARS for calculating cross sections of reactions transferring nucleon(s) primarily between two heavy ions. The calculations are made within the framework of the finite-range distorted wave Born approximation(DWBA). The first part, SATURN, prepares the form factor(s) either for exact finite (EFR) or for no-recoil (NR) approach. The prepared form factor is then used by the second part MARS to calculate either EFR-DWBA or NR-DWBA cross-s ...

    Versions of this program held in the CPC repository in Mendeley Data abpb_v1_0; MARS-1-FOR-EFR-DWBA; 10.1016/0010-4655(74)90012-5

    This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2019)

  10. Estimated stand-off distance between ADS-B equipped aircraft and obstacles

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    jpeg, zip
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Weinert; Andrew Weinert (2024). Estimated stand-off distance between ADS-B equipped aircraft and obstacles [Dataset]. http://doi.org/10.5281/zenodo.7741273
    Explore at:
    zip, jpegAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrew Weinert; Andrew Weinert
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Summary:

    Estimated stand-off distance between ADS-B equipped aircraft and obstacles. Obstacle information was sourced from the FAA Digital Obstacle File and the FHWA National Bridge Inventory. Aircraft tracks were sourced from processed data curated from the OpenSky Network. Results are presented as histograms organized by aircraft type and distance away from runways.

    Description:

    For many aviation safety studies, aircraft behavior is represented using encounter models, which are statistical models of how aircraft behave during close encounters. They are used to provide a realistic representation of the range of encounter flight dynamics where an aircraft collision avoidance system would be likely to alert. These models currently and have historically have been limited to interactions between aircraft; they have not represented the specific interactions between obstacles and aircraft equipped transponders. In response, we calculated the standoff distance between obstacles and ADS-B equipped manned aircraft.

    For robustness, this assessment considered two different datasets of manned aircraft tracks and two datasets of obstacles. For robustness, MIT LL calculated the standoff distance using two different datasets of aircraft tracks and two datasets of obstacles. This approach aligned with the foundational research used to support the ASTM F3442/F3442M-20 well clear criteria of 2000 feet laterally and 250 feet AGL vertically.

    The two datasets of processed tracks of ADS-B equipped aircraft curated from the OpenSky Network. It is likely that rotorcraft were underrepresented in these datasets. There were also no considerations for aircraft equipped only with Mode C or not equipped with any transponders. The first dataset was used to train the v1.3 uncorrelated encounter models and referred to as the “Monday” dataset. The second dataset is referred to as the “aerodrome” dataset and was used to train the v2.0 and v3.x terminal encounter model. The Monday dataset consisted of 104 Mondays across North America. The other dataset was based on observations at least 8 nautical miles within Class B, C, D aerodromes in the United States for the first 14 days of each month from January 2019 through February 2020. Prior to any processing, the datasets required 714 and 847 Gigabytes of storage. For more details on these datasets, please refer to "Correlated Bayesian Model of Aircraft Encounters in the Terminal Area Given a Straight Takeoff or Landing" and “Benchmarking the Processing of Aircraft Tracks with Triples Mode and Self-Scheduling.”

    Two different datasets of obstacles were also considered. First was point obstacles defined by the FAA digital obstacle file (DOF) and consisted of point obstacle structures of antenna, lighthouse, meteorological tower (met), monument, sign, silo, spire (steeple), stack (chimney; industrial smokestack), transmission line tower (t-l tower), tank (water; fuel), tramway, utility pole (telephone pole, or pole of similar height, supporting wires), windmill (wind turbine), and windsock. Each obstacle was represented by a cylinder with the height reported by the DOF and a radius based on the report horizontal accuracy. We did not consider the actual width and height of the structure itself. Additionally, we only considered obstacles at least 50 feet tall and marked as verified in the DOF.

    The other obstacle dataset, termed as “bridges,” was based on the identified bridges in the FAA DOF and additional information provided by the National Bridge Inventory. Due to the potential size and extent of bridges, it would not be appropriate to model them as point obstacles; however, the FAA DOF only provides a point location and no information about the size of the bridge. In response, we correlated the FAA DOF with the National Bridge Inventory, which provides information about the length of many bridges. Instead of sizing the simulated bridge based on horizontal accuracy, like with the point obstacles, the bridges were represented as circles with a radius of the longest, nearest bridge from the NBI. A circle representation was required because neither the FAA DOF or NBI provided sufficient information about orientation to represent bridges as rectangular cuboid. Similar to the point obstacles, the height of the obstacle was based on the height reported by the FAA DOF. Accordingly, the analysis using the bridge dataset should be viewed as risk averse and conservative. It is possible that a manned aircraft was hundreds of feet away from an obstacle in actuality but the estimated standoff distance could be significantly less. Additionally, all obstacles are represented with a fixed height, the potentially flat and low level entrances of the bridge are assumed to have the same height as the tall bridge towers. The attached figure illustrates an example simulated bridge.

    It would had been extremely computational inefficient to calculate the standoff distance for all possible track points. Instead, we define an encounter between an aircraft and obstacle as when an aircraft flying 3069 feet AGL or less comes within 3000 feet laterally of any obstacle in a 60 second time interval. If the criteria were satisfied, then for that 60 second track segment we calculate the standoff distance to all nearby obstacles. Vertical separation was based on the MSL altitude of the track and the maximum MSL height of an obstacle.

    For each combination of aircraft track and obstacle datasets, the results were organized seven different ways. Filtering criteria were based on aircraft type and distance away from runways. Runway data was sourced from the FAA runways of the United States, Puerto Rico, and Virgin Islands open dataset. Aircraft type was identified as part of the em-processing-opensky workflow.

    • All: No filter, all observations that satisfied encounter conditions
    • nearRunway: Aircraft within or at 2 nautical miles of a runway
    • awayRunway: Observations more than 2 nautical miles from a runway
    • glider: Observations when aircraft type is a glider
    • fwme: Observations when aircraft type is a fixed-wing multi-engine
    • fwse: Observations when aircraft type is a fixed-wing single engine
    • rotorcraft: Observations when aircraft type is a rotorcraft

    License

    This dataset is licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International(CC BY-NC-ND 4.0).

    This license requires that reusers give credit to the creator. It allows reusers to copy and distribute the material in any medium or format in unadapted form and for noncommercial purposes only. Only noncommercial use of your work is permitted. Noncommercial means not primarily intended for or directed towards commercial advantage or monetary compensation. Exceptions are given for the not for profit standards organizations of ASTM International and RTCA.

    MIT is releasing this dataset in good faith to promote open and transparent research of the low altitude airspace. Given the limitations of the dataset and a need for more research, a more restrictive license was warranted. Namely it is based only on only observations of ADS-B equipped aircraft, which not all aircraft in the airspace are required to employ; and observations were source from a crowdsourced network whose surveillance coverage has not been robustly characterized.

    As more research is conducted and the low altitude airspace is further characterized or regulated, it is expected that a future version of this dataset may have a more permissive license.

    Distribution Statement

    DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited.

    © 2021 Massachusetts Institute of Technology.

    Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

    This material is based upon work supported by the Federal Aviation Administration under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Federal Aviation Administration.

    This document is derived from work done for the FAA (and possibly others); it is not the direct product of work done for the FAA. The information provided herein may include content supplied by third parties. Although the data and information contained herein has been produced or processed from sources believed to be reliable, the Federal Aviation Administration makes no warranty, expressed or implied, regarding the accuracy, adequacy, completeness, legality, reliability or usefulness of any information, conclusions or recommendations provided herein. Distribution of the information contained herein does not constitute an endorsement or warranty of the data or information provided herein by the Federal Aviation Administration or the U.S. Department of Transportation. Neither the Federal Aviation Administration nor the U.S. Department of

  11. Helsinki Region Travel Time Matrix

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Henrikki Tenkanen; Henrikki Tenkanen; Tuuli Toivonen; Tuuli Toivonen (2020). Helsinki Region Travel Time Matrix [Dataset]. http://doi.org/10.5281/zenodo.3247564
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Henrikki Tenkanen; Henrikki Tenkanen; Tuuli Toivonen; Tuuli Toivonen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Helsinki metropolitan area, Helsinki
    Description

    Helsinki Region Travel Time Matrix contains travel time and distance information for routes between all 250 m x 250 m grid cell centroids (n = 13231) in the Helsinki Region, Finland by walking, cycling, public transportation and car. The grid cells are compatible with the statistical grid cells used by Statistics Finland and the YKR (yhdyskuntarakenteen seurantajärjestelmä) data set. The Helsinki Region Travel Time Matrix is available for three different years:

    • 2018
    • 2015
    • 2013

    The data consists of travel time and distance information of the routes that have been calculated between all statistical grid cell centroids (n = 13231) by walking, cycling, public transportation and car.

    The data have been calculated for two different times of the day: 1) midday and 2) rush hour.

    The data may be used freely (under Creative Commons 4.0 licence). We do not take any responsibility for any mistakes, errors or other deficiencies in the data.

    Organization of data

    The data have been divided into 13231 text files according to destinations of the routes. The data files have been organized into sub-folders that contain multiple (approx. 4-150) Travel Time Matrix result files. Individual folders consist of all the Travel Time Matrices that have same first four digits in their filename (e.g. 5785xxx).

    In order to visualize the data on a map, the result tables can be joined with the MetropAccess YKR-grid shapefile (attached here). The data can be joined by using the field ‘from_id’ in the text files and the field ‘YKR_ID’ in MetropAccess-YKR-grid shapefile as a common key.

    Data structure

    The data have been divided into 13231 text files according to destinations of the routes. One file includes the routes from all statistical grid cells to a particular destination grid cell. All files have been named according to the destination grid cell code and each file includes 13231 rows.

    NODATA values have been stored as value -1.

    Each file consists of 17 attribute fields: 1) from_id, 2) to_id, 3) walk_t, 4) walk_d, 5) bike_f_t, 6) bike_s_t, 7) bike_d, 8) pt_r_tt, 9) pt_r_t, 10) pt_r_d, 11) pt_m_tt, 12) pt_m_t, 13) pt_m_d, 14) car_r_t, 15) car_r_d, 16) car_m_t, 17) car_m_d, 18) car_sl_t

    The fields are separated by semicolon in the text files.

    Attributes

    • from_id: ID number of the origin grid cell
    • to_id: ID number of the destination grid cell
    • walk_t: Travel time in minutes from origin to destination by walking
    • walk_d: Distance in meters of the walking route
    • bike_f_t: Total travel time in minutes from origin to destination by fast cycling; Includes extra time (1 min) that it takes to take/return bike
    • bike_s_t: Total travel time in minutes from origin to destination by slow cycling; Includes extra time (1 min) that it takes to take/return bike
    • bike_d:Distance in meters of the cycling route
    • pt_r_tt: Travel time in minutes from origin to destination by public transportation in rush hour traffic; whole travel chain has been taken into account including the waiting time at home
    • pt_r_t: Travel time in minutes from origin to destination by public transportation in rush hour traffic; whole travel chain has been taken into account excluding the waiting time at home
    • pt_r_d: Distance in meters of the public transportation route in rush hour traffic
    • pt_m_tt: Travel time in minutes from origin to destination by public transportation in midday traffic; whole travel chain has been taken into account including the waiting time at home
    • pt_m_t: Travel time in minutes from origin to destination by public transportation in midday traffic; whole travel chain has been taken into account excluding the waiting time at home
    • pt_m_d: Distance in meters of the public transportation route in midday traffic
    • car_r_t: Travel time in minutes from origin to destination by private car in rush hour traffic; the whole travel chain has been taken into account
    • car_r_d: Distance in meters of the private car route in rush hour traffic
    • car_m_t: Travel time in minutes from origin to destination by private car in midday traffic; the whole travel chain has been taken into account
    • car_m_d: Distance in meters of the private car route in midday traffic
    • car_sl_t: Travel time from origin to destination by private car following speed limits without any additional impedances; the whole travel chain has been taken into account

    METHODS

    For detailed documentation and how to reproduce the data, see HelsinkiRegionTravelTimeMatrix2018 GitHub repository.

    THE ROUTE BY CAR have been calculated with a dedicated open source tool called DORA (DOor-to-door Routing Analyst) developed for this project. DORA uses PostgreSQL database with PostGIS extension and is based on the pgRouting toolkit. MetropAccess-Digiroad (modified from the original Digiroad data provided by Finnish Transport Agency) has been used as a street network in which the travel times of the road segments are made more realistic by adding crossroad impedances for different road classes.

    The calculations have been repeated for two times of the day using 1) the “midday impedance” (i.e. travel times outside rush hour) and 2) the “rush hour impendance” as impedance in the calculations. Moreover, there is 3) the “speed limit impedance” calculated in the matrix (i.e. using speed limit without any additional impedances).

    The whole travel chain (“door-to-door approach”) is taken into account in the calculations:
    1) walking time from the real origin to the nearest network location (based on Euclidean distance),
    2) average walking time from the origin to the parking lot,
    3) travel time from parking lot to destination,
    4) average time for searching a parking lot,
    5) walking time from parking lot to nearest network location of the destination and
    6) walking time from network location to the real destination (based on Euclidean distance).

    THE ROUTES BY PUBLIC TRANSPORTATION have been calculated by using the MetropAccess-Reititin tool which also takes into account the whole travel chains from the origin to the destination:
    1) possible waiting at home before leaving,
    2) walking from home to the transit stop,
    3) waiting at the transit stop,
    4) travel time to next transit stop,
    5) transport mode change,
    6) travel time to next transit stop and
    7) walking to the destination.

    Travel times by public transportation have been optimized using 10 different departure times within the calculation hour using so called Golomb ruler. The fastest route from these calculations are selected for the final travel time matrix.

    THE ROUTES BY CYCLING are also calculated using the DORA tool. The network dataset underneath is MetropAccess-CyclingNetwork, which is a modified version from the original Digiroad data provided by Finnish Transport Agency. In the dataset the travel times for the road segments have been modified to be more realistic based on Strava sports application data from the Helsinki region from 2016 and the bike sharing system data from Helsinki from 2017.

    For each road segment a separate speed value was calculated for slow and fast cycling. The value for fast cycling is based on a percentual difference between segment specific Strava speed value and the average speed value for the whole Strava data. This same percentual difference has been applied to calculate the slower speed value for each road segment. The speed value is then the average speed value of bike sharing system users multiplied by the percentual difference value.

    The reference value for faster cycling has been 19km/h, which is based on the average speed of Strava sports application users in the Helsinki region. The reference value for slower cycling has been 12km/, which has been the average travel speed of bike sharing system users in Helsinki. Additional 1 minute have been added to the travel time to consider the time for taking (30s) and returning (30s) bike on the origin/destination.

    More information of the Strava dataset that was used can be found from the Cycling routes and fluency report, which was published by us and the city of Helsinki.

    THE ROUTES BY WALKING were also calculated using the MetropAccess-Reititin by disabling all motorized transport modesin the calculation. Thus, all routes are based on the Open Street Map geometry.

    The walking speed has been adjusted to 70 meters per minute, which is the default speed in the HSL Journey Planner (also in the calculations by public transportation).

    All calculations were done using the computing resources of CSC-IT Center for Science (https://www.csc.fi/home).

  12. housing

    • kaggle.com
    zip
    Updated Sep 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HappyRautela (2023). housing [Dataset]. https://www.kaggle.com/datasets/happyrautela/housing
    Explore at:
    zip(809785 bytes)Available download formats
    Dataset updated
    Sep 22, 2023
    Authors
    HappyRautela
    Description

    The exercise after this contains questions that are based on the housing dataset.

    1. How many houses have a waterfront? a. 21000 b. 21450 c. 163 d. 173

    2. How many houses have 2 floors? a. 2692 b. 8241 c. 10680 d. 161

    3. How many houses built before 1960 have a waterfront? a. 80 b. 7309 c. 90 d. 92

    4. What is the price of the most expensive house having more than 4 bathrooms? a. 7700000 b. 187000 c. 290000 d. 399000

    5. For instance, if the ‘price’ column consists of outliers, how can you make the data clean and remove the redundancies? a. Calculate the IQR range and drop the values outside the range. b. Calculate the p-value and remove the values less than 0.05. c. Calculate the correlation coefficient of the price column and remove the values less than the correlation coefficient. d. Calculate the Z-score of the price column and remove the values less than the z-score.

    6. What are the various parameters that can be used to determine the dependent variables in the housing data to determine the price of the house? a. Correlation coefficients b. Z-score c. IQR Range d. Range of the Features

    7. If we get the r2 score as 0.38, what inferences can we make about the model and its efficiency? a. The model is 38% accurate, and shows poor efficiency. b. The model is showing 0.38% discrepancies in the outcomes. c. Low difference between observed and fitted values. d. High difference between observed and fitted values.

    8. If the metrics show that the p-value for the grade column is 0.092, what all inferences can we make about the grade column? a. Significant in presence of other variables. b. Highly significant in presence of other variables c. insignificance in presence of other variables d. None of the above

    9. If the Variance Inflation Factor value for a feature is considerably higher than the other features, what can we say about that column/feature? a. High multicollinearity b. Low multicollinearity c. Both A and B d. None of the above

  13. c

    Data from: U.S. Geological Survey calculated half interpercentile range...

    • s.cnmilf.com
    • search.dataone.org
    • +1more
    Updated Oct 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). U.S. Geological Survey calculated half interpercentile range (half of the difference between the 16th and 84th percentiles) of wave-current bottom shear stress in the South Atlantic Bight from May 2010 to May 2011 (SAB_hIPR.shp, polygon shapefile, Geographic, WGS84) [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/u-s-geological-survey-calculated-half-interpercentile-range-half-of-the-difference-between
    Explore at:
    Dataset updated
    Oct 1, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The U.S. Geological Survey has been characterizing the regional variation in shear stress on the sea floor and sediment mobility through statistical descriptors. The purpose of this project is to identify patterns in stress in order to inform habitat delineation or decisions for anthropogenic use of the continental shelf. The statistical characterization spans the continental shelf from the coast to approximately 120 m water depth, at approximately 5 km resolution. Time-series of wave and circulation are created using numerical models, and near-bottom output of steady and oscillatory velocities and an estimate of bottom roughness are used to calculate a time-series of bottom shear stress at 1-hour intervals. Statistical descriptions such as the median and 95th percentile, which are the output included with this database, are then calculated to create a two-dimensional picture of the regional patterns in shear stress. In addition, time-series of stress are compared to critical stress values at select points calculated from observed surface sediment texture data to determine estimates of sea floor mobility.

  14. Traveling Salesman Computer Vision

    • kaggle.com
    zip
    Updated Apr 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff Heaton (2022). Traveling Salesman Computer Vision [Dataset]. https://www.kaggle.com/datasets/jeffheaton/traveling-salesman-computer-vision
    Explore at:
    zip(2977884049 bytes)Available download formats
    Dataset updated
    Apr 20, 2022
    Authors
    Jeff Heaton
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    The Traveling Salesperson Problem (TSP) is a class problem of computer science that seeks to find the shortest route between a group of cities. It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

    https://data.heatonresearch.com/images/wustl/kaggle/tsp/world-tsp.png" alt="World Map">

    In this Kaggle competition, your goal is not to find the shortest route among cities. Rather, you must attempt to determine the route labeled on a map.

    Calculating Line Distances

    The data for this competition is not made up of real-world maps, but rather randomly generated maps of varying attributes of size, city count, and optimality of the routes. The following image demonstrates a relatively small map, with few cities, and an optimal route.

    https://data.heatonresearch.com/images/wustl/kaggle/tsp/1.jpg" alt="Small Map">

    Not all maps are this small, or contain this optimal a route. Consider the following map, which is much larger.

    https://data.heatonresearch.com/images/wustl/kaggle/tsp/6.jpg" alt="Larger Map">

    The following attributes were randomly selected to generate each image.

    • Height
    • Width
    • City count
    • Cycles of Simulated Annealing optimization of initial random path

    The path distance is based on the sum of the Euclidean distance of all segments in the path. The distance units are in pixels.

    Dataset Challenges

    This is a regression problem, you are to estimate the total path length. Several challenges to consider.

    • If you indiscriminately scale the maps, you will lose size information.
    • Paths might overlap, causing the ration of total pixels to total length to become misleading.
    • As paths overlap bot other path segments and cities, the resulting color becomes brighter.

    The following picture shows a section from one map zoomed to the pixel-level:

    https://data.heatonresearch.com/images/wustl/kaggle/tsp/tsp_zoom.jpg" alt="TSP Zoom">

    CSV Files

    The following CSV files are provided, in addition to the images.

    • train.csv - Training data, with distance labels.
    • test.csv - Test data without distance labels.
    • tsp-all.csv - Training and test data combined with complete labels and additional information about each generated map.

    CSV File Format

    The tsp-all.csv file contains the following data.

    id,filename,distance,key
    0,0.jpg,83110,503x673-270-83110.jpg
    1,1.jpg,1035,906x222-10-1035.jpg
    2,2.jpg,20756,810x999-299-20756.jpg
    3,3.jpg,13286,781x717-272-13286.jpg
    4,4.jpg,13924,609x884-312-13924.jpg
    

    The columns:

    • id - A unique ID that allows linking across all three CSV files.
    • filename - The name of each map's image file.
    • distance - The total distance through the cities, this is the y/label.
    • key - The generator filename, provides the dimensions, city count, & distance.
  15. Math Formula Retrieval

    • kaggle.com
    • huggingface.co
    zip
    Updated Dec 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Math Formula Retrieval [Dataset]. https://www.kaggle.com/datasets/thedevastator/math-formula-pair-classification-dataset/data
    Explore at:
    zip(2021716728 bytes)Available download formats
    Dataset updated
    Dec 2, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Math Formula Retrieval

    Math Formula Pair Classification Dataset

    By ddrg (From Huggingface) [source]

    About this dataset

    With a total of six columns, including formula1, formula2, label (binary format), formula1, formula2, and label, the dataset provides all the necessary information for conducting comprehensive analysis and evaluation.

    The train.csv file contains a subset of the dataset specifically curated for training purposes. It includes an extensive range of math formula pairs along with their corresponding labels and unique ID names. This allows researchers and data scientists to construct models that can predict whether two given formulas fall within the same category or not.

    On the other hand, test.csv serves as an evaluation set. It consists of additional pairs of math formulas accompanied by their respective labels and unique IDs. By evaluating model performance on this test set after training it on train.csv data, researchers can assess how well their models generalize to unseen instances.

    By leveraging this informative dataset, researchers can unlock new possibilities in mathematics-related fields such as pattern recognition algorithms development or enhancing educational tools that involve automatic identification and categorization tasks based on mathematical formulas

    How to use the dataset

    Introduction

    Dataset Description

    train.csv

    The train.csv file contains a set of labeled math formula pairs along with their corresponding labels and formula name IDs. It consists of the following columns: - formula1: The first mathematical formula in the pair (text). - formula2: The second mathematical formula in the pair (text). - label: The classification label indicating whether the pair of formulas belong to the same category or not (binary). A label value of 1 indicates that both formulas belong to the same category, while a label value of 0 indicates different categories.

    test.csv

    The purpose of the test.csv file is to provide a set of formula pairs along with their labels and formula name IDs for testing and evaluation purposes. It has an identical structure to train.csv, containing columns like formula1, formula2, label, etc.

    Task

    The main task using this dataset is binary classification, where your objective is to predict whether two mathematical formulas belong to the same category or not based on their textual representation. You can use various machine learning algorithms such as logistic regression, decision trees, random forests, or neural networks for training models on this dataset.

    Exploring & Analyzing Data

    Before building your model, it's crucial to explore and analyze your data. Here are some steps you can take:

    • Load both CSV files (train.csv and test.csv) into your preferred data analysis framework or programming language (e.g., Python with libraries like pandas).
    • Examine the dataset's structure, including the number of rows, columns, and data types.
    • Check for missing values in the dataset and handle them accordingly.
    • Visualize the distribution of labels to understand whether it is balanced or imbalanced.

    Model Building

    Once you have analyzed and preprocessed your dataset, you can start building your classification model using various machine learning algorithms:

    • Split your train.csv data into training and validation sets for model evaluation during training.
    • Choose a suitable

    Research Ideas

    • Math Formula Similarity: This dataset can be used to develop a model that classifies whether two mathematical formulas are similar or not. This can be useful in various applications such as plagiarism detection, identifying duplicate formulas in databases, or suggesting similar formulas based on user input.
    • Formula Categorization: The dataset can be used to train a model that categorizes mathematical formulas into different classes or categories. For example, the model can classify formulas into algebraic expressions, trigonometric equations, calculus problems, or geometric theorems. This categorization can help organize and search through large collections of mathematical formulas.
    • Formula Recommendation: Using this dataset, one could build a recommendation system that suggests related math formulas based on user input. By analyzing the similarities between different formula pairs and their corresponding labels, the system could provide recommendations for relevant mathematical concepts that users may need while solving problems or studying specific topics in mathematics

    Acknowle...

  16. e

    Data from: Analysis of the Scalar and Vector Random Coupling Models For a...

    • data.europa.eu
    • demo.researchdata.se
    • +2more
    unknown
    Updated Sep 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chalmers tekniska högskola (2023). Analysis of the Scalar and Vector Random Coupling Models For a Four Coupled-Core Fiber [Dataset]. https://data.europa.eu/data/datasets/https-doi-org-10-5281-zenodo-7895952~~1?locale=mt
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Sep 11, 2023
    Dataset authored and provided by
    Chalmers tekniska högskola
    Description

    The files with simulation results for ECOC 20223 submission "Analysis of the Scalar and Vector Random Coupling Models For a Four Coupled-Core Fiber". "4CCF_eigenvectorsPol" file is the Mathematica code which enables to calculate supermodes (eigenvectors of M(w)) and their propagation constants of 4-coupled-core fiber (4CCF). These results are uploaded to the python notebook "4CCF_modelingECOC" in order to plot them to get Fig. 2 in the paper. "TransferMatrix" is the python file with functions used for modeling, simulation and plotting. It is also uploaded in the python notebook "4CCF_modelingECOC", where all the calculations for figures in the paper are presented.

    ! UPD 25.09.2023: There is an error in the formula of birefringence calculation. It is in the function "CouplingCoefficients" in "TransferMatrix" file. There the variable "birefringence" has to be calculated according to the formula (19) [A. Ankiewicz, A. Snyder, and X.-H. Zheng, "Coupling between parallel optical fiber cores–critical examination", Journal of Lightwave Technology, vol. 4, no. 9,pp. 1317–1323, 1986]: (4*U**2*W*spec.k0(W)*spec.kn(2, W_)/(spec.k1(W)*V**4))*((spec.iv(1, W)/spec.k1(W))-(spec.iv(2, W)/spec.k0(W))) The correct formula gives almost the same result (the difference is 10^-5), but one has to use a correct formula anyway. ! UPD 9.12.2023: I have noticed that in the published version of the code I forgot to change the wavelength range for impulse response calculation. So instead of seeing the nice shape as in the paper you will see resolution limited shape. To solve that just change the range of wavelengths, you can add "wl = [1545e-9, 1548e-9]" in the first cell after "Total power impulse response". P.s. In case of any questions or suggestions you are welcome to write me an email ekader@chalmers.se

  17. f

    Data from: Numerical Simulation Strategy and Applications for Falling Film...

    • acs.figshare.com
    xlsx
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenxu Yuan; Wenxing Chen; Shichang Chen (2025). Numerical Simulation Strategy and Applications for Falling Film Flow with Variable Viscosity Fluids [Dataset]. http://doi.org/10.1021/acs.iecr.4c03582.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 3, 2025
    Dataset provided by
    ACS Publications
    Authors
    Wenxu Yuan; Wenxing Chen; Shichang Chen
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The flow behaviors for falling film with wide-range variable viscosity were demonstrated by a neosimulation strategy, which incorporated the age transport equation based on mean age theory and a designable age-viscosity formula into Navier–Stokes equations. Surprisingly, a turning region was revealed, in which the thickness variation for variable viscosity falling film with the flow rate and initial viscosity was reversed. The larger the flow rate or the higher the initial viscosity, the longer the length of the turning region, and further, it was from the inlet along the flow direction. A flow cross-sectional viscosity was proposed to explain this anomaly. Then, a simulation scheme for calculating the initial viscosity based on outlet viscosity and an empirical equation for designing the length of the falling film pipe could be achieved according to flow cross-sectional viscosity analysis. It provided a practical reference for falling film reactor design, scale-up, and process optimization.

  18. c

    Data from: Variable Terrestrial GPS Telemetry Detection Rates: Parts 1 -...

    • s.cnmilf.com
    • data.usgs.gov
    • +2more
    Updated Oct 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Variable Terrestrial GPS Telemetry Detection Rates: Parts 1 - 7—Data [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/variable-terrestrial-gps-telemetry-detection-rates-parts-1-7data
    Explore at:
    Dataset updated
    Oct 2, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    Studies utilizing Global Positioning System (GPS) telemetry rarely result in 100% fix success rates (FSR). Many assessments of wildlife resource use do not account for missing data, either assuming data loss is random or because a lack of practical treatment for systematic data loss. Several studies have explored how the environment, technological features, and animal behavior influence rates of missing data in GPS telemetry, but previous spatially explicit models developed to correct for sampling bias have been specified to small study areas, on a small range of data loss, or to be species-specific, limiting their general utility. Here we explore environmental effects on GPS fix acquisition rates across a wide range of environmental conditions and detection rates for bias correction of terrestrial GPS-derived, large mammal habitat use. We also evaluate patterns in missing data that relate to potential animal activities that change the orientation of the antennae and characterize home-range probability of GPS detection for 4 focal species; cougars (Puma concolor), desert bighorn sheep (Ovis canadensis nelsoni), Rocky Mountain elk (Cervus elaphus ssp. nelsoni) and mule deer (Odocoileus hemionus). Part 1, Positive Openness Raster (raster dataset): Openness is an angular measure of the relationship between surface relief and horizontal distance. For angles less than 90 degrees it is equivalent to the internal angle of a cone with its apex at a DEM _location, and is constrained by neighboring elevations within a specified radial distance. 480 meter search radius was used for this calculation of positive openness. Openness incorporates the terrain line-of-sight or viewshed concept and is calculated from multiple zenith and nadir angles-here along eight azimuths. Positive openness measures openness above the surface, with high values for convex forms and low values for concave forms (Yokoyama et al. 2002). We calculated positive openness using a custom python script, following the methods of Yokoyama et. al (2002) using a USGS National Elevation Dataset as input. Part 2, Northern Arizona GPS Test Collar (csv): Bias correction in GPS telemetry data-sets requires a strong understanding of the mechanisms that result in missing data. We tested wildlife GPS collars in a variety of environmental conditions to derive a predictive model of fix acquisition. We found terrain exposure and tall over-story vegetation are the primary environmental features that affect GPS performance. Model evaluation showed a strong correlation (0.924) between observed and predicted fix success rates (FSR) and showed little bias in predictions. The model's predictive ability was evaluated using two independent data-sets from stationary test collars of different make/model, fix interval programming, and placed at different study sites. No statistically significant differences (95% CI) between predicted and observed FSRs, suggest changes in technological factors have minor influence on the models ability to predict FSR in new study areas in the southwestern US. The model training data are provided here for fix attempts by hour. This table can be linked with the site _location shapefile using the site field. Part 3, Probability Raster (raster dataset): Bias correction in GPS telemetry datasets requires a strong understanding of the mechanisms that result in missing data. We tested wildlife GPS collars in a variety of environmental conditions to derive a predictive model of fix aquistion. We found terrain exposure and tall overstory vegetation are the primary environmental features that affect GPS performance. Model evaluation showed a strong correlation (0.924) between observed and predicted fix success rates (FSR) and showed little bias in predictions. The models predictive ability was evaluated using two independent datasets from stationary test collars of different make/model, fix interval programing, and placed at different study sites. No statistically significant differences (95% CI) between predicted and observed FSRs, suggest changes in technological factors have minor influence on the models ability to predict FSR in new study areas in the southwestern US. We evaluated GPS telemetry datasets by comparing the mean probability of a successful GPS fix across study animals home-ranges, to the actual observed FSR of GPS downloaded deployed collars on cougars (Puma concolor), desert bighorn sheep (Ovis canadensis nelsoni), Rocky Mountain elk (Cervus elaphus ssp. nelsoni) and mule deer (Odocoileus hemionus). Comparing the mean probability of acquisition within study animals home-ranges and observed FSRs of GPS downloaded collars resulted in a approximatly 1:1 linear relationship with an r-sq= 0.68. Part 4, GPS Test Collar Sites (shapefile): Bias correction in GPS telemetry data-sets requires a strong understanding of the mechanisms that result in missing data. We tested wildlife GPS collars in a variety of environmental conditions to derive a predictive model of fix acquisition. We found terrain exposure and tall over-story vegetation are the primary environmental features that affect GPS performance. Model evaluation showed a strong correlation (0.924) between observed and predicted fix success rates (FSR) and showed little bias in predictions. The model's predictive ability was evaluated using two independent data-sets from stationary test collars of different make/model, fix interval programming, and placed at different study sites. No statistically significant differences (95% CI) between predicted and observed FSRs, suggest changes in technological factors have minor influence on the models ability to predict FSR in new study areas in the southwestern US. Part 5, Cougar Home Ranges (shapefile): Cougar home-ranges were calculated to compare the mean probability of a GPS fix acquisition across the home-range to the actual fix success rate (FSR) of the collar as a means for evaluating if characteristics of an animal’s home-range have an effect on observed FSR. We estimated home-ranges using the Local Convex Hull (LoCoH) method using the 90th isopleth. Data obtained from GPS download of retrieved units were only used. Satellite delivered data was omitted from the analysis for animals where the collar was lost or damaged because satellite delivery tends to lose as additional 10% of data. Comparisons with home-range mean probability of fix were also used as a reference for assessing if the frequency animals use areas of low GPS acquisition rates may play a role in observed FSRs. Part 6, Cougar Fix Success Rate by Hour (csv): Cougar GPS collar fix success varied by hour-of-day suggesting circadian rhythms with bouts of rest during daylight hours may change the orientation of the GPS receiver affecting the ability to acquire fixes. Raw data of overall fix success rates (FSR) and FSR by hour were used to predict relative reductions in FSR. Data only includes direct GPS download datasets. Satellite delivered data was omitted from the analysis for animals where the collar was lost or damaged because satellite delivery tends to lose approximately an additional 10% of data. Part 7, Openness Python Script version 2.0: This python script was used to calculate positive openness using a 30 meter digital elevation model for a large geographic area in Arizona, California, Nevada and Utah. A scientific research project used the script to explore environmental effects on GPS fix acquisition rates across a wide range of environmental conditions and detection rates for bias correction of terrestrial GPS-derived, large mammal habitat use.

  19. d

    Data from: Haploids adapt faster than diploids across a range of...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Dec 7, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aleeza C Gerstein; Lesley A Cleathero; Mohammad A Mandegar; Sarah P. Otto (2010). Haploids adapt faster than diploids across a range of environments [Dataset]. http://doi.org/10.5061/dryad.8048
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 7, 2010
    Dataset provided by
    Dryad
    Authors
    Aleeza C Gerstein; Lesley A Cleathero; Mohammad A Mandegar; Sarah P. Otto
    Time period covered
    Dec 7, 2010
    Description

    Raw data to calculate rate of adaptationRaw dataset for rate of adaptation calculations (Figure 1) and related statistics.dataall.csvR code to analyze raw data for rate of adaptationCompetition Analysis.RRaw data to calculate effective population sizesdatacount.csvR code to analayze effective population sizesR code used to analyze effective population sizes; Figure 2Cell Count Ne.RR code to determine our best estimate of the dominance coefficient in each environmentR code to produce figures 3, S4, S5 -- what is the best estimate of dominance? Note, competition and effective population size R code must be run first in the same session.what is h.R

  20. f

    Summary and methods used to calculate the physical characteristics used to...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Mar 31, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathan, Senthilvel K. S. S.; Saldivar, Diana A. Ramirez; Vaughan, Ian P.; Goossens, Benoit; Stark, Danica J. (2017). Summary and methods used to calculate the physical characteristics used to compare the home range estimators. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001743878
    Explore at:
    Dataset updated
    Mar 31, 2017
    Authors
    Nathan, Senthilvel K. S. S.; Saldivar, Diana A. Ramirez; Vaughan, Ian P.; Goossens, Benoit; Stark, Danica J.
    Description

    Summary and methods used to calculate the physical characteristics used to compare the home range estimators.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
K.B. Dharun Krishna (2024). Indian Cities Distance Dataset [Dataset]. https://www.kaggle.com/datasets/kbdharun/a-star-algorithm-route-planning-dataset/code
Organization logo

Indian Cities Distance Dataset

Indian Cities Distance Dataset for Route Finding.

Explore at:
zip(804 bytes)Available download formats
Dataset updated
Mar 1, 2024
Authors
K.B. Dharun Krishna
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Area covered
India
Description

The "Indian Cities Distance Dataset" is a comprehensive collection of distance data between major cities in India, designed to facilitate pathfinding and optimization tasks.

This connected dataset includes information about the distances (in kilometres) between pairs of cities, allowing users to calculate the shortest paths and optimize routes for various purposes.

Key features of this dataset

City Pairings: The dataset provides connectivity information between pairs of prominent Indian cities, enabling users to calculate the shortest paths and travel distances between any two cities included in the dataset. It is an excellent resource for delving into programming route planning, navigation, and logistics optimization programs.

Distance Data: Each entry in the dataset includes the distance in kilometres between two cities. The distances have been curated to reflect the actual road distances between these locations.

A* Search Algorithm: This dataset is ideal for use with the A* (A-star) search algorithm, a widely used optimization and pathfinding algorithm. The A* algorithm can help find the shortest and most efficient routes between cities, making it suitable for transportation, tourism, and urban planning applications.

Beginner friendly: This dataset contains a minimum number of features for better processing and analyzing of data making it suitable for beginners.

Search
Clear search
Close search
Google apps
Main menu