100+ datasets found
  1. f

    Data from: LMDiskANN.jl: An Implementation of the Low Memory Disk...

    • figshare.com
    zip
    Updated Jun 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander V. Mantzaris (2025). LMDiskANN.jl: An Implementation of the Low Memory Disk Approximate Nearest Neighbors Search Algorithm [Dataset]. http://doi.org/10.6084/m9.figshare.29286668.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 10, 2025
    Dataset provided by
    figshare
    Authors
    Alexander V. Mantzaris
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    LMDiskANN.jl (v1.2.0) is a Julia package that implements the Low-Memory Disk Approximate Nearest-Neighbor (LM-DiskANN) algorithm, extending DiskANN-style graph search to handle billion-scale vector datasets while keeping RAM usage to a minimum. It stores adjacency lists on disk via memory-mapped files, performs tunable best-first graph traversals for fast and accurate queries, and supports dynamic insertions and deletions with automatic pruning to maintain a compact index. The library exposes knobs to balance recall against latency, and it optionally pairs a LevelDB key–value store with the node IDs for flexible external key lookup. These capabilities make LMDiskANN.jl well-suited for embedding retrieval, recommendation systems, and other large-scale similarity-search workloads that need high throughput on commodity hardware.

  2. h

    nearest-neighbors-datasets

    • huggingface.co
    Updated Mar 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hassan Abedi (2025). nearest-neighbors-datasets [Dataset]. https://huggingface.co/datasets/habedi/nearest-neighbors-datasets
    Explore at:
    Dataset updated
    Mar 22, 2025
    Authors
    Hassan Abedi
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Nearest Neighbors Search Datasets

    The datasets listed below are used in Hann library.

    Index Dataset Dimensions Train Size Test Size Neighbors Distance Original Source

    1 GloVe (25d) 25 1,183,514 10,000 100 Cosine HDF5 (121MB)

    2 GloVe (50d) 50 1,183,514 10,000 100 CosineHDF5 (235MB)

    3 GloVe (100d) 100 1,183,514 10,000 100 Cosine HDF5 (463MB)

    4 GloVe (200d) 200 1,183,514 10,000 100 Cosine HDF5 (918MB)

    5 Last.fm 65 292,385 50,000 100 Cosine HDF5 (135MB)

    6 MNIST 784… See the full description on the dataset page: https://huggingface.co/datasets/habedi/nearest-neighbors-datasets.

  3. snn_exp

    • figshare.com
    bin
    Updated Dec 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Güttel; Xinye Chen (2023). snn_exp [Dataset]. http://doi.org/10.6084/m9.figshare.24781473.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 9, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Stefan Güttel; Xinye Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository includes all the experimental code associated with all required data for the paper ``X. Chen and S. Güttel. Fast and exact fixed-radius neighbor search based on sorting, 2023.''

  4. Search Nearby API | DATA.GOV.HK

    • data.gov.hk
    Updated Jul 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.gov.hk (2024). Search Nearby API | DATA.GOV.HK [Dataset]. https://data.gov.hk/en-data/dataset/hk-landsd-openmap-development-search-nearby-api
    Explore at:
    Dataset updated
    Jul 25, 2024
    Dataset provided by
    data.gov.hk
    Description

    Search Nearby API provides HTTP-based API for application developers to find the facilities located within 1 km of the search location.

  5. e

    ann-t2i-1m

    • hf-proxy-cf.effarig.site
    • huggingface.co
    Updated Jul 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unum (2024). ann-t2i-1m [Dataset]. https://hf-proxy-cf.effarig.site/datasets/unum-cloud/ann-t2i-1m
    Explore at:
    Dataset updated
    Jul 27, 2024
    Dataset authored and provided by
    Unum
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Summary

    This dataset contains 200-dimensional vectors for 1M images indexed by Yandex and produced by the Se-ResNext-101 model.

      Usage
    

    git lfs install git clone https://huggingface.co/datasets/unum-cloud/ann-t2i-1m

      Dataset Structure
    

    The dataset contains three matrices:

    base: base.1M.fbin with 1M vectors to construct the index. query: query.public.100K.fbin with 100K vectors to lookup in the index. truth: groundtruth.public.100K.ibin with… See the full description on the dataset page: https://huggingface.co/datasets/unum-cloud/ann-t2i-1m.

  6. Nearby

    • city-of-lawrenceville-arcgis-hub-lville.hub.arcgis.com
    Updated Jun 30, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    esri_en (2020). Nearby [Dataset]. https://city-of-lawrenceville-arcgis-hub-lville.hub.arcgis.com/items/9d3f21cfd9b14589968f7e5be91b52c8
    Explore at:
    Dataset updated
    Jun 30, 2020
    Dataset provided by
    Esrihttp://esri.com/
    Authors
    esri_en
    Description

    Nearby guides your app viewers to places of interest in your map based on an address location they search for or their current location. Search for places of interest using a search radius or the map extent. When using the search radius, set a range for the distance slider that app viewers will user to define their search buffer or pan the map to see results when showing results on the map. Include directions to help viewers navigate to locations. Enable the export tool to allow viewers to capture images of the map along with results from the search.Examples:Create a store locator app where a customer inputs a location and can find the closest or nearby stores and navigate to itBuild an app where the users can find healthcare facilities within a specified distance of a searched addressProvide viewers with directions and information for election polling locationsBuild an app where users can find nearby trails and view an elevation profile of each resultData RequirementsThis application requires a feature layer to take full advantage of its capabilities. For more information, see the Layers help topic for more details.Key App CapabilitiesDistance slider - Set a minimum and maximum search radius in which results will be capturedMap extent result - Show all the results in the map viewSearch results - Provide location information with feature attributes from a configured pop-upInclude related records – Included related records to be returned in the resultsResults focused layout - Keep the map out of the app to maintain focus on the search and resultsFilter options - Configure predefined options that allow viewers to filter data in the mapExport - Capture an image of the map to export and choose to include search resultsDirections - Include the option to provide directions from a searched location to a resultElevation profile - Include an option to view the elevation profile of linesExport – Print the results and map to a PDF or export results to csvLanguage switcher - Publish a multilingual app that combines your translated custom text and the UI translations for supported languagesHome, Zoom Controls, Legend, Layer List, SearchSupportabilityThis web app is designed responsively to be used in browsers on desktops, mobile phones, and tablets. We are committed to ongoing efforts towards making our apps as accessible as possible. Please feel free to leave a comment on how we can improve the accessibility of our apps for those who use assistive technologies.

  7. Nearby

    • anla-esp-esri-co.hub.arcgis.com
    • noveladata.com
    • +1more
    Updated Jul 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    esri_en (2020). Nearby [Dataset]. https://anla-esp-esri-co.hub.arcgis.com/items/9d3f21cfd9b14589968f7e5be91b52c8
    Explore at:
    Dataset updated
    Jul 1, 2020
    Dataset provided by
    Esrihttp://esri.com/
    Authors
    esri_en
    Description

    Use the Nearby template to guides your app users to places of interest close to an address. This template helps users find focused types of locations (such as schools) within a search distance of an address, their current location, or other place they specify. They can adjust distance values to change the search radius and get directions to locations they select. For users who are searching, you can set a range for the distance slider so users can define their search buffer or pan the map to see results from the map view. Include directions to help users navigate to locations within a defined search radius. Include the export tool to allow users to capture images of the map along with results from the search. Examples: Create a store locator app that allows customers to input a location, find a nearby store, and navigate to it. Create an app for finding health care facilities within a specified distance of a searched address. Provide users with directions and information for election polling locations. Build an app where users can find nearby trails and view an elevation profile of each result. Data requirements The Nearby template requires a feature layer to take full advantage of its capabilities. Key app capabilities Distance slider - Set a minimum and maximum search radius for finding results. Map extent result - Show all the results in the map view. Panel options - Customize result panel location information with feature attributes from a configured pop-up. Results-focused layout - Keep the map out of the app to maintain focus on the search and results. Attribute filter - Configure map filter options that are available to app users. Export - Print or export the search results or selected features as a .pdf, .jpg, or .png file that includes the pop-up content of returned features and an option to include the map. Alternatively, download the search results as a .csv file. Directions - Provide directions from a searched location to a result location. Elevation profile - Generate an elevation profile graph across an input line feature that can be selected in the scene or from drawing a single or multisegment line using the tool. Language switcher - Provide translations for custom text and create a multilingual app. Home, Zoom controls, Legend, Layer List, Search Supportability This web app is designed responsively to be used in browsers on desktops, mobile phones, and tablets. We are committed to ongoing efforts towards making our apps as accessible as possible. Please feel free to leave a comment on how we can improve the accessibility of our apps for those who use assistive technologies.

  8. f

    Results of the proposed method against existing techniques on Diab dataset.

    • plos.figshare.com
    • figshare.com
    xls
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sunil Kumar; Sudeep Varshney; Usha Jain; Prashant Johri; Abdulaziz S. Almazyad; Ali Wagdy Mohamed; Mehdi Hosseinzadeh; Mohammad Shokouhifar (2025). Results of the proposed method against existing techniques on Diab dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0322738.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Sunil Kumar; Sudeep Varshney; Usha Jain; Prashant Johri; Abdulaziz S. Almazyad; Ali Wagdy Mohamed; Mehdi Hosseinzadeh; Mohammad Shokouhifar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of the proposed method against existing techniques on Diab dataset.

  9. T

    deep1b

    • tensorflow.org
    Updated Sep 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). deep1b [Dataset]. https://www.tensorflow.org/datasets/catalog/deep1b
    Explore at:
    Dataset updated
    Sep 3, 2024
    Description

    Pre-trained embeddings for approximate nearest neighbor search using the cosine distance. This dataset consists of two splits:

    1. 'database': consists of 9,990,000 data points, each has features: 'embedding' (96 floats), 'index' (int64), 'neighbors' (empty list).
    2. 'test': consists of 10,000 data points, each has features: 'embedding' (96 floats), 'index' (int64), 'neighbors' (list of 'index' and 'distance' of the nearest neighbors in the database.)

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('deep1b', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  10. d

    Latitude and longitude search for nearby unexpired events

    • data.gov.tw
    json
    Updated Feb 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Culture (2024). Latitude and longitude search for nearby unexpired events [Dataset]. https://data.gov.tw/en/datasets/10044
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Feb 15, 2024
    Dataset authored and provided by
    Ministry of Culture
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Description

    This dataset mainly provides the Ministry of Culture's integration of its own and its subordinate institutions, as well as latitude and longitude queries of activities from other public and private units in the vicinity that have not expired.

  11. g

    Find a Health Center

    • gimi9.com
    • catalog.data.gov
    Updated Dec 22, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2010). Find a Health Center [Dataset]. https://gimi9.com/dataset/data-gov_find-a-health-center-c0304
    Explore at:
    Dataset updated
    Dec 22, 2010
    Description

    The Find Health Center tool is a locator tool designed to make data and information concerning Federally-Funded Health Centers more readily available to our users. It is intended to help people in greatest need for health care locate where they could obtain care in their particular location. The user is able to search for health centers nearest to a specific complete address, city and state, state and county, or ZIP code. The search results (health centers) are returned in groups of ten (numbered from one to ten) and are sorted by increasing distance away from the center of the search area (address or county). For each health center entry in the list the user is provided the health center name, address, approximate distance from the center point of the search, telephone number, website address (where available), and a link for driving directions. The user has the option of viewing the search results either on a map or as text (default) and both views provide links to get more detailed information for each returned opportunity.

  12. h

    my-vicinity-repo

    • huggingface.co
    Updated Feb 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Minish (2025). my-vicinity-repo [Dataset]. https://huggingface.co/datasets/minishlab/my-vicinity-repo
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 28, 2025
    Dataset authored and provided by
    Minish
    Description

    Dataset Card for minishlab/my-vicinity-repo

    This dataset was created using the vicinity library, a lightweight nearest neighbors library with flexible backends. It contains a vector space with 5 items.

      Usage
    

    You can load this dataset using the following code: from vicinity import Vicinity vicinity = Vicinity.load_from_hub("minishlab/my-vicinity-repo")

    After loading the dataset, you can use the vicinity.query method to find the nearest neighbors to a vector.… See the full description on the dataset page: https://huggingface.co/datasets/minishlab/my-vicinity-repo.

  13. r

    Survey Control Points

    • geohub.roundrocktexas.gov
    Updated May 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Round Rock (2021). Survey Control Points [Dataset]. https://geohub.roundrocktexas.gov/items/d555a98f6a0e407593c23c31b9e75829
    Explore at:
    Dataset updated
    May 19, 2021
    Dataset authored and provided by
    City of Round Rock
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This layer contains the data for the survey control points for the City of Round Rock, located in Williamson County, Texas. This layer is part of an original dataset provided and maintained by the City of Round Rock GIS/IT Department. The data in this layer are represented as points.This layer can be used to locate the nearest monument(s) to your site’s location. Find the control point nearest your area to determine the corresponding data sheet, and find the download link below. You can also download the monument coordinates and report synopsis.GPS Point Data Sheets:01-001 01-002 01-003 01-00401-005 01-006 01-007 01-00801-009 01-010 01-011 01-01201-013 01-014 01-015 01-01601-017 01-018 01-019 01-02001-021 01-022 01-023 01-02401-025 01-026 01-027 01-02801-029 01-030 01-031 01-03201-033 01-034 01-035 01-03601-037 01-038 01-039 01-04001-041

  14. f

    Results of the proposed method against existing techniques on Hepatitis...

    • figshare.com
    xls
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sunil Kumar; Sudeep Varshney; Usha Jain; Prashant Johri; Abdulaziz S. Almazyad; Ali Wagdy Mohamed; Mehdi Hosseinzadeh; Mohammad Shokouhifar (2025). Results of the proposed method against existing techniques on Hepatitis dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0322738.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Sunil Kumar; Sudeep Varshney; Usha Jain; Prashant Johri; Abdulaziz S. Almazyad; Ali Wagdy Mohamed; Mehdi Hosseinzadeh; Mohammad Shokouhifar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of the proposed method against existing techniques on Hepatitis dataset.

  15. w

    MetroCard Vendor Location Finder

    • gis.westchestergov.com
    Updated Jun 9, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Westchester County GIS (2017). MetroCard Vendor Location Finder [Dataset]. https://gis.westchestergov.com/app/metrocard-vendor-location-finder
    Explore at:
    Dataset updated
    Jun 9, 2017
    Dataset authored and provided by
    Westchester County GIS
    Description

    MetroCard Vendor Location FinderMetroCards can be purchased at many locations throughout Westchester including the County Center, Metro-North train stations, and over 100 neighborhood stores. This map is designed to help you find a vendor near you. To find the nearest MetroCard Vendor Type in an address in Search an address Box To Clear Search Click on the X Use the Zoom-in tool to see additional features and Zoom-out to see less features To Zoom to Full Extent Click on Home Button Please note: Not all types of MetroCard are available at every sales location. See below for additional ways to purchase a MetroCard and how to become a vendor.Retail MerchantsMerchants can sell both pre-valued MetroCard (ranging in price from $5.50 to $61.90 with bonus) and Unlimited Ride MetroCard (7-Day or 30-Day). This map is designed to help you find retail merchants within Westchester County. For a complete list of merchants within New York City, Long Island, and New Jersey visit the MTA’s website or call 718-330-1234. MetroCard VanThere is a full-service MetroCard van that visits Westchester County every month. For more details including dates and locations of the van please click here. Riders are able to buy a regular MetroCard, refill their existing MetroCards, and apply for a Reduced-Fare MetroCard if they are 65 and older or have qualifying disabilities.Metro-North Railroad StationsYou can buy a joint rail/MetroCard or a separate $25 MetroCard from any Metro-North ticket machine or ticket office. Machines accept cash, credit cards and ATM/debit cards - a $1 fee is assessed on these purchases. Other joint rail/MetroCard options are also available through Mail and Ride, Metro-North's monthly ticket-by-mail program.Subway StationsMetroCard can be purchased from vending machines or staffed sales booths in New York City subway stations. Machines accept cash, credit cards and ATM/debit cards. Station booth agents accept cash only.EasyPayEasyPay is for both full-fare and reduced-fare customers who want to enjoy the benefits of a MetroCard that never runs out of rides. The EasyPay MetroCard is linked to your credit or debit card, and refills automatically as you use it.Become a VendorSelling MetroCard brings in customers and commissions. Merchants can earn up to 3% on every card sold. Click here to learn more and complete the vendor application process. Free advertising materials are provided to merchants.

  16. e

    Data from: Fast open modification spectral library searching through...

    • ebi.ac.uk
    • omicsdi.org
    Updated May 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wout Bittremieux (2021). Fast open modification spectral library searching through approximate nearest neighbor indexing [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD009861
    Explore at:
    Dataset updated
    May 25, 2021
    Authors
    Wout Bittremieux
    Variables measured
    Proteomics
    Description

    Open modification searching (OMS) is a powerful search strategy that identifies peptides carrying any type of modification by allowing a modified spectrum to match against its unmodified variant by using a very wide precursor mass window. A drawback of this strategy, however, is that it leads to a large increase in search time. Although performing an open search can be done using existing spectral library search engines by simply setting a wide precursor mass window, none of these tools have been optimized for OMS, leading to excessive runtimes and suboptimal identification results. This data set contains the evaluation results of the ANN-SoLo tool for fast and accurate open spectral library searching. ANN-SoLo uses approximate nearest neighbor indexing to speed up OMS by selecting only a limited number of the most relevant library spectra to compare to an unknown query spectrum. This approach is combined with a cascade search strategy to maximize the number of identified unmodified and modified spectra while strictly controlling the false discovery rate, as well as a shifted dot product score to sensitively match modified spectra to their unmodified counterparts. ANN-SoLo achieves state-of-the-art performance in terms of speed and the number of identifications. On a previously published human cell line data set, ANN-SoLo confidently identifies more spectra than SpectraST or MSFragger and achieves a speedup of an order of magnitude compared to SpectraST.

  17. p

    Columbus, GA Real Estate Investment Insights

    • propertygenie.us
    Updated Jul 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PropertyGenie (2025). Columbus, GA Real Estate Investment Insights [Dataset]. https://www.propertygenie.us/market-insight/columbus-ga
    Explore at:
    Dataset updated
    Jul 12, 2025
    Dataset authored and provided by
    PropertyGenie
    License

    https://www.propertygenie.us/terms-conditionshttps://www.propertygenie.us/terms-conditions

    Time period covered
    May 31, 2025
    Area covered
    Variables measured
    Population, Rental Count, Job Growth (%), LTR Genie Score, STR Genie Score, Income Growth (%), Rental Demand Score, LTR Monthly Cash Flow, Population Growth (%), STR Monthly Cash Flow, and 6 more
    Description

    The LTR Genie Score of Columbus, GA is 66, indicating a moderate level of rentability for long-term rental properties in the area. The STR Genie Score is 86, showing a high level of rentability for short-term rental or Airbnb properties. The higher STR Genie Score can be attributed to the strong net ROI of 63.04% and high occupancy rate of 68.97, which are both significantly higher than the metrics for long-term rentals. Additionally, the 1-Year Price Appreciation Forecast of 0.13% suggests a stable market with potential for growth.Columbus, GA is a city located in western Georgia, known for its diverse economy and strong military presence due to the nearby Fort Benning. The city offers a mix of urban amenities and outdoor recreational opportunities, making it an attractive location for both residents and visitors.Based on the metrics provided, Columbus, GA appears to be more attractive for short-term rental investments due to the higher STR Genie Score and stronger net ROI. Investors looking for higher returns and a potentially more stable market may find success in the short-term rental market in this area. However, long-term rental investments may still be viable for those seeking a more traditional real estate investment approach. It is recommended for real estate investors to carefully evaluate their investment goals and risk tolerance before deciding on the best strategy for Columbus, GA.

  18. c

    Alternative Fuel Stations in New York

    • s.cnmilf.com
    • data.ny.gov
    • +1more
    Updated Jun 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2025). Alternative Fuel Stations in New York [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/alternative-fuel-stations-in-new-york
    Explore at:
    Dataset updated
    Jun 28, 2025
    Dataset provided by
    data.ny.gov
    Area covered
    New York
    Description

    Go to https://afdc.energy.gov/stations/#/find/nearest to access the full database of alternative fuel station locations nationwide, collected and maintained by the U.S. Department of Energy National Renewable Energy Laboratory. A station appears as one point in the data and on the map, regardless of the number of fuel dispensers or charging outlets at that _location. For EV charging stations for example, the data includes the number of number of charging ports available at the specific station. How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.

  19. Weather forecasting at Ria Arousa (Spain) using AI

    • kaggle.com
    zip
    Updated Apr 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorge Robinat (2021). Weather forecasting at Ria Arousa (Spain) using AI [Dataset]. https://www.kaggle.com/jorgerobinat/weather-forecasting-at-ria-arousa-spain-using-ai
    Explore at:
    zip(3513141742 bytes)Available download formats
    Dataset updated
    Apr 5, 2021
    Authors
    Jorge Robinat
    Area covered
    Spain, Ría de Arousa
    Description

    Context

    Our aim is to improve the accuracy of the meteorological model with Machine Learning. To do so we need a database that contains input variables (meteorological model results) and output data (actual data from a meteorological station). Dependent variables are outputs of the meteorological model. Independent variables are measured by the meteorological station. The trained Machine Learning algorithm will take the variables from the meteorological model and forecast a meteorological variable. The meteorological model is a WRF model maintained by Meteogalicia, a public meteorological service from Galicia (Spain). The model has a resolution of 4 Km. We get the nearest points outputs provided by the model from the station. This dataset is focused on meteorological stations at Ria Arousa (Spain). The meteorological stations are: Coron at latitude: 42.5801 N and longitude: 8.80471 W. and Cortegada at latitude: 42.626 N and longitude: 8.784 W.

    Content

    The dataset contains:

    1._Files (.csv) with the meteorological model: Format LatXX.XX-lonXX.XXp4R4KmD0.csv when lat. and lon. represent latitude and longitude of the meteorological station. p is the number of nearest points from the station (4 points in this case). R is the spatial resolution of the model (4 Km in this case). D means the Day forecast. D0 represents hours H+1 to H+24 from time analysis (we use 00Z analysis of WRF Meteogalicia model). D1 represents hours H+25 H+48 and so on. Each meteorological variable ends with a numerical suffix representing the point. The nearest point is "po" and the farthest point would be: "p3". Columns are meteorological variables forecasted and column time (every hour):

    lhflx: Surface downward latent heat flux. Units, watts per square meters.

    dir: Predicted wind direction at 10 meters. From North direction clockwise. Units are degrees. Unlike dir_o no variable wind is forecasted (no -1 values)

    mod: Wind intensity forecasted at 10 meters. Units are meters per second.

    prec: Total accumulated rainfall between each model output. In our case, every hour. Units kilograms per meter squared.

    rh: Relative Humidity. Units fraction

    visibility: Visibility in air. Units meters. Minimum visibility 26.028316 meters. Maximum visibility 24235.000000

    wind_gust: Wind gust at 10 meters. Units are meters per second. Unlike wind gust_o always forecasted (no -1 value)

    mslp: Sea Level Pressure in pascals

    temp: Air Temperature in Kelvin at 2 meters

    cape: Convective available potential energy. Units: Jules per kilogram. Check this link for more information

    cin: Convective inhibition. Click here for more information. Units Jules per Kilogram

    cfl: Cloud area fraction at low atmosphere layer. I found 1251 samples with values higher than 1 !! Perhaps, we wouldn’t trust this feature so much.

    cfm: Cloud area fraction at mid atmosphere layer. Also, I found 37 samples with values higher than 1.

    conv_prec: Total accumulated convective rainfall between each model output. Every hour in our case.

    HGT500: Geopotential height at 500mb. Units m

    HGT850: Geopotential height at 850mb. Units m

    T500: Temperature at 500mb. Units Kelvin

    T850: Temperature at 850mb. Units Kelvin

    cfh: Cloud cover at high levels. Units fraction

    cft: Cloud cover at low and mid-levels. Units fraction

    lwflx: Surface downward latent heat flux. Units: W m-2

    2._Files with format: stationname.csv: Contain the actual meteorological variables mesured every 10 or 60 minutes. Variables are:

    dir_o: wind direction (degrees) gust_direction_o: gust direction (degees) gust_speed_o: gust speed (m/s) spd_o: speed (m/s) std_dir_o: standard deviation direction (degrees) std_spd_o: standard deviation speed (m/s) gust_spd_max_hour_before_o: max gust speed an hour before (m/s) prec_o: precipitation every 10 minutes (mm) prec_accumulated_1_hour_before: precipitation accumulated one hour before (mm)

    3._ Files with format: metvar_stationname_pxRxKDX.al: Contain the algorthm (independent variables, scaler, PCA, and quality stadisticcs about the algorithn itself and the meteorological model). metvar is the variable forecasted. pX number of the 4 nearest points . RXKm model resolution (4 Km in our case). D forecast day. These files are required by the notebook (operational_arousa) to get the daily results.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  20. d

    Alternative Fueling Stations

    • catalog.data.gov
    • gimi9.com
    • +6more
    Updated May 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Renewable Energy Laboratory (NREL) (Point of Contact) (2025). Alternative Fueling Stations [Dataset]. https://catalog.data.gov/dataset/alternative-fueling-stations1
    Explore at:
    Dataset updated
    May 2, 2025
    Dataset provided by
    National Renewable Energy Laboratory (NREL) (Point of Contact)
    Description

    The Alternative Fueling Stations dataset is updated daily from the National Renewable Energy Laboratory (NREL) and is part of the U.S. Department of Transportation (USDOT)/Bureau of Transportation Statistics (BTS) National Transportation Atlas Database (NTAD). For more information about the update cycle and data collection methods, please refer to https://afdc.energy.gov/stations/#/find/nearest?show_about=true. This dataset shows all station access types (public and private) and statuses (available, planned, and temporarily unavailable) by default. To view only publicly available stations, use the access and status filters. The U.S. Department of Energy collects these data in partnership with Clean Cities coalitions and their stakeholders to help fleets and consumers find alternative fueling stations. Clean Cities coalitions foster the nation's economic, environmental, and energy security by working locally to advance affordable, efficient, and clean transportation fuels and technologies. This data can be found on the Alternative Fuels Data Center: https://doi.org/10.21949/1519144. For more information about the data schema and data dictionary, please see https://developer.nrel.gov/docs/transportation/alt-fuel-stations-v1/all/#response-fields. A data dictionary, or other source of attribute information, is accessible at https://doi.org/10.21949/1529008

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Alexander V. Mantzaris (2025). LMDiskANN.jl: An Implementation of the Low Memory Disk Approximate Nearest Neighbors Search Algorithm [Dataset]. http://doi.org/10.6084/m9.figshare.29286668.v1

Data from: LMDiskANN.jl: An Implementation of the Low Memory Disk Approximate Nearest Neighbors Search Algorithm

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Jun 10, 2025
Dataset provided by
figshare
Authors
Alexander V. Mantzaris
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

LMDiskANN.jl (v1.2.0) is a Julia package that implements the Low-Memory Disk Approximate Nearest-Neighbor (LM-DiskANN) algorithm, extending DiskANN-style graph search to handle billion-scale vector datasets while keeping RAM usage to a minimum. It stores adjacency lists on disk via memory-mapped files, performs tunable best-first graph traversals for fast and accurate queries, and supports dynamic insertions and deletions with automatic pruning to maintain a compact index. The library exposes knobs to balance recall against latency, and it optionally pairs a LevelDB key–value store with the node IDs for flexible external key lookup. These capabilities make LMDiskANN.jl well-suited for embedding retrieval, recommendation systems, and other large-scale similarity-search workloads that need high throughput on commodity hardware.

Search
Clear search
Close search
Google apps
Main menu