100+ datasets found
  1. h

    finesse-benchmark-database

    • huggingface.co
    Updated Oct 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    winter.sci.dev (2025). finesse-benchmark-database [Dataset]. https://huggingface.co/datasets/enzoescipy/finesse-benchmark-database
    Explore at:
    Dataset updated
    Oct 25, 2025
    Authors
    winter.sci.dev
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Finesse Benchmark Database

      Overview
    

    finesse-benchmark-database is a data generation factory for atomic probes in the Finesse benchmark. It generates probes_atomic.jsonl files from Wikimedia Wikipedia datasets, leveraging Hugging Face's datasets library, tokenizers from transformers, and optional PyTorch support. This tool is designed to create high-quality, language-specific probe datasets for benchmarking fine-grained understanding in NLP tasks.… See the full description on the dataset page: https://huggingface.co/datasets/enzoescipy/finesse-benchmark-database.

  2. NIST Computational Chemistry Comparison and Benchmark Database - SRD 101

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). NIST Computational Chemistry Comparison and Benchmark Database - SRD 101 [Dataset]. https://catalog.data.gov/dataset/nist-computational-chemistry-comparison-and-benchmark-database-srd-101
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    The NIST Computational Chemistry Comparison and Benchmark Database is a collection of experimental and ab initio thermochemical properties for a selected set of gas-phase molecules. The goals are to provide a benchmark set of experimental data for the evaluation of ab initio computational methods and allow the comparison between different ab initio computational methods for the prediction of gas-phase thermochemical properties. The data files linked to this record are a subset of the experimental data present in the CCCBDB.

  3. d

    Elevation Benchmarks

    • catalog.data.gov
    • data.cityofchicago.org
    • +3more
    Updated Dec 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofchicago.org (2023). Elevation Benchmarks [Dataset]. https://catalog.data.gov/dataset/elevation-benchmarks
    Explore at:
    Dataset updated
    Dec 2, 2023
    Dataset provided by
    data.cityofchicago.org
    Description

    The following dataset includes "Active Benchmarks," which are provided to facilitate the identification of City-managed standard benchmarks. Standard benchmarks are for public and private use in establishing a point in space. Note: The benchmarks are referenced to the Chicago City Datum = 0.00, (CCD = 579.88 feet above mean tide New York). The City of Chicago Department of Water Management’s (DWM) Topographic Benchmark is the source of the benchmark information contained in this online database. The information contained in the index card system was compiled by scanning the original cards, then transcribing some of this information to prepare a table and map. Over time, the DWM will contract services to field verify the data and update the index card system and this online database.This dataset was last updated September 2011. Coordinates are estimated. To view map, go to https://data.cityofchicago.org/Buildings/Elevation-Benchmarks-Map/kmt9-pg57 or for PDF map, go to http://cityofchicago.org/content/dam/city/depts/water/supp_info/Benchmarks/BMMap.pdf. Please read the Terms of Use: http://www.cityofchicago.org/city/en/narr/foia/data_disclaimer.html.

  4. d

    Global Data Literacy Benchmark

    • datatothepeople.org
    Updated Aug 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data To The People (2020). Global Data Literacy Benchmark [Dataset]. https://www.datatothepeople.org/gdlb
    Explore at:
    Dataset updated
    Aug 14, 2020
    Dataset authored and provided by
    Data To The People
    Description

    Dataset enabling organizations to benchmark their data literacy capability globally.

  5. data-product-benchmark

    • huggingface.co
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research (2025). data-product-benchmark [Dataset]. https://huggingface.co/datasets/ibm-research/data-product-benchmark
    Explore at:
    Dataset updated
    Oct 4, 2025
    Dataset provided by
    IBMhttp://ibm.com/
    IBM Research
    Authors
    IBM Research
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description

    This dataset provides a benchmark for automatic data product creation. The task is framed as follows: given a natural language data product request and a corpus of text and tables, the objective is to identify the relevant tables and text documents that should be included in the resulting data product which would useful to the given data product request. The benchmark brings together three variants: HybridQA, TAT-QA, and ConvFinQA, each consisting of:

    A corpus… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/data-product-benchmark.

  6. S

    Survey Benchmark

    • data.sanjoseca.gov
    • gisdata-csj.opendata.arcgis.com
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enterprise GIS (2025). Survey Benchmark [Dataset]. https://data.sanjoseca.gov/dataset/survey-benchmark
    Explore at:
    html, zip, geojson, kml, csv, arcgis geoservices rest apiAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    City of San José
    Authors
    Enterprise GIS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These are locations that are to be used as an elevation reference and contain the official elevation and last known latitude and longitude.

    App: The data can be viewed in web map format at: Survey Benchmarks

    Data is published on Mondays on a weekly basis.


  7. Z

    Benchmark Database for Phonetic Alignments

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Feb 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    List, Johann-Mattis; Prokić, Jelena (2022). Benchmark Database for Phonetic Alignments [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11880
    Explore at:
    Dataset updated
    Feb 21, 2022
    Dataset provided by
    Philipps-Universität Marburg
    Authors
    List, Johann-Mattis; Prokić, Jelena
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the last two decades, alignment analyses have become an important technique in quantitative historical linguistics and dialectology. Phonetic alignment plays a crucial role in the identification of regular sound correspondences and deeper genealogical relations between and within languages and language families. Surprisingly, up to today, there are no easily accessible benchmark data sets for phonetic alignment analyses. Here we present a publicly available database of manually edited phonetic alignments which can serve as a platform for testing and improving the performance of automatic alignment algorithms. The database consists of a great variety of alignments drawn from a large number of different sources. The data is arranged in a such way that typical problems encountered in phonetic alignment analyses (metathesis, diversity of phonetic sequences) are represented and can be directly tested.

  8. a

    Benchmark

    • gis-cupertino.opendata.arcgis.com
    • hub.arcgis.com
    Updated Jan 21, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Cupertino (2016). Benchmark [Dataset]. https://gis-cupertino.opendata.arcgis.com/datasets/benchmark/data
    Explore at:
    Dataset updated
    Jan 21, 2016
    Dataset authored and provided by
    City of Cupertino
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    Benchmark is a Point FeatureClass representing land-surveyed benchmarks in Cupertino. Benchmarks are stable sites used to provide elevation data. It is primarily used as a reference layer. The layer is updated as needed by the GIS department. Benchmark has the following fields:

    OBJECTID: Unique identifier automatically generated by Esri type: OID, length: 4, domain: none

    ID: Unique identifier assigned to the Benchmark type: Integer, length: 4, domain: none

    REF_MARK: The reference mark associated with the Benchmark type: String, length: 10, domain: none

    ELEV: The elevation of the Benchmark type: Double, length: 8, domain: none

    Shape: Field that stores geographic coordinates associated with feature type: Geometry, length: 4, domain: none

    Description: A more detailed description of the Benchmark type: String, length: 200, domain: none

    Owner: The owner of the Benchmark type: String, length: 10, domain: none

    GlobalID: Unique identifier automatically generated for features in enterprise database type: GlobalID, length: 38, domain: none Operator:

    The user responsible for updating this database type: String, length: 255, domain: OPERATOR

    last_edited_date: The date the database row was last updated type: Date, length: 8, domain: none

    created_date: The date the database row was initially created type: Date, length: 8, domain: none

    VerticalDatum: The vertical datum associated with the Benchmarktype: String, length: 100, domain: none

  9. a

    Benchmark Points

    • data-auroraco.opendata.arcgis.com
    • hub.arcgis.com
    Updated Mar 2, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Aurora, Colorado Maps (2015). Benchmark Points [Dataset]. https://data-auroraco.opendata.arcgis.com/datasets/AuroraCo::benchmark-points/about
    Explore at:
    Dataset updated
    Mar 2, 2015
    Dataset authored and provided by
    City of Aurora, Colorado Maps
    Area covered
    Description

    This feature class is maintained to keep construction, mapping and all city services in and adjacent to the city on the same vertical datum. Thus minimizing mistakes or conflicts in construction or maintenance of the infrastructure of the city. This feature class represents the cities vertical control network which was started approximatley in 1971. The oringinal bench mark used to start the network was a NGS (Nationa Geodetic Survey) monument located in the northwest part of the city (Stapleton Airport/Peoria and Smith). Level loops and runs were added and extended as the city grew. Which also incorporated more NGS monuments. The datum used when the program started was NGVD (National Geodetic Vertical Datum) of 1929 . In June of 2006 the city converted from NGVD 1929 to NAVD (North American Vertical Datum) of 1988 using Corpscon 6.0.1 and the Geoid 2003. This network has been successful and should be maintained until a improved alternative can be utilized.

  10. c

    Survey Benchmarks

    • data.cityofsacramento.org
    • data.sacog.org
    • +3more
    Updated Apr 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Sacramento (2017). Survey Benchmarks [Dataset]. https://data.cityofsacramento.org/datasets/survey-benchmarks
    Explore at:
    Dataset updated
    Apr 3, 2017
    Dataset authored and provided by
    City of Sacramento
    Area covered
    Description

    The Datum of the majority of the City of Sacramento benchmarks is the National Geodetic Vertical Datum of 1929 (NGVD 29) and was based upon a plane datum with reference to four U.S. Government monuments. Some North American Vertical Datum 1988 (NAVD 88) differential baselines have been conducted and the elevations of those benchmarks have been included and identified in this database.VERTICAL CONTROL DISCLAIMER AND GENERAL USE DISCLAIMER City of Sacramento Vertical Control Network, also known as “City of Sacramento Datum” information, is furnished by the Department of Public Works - Engineering Services. It was developed and collected for the purpose of establishing reference points for all survey activities within Sacramento City limits. City of Sacramento makes no warranties, expressed or implied, concerning the accuracy, completeness, reliability, or suitability of this data for any other particular use. Furthermore, the City of Sacramento assumes not liability for any errors, omissions, or inaccuracies associated with the use or misuse of such data. The City of Sacramento tries to keep this information current and accurate.If you find a missing or destroyed City of Sacramento benchmark please contact the City Land Surveyor by calling (916) 808-8777.

    ACCEPTABLE USE The City of Sacramento Datum (unadjusted NGVD29) local benchmarks provided here are acceptable as City of Sacramento benchmarks for the purpose of establishing and extending vertical control to design surveys. Reference Sacramento City Code, §1.12.010.All information compiled on this website is provided as a public service and for general informational purposes only. In preparation of these pages, every effort has been made by the Department of Public Works - Engineering Services to offer the most current, correct and concise information possible. The City of Sacramento and its authorized agents and contractors disclaim any responsibility for typographical errors and accuracy of the information that may be contained on the City of Sacramento website; www.cityofsacramento.orgBy accessing the information, data, materials and links contained in the City of Sacramento World Wide Web pages, you hereby agree to and accept the following terms and conditions: The Department of Public Works - Engineering Services shall not be liable for the improper or incorrect use of data, information, materials, links or related graphics described and /or contained herein. The Department of Public Works - Engineering Services shall not be liable for any demand claim, regardless of form or action, arising out of or incident to the posting of information or data on this website; the accessing or use of any information or data on this website; and/or the acts or omissions of any person or entity accessing or using any information from this website.The user hereby recognizes that the information, data, materials and related graphics are dynamic and may change over time without notice. The Department of Public Works - Engineering Services is not responsible for the use or reliance upon this information. There are links and pointers to third party internet websites contained in the City of Sacramento website. These sites linked from the City of Sacramento website are not under the City’s control. The City of Sacramento and its authorized agents and contractors do not assume any responsibility or liability for any information, communications or materials available at such linked sites, or at any link contained in a liked site. Each individual site has its own set of policies about what information is appropriate for public access. User assumes sole responsibility for use of third party links and pointers.

  11. h

    benchmark-dummy-data

    • huggingface.co
    Updated Mar 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evaluation Bot (2023). benchmark-dummy-data [Dataset]. https://huggingface.co/datasets/autoevaluator/benchmark-dummy-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 2, 2023
    Authors
    Evaluation Bot
    Description

    Dummy Dataset for AutoTrain Benchmark

    This dataset contains dummy data that's needed to create AutoTrain projects for benchmarks like RAFT. See here for more details.

  12. C

    Chicago Energy Benchmarking

    • data.cityofchicago.org
    • gimi9.com
    • +3more
    csv, xlsx, xml
    Updated Feb 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Chicago (2025). Chicago Energy Benchmarking [Dataset]. https://data.cityofchicago.org/Environment-Sustainable-Development/Chicago-Energy-Benchmarking/xq83-jr8c
    Explore at:
    csv, xml, xlsxAvailable download formats
    Dataset updated
    Feb 5, 2025
    Dataset authored and provided by
    City of Chicago
    Area covered
    Chicago
    Description

    The Chicago Building Energy Use Benchmarking Ordinance calls on existing municipal, commercial, and residential buildings larger than 50,000 square feet to track whole-building energy use, report to the City annually, and verify data accuracy every three years. The law, which was phased in from 2014-2017, covers less than 1% of Chicago’s buildings, which account for approximately 20% of total energy used by all buildings. For more details, including ordinance text, rules and regulations, and timing, please visit www.CityofChicago.org/EnergyBenchmarking

    The ordinance authorizes the City to share property-specific information with the public, beginning with the second year in which a building is required to comply.

    The dataset represents self-reported and publicly-available property information by calendar year. Please note that the "Data Year" column refers to the year to which the data apply, not the year in which they were reported. That column and filtered views under "Related Content" can be used to isolate specific years.

  13. d

    Benchmark

    • catalog.data.gov
    • data.brla.gov
    • +1more
    Updated Feb 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.brla.gov (2024). Benchmark [Dataset]. https://catalog.data.gov/dataset/benchmark-3b4b6
    Explore at:
    Dataset updated
    Feb 2, 2024
    Dataset provided by
    data.brla.gov
    Description

    Point geometry with attributes displaying geodetic control stations (benchmarks) in East Baton Rouge Parish, Louisiana.

  14. b

    Benchmark Energy & Geometry Database

    • bioregistry.io
    Updated Nov 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Benchmark Energy & Geometry Database [Dataset]. http://identifiers.org/re3data:r3d100011166
    Explore at:
    Dataset updated
    Nov 23, 2021
    Description

    The Benchmark Energy & Geometry Database (BEGDB) collects results of highly accurate quantum mechanics (QM) calculations of molecular structures, energies and properties. These data can serve as benchmarks for testing and parameterization of other computational methods.

  15. c

    ColdFusion 2025 Performance Metrics Database

    • cfguide.io
    Updated Oct 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Convective (2025). ColdFusion 2025 Performance Metrics Database [Dataset]. https://www.cfguide.io/performance-metrics-database
    Explore at:
    Dataset updated
    Oct 5, 2025
    Dataset authored and provided by
    Convective
    Measurement technique
    Real-world production workload testing with Apache JMeter and performance monitoring tools
    Description

    Comprehensive performance benchmark database for ColdFusion 2025 with real-world metrics across different workload types, hardware configurations, and JVM tuning scenarios

  16. m

    Benchmark data sets

    • data.mendeley.com
    • narcis.nl
    Updated Dec 27, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haonan Tong (2017). Benchmark data sets [Dataset]. http://doi.org/10.17632/923xvkk5mm.1
    Explore at:
    Dataset updated
    Dec 27, 2017
    Authors
    Haonan Tong
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A total of 12 software defect data sets from NASA were used in this study, where five data sets (part I) including CM1, JM1, KC1, KC2, and PC1 are obtained from PROMISE software engineering repository (http://promise.site.uottawa.ca/SERepository/), the other seven data sets (part II) are obtained from tera-PROMISE Repository (http://openscience.us/repo/defect/mccabehalsted/).

  17. U

    USGS Benchmark Glacier Project Comprehensive Data Collection

    • data.usgs.gov
    • s.cnmilf.com
    • +1more
    Updated Apr 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey Benchmark Glacier Program (2024). USGS Benchmark Glacier Project Comprehensive Data Collection [Dataset]. http://doi.org/10.5066/P9AGXQSR
    Explore at:
    Dataset updated
    Apr 10, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    U.S. Geological Survey Benchmark Glacier Program
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    2020
    Description

    Mountain glaciers are closely coupled to climate processes, ecosystems, and regional water resources. To enhance physical understanding of these connections, the USGS maintains a collection of glacier mass balance and climate data across the western United States and Alaska. In some cases, records of glacier mass balance extend back to the mid-1940s. These data have been incorporated from various sources, primarily original USGS studies, but also including work from the University of Alaska, and the Juneau Icefield Research Program (JIRP). The core of this collection is composed of mass balance data from the USGS Benchmark Glaciers. These five glaciers are Lemon Creek Glacier, AK (1953 -Present), South Cascade Glacier, WA (1958 - Present), Gulkana and Wolverine glaciers, AK (1966 - Present), and Sperry Glacier, MT (2005 - Present). Datasets from each benchmark glacier are composed of, at a minimum, point mass balances, glacier hypsometry, daily temperature and precipitation, geode ...

  18. l

    LA County Benchmarks

    • data.lacounty.gov
    • geohub.lacity.org
    • +2more
    Updated May 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    County of Los Angeles (2021). LA County Benchmarks [Dataset]. https://data.lacounty.gov/datasets/la-county-benchmarks
    Explore at:
    Dataset updated
    May 19, 2021
    Dataset authored and provided by
    County of Los Angeles
    Area covered
    Description

    Los Angeles County Department of Public Works’ Vertical Control Network is composed of more than 1,700 miles (2,720 kilometers) of level runs and comprise nearly 9,000 benchmarks. The basic accuracy of the net is reflected by an indicated field probable error of ± 0.017 feet per mile (4 mm per kilometer) of leveling as determined from conditions of closure. However, because of varying degrees of subsidence and heaving, the true datum is recovered only by obtaining substantial agreement of a number of benchmarks.For each active benchmark, a point representation was created in GIS by locating them based on their description. Parcel data, mile markers, the County Address Management System (CAMS), LARIAC aerials, oblique photos, 2-foot contour lines and/or Google Street View were used in assisting with the location.The creation of the benchmarks in GIS greatly enhances the Vertical Control Network by adding visual context with respect to their representative geospatial locations. With a glance, geospatial patterns can be observed and out-of-place benchmarks can be quickly identified and remapped to the correct location after verification.To facilitate the adjustment, indexing and distribution of adjusted values in the network, the county territory was divided into 33 quads or areas. For identification purposes, each quad was given a name (for example, “Rosemead”, “La Mirada”, “Santa Fe”, and etc.). Index maps, county maps, and other information can be accessed and downloaded on the basis of each of the quads by going to Survey Division’s Benchmark Retrieval System (https://pw.lacounty.gov/sur/benchmark). General adjustments are carried out every 5 to 10 years and the provided elevation data is expected to remain sound during this period. When a quad is adjusted, new elevations will be published and the date of the readjustment will be noted. No historical data is provided, but it can be acquired from Survey Division’s Public Records Counter or via the fee based Optional Technical Research (OTR) program. For general questions, contact:Hector Chang626-458-7038hchang@dpw.lacounty.govFor survey-related questions, contact:Charles Springstun626-320-9896cspring@dpw.lacounty.govThe following resources can be used to obtain historical benchmark data:PUBLIC RECORDS COUNTER900 S. Fremont Ave, 4th FloorAlhambra, CA 918037:00 AM to 5:00 PM Mon – ThursPhone: (626) 458-5137OPTIONAL TECHNICAL RESEARCH (OTR)7:00 AM to 5:00 PM Mon – ThursPhone: (626) 458-5131

  19. a

    Benchmark Table

    • data-roseville.opendata.arcgis.com
    Updated May 11, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CityofRoseville (2017). Benchmark Table [Dataset]. https://data-roseville.opendata.arcgis.com/documents/12e87c5371a642fba1d05a780353eec5
    Explore at:
    Dataset updated
    May 11, 2017
    Dataset authored and provided by
    CityofRoseville
    Description

    Benchmark locations as identified by the Department of Public Works.

  20. f

    Performance comparison on the benchmark noisy database.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    • +1more
    Updated Oct 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carrault, Guy; Doyen, Matthieu; Hernández, Alfredo I.; Beuchée, Alain; Ge, Di (2019). Performance comparison on the benchmark noisy database. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000159454
    Explore at:
    Dataset updated
    Oct 29, 2019
    Authors
    Carrault, Guy; Doyen, Matthieu; Hernández, Alfredo I.; Beuchée, Alain; Ge, Di
    Description

    Performance comparison on the benchmark noisy database.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
winter.sci.dev (2025). finesse-benchmark-database [Dataset]. https://huggingface.co/datasets/enzoescipy/finesse-benchmark-database

finesse-benchmark-database

enzoescipy/finesse-benchmark-database

Explore at:
Dataset updated
Oct 25, 2025
Authors
winter.sci.dev
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Finesse Benchmark Database

  Overview

finesse-benchmark-database is a data generation factory for atomic probes in the Finesse benchmark. It generates probes_atomic.jsonl files from Wikimedia Wikipedia datasets, leveraging Hugging Face's datasets library, tokenizers from transformers, and optional PyTorch support. This tool is designed to create high-quality, language-specific probe datasets for benchmarking fine-grained understanding in NLP tasks.… See the full description on the dataset page: https://huggingface.co/datasets/enzoescipy/finesse-benchmark-database.

Search
Clear search
Close search
Google apps
Main menu