100+ datasets found
  1. h

    finesse-benchmark-database

    • huggingface.co
    Updated Oct 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    winter.sci.dev (2025). finesse-benchmark-database [Dataset]. https://huggingface.co/datasets/enzoescipy/finesse-benchmark-database
    Explore at:
    Dataset updated
    Oct 25, 2025
    Authors
    winter.sci.dev
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Finesse Benchmark Database

      Overview
    

    finesse-benchmark-database is a data generation factory for atomic probes in the Finesse benchmark. It generates probes_atomic.jsonl files from Wikimedia Wikipedia datasets, leveraging Hugging Face's datasets library, tokenizers from transformers, and optional PyTorch support. This tool is designed to create high-quality, language-specific probe datasets for benchmarking fine-grained understanding in NLP tasks.… See the full description on the dataset page: https://huggingface.co/datasets/enzoescipy/finesse-benchmark-database.

  2. d

    Global Data Literacy Benchmark

    • datatothepeople.org
    Updated Aug 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data To The People (2020). Global Data Literacy Benchmark [Dataset]. https://www.datatothepeople.org/gdlb
    Explore at:
    Dataset updated
    Aug 14, 2020
    Dataset authored and provided by
    Data To The People
    Description

    Dataset enabling organizations to benchmark their data literacy capability globally.

  3. NIST Computational Chemistry Comparison and Benchmark Database - SRD 101

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). NIST Computational Chemistry Comparison and Benchmark Database - SRD 101 [Dataset]. https://catalog.data.gov/dataset/nist-computational-chemistry-comparison-and-benchmark-database-srd-101
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    The NIST Computational Chemistry Comparison and Benchmark Database is a collection of experimental and ab initio thermochemical properties for a selected set of gas-phase molecules. The goals are to provide a benchmark set of experimental data for the evaluation of ab initio computational methods and allow the comparison between different ab initio computational methods for the prediction of gas-phase thermochemical properties. The data files linked to this record are a subset of the experimental data present in the CCCBDB.

  4. data-product-benchmark

    • huggingface.co
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research (2025). data-product-benchmark [Dataset]. https://huggingface.co/datasets/ibm-research/data-product-benchmark
    Explore at:
    Dataset updated
    Oct 4, 2025
    Dataset provided by
    IBMhttp://ibm.com/
    IBM Research
    Authors
    IBM Research
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description

    This dataset provides a benchmark for automatic data product creation. The task is framed as follows: given a natural language data product request and a corpus of text and tables, the objective is to identify the relevant tables and text documents that should be included in the resulting data product which would useful to the given data product request. The benchmark brings together three variants: HybridQA, TAT-QA, and ConvFinQA, each consisting of:

    A corpus… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/data-product-benchmark.

  5. LLMSQL Benchmark

    • kaggle.com
    • huggingface.co
    zip
    Updated Oct 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dzmitry Pihulski (2025). LLMSQL Benchmark [Dataset]. https://www.kaggle.com/datasets/dzmitrypihulski/llmsql-benchmark
    Explore at:
    zip(46426396 bytes)Available download formats
    Dataset updated
    Oct 12, 2025
    Authors
    Dzmitry Pihulski
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    LLMSQL Benchmark

    This benchmark is designed to evaluate text-to-SQL models. For usage of this benchmark see https://github.com/LLMSQL/llmsql-benchmark.

    If you want to use this benchmark from huggingface, please see: https://huggingface.co/llmsql-bench.

    Arxiv Article: https://arxiv.org/abs/2510.02350

    Files

    • tables.jsonl — Database table metadata
    • questions.jsonl — All available questions
    • train_questions.jsonl, val_questions.jsonl, test_questions.jsonl — Data splits for finetuning, see https://github.com/LLMSQL/llmsql-benchmark
    • sqlite_tables.db — sqlite db with tables from tables.jsonl, created with the help of create_db_sql.
    • create_db.sql — SQL script that creates the database sqlite_tables.db.

    Citation

    If you use this benchmark, please cite:

    @inproceedings{llmsql_bench,
     title={LLMSQL: Upgrading WikiSQL for the LLM Era of Text-to-SQLels},
     author={Pihulski, Dzmitry and Charchut, Karol and Novogrodskaia, Viktoria and Koco{'n}, Jan},
     booktitle={2025 IEEE International Conference on Data Mining Workshops (ICDMW)},
     year={2025},
     organization={IEEE}
    }
    
  6. e

    Data for: A Benchmark Dataset for Machine Learning in Ecotoxicology -...

    • opendata.eawag.ch
    • opendata-stage.eawag.ch
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data for: A Benchmark Dataset for Machine Learning in Ecotoxicology - Package - ERIC [Dataset]. https://opendata.eawag.ch/dataset/adore
    Explore at:
    Description

    The publication provides and describes a clean and expert-curated benchmark dataset to be used for machine-learning-based research in ecotoxicology. The package contains several data files associated with the challenges we propose and some supplementary data files to aid in the interpretation of results.

  7. LDBC-SNB SF-0001 and SF-0003 Datasets

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Jan 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arnau Prat-Pérez; Arnau Prat-Pérez (2020). LDBC-SNB SF-0001 and SF-0003 Datasets [Dataset]. http://doi.org/10.5281/zenodo.3452106
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 21, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Arnau Prat-Pérez; Arnau Prat-Pérez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This datasets generated with the LDBC SNB Data generator.

    https://github.com/ldbc/ldbc_snb_datagen

    It corresponds to Scale Factors 1 and 3. They are used in the following paper:

    An early look at the LDBC social network benchmark's business intelligence workload

    10.1145/3210259.3210268

  8. Benchmark Results: DBpedia 50%

    • figshare.com
    zip
    Updated Apr 28, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Conrads; Jens Lehmann; Muhammad Saleem; Mohamed Morsey; Axel-Cyrille Ngonga Ngomo (2017). Benchmark Results: DBpedia 50% [Dataset]. http://doi.org/10.6084/m9.figshare.3205435.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 28, 2017
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Felix Conrads; Jens Lehmann; Muhammad Saleem; Mohamed Morsey; Axel-Cyrille Ngonga Ngomo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of the IGUANA Benchmark in 2015/16 for the truncated DBpedia dataset. The dataset is 50% of the initial 100% dataset.

  9. t

    Why AI Can't Crack Your Database - Data Analysis

    • tomtunguz.com
    Updated Aug 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomasz Tunguz (2025). Why AI Can't Crack Your Database - Data Analysis [Dataset]. https://tomtunguz.com/spider-2-benchmark-trends/
    Explore at:
    Dataset updated
    Aug 13, 2025
    Dataset provided by
    Theory Ventures
    Authors
    Tomasz Tunguz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Explore why AI excels at complex math but struggles with SQL queries, with benchmark data showing a 60% accuracy ceiling in database operations across leading models.

  10. m

    Benchmark data sets

    • data.mendeley.com
    • narcis.nl
    Updated Dec 27, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haonan Tong (2017). Benchmark data sets [Dataset]. http://doi.org/10.17632/923xvkk5mm.1
    Explore at:
    Dataset updated
    Dec 27, 2017
    Authors
    Haonan Tong
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A total of 12 software defect data sets from NASA were used in this study, where five data sets (part I) including CM1, JM1, KC1, KC2, and PC1 are obtained from PROMISE software engineering repository (http://promise.site.uottawa.ca/SERepository/), the other seven data sets (part II) are obtained from tera-PROMISE Repository (http://openscience.us/repo/defect/mccabehalsted/).

  11. Dump of RDF dataset used by PO for a Graph Database benchmark, 2022

    • data.niaid.nih.gov
    • nde-dev.biothings.io
    Updated Jan 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghislain ATEMEZING (2023). Dump of RDF dataset used by PO for a Graph Database benchmark, 2022 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1036738
    Explore at:
    Dataset updated
    Jan 29, 2023
    Dataset provided by
    Mondeca
    Authors
    Ghislain ATEMEZING
    Description

    This dataset represents a newer version of the NQUADS files in RDF from Publication Offices used for benchmarking graph databases.

  12. a

    Benchmark

    • gis-cupertino.opendata.arcgis.com
    • hub.arcgis.com
    Updated Jan 21, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Cupertino (2016). Benchmark [Dataset]. https://gis-cupertino.opendata.arcgis.com/datasets/benchmark/data
    Explore at:
    Dataset updated
    Jan 21, 2016
    Dataset authored and provided by
    City of Cupertino
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    Benchmark is a Point FeatureClass representing land-surveyed benchmarks in Cupertino. Benchmarks are stable sites used to provide elevation data. It is primarily used as a reference layer. The layer is updated as needed by the GIS department. Benchmark has the following fields:

    OBJECTID: Unique identifier automatically generated by Esri type: OID, length: 4, domain: none

    ID: Unique identifier assigned to the Benchmark type: Integer, length: 4, domain: none

    REF_MARK: The reference mark associated with the Benchmark type: String, length: 10, domain: none

    ELEV: The elevation of the Benchmark type: Double, length: 8, domain: none

    Shape: Field that stores geographic coordinates associated with feature type: Geometry, length: 4, domain: none

    Description: A more detailed description of the Benchmark type: String, length: 200, domain: none

    Owner: The owner of the Benchmark type: String, length: 10, domain: none

    GlobalID: Unique identifier automatically generated for features in enterprise database type: GlobalID, length: 38, domain: none Operator:

    The user responsible for updating this database type: String, length: 255, domain: OPERATOR

    last_edited_date: The date the database row was last updated type: Date, length: 8, domain: none

    created_date: The date the database row was initially created type: Date, length: 8, domain: none

    VerticalDatum: The vertical datum associated with the Benchmarktype: String, length: 100, domain: none

  13. d

    Elevation Benchmarks

    • catalog.data.gov
    • data.cityofchicago.org
    • +3more
    Updated Dec 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofchicago.org (2023). Elevation Benchmarks [Dataset]. https://catalog.data.gov/dataset/elevation-benchmarks
    Explore at:
    Dataset updated
    Dec 2, 2023
    Dataset provided by
    data.cityofchicago.org
    Description

    The following dataset includes "Active Benchmarks," which are provided to facilitate the identification of City-managed standard benchmarks. Standard benchmarks are for public and private use in establishing a point in space. Note: The benchmarks are referenced to the Chicago City Datum = 0.00, (CCD = 579.88 feet above mean tide New York). The City of Chicago Department of Water Management’s (DWM) Topographic Benchmark is the source of the benchmark information contained in this online database. The information contained in the index card system was compiled by scanning the original cards, then transcribing some of this information to prepare a table and map. Over time, the DWM will contract services to field verify the data and update the index card system and this online database.This dataset was last updated September 2011. Coordinates are estimated. To view map, go to https://data.cityofchicago.org/Buildings/Elevation-Benchmarks-Map/kmt9-pg57 or for PDF map, go to http://cityofchicago.org/content/dam/city/depts/water/supp_info/Benchmarks/BMMap.pdf. Please read the Terms of Use: http://www.cityofchicago.org/city/en/narr/foia/data_disclaimer.html.

  14. Z

    Benchmark Database for Phonetic Alignments

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Feb 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    List, Johann-Mattis; Prokić, Jelena (2022). Benchmark Database for Phonetic Alignments [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11880
    Explore at:
    Dataset updated
    Feb 21, 2022
    Dataset provided by
    Philipps-Universität Marburg
    Authors
    List, Johann-Mattis; Prokić, Jelena
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the last two decades, alignment analyses have become an important technique in quantitative historical linguistics and dialectology. Phonetic alignment plays a crucial role in the identification of regular sound correspondences and deeper genealogical relations between and within languages and language families. Surprisingly, up to today, there are no easily accessible benchmark data sets for phonetic alignment analyses. Here we present a publicly available database of manually edited phonetic alignments which can serve as a platform for testing and improving the performance of automatic alignment algorithms. The database consists of a great variety of alignments drawn from a large number of different sources. The data is arranged in a such way that typical problems encountered in phonetic alignment analyses (metathesis, diversity of phonetic sequences) are represented and can be directly tested.

  15. O

    Data from: BuildingsBench: A Large-Scale Dataset of 900K Buildings and...

    • data.openei.org
    • osti.gov
    code, data, website
    Updated Dec 31, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Emami; Peter Graf; Patrick Emami; Peter Graf (2018). BuildingsBench: A Large-Scale Dataset of 900K Buildings and Benchmark for Short-Term Load Forecasting [Dataset]. http://doi.org/10.25984/1986147
    Explore at:
    code, website, dataAvailable download formats
    Dataset updated
    Dec 31, 2018
    Dataset provided by
    Open Energy Data Initiative (OEDI)
    National Renewable Energy Laboratory
    USDOE Office of Energy Efficiency and Renewable Energy (EERE), Multiple Programs (EE)
    Authors
    Patrick Emami; Peter Graf; Patrick Emami; Peter Graf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The BuildingsBench datasets consist of:

    • Buildings-900K: A large-scale dataset of 900K buildings for pretraining models on the task of short-term load forecasting (STLF). Buildings-900K is statistically representative of the entire U.S. building stock.
    • 7 real residential and commercial building datasets for benchmarking two downstream tasks evaluating generalization: zero-shot STLF and transfer learning for STLF.

    Buildings-900K can be used for pretraining models on day-ahead STLF for residential and commercial buildings. The specific gap it fills is the lack of large-scale and diverse time series datasets of sufficient size for studying pretraining and finetuning with scalable machine learning models. Buildings-900K consists of synthetically generated energy consumption time series. It is derived from the NREL End-Use Load Profiles (EULP) dataset (see link to this database in the links further below). However, the EULP was not originally developed for the purpose of STLF. Rather, it was developed to "...help electric utilities, grid operators, manufacturers, government entities, and research organizations make critical decisions about prioritizing research and development, utility resource and distribution system planning, and state and local energy planning and regulation." Similar to the EULP, Buildings-900K is a collection of Parquet files and it follows nearly the same Parquet dataset organization as the EULP. As it only contains a single energy consumption time series per building, it is much smaller (~110 GB).

    BuildingsBench also provides an evaluation benchmark that is a collection of various open source residential and commercial real building energy consumption datasets. The evaluation datasets, which are provided alongside Buildings-900K below, are collections of CSV files which contain annual energy consumption. The size of the evaluation datasets altogether is less than 1GB, and they are listed out below:

    1. ElectricityLoadDiagrams20112014
    2. Building Data Genome Project-2
    3. Individual household electric power consumption (Sceaux)
    4. Borealis
    5. SMART
    6. IDEAL
    7. Low Carbon London

    A README file providing details about how the data is stored and describing the organization of the datasets can be found within each data lake version under BuildingsBench.

  16. f

    Benchmark test databases for IQA.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Sep 23, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lu, Yin; Wu, Xiao-Jun; Sang, Qing-Bing; Li, Chao-Feng (2014). Benchmark test databases for IQA. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001257529
    Explore at:
    Dataset updated
    Sep 23, 2014
    Authors
    Lu, Yin; Wu, Xiao-Jun; Sang, Qing-Bing; Li, Chao-Feng
    Description

    Benchmark test databases for IQA.

  17. Oxford Nanopore Technologies Benchmark Datasets

    • registry.opendata.aws
    Updated Sep 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oxford Nanopore Technologies (2020). Oxford Nanopore Technologies Benchmark Datasets [Dataset]. https://registry.opendata.aws/ont-open-data/
    Explore at:
    Dataset updated
    Sep 29, 2020
    Dataset provided by
    Oxford Nanopore Technologieshttp://nanoporetech.com/
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The ont-open-data registry provides reference sequencing data from Oxford Nanopore Technologies to support, 1) Exploration of the characteristics of nanopore sequence data. 2) Assessment and reproduction of performance benchmarks 3) Development of tools and methods. The data deposited showcases DNA sequences from a representative subset of sequencing chemistries. The datasets correspond to publicly-available reference samples (e.g. Genome In A Bottle reference cell lines). Raw data are provided with metadata and scripts to describe sample and data provenance.

  18. 🟥AMD - CPU Benchmarks (UserBenchmark)📊

    • kaggle.com
    zip
    Updated May 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    💥Alien💥 (2022). 🟥AMD - CPU Benchmarks (UserBenchmark)📊 [Dataset]. https://www.kaggle.com/datasets/alanjo/amd-cpu-benchmarks
    Explore at:
    zip(7339 bytes)Available download formats
    Dataset updated
    May 3, 2022
    Authors
    💥Alien💥
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Benchmarks allow for easy comparison between multiple CPUs by scoring their performance on a standardized series of tests, and they are useful in many instances: When buying or building a new PC.

    Content

    Newest data as of May 2nd, 2022 This dataset contains benchmarks of AMD processors.

    Acknowledgements

    Data scrapped from userbenchmark.

    Inspiration

    When Lisa Su became CEO of Advanced Micro Devices in 2014, the company was on the brink of bankruptcy. Since then, AMD's stock has soared—from less than US $2 per share to more than $110. The company is now a leader in high-performance computing. She funneled billions of dollars to research and development, while Intel funneled their R&D funds into executive pay. Now Intel is losing a large portion of the market share they originally dominated in.

    If you enjoyed this dataset, here's some similar datasets you may like 😎

  19. Benchmark for 47M DBpedia based dataset on 4 triple stores

    • figshare.com
    txt
    Updated Mar 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Conrads; Muhammad Saleem; Alexander Bigerl; Mohamed Ahmed Sherif; Axel-Cyrille Ngonga Ngomo (2018). Benchmark for 47M DBpedia based dataset on 4 triple stores [Dataset]. http://doi.org/10.6084/m9.figshare.5808333.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Mar 28, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Felix Conrads; Muhammad Saleem; Alexander Bigerl; Mohamed Ahmed Sherif; Axel-Cyrille Ngonga Ngomo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Benchmarking results for 47Mio Triples based upon DBpedia dataset using 499 queries on TNT, Fuseki, Virtuoso and N-graphStore with approx. 300GB RAM provided for each Triple store

  20. d

    Ground benchmark data sets 2024 district-free city of Cottbus

    • datasets.ai
    • data.europa.eu
    0
    Updated Feb 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GDI-DE (2024). Ground benchmark data sets 2024 district-free city of Cottbus [Dataset]. https://datasets.ai/datasets/0d8578cc-9118-409c-8b09-91575203f840
    Explore at:
    0Available download formats
    Dataset updated
    Feb 25, 2024
    Dataset authored and provided by
    GDI-DE
    Area covered
    Cottbus
    Description

    Ground benchmark datasets are published annually in the standard file formats Text (CSV) and XML based on EPSG code 25833. Depending on the file format, ground benchmark data sets are provided completely for the areas of responsibility of the expert committees and for the state of Brandenburg in a zipped file with a statistical indication and a description of the elements. The CSV file is based on VBORIS2. A key bridge to the old format can be taken from the data. On request, ground benchmark data records for municipal areas can be cut out or provided in shape format. Furthermore, the delivery of soil benchmarks in the form of web-based geoservices is possible.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
winter.sci.dev (2025). finesse-benchmark-database [Dataset]. https://huggingface.co/datasets/enzoescipy/finesse-benchmark-database

finesse-benchmark-database

enzoescipy/finesse-benchmark-database

Explore at:
Dataset updated
Oct 25, 2025
Authors
winter.sci.dev
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Finesse Benchmark Database

  Overview

finesse-benchmark-database is a data generation factory for atomic probes in the Finesse benchmark. It generates probes_atomic.jsonl files from Wikimedia Wikipedia datasets, leveraging Hugging Face's datasets library, tokenizers from transformers, and optional PyTorch support. This tool is designed to create high-quality, language-specific probe datasets for benchmarking fine-grained understanding in NLP tasks.… See the full description on the dataset page: https://huggingface.co/datasets/enzoescipy/finesse-benchmark-database.

Search
Clear search
Close search
Google apps
Main menu