BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) represents a pioneering, cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing. BIRD contains over 12,751 unique question-SQL pairs and 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain, hockey, healthcare and education, etc.
Dataset Card for "BIRD-SQL-data"
More Information needed
BIRD-SQL
Data from BIRD-SQL benchmark dev set (last release Jul 3, 2024). Ref: https://bird-bench.github.io
benjamintli/BIRD-SQL-data-train-formatted dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
BIRD-CRITIC-1.0-Flash
BIRD-Critic is the first SQL debugging benchmark designed to answer a critical question: Can large language models (LLMs) fix user issues in real-world database applications? Each task in BIRD-CRITIC has been verified by human experts on the following dimensions:
Reproduction of errors on BIRD env to prevent data leakage. Carefully curate test case functions for each task specifically. Soft EX: This metric can evaluate SELECT-ONLY tasks. Soft EX + Parsing:… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird-critic-1.0-flash-exp.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
🧸 Overview
BIRD-INTERACT, an interactive text-to-SQL benchmark, re-imagines Text-to-SQL evaluation via lens of dynamic interactions. The environment blends a hierarchical knowledge base, database documentation and a function-driven user simulator to recreate authentic enterprise environments across full CRUD operations. It offers two rigorous test modes: (1) passive Conversational Interaction and (2) active Agentic Interaction, spanning 600 annotated tasks including Business… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird-interact-lite.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Update 2025-05-22
The previous issue regarding mismatched MySQL instances has been resolved. The updated version of BIRD-CRITIC-Open is now available. Thank you for your patience and understanding.
Update 2025-04-25
We’ve identified a mismatch issue in some uploaded MySQL instances. Our team is actively working to resolve this, and we’ll release the updated version promptly. Please refrain from using MySQL until the fix is deployed. Apologies for any inconvenience caused.… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird-critic-1.0-open.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Update 2025-06-08
We release the full version of BIRD-Critic-PG, a dataset containing 530 high-quality user issues focused on real-world PostgreSQL database applications. The schema file is include in the code repository https://github.com/bird-bench/BIRD-CRITIC-1/blob/main/baseline/data/post_schema.jsonl
BIRD-CRITIC-1.0-PG
BIRD-Critic is the first SQL debugging benchmark designed to answer a critical question: Can large language models (LLMs) fix user issues in… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird-critic-1.0-postgresql.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
BIRD-SQL Mini-Dev
Update 2025-07-04
We are grateful for the valuable feedback from the community over the past year regarding BIRD Mini-Dev. Based on your suggestions, we have made significant updates to the BIRD Mini-Dev dataset.
For New Users
If you are new to BIRD Mini-Dev, you can download the complete databases and datasets using the following link: Download BIRD Mini-Dev Complete Package
For Existing Users
If you have already downloaded the… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird_mini_dev.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Climate change has led to changes in the strength of directional selection on seasonal timing. Understanding the causes and consequences of these changes is crucial to predicting the impact of climate change. But are observed patterns in one population generalisable to others, and can spatial variation in selection be explained by environmental variation among populations? We used long-term data (1955–2022) on blue and great tits co-occurring in four locations across the Netherlands to assess inter-population variation in temporal patterns of selection on laying date. To analyse selection, we combine reproduction and adult survival into a joined fitness measure. We found distinct spatial variation in temporal patterns of selection which overall acted towards earlier laying, and which was due to selection through reproduction rather than through survival. The underlying relationships between temperature, bird and caterpillar phenology were however the same across populations, and the spatial variation in selection patterns is thus caused by spatial variation in the temperatures and other habitat characteristics to which birds and caterpillars respond. This underlines that climate change is not necessarily equally affecting populations, but that we can understand this spatial variation, which enables us to predict climate change effects on selection for other populations. Methods Long-term data on breeding birds were collected by regular nest checks and by capturing and ringing birds. Data on caterpillar biomass was collected using frass nets. All data was stored in an relational SQL database and analysed using R.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BIRD-RL
This dataset is a processed dataset of BIRD-SQL for Post-Training.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
🚀 LiveSQLBench-Base-Lite
A dynamic, contamination‑free benchmark for evaluating LLMs on complex, real‑world text‑to‑SQL tasks. 🌐 Website • 📄 Paper (coming soon) • 💻 GitHub Maintained by the 🦜 BIRD Team @ HKU & ☁️ Google Cloud
📊 LiveSQLBench Overview
LiveSQLBench (BIRD-SQL Pro v0.5) is a contamination-free, continuously evolving benchmark designed to evaluate LLMs on complex, real-world text-to-SQL tasks, featuring diverse real-world user queries, including… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/livesqlbench-base-lite.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
BIRD-SQL - Versão em Português
Este repositório contém a tradução para português da partição de treino e desenvolvimento do benchmark BIRD-SQL, um benchmark para a tarefa de Text-to-SQL.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset
We built this dataset from several sources combining examples from:
Wikisql Bird Spider Synthetic SQL samples
This dataset has been cleaned and filtered by:
Removing DDL/DML examples (INSERT, UPDATE, DELETE, etc.) De-duplicating examples based on hashing semantics of SQL and queries Filtering only SELECT-style analytical queries
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Text2SQL Workflow Trace
Dataset Description
This dataset contains workflow traces for Text-to-SQL tasks, capturing the intermediate steps of translating natural language queries to executable SQL. It was used as input trace for the research presented in the paper:"HEXGEN-TEXT2SQL: Optimizing LLM Inference Request Scheduling for Agentic Text-to-SQL Workflow" (arXiv:2505.05286). The end-to-end Text-to-SQL queries collected in the dataset are from BIRD bench, and the trace… See the full description on the dataset page: https://huggingface.co/datasets/fredpeng/Text2SQL_Workflow_Trace.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset Card for Spider
Dataset Summary
Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students. The goal of the Spider challenge is to develop natural language interfaces to cross-domain databases.
Supported Tasks and Leaderboards
The leaderboard can be seen at https://yale-lily.github.io/spider
Languages
The text in the dataset is in English.
Dataset Structure
Data… See the full description on the dataset page: https://huggingface.co/datasets/xlangai/spider.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) represents a pioneering, cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing. BIRD contains over 12,751 unique question-SQL pairs and 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain, hockey, healthcare and education, etc.