17 datasets found

P
BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)...
paperswithcode.com
Updated Jan 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jinyang Li; Binyuan Hui; Ge Qu; Jiaxi Yang; Binhua Li; Bowen Li; Bailin Wang; Bowen Qin; Rongyu Cao; Ruiying Geng; Nan Huo; Xuanhe Zhou; Chenhao Ma; Guoliang Li; Kevin C. C. Chang; Fei Huang; Reynold Cheng; Yongbin Li (2024). BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) Dataset [Dataset]. https://paperswithcode.com/dataset/bird-sql
Explore at:
Dataset updated
Jan 5, 2024
Authors
Jinyang Li; Binyuan Hui; Ge Qu; Jiaxi Yang; Binhua Li; Bowen Li; Bailin Wang; Bowen Qin; Rongyu Cao; Ruiying Geng; Nan Huo; Xuanhe Zhou; Chenhao Ma; Guoliang Li; Kevin C. C. Chang; Fei Huang; Reynold Cheng; Yongbin Li
Description
BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) represents a pioneering, cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing. BIRD contains over 12,751 unique question-SQL pairs and 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain, hockey, healthcare and education, etc.
h
BIRD-SQL-data
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wen-Ding Li, BIRD-SQL-data [Dataset]. https://huggingface.co/datasets/xu3kev/BIRD-SQL-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Wen-Ding Li
Description
Dataset Card for "BIRD-SQL-data"

More Information needed
h
bird
huggingface.co
Updated Jul 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mic (2024). bird [Dataset]. https://huggingface.co/datasets/micpst/bird
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 3, 2024
Authors
Mic
Description
BIRD-SQL

Data from BIRD-SQL benchmark dev set (last release Jul 3, 2024). Ref: https://bird-bench.github.io
h
BIRD-SQL
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deema, BIRD-SQL [Dataset]. https://huggingface.co/datasets/Deema/BIRD-SQL
Explore at:
Authors
Deema
Description
Deema/BIRD-SQL dataset hosted on Hugging Face and contributed by the HF Datasets community
h
BIRD-SQL-data-train-formatted
huggingface.co
Updated Feb 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benjamin Li (2024). BIRD-SQL-data-train-formatted [Dataset]. https://huggingface.co/datasets/benjamintli/BIRD-SQL-data-train-formatted
Explore at:
Dataset updated
Feb 6, 2024
Authors
Benjamin Li
Description
benjamintli/BIRD-SQL-data-train-formatted dataset hosted on Hugging Face and contributed by the HF Datasets community
h
bird-critic-1.0-flash-exp
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The BIRD Team, bird-critic-1.0-flash-exp [Dataset]. https://huggingface.co/datasets/birdsql/bird-critic-1.0-flash-exp
Explore at:
Dataset authored and provided by
The BIRD Team
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
BIRD-CRITIC-1.0-Flash

BIRD-Critic is the first SQL debugging benchmark designed to answer a critical question: Can large language models (LLMs) fix user issues in real-world database applications? Each task in BIRD-CRITIC has been verified by human experts on the following dimensions:

Reproduction of errors on BIRD env to prevent data leakage. Carefully curate test case functions for each task specifically. Soft EX: This metric can evaluate SELECT-ONLY tasks. Soft EX + Parsing:… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird-critic-1.0-flash-exp.
h
bird-interact-lite
huggingface.co
Updated Jun 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The BIRD Team (2025). bird-interact-lite [Dataset]. https://huggingface.co/datasets/birdsql/bird-interact-lite
Explore at:
Dataset updated
Jun 9, 2025
Dataset authored and provided by
The BIRD Team
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
🧸 Overview

BIRD-INTERACT, an interactive text-to-SQL benchmark, re-imagines Text-to-SQL evaluation via lens of dynamic interactions. The environment blends a hierarchical knowledge base, database documentation and a function-driven user simulator to recreate authentic enterprise environments across full CRUD operations. It offers two rigorous test modes: (1) passive Conversational Interaction and (2) active Agentic Interaction, spanning 600 annotated tasks including Business… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird-interact-lite.
h
bird-critic-1.0-open
huggingface.co
Updated Apr 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The BIRD Team (2025). bird-critic-1.0-open [Dataset]. https://huggingface.co/datasets/birdsql/bird-critic-1.0-open
Explore at:
Dataset updated
Apr 25, 2025
Dataset authored and provided by
The BIRD Team
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Update 2025-05-22

The previous issue regarding mismatched MySQL instances has been resolved. The updated version of BIRD-CRITIC-Open is now available. Thank you for your patience and understanding.

Update 2025-04-25

We’ve identified a mismatch issue in some uploaded MySQL instances. Our team is actively working to resolve this, and we’ll release the updated version promptly. Please refrain from using MySQL until the fix is deployed. Apologies for any inconvenience caused.… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird-critic-1.0-open.
h
bird-critic-1.0-postgresql
huggingface.co
Updated Jun 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The BIRD Team (2025). bird-critic-1.0-postgresql [Dataset]. https://huggingface.co/datasets/birdsql/bird-critic-1.0-postgresql
Explore at:
Dataset updated
Jun 8, 2025
Dataset authored and provided by
The BIRD Team
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Update 2025-06-08

We release the full version of BIRD-Critic-PG, a dataset containing 530 high-quality user issues focused on real-world PostgreSQL database applications. The schema file is include in the code repository https://github.com/bird-bench/BIRD-CRITIC-1/blob/main/baseline/data/post_schema.jsonl

BIRD-CRITIC-1.0-PG

BIRD-Critic is the first SQL debugging benchmark designed to answer a critical question: Can large language models (LLMs) fix user issues in… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird-critic-1.0-postgresql.
h
bird_mini_dev
huggingface.co
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The BIRD Team (2025). bird_mini_dev [Dataset]. https://huggingface.co/datasets/birdsql/bird_mini_dev
Explore at:
Dataset updated
Jul 4, 2025
Dataset authored and provided by
The BIRD Team
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
BIRD-SQL Mini-Dev

Update 2025-07-04

We are grateful for the valuable feedback from the community over the past year regarding BIRD Mini-Dev. Based on your suggestions, we have made significant updates to the BIRD Mini-Dev dataset.

For New Users

If you are new to BIRD Mini-Dev, you can download the complete databases and datasets using the following link: Download BIRD Mini-Dev Complete Package

For Existing Users

If you have already downloaded the… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/bird_mini_dev.
Data from: Climate change does not equally affect temporal patterns of...
data.niaid.nih.gov
datadryad.org
zip
Updated Sep 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcel E. Visser; Cherine Jantzen (2023). Climate change does not equally affect temporal patterns of natural selection on reproductive timing across populations in two songbird species [Dataset]. http://doi.org/10.5061/dryad.1zcrjdfz0
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.1zcrjdfz0
Dataset updated
Sep 25, 2023
Dataset provided by
Netherlands Institute of Ecology
Authors
Marcel E. Visser; Cherine Jantzen
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Climate change has led to changes in the strength of directional selection on seasonal timing. Understanding the causes and consequences of these changes is crucial to predicting the impact of climate change. But are observed patterns in one population generalisable to others, and can spatial variation in selection be explained by environmental variation among populations? We used long-term data (1955–2022) on blue and great tits co-occurring in four locations across the Netherlands to assess inter-population variation in temporal patterns of selection on laying date. To analyse selection, we combine reproduction and adult survival into a joined fitness measure. We found distinct spatial variation in temporal patterns of selection which overall acted towards earlier laying, and which was due to selection through reproduction rather than through survival. The underlying relationships between temperature, bird and caterpillar phenology were however the same across populations, and the spatial variation in selection patterns is thus caused by spatial variation in the temperatures and other habitat characteristics to which birds and caterpillars respond. This underlines that climate change is not necessarily equally affecting populations, but that we can understand this spatial variation, which enables us to predict climate change effects on selection for other populations. Methods Long-term data on breeding birds were collected by regular nest checks and by capturing and ringing birds. Data on caterpillar biomass was collected using frass nets. All data was stored in an relational SQL database and analysed using R.
h
bird-rl
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rihong, bird-rl [Dataset]. https://huggingface.co/datasets/Rihong/bird-rl
Explore at:
Authors
Rihong
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
BIRD-RL

This dataset is a processed dataset of BIRD-SQL for Post-Training.
h
livesqlbench-base-lite
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The BIRD Team, livesqlbench-base-lite [Dataset]. https://huggingface.co/datasets/birdsql/livesqlbench-base-lite
Explore at:
Dataset authored and provided by
The BIRD Team
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
🚀 LiveSQLBench-Base-Lite

A dynamic, contamination‑free benchmark for evaluating LLMs on complex, real‑world text‑to‑SQL tasks. 🌐 Website • 📄 Paper (coming soon) • 💻 GitHub Maintained by the 🦜 BIRD Team @ HKU & ☁️ Google Cloud

📊 LiveSQLBench Overview

LiveSQLBench (BIRD-SQL Pro v0.5) is a contamination-free, continuously evolving benchmark designed to evaluate LLMs on complex, real-world text-to-SQL tasks, featuring diverse real-world user queries, including… See the full description on the dataset page: https://huggingface.co/datasets/birdsql/livesqlbench-base-lite.
h
bird-sql-portuguese
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Breno, bird-sql-portuguese [Dataset]. https://huggingface.co/datasets/Boakpe/bird-sql-portuguese
Explore at:
Authors
Breno
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
BIRD-SQL - Versão em Português

Este repositório contém a tradução para português da partição de treino e desenvolvimento do benchmark BIRD-SQL, um benchmark para a tarefa de Text-to-SQL.
h
text2sql-dataset
huggingface.co
Updated Jun 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
text2sql-dataset [Dataset]. https://huggingface.co/datasets/fahmiaziz/text2sql-dataset
Explore at:
Dataset updated
Jun 20, 2025
Authors
Fahmi Aziz Fadhil
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

We built this dataset from several sources combining examples from:

Wikisql Bird Spider Synthetic SQL samples

This dataset has been cleaned and filtered by:

Removing DDL/DML examples (INSERT, UPDATE, DELETE, etc.) De-duplicating examples based on hashing semantics of SQL and queries Filtering only SELECT-style analytical queries
h
Text2SQL_Workflow_Trace
huggingface.co
Updated May 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
You Peng (2025). Text2SQL_Workflow_Trace [Dataset]. https://huggingface.co/datasets/fredpeng/Text2SQL_Workflow_Trace
Explore at:
Dataset updated
May 12, 2025
Authors
You Peng
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Text2SQL Workflow Trace

Dataset Description

This dataset contains workflow traces for Text-to-SQL tasks, capturing the intermediate steps of translating natural language queries to executable SQL. It was used as input trace for the research presented in the paper:"HEXGEN-TEXT2SQL: Optimizing LLM Inference Request Scheduling for Agentic Text-to-SQL Workflow" (arXiv:2505.05286). The end-to-end Text-to-SQL queries collected in the dataset are from BIRD bench, and the trace… See the full description on the dataset page: https://huggingface.co/datasets/fredpeng/Text2SQL_Workflow_Trace.
h
spider
huggingface.co
opendatalab.com
Updated Dec 9, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
XLang NLP Lab (2021). spider [Dataset]. https://huggingface.co/datasets/xlangai/spider
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 9, 2021
Dataset authored and provided by
XLang NLP Lab
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Card for Spider

Dataset Summary

Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students. The goal of the Spider challenge is to develop natural language interfaces to cross-domain databases.

Supported Tasks and Leaderboards

The leaderboard can be seen at https://yale-lily.github.io/spider

Languages

The text in the dataset is in English.

Dataset Structure Data… See the full description on the dataset page: https://huggingface.co/datasets/xlangai/spider.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Jinyang Li; Binyuan Hui; Ge Qu; Jiaxi Yang; Binhua Li; Bowen Li; Bailin Wang; Bowen Qin; Rongyu Cao; Ruiying Geng; Nan Huo; Xuanhe Zhou; Chenhao Ma; Guoliang Li; Kevin C. C. Chang; Fei Huang; Reynold Cheng; Yongbin Li (2024). BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) Dataset [Dataset]. https://paperswithcode.com/dataset/bird-sql

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) Dataset

Explore at:

Dataset updated

Jan 5, 2024

Authors

Description

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) represents a pioneering, cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing. BIRD contains over 12,751 unique question-SQL pairs and 95 big databases with a total size of 33.4 GB. It also covers more than 37 professional domains, such as blockchain, hockey, healthcare and education, etc.

Clear search

Close search

Google apps

Main menu

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)...

BIRD-SQL-data

bird

BIRD-SQL

BIRD-SQL-data-train-formatted

bird-critic-1.0-flash-exp

bird-interact-lite

bird-critic-1.0-open

bird-critic-1.0-postgresql

bird_mini_dev

Data from: Climate change does not equally affect temporal patterns of...

bird-rl

livesqlbench-base-lite

bird-sql-portuguese

text2sql-dataset

Text2SQL_Workflow_Trace

spider

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) DatasetSee More Versions

BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) Dataset