18 datasets found

h
nl2sql-dataset
huggingface.co
Updated Aug 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajpreet Singh (2024). nl2sql-dataset [Dataset]. https://huggingface.co/datasets/Rajpreet2206/nl2sql-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 19, 2024
Authors
Rajpreet Singh
Description
Rajpreet2206/nl2sql-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
NL2SQL_Query_Dataset
kaggle.com
zip
Updated Dec 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suresh Muthusamy P (2023). NL2SQL_Query_Dataset [Dataset]. https://www.kaggle.com/datasets/sureshmuthusamy001p/nl2sql-query-dataset
Explore at:
zip(231382 bytes)Available download formats
Dataset updated
Dec 22, 2023
Authors
Suresh Muthusamy P
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This dataset is designed for training models to convert natural language prompts into SQL queries, specifically focusing on SELECT statements. The dataset comprises 14,815 examples where each prompt is associated with the corresponding SQL query that would retrieve the desired information from a specific table.

Columns: Prompt: The natural language text representing a query request. SQL Query: The corresponding SQL query generated to fulfill the request.
h
nl2sql
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shritama Sengupta, nl2sql [Dataset]. https://huggingface.co/datasets/Shritama/nl2sql
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Shritama Sengupta
Description
Shritama/nl2sql dataset hosted on Hugging Face and contributed by the HF Datasets community
Pawnshop NL2SQL Dataset Multilingual Queries
kaggle.com
zip
Updated Nov 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KingPawnUSA (2025). Pawnshop NL2SQL Dataset Multilingual Queries [Dataset]. https://www.kaggle.com/datasets/kingpawnusa/pawnshop-nl2sql-dataset-multilingual-queries
Explore at:
zip(3231 bytes)Available download formats
Dataset updated
Nov 23, 2025
Authors
KingPawnUSA
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by KingPawnUSA

Released under CC0: Public Domain

Contents
h
nl2sql
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Kuncheria, nl2sql [Dataset]. https://huggingface.co/datasets/NormalMatt/nl2sql
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Matthew Kuncheria
Description
NormalMatt/nl2sql dataset hosted on Hugging Face and contributed by the HF Datasets community
NL2SQL for BI Dataset
figshare.com
zip
Updated Dec 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bora Caglayan (2023). NL2SQL for BI Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.24771738.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24771738.v2
Dataset updated
Dec 8, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Bora Caglayan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NL2SQL for BI dataset
h
nl2sql
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tae-Hyoung Choi, nl2sql [Dataset]. https://huggingface.co/datasets/selmoch/nl2sql
Explore at:
Authors
Tae-Hyoung Choi
Description
selmoch/nl2sql dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset and Sample Evaluation Script
figshare.com
Updated Dec 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bora Caglayan (2023). Dataset and Sample Evaluation Script [Dataset]. http://doi.org/10.6084/m9.figshare.24771747.v1
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.24771747.v1
Dataset updated
Dec 8, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Bora Caglayan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is accompanying our current submission to MSR 2024 data and tool showcase track submission
h
nl2sql
huggingface.co
Updated Sep 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahesh (2024). nl2sql [Dataset]. https://huggingface.co/datasets/Mahesh929/nl2sql
Explore at:
Dataset updated
Sep 10, 2024
Authors
Mahesh
Description
Mahesh929/nl2sql dataset hosted on Hugging Face and contributed by the HF Datasets community
1000_samples_nl2sql_dataset
kaggle.com
zip
Updated Feb 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Himanshu Nayal (2025). 1000_samples_nl2sql_dataset [Dataset]. https://www.kaggle.com/datasets/himanshunayal/1000-samples-nl2sql-dataset
Explore at:
zip(112965 bytes)Available download formats
Dataset updated
Feb 21, 2025
Authors
Himanshu Nayal
Description
Dataset

This dataset was created by Himanshu Nayal

Contents
h
NL2SQL
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manohar Palanisamy, NL2SQL [Dataset]. https://huggingface.co/datasets/ManoharPalanisamy/NL2SQL
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Manohar Palanisamy
Description
ManoharPalanisamy/NL2SQL dataset hosted on Hugging Face and contributed by the HF Datasets community
h
nl2sql-deduplicated
huggingface.co
Updated Dec 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AI (2025). nl2sql-deduplicated [Dataset]. https://huggingface.co/datasets/AsadIsmail/nl2sql-deduplicated
Explore at:
Dataset updated
Dec 2, 2025
Authors
AI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NL2SQL Deduplicated Training Dataset

A curated and deduplicated Text-to-SQL training dataset with 683,015 unique examples from 4 high-quality sources.

📊 Dataset Summary

Total Examples: 683,015 unique question-SQL pairs Sources: Spider, SQaLe, Gretel Synthetic, SQL-Create-Context Deduplication Strategy: Input-only (question-based) with conflict resolution via quality priority Conflicts Resolved: 2,238 cases where same question had different SQL SQL Dialect: Standard SQL… See the full description on the dataset page: https://huggingface.co/datasets/AsadIsmail/nl2sql-deduplicated.
h
nl2sql-reasoning-trace
huggingface.co
Updated Feb 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simone PAPICCHIO (2025). nl2sql-reasoning-trace [Dataset]. https://huggingface.co/datasets/simone-papicchio/nl2sql-reasoning-trace
Explore at:
Dataset updated
Feb 17, 2025
Authors
Simone PAPICCHIO
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
simone-papicchio/nl2sql-reasoning-trace dataset hosted on Hugging Face and contributed by the HF Datasets community
nl2sql_baseline-python3
kaggle.com
zip
Updated Jun 26, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zuo Zhaorui (2019). nl2sql_baseline-python3 [Dataset]. https://www.kaggle.com/zuozhaorui/nl2sql-baselinepython3
Explore at:
zip(121043208 bytes)Available download formats
Dataset updated
Jun 26, 2019
Authors
Zuo Zhaorui
Description
Dataset

This dataset was created by Zuo Zhaorui

Contents
h
nl2sql_food_fieldname
huggingface.co
Updated Jul 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sirabhop saengumyoun (2023). nl2sql_food_fieldname [Dataset]. https://huggingface.co/datasets/sirabhop/nl2sql_food_fieldname
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 28, 2023
Authors
sirabhop saengumyoun
Description
sirabhop/nl2sql_food_fieldname dataset hosted on Hugging Face and contributed by the HF Datasets community
h
SynSQL-Complex-5K
huggingface.co
Updated Sep 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peixian Ma (2025). SynSQL-Complex-5K [Dataset]. https://huggingface.co/datasets/MPX0222forHF/SynSQL-Complex-5K
Explore at:
Dataset updated
Sep 29, 2025
Authors
Peixian Ma
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Peixian Ma1,2 Xialie Zhuang1,3 Chengjin Xu1,4 Xuhui Jiang1,4 Ran Chen1 Jian Guo1 1IDEA Research, International Digital Economy Academy 2The Hong Kong University of Science and Technology (Guangzhou) 3University of Chinese Academy of Science 4DataArc Tech Ltd. 📖 Overview

Natural Language to SQL (NL2SQL) enables intuitive interactions… See the full description on the dataset page: https://huggingface.co/datasets/MPX0222forHF/SynSQL-Complex-5K.
h
spider
huggingface.co
opendatalab.com
Updated Jul 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
XLang NLP Lab (2024). spider [Dataset]. https://huggingface.co/datasets/xlangai/spider
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 8, 2024
Dataset authored and provided by
XLang NLP Lab
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Card for Spider

Dataset Summary

Spider is a large-scale complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 Yale students. The goal of the Spider challenge is to develop natural language interfaces to cross-domain databases.

Supported Tasks and Leaderboards

The leaderboard can be seen at https://yale-lily.github.io/spider

Languages

The text in the dataset is in English.

Dataset Structure Data… See the full description on the dataset page: https://huggingface.co/datasets/xlangai/spider.
h
NL2SQL_zh
huggingface.co
Updated Jan 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nuo (2024). NL2SQL_zh [Dataset]. https://huggingface.co/datasets/lorinma/NL2SQL_zh
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 13, 2024
Authors
Nuo
Description
整合了3个中文数据集：追一科技NL2SQL，西湖大学的CSpider中文翻译，百度的DuSQL。进行了大致的清洗，以及格式转换（alpaca）：假设你是一个数据库SQL专家，下面我会给出一个MySQL数据库的信息，请根据问题，帮我生成相应的SQL语句。当前时间为2023年。格式如下：{'sql':sql语句} MySQL数据库数据库结构如下： {表名（字段名...）} 其中: {表之间的主外键关联关系} 对于query：“{问题}”，给出相应的SQL语句，按照要求的格式返回，不进行任何解释。其中，DuSQL最终结果是25004个。NL2SQL最终结果45919个，注意表名是乱码。CSpider，最终结果7786条，注意数据库是英文的，问题是中文的。最终形成的文件，一共78706条，文件样例: { "instruction": "假设你是一个数据库SQL专家，下面我会给出一个MySQL数据库的信息，请根据问题，帮我生成相应的SQL语句。当前时间为2023年。", "input":… See the full description on the dataset page: https://huggingface.co/datasets/lorinma/NL2SQL_zh.
Not seeing a result you expected?
Learn how you can add new datasets to our index.