15 datasets found

h
text2cypher-gpt4o-clean
huggingface.co
Updated May 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomaž Bratanič (2024). text2cypher-gpt4o-clean [Dataset]. https://huggingface.co/datasets/tomasonjo/text2cypher-gpt4o-clean
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 23, 2024
Authors
Tomaž Bratanič
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Synthetic dataset created with GPT-4o

Synthetic dataset of text2cypher over 16 different graph schemas. Questions were generated using GPT-4-turbo, and the corresponding Cypher statements with gpt-4o using Chain of Thought. Here, there are only questions that return results when queried against the database. For more information visit: https://github.com/neo4j-labs/text2cypher/tree/main/datasets/synthetic_gpt4o_demodbs Dataset is available as train.csv. Columns are the following:… See the full description on the dataset page: https://huggingface.co/datasets/tomasonjo/text2cypher-gpt4o-clean.
h
synthetic-text2cypher-gpt4turbo
huggingface.co
Updated May 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomaž Bratanič (2024). synthetic-text2cypher-gpt4turbo [Dataset]. https://huggingface.co/datasets/tomasonjo/synthetic-text2cypher-gpt4turbo
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 2, 2024
Authors
Tomaž Bratanič
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Synthetic dataset created with GPT-4-Turbo

Synthetic dataset of text2cypher over 16 different graph schemas. Both questions and cypher queries were generated using GPT-4-turbo. The demo database is available at: URI: neo4j+s://demo.neo4jlabs.com username: name of the database, for example 'movies' password: name of the database, for example 'movies' database: name of the database, for example 'movies'

Notebooks:

generate_text2cypher_questions.ipynb: Generate questions and prepare… See the full description on the dataset page: https://huggingface.co/datasets/tomasonjo/synthetic-text2cypher-gpt4turbo.
h
text2cypher-recommendations-ha-sample
huggingface.co
Updated May 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Persistent (2024). text2cypher-recommendations-ha-sample [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-recommendations-ha-sample
Explore at:
Dataset updated
May 22, 2024
Dataset authored and provided by
Persistent
Description
Dataset Card for "text2cypher-recommendations-ha-sample"

This is human annotated sample data for Text2Cypher generation on recommendations (Movie) database.
h
text2cypher
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eva Papaspyrou, text2cypher [Dataset]. https://huggingface.co/datasets/evagelnjy/text2cypher
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Eva Papaspyrou
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
evagelnjy/text2cypher dataset hosted on Hugging Face and contributed by the HF Datasets community
h
text2cypher-recommendations-gpt4o-sft-0.5k
huggingface.co
Updated Jul 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Persistent (2024). text2cypher-recommendations-gpt4o-sft-0.5k [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-recommendations-gpt4o-sft-0.5k
Explore at:
Dataset updated
Jul 29, 2024
Dataset authored and provided by
Persistent
Description
Dataset Card for "text2cypher-recommendations-gpt4o-sft-0.5k"

More Information needed
h
text2cypher-recommendations-test-sample
huggingface.co
Updated Aug 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Persistent (2024). text2cypher-recommendations-test-sample [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-recommendations-test-sample
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 8, 2024
Dataset authored and provided by
Persistent
Description
persistent/text2cypher-recommendations-test-sample dataset hosted on Hugging Face and contributed by the HF Datasets community
h
text2cypher-2025v1
huggingface.co
Updated Apr 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neo4j (2025). text2cypher-2025v1 [Dataset]. https://huggingface.co/datasets/neo4j/text2cypher-2025v1
Explore at:
Dataset updated
Apr 2, 2025
Dataset authored and provided by
Neo4j
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
neo4j/text2cypher-2025v1 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
text2cypher-sft-0.1k
huggingface.co
Updated May 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Persistent (2024). text2cypher-sft-0.1k [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-sft-0.1k
Explore at:
Dataset updated
May 15, 2024
Dataset authored and provided by
Persistent
Description
persistent/text2cypher-sft-0.1k dataset hosted on Hugging Face and contributed by the HF Datasets community
h
text2cypher-recommendations-sft-0.375k
huggingface.co
Updated Aug 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Persistent (2024). text2cypher-recommendations-sft-0.375k [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-recommendations-sft-0.375k
Explore at:
Dataset updated
Aug 9, 2024
Dataset authored and provided by
Persistent
Description
persistent/text2cypher-recommendations-sft-0.375k dataset hosted on Hugging Face and contributed by the HF Datasets community
h
neo4j-text2cypher-2024v1
huggingface.co
Updated Nov 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Max (2024). neo4j-text2cypher-2024v1 [Dataset]. https://huggingface.co/datasets/maxromanovsky/neo4j-text2cypher-2024v1
Explore at:
Dataset updated
Nov 11, 2024
Authors
Max
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
maxromanovsky/neo4j-text2cypher-2024v1 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
text2cypher-small
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gurveer Singh Virk, text2cypher-small [Dataset]. https://huggingface.co/datasets/Gurveer05/text2cypher-small
Explore at:
Authors
Gurveer Singh Virk
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Gurveer05/text2cypher-small dataset hosted on Hugging Face and contributed by the HF Datasets community
h
translated_text2cypher24_trainset_sampled
huggingface.co
Updated Aug 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MGO (2025). translated_text2cypher24_trainset_sampled [Dataset]. https://huggingface.co/datasets/mgoNeo4j/translated_text2cypher24_trainset_sampled
Explore at:
Dataset updated
Aug 27, 2025
Authors
MGO
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Translated Text2Cypher'24 Training Set - Sampled & Multilingual

This dataset provides a sampled and translated training set based on the Neo4j Text2Cypher '24 dataset. It is designed to support research on multilingual natural language to Cypher query generation. We offer two versions of the training set:

1. Multilingual Version (multilang)

Total examples: ~36,000
Languages: English (en), Spanish (es), Turkish (tr)
Samples per language: ~12,000
Translation… See the full description on the dataset page: https://huggingface.co/datasets/mgoNeo4j/translated_text2cypher24_trainset_sampled.
h
text2cypher-2024v1-copy
huggingface.co
Updated Nov 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roger yau (2024). text2cypher-2024v1-copy [Dataset]. https://huggingface.co/datasets/chnug/text2cypher-2024v1-copy
Explore at:
Dataset updated
Nov 11, 2024
Authors
Roger yau
Description
chnug/text2cypher-2024v1-copy dataset hosted on Hugging Face and contributed by the HF Datasets community
h
translated_text2cypher24_testset
huggingface.co
Updated Jun 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MGO (2025). translated_text2cypher24_testset [Dataset]. https://huggingface.co/datasets/mgoNeo4j/translated_text2cypher24_testset
Explore at:
Dataset updated
Jun 1, 2025
Authors
MGO
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Translated Text2Cypher'24 Test Set

This dataset provides Spanish (es) and Turkish (tr) translations of the test split of the Neo4jText2Cypher'24 dataset.

Overview

Only the question field (user's natural language input) is translated. Original questions were in English (en), and translated versions are available in Spanish(es) and Turkish (tr). All questions across languages are paired with the same Cypher query for consistent evaluation.

Usage Example

from… See the full description on the dataset page: https://huggingface.co/datasets/mgoNeo4j/translated_text2cypher24_testset.
SynthCypher
huggingface.co
Updated Jun 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ServiceNow-AI (2025). SynthCypher [Dataset]. https://huggingface.co/datasets/ServiceNow-AI/SynthCypher
Explore at:
Dataset updated
Jun 20, 2025
Dataset provided by
ServiceNowhttp://servicenow.com/
Authors
ServiceNow-AI
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
SynthCypher Dataset Repository

Overview

This repository hosts SynthCypher, a novel synthetic dataset designed to bridge the gap in Text-to-Cypher (Text2Cypher) tasks. SynthCypher leverages state-of-the-art large language models (LLMs) to automatically generate and validate high-quality data for training and evaluating models that convert natural language questions into Cypher queries for graph databases like Neo4j. Our dataset and pipeline contribute significantly to… See the full description on the dataset page: https://huggingface.co/datasets/ServiceNow-AI/SynthCypher.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Tomaž Bratanič (2024). text2cypher-gpt4o-clean [Dataset]. https://huggingface.co/datasets/tomasonjo/text2cypher-gpt4o-clean

text2cypher-gpt4o-clean

tomasonjo/text2cypher-gpt4o-clean

Clean text2cypher dataset generated with gpt-4o on 16 different graph schemas

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 23, 2024

Authors

Tomaž Bratanič

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Synthetic dataset created with GPT-4o

Synthetic dataset of text2cypher over 16 different graph schemas. Questions were generated using GPT-4-turbo, and the corresponding Cypher statements with gpt-4o using Chain of Thought. Here, there are only questions that return results when queried against the database. For more information visit: https://github.com/neo4j-labs/text2cypher/tree/main/datasets/synthetic_gpt4o_demodbs Dataset is available as train.csv. Columns are the following:… See the full description on the dataset page: https://huggingface.co/datasets/tomasonjo/text2cypher-gpt4o-clean.

Clear search

Close search

Google apps

Main menu

text2cypher-gpt4o-clean

synthetic-text2cypher-gpt4turbo

text2cypher-recommendations-ha-sample

text2cypher

text2cypher-recommendations-gpt4o-sft-0.5k

text2cypher-recommendations-test-sample

text2cypher-2025v1

text2cypher-sft-0.1k

text2cypher-recommendations-sft-0.375k

neo4j-text2cypher-2024v1

text2cypher-small

translated_text2cypher24_trainset_sampled

text2cypher-2024v1-copy

translated_text2cypher24_testset

SynthCypher

text2cypher-gpt4o-clean

tomasonjo/text2cypher-gpt4o-clean

Clean text2cypher dataset generated with gpt-4o on 16 different graph schemas