15 datasets found
  1. h

    text2cypher-gpt4o-clean

    • huggingface.co
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomaž Bratanič (2024). text2cypher-gpt4o-clean [Dataset]. https://huggingface.co/datasets/tomasonjo/text2cypher-gpt4o-clean
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 23, 2024
    Authors
    Tomaž Bratanič
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Synthetic dataset created with GPT-4o

    Synthetic dataset of text2cypher over 16 different graph schemas. Questions were generated using GPT-4-turbo, and the corresponding Cypher statements with gpt-4o using Chain of Thought. Here, there are only questions that return results when queried against the database. For more information visit: https://github.com/neo4j-labs/text2cypher/tree/main/datasets/synthetic_gpt4o_demodbs Dataset is available as train.csv. Columns are the following:… See the full description on the dataset page: https://huggingface.co/datasets/tomasonjo/text2cypher-gpt4o-clean.

  2. h

    synthetic-text2cypher-gpt4turbo

    • huggingface.co
    Updated May 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomaž Bratanič (2024). synthetic-text2cypher-gpt4turbo [Dataset]. https://huggingface.co/datasets/tomasonjo/synthetic-text2cypher-gpt4turbo
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 2, 2024
    Authors
    Tomaž Bratanič
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Synthetic dataset created with GPT-4-Turbo

    Synthetic dataset of text2cypher over 16 different graph schemas. Both questions and cypher queries were generated using GPT-4-turbo. The demo database is available at: URI: neo4j+s://demo.neo4jlabs.com username: name of the database, for example 'movies' password: name of the database, for example 'movies' database: name of the database, for example 'movies'

    Notebooks:

    generate_text2cypher_questions.ipynb: Generate questions and prepare… See the full description on the dataset page: https://huggingface.co/datasets/tomasonjo/synthetic-text2cypher-gpt4turbo.

  3. h

    text2cypher-recommendations-ha-sample

    • huggingface.co
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Persistent (2024). text2cypher-recommendations-ha-sample [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-recommendations-ha-sample
    Explore at:
    Dataset updated
    May 22, 2024
    Dataset authored and provided by
    Persistent
    Description

    Dataset Card for "text2cypher-recommendations-ha-sample"

    This is human annotated sample data for Text2Cypher generation on recommendations (Movie) database.

  4. h

    text2cypher

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eva Papaspyrou, text2cypher [Dataset]. https://huggingface.co/datasets/evagelnjy/text2cypher
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Eva Papaspyrou
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    evagelnjy/text2cypher dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    text2cypher-recommendations-gpt4o-sft-0.5k

    • huggingface.co
    Updated Jul 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Persistent (2024). text2cypher-recommendations-gpt4o-sft-0.5k [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-recommendations-gpt4o-sft-0.5k
    Explore at:
    Dataset updated
    Jul 29, 2024
    Dataset authored and provided by
    Persistent
    Description

    Dataset Card for "text2cypher-recommendations-gpt4o-sft-0.5k"

    More Information needed

  6. h

    text2cypher-recommendations-test-sample

    • huggingface.co
    Updated Aug 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Persistent (2024). text2cypher-recommendations-test-sample [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-recommendations-test-sample
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 8, 2024
    Dataset authored and provided by
    Persistent
    Description

    persistent/text2cypher-recommendations-test-sample dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    text2cypher-2025v1

    • huggingface.co
    Updated Apr 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neo4j (2025). text2cypher-2025v1 [Dataset]. https://huggingface.co/datasets/neo4j/text2cypher-2025v1
    Explore at:
    Dataset updated
    Apr 2, 2025
    Dataset authored and provided by
    Neo4j
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    neo4j/text2cypher-2025v1 dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. h

    text2cypher-sft-0.1k

    • huggingface.co
    Updated May 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Persistent (2024). text2cypher-sft-0.1k [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-sft-0.1k
    Explore at:
    Dataset updated
    May 15, 2024
    Dataset authored and provided by
    Persistent
    Description

    persistent/text2cypher-sft-0.1k dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    text2cypher-recommendations-sft-0.375k

    • huggingface.co
    Updated Aug 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Persistent (2024). text2cypher-recommendations-sft-0.375k [Dataset]. https://huggingface.co/datasets/persistent/text2cypher-recommendations-sft-0.375k
    Explore at:
    Dataset updated
    Aug 9, 2024
    Dataset authored and provided by
    Persistent
    Description

    persistent/text2cypher-recommendations-sft-0.375k dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    neo4j-text2cypher-2024v1

    • huggingface.co
    Updated Nov 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Max (2024). neo4j-text2cypher-2024v1 [Dataset]. https://huggingface.co/datasets/maxromanovsky/neo4j-text2cypher-2024v1
    Explore at:
    Dataset updated
    Nov 11, 2024
    Authors
    Max
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    maxromanovsky/neo4j-text2cypher-2024v1 dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. h

    text2cypher-small

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gurveer Singh Virk, text2cypher-small [Dataset]. https://huggingface.co/datasets/Gurveer05/text2cypher-small
    Explore at:
    Authors
    Gurveer Singh Virk
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Gurveer05/text2cypher-small dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. h

    translated_text2cypher24_trainset_sampled

    • huggingface.co
    Updated Aug 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MGO (2025). translated_text2cypher24_trainset_sampled [Dataset]. https://huggingface.co/datasets/mgoNeo4j/translated_text2cypher24_trainset_sampled
    Explore at:
    Dataset updated
    Aug 27, 2025
    Authors
    MGO
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Translated Text2Cypher'24 Training Set - Sampled & Multilingual

    This dataset provides a sampled and translated training set based on the Neo4j Text2Cypher '24 dataset. It is designed to support research on multilingual natural language to Cypher query generation. We offer two versions of the training set:

      1. Multilingual Version (multilang)
    

    Total examples: ~36,000
    Languages: English (en), Spanish (es), Turkish (tr)
    Samples per language: ~12,000
    Translation… See the full description on the dataset page: https://huggingface.co/datasets/mgoNeo4j/translated_text2cypher24_trainset_sampled.

  13. h

    text2cypher-2024v1-copy

    • huggingface.co
    Updated Nov 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roger yau (2024). text2cypher-2024v1-copy [Dataset]. https://huggingface.co/datasets/chnug/text2cypher-2024v1-copy
    Explore at:
    Dataset updated
    Nov 11, 2024
    Authors
    Roger yau
    Description

    chnug/text2cypher-2024v1-copy dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. h

    translated_text2cypher24_testset

    • huggingface.co
    Updated Jun 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MGO (2025). translated_text2cypher24_testset [Dataset]. https://huggingface.co/datasets/mgoNeo4j/translated_text2cypher24_testset
    Explore at:
    Dataset updated
    Jun 1, 2025
    Authors
    MGO
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Translated Text2Cypher'24 Test Set

    This dataset provides Spanish (es) and Turkish (tr) translations of the test split of the Neo4jText2Cypher'24 dataset.

      Overview
    

    Only the question field (user's natural language input) is translated. Original questions were in English (en), and translated versions are available in Spanish(es) and Turkish (tr). All questions across languages are paired with the same Cypher query for consistent evaluation.

      Usage Example
    

    from… See the full description on the dataset page: https://huggingface.co/datasets/mgoNeo4j/translated_text2cypher24_testset.

  15. SynthCypher

    • huggingface.co
    Updated Jun 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ServiceNow-AI (2025). SynthCypher [Dataset]. https://huggingface.co/datasets/ServiceNow-AI/SynthCypher
    Explore at:
    Dataset updated
    Jun 20, 2025
    Dataset provided by
    ServiceNowhttp://servicenow.com/
    Authors
    ServiceNow-AI
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    SynthCypher Dataset Repository

      Overview
    

    This repository hosts SynthCypher, a novel synthetic dataset designed to bridge the gap in Text-to-Cypher (Text2Cypher) tasks. SynthCypher leverages state-of-the-art large language models (LLMs) to automatically generate and validate high-quality data for training and evaluating models that convert natural language questions into Cypher queries for graph databases like Neo4j. Our dataset and pipeline contribute significantly to… See the full description on the dataset page: https://huggingface.co/datasets/ServiceNow-AI/SynthCypher.

  16. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Tomaž Bratanič (2024). text2cypher-gpt4o-clean [Dataset]. https://huggingface.co/datasets/tomasonjo/text2cypher-gpt4o-clean

text2cypher-gpt4o-clean

tomasonjo/text2cypher-gpt4o-clean

Clean text2cypher dataset generated with gpt-4o on 16 different graph schemas

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 23, 2024
Authors
Tomaž Bratanič
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Synthetic dataset created with GPT-4o

Synthetic dataset of text2cypher over 16 different graph schemas. Questions were generated using GPT-4-turbo, and the corresponding Cypher statements with gpt-4o using Chain of Thought. Here, there are only questions that return results when queried against the database. For more information visit: https://github.com/neo4j-labs/text2cypher/tree/main/datasets/synthetic_gpt4o_demodbs Dataset is available as train.csv. Columns are the following:… See the full description on the dataset page: https://huggingface.co/datasets/tomasonjo/text2cypher-gpt4o-clean.

Search
Clear search
Close search
Google apps
Main menu