2 datasets found
  1. h

    chatgpt-paraphrases

    • huggingface.co
    Updated Mar 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Humarin (2023). chatgpt-paraphrases [Dataset]. https://huggingface.co/datasets/humarin/chatgpt-paraphrases
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 17, 2023
    Dataset authored and provided by
    Humarin
    License

    https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/

    Description

    This is a dataset of paraphrases created by ChatGPT. Model based on this dataset is avaible: model

      We used this prompt to generate paraphrases
    

    Generate 5 similar paraphrases for this question, show it like a numbered list without commentaries: {text} This dataset is based on the Quora paraphrase question, texts from the SQUAD 2.0 and the CNN news dataset. We generated 5 paraphrases for each sample, totally this dataset has about 420k data rows. You can make 30 rows from a row from… See the full description on the dataset page: https://huggingface.co/datasets/humarin/chatgpt-paraphrases.

  2. code_evaluation_prompts

    • huggingface.co
    Updated May 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugging Face H4 (2023). code_evaluation_prompts [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/code_evaluation_prompts
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 3, 2023
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Hugging Face H4
    Description

    Dataset Card for H4 Code Evaluation Prompts

    These are a filtered set of prompts for evaluating code instruction models. It will contain a variety of languages and task types. Currently, we used ChatGPT (GPT-3.5-tubro) to generate these, so we encourage using them only for qualatative evaluation and not to train your models. The generation of this data is similar to something like CodeAlpaca, which you can download here, but we intend to make these tasks botha) more challenging… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceH4/code_evaluation_prompts.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Humarin (2023). chatgpt-paraphrases [Dataset]. https://huggingface.co/datasets/humarin/chatgpt-paraphrases

chatgpt-paraphrases

humarin/chatgpt-paraphrases

Explore at:
61 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 17, 2023
Dataset authored and provided by
Humarin
License

https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/

Description

This is a dataset of paraphrases created by ChatGPT. Model based on this dataset is avaible: model

  We used this prompt to generate paraphrases

Generate 5 similar paraphrases for this question, show it like a numbered list without commentaries: {text} This dataset is based on the Quora paraphrase question, texts from the SQUAD 2.0 and the CNN news dataset. We generated 5 paraphrases for each sample, totally this dataset has about 420k data rows. You can make 30 rows from a row from… See the full description on the dataset page: https://huggingface.co/datasets/humarin/chatgpt-paraphrases.

Search
Clear search
Close search
Google apps
Main menu