2 datasets found
  1. h

    openwebtext

    • huggingface.co
    • paperswithcode.com
    • +4more
    Updated Sep 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron Gokaslan (2020). openwebtext [Dataset]. https://huggingface.co/datasets/Skylion007/openwebtext
    Explore at:
    Dataset updated
    Sep 28, 2020
    Authors
    Aaron Gokaslan
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    An open-source replication of the WebText dataset from OpenAI.

  2. h

    Data from: dummy-text

    • huggingface.co
    Updated Oct 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AbdelRahim Elmadany (2023). dummy-text [Dataset]. https://huggingface.co/datasets/elmadany/dummy-text
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 28, 2023
    Authors
    AbdelRahim Elmadany
    Description

    An open-source replication of the WebText dataset from OpenAI.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Aaron Gokaslan (2020). openwebtext [Dataset]. https://huggingface.co/datasets/Skylion007/openwebtext

openwebtext

OpenWebText

Skylion007/openwebtext

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Sep 28, 2020
Authors
Aaron Gokaslan
License

https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

Description

An open-source replication of the WebText dataset from OpenAI.

Search
Clear search
Close search
Google apps
Main menu