2 datasets found
  1. h

    gpt4all-j-prompt-generations

    • huggingface.co
    • opendatalab.com
    Updated Apr 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nomic AI (2023). gpt4all-j-prompt-generations [Dataset]. https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 13, 2023
    Dataset authored and provided by
    Nomic AI
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for [GPT4All-J Prompt Generations]

      Dataset Description
    

    Dataset used to train GPT4All-J and GPT4All-J-LoRA We release several versions of datasets

    v1.0: The original dataset we used to finetune GPT-J on v1.1-breezy: A filtered dataset where we removed all instances of AI language model v1.2-jazzy: A filtered dataset where we also removed instances like I'm sorry, I can't answer... and AI language model v1.3-groovy: The v1.2 dataset with ShareGPT and Dolly… See the full description on the dataset page: https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations.

  2. h

    Dans-MemoryCore-CoreCurriculum-Small

    • huggingface.co
    Updated Dec 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PocketDoc, Dans-MemoryCore-CoreCurriculum-Small [Dataset]. https://huggingface.co/datasets/PocketDoc/Dans-MemoryCore-CoreCurriculum-Small
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 24, 2024
    Authors
    PocketDoc
    License

    https://choosealicense.com/licenses/agpl-3.0/https://choosealicense.com/licenses/agpl-3.0/

    Description

    Dan's Memory Core: Core Curriculum Small

      Broad strokes
    

    This dataset aims to provide a foundation of knowledge common to a number of fields and areas of study. The question answer pairs were generated using a RAG implementation and a curated selection of source material. Ideally this will be the first in a series of datasets that will cover a wide range of topics.

      Nomic Atlas Visualiztion
    

    Cluster visualization for the dataset available here.

      Topics… See the full description on the dataset page: https://huggingface.co/datasets/PocketDoc/Dans-MemoryCore-CoreCurriculum-Small.
    
  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nomic AI (2023). gpt4all-j-prompt-generations [Dataset]. https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations

gpt4all-j-prompt-generations

nomic-ai/gpt4all-j-prompt-generations

Explore at:
5 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 13, 2023
Dataset authored and provided by
Nomic AI
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Dataset Card for [GPT4All-J Prompt Generations]

  Dataset Description

Dataset used to train GPT4All-J and GPT4All-J-LoRA We release several versions of datasets

v1.0: The original dataset we used to finetune GPT-J on v1.1-breezy: A filtered dataset where we removed all instances of AI language model v1.2-jazzy: A filtered dataset where we also removed instances like I'm sorry, I can't answer... and AI language model v1.3-groovy: The v1.2 dataset with ShareGPT and Dolly… See the full description on the dataset page: https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations.

Search
Clear search
Close search
Google apps
Main menu