Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for [GPT4All-J Prompt Generations]
Dataset Description
Dataset used to train GPT4All-J and GPT4All-J-LoRA We release several versions of datasets
v1.0: The original dataset we used to finetune GPT-J on v1.1-breezy: A filtered dataset where we removed all instances of AI language model v1.2-jazzy: A filtered dataset where we also removed instances like I'm sorry, I can't answer... and AI language model v1.3-groovy: The v1.2 dataset with ShareGPT and Dolly… See the full description on the dataset page: https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations.
https://choosealicense.com/licenses/agpl-3.0/https://choosealicense.com/licenses/agpl-3.0/
Dan's Memory Core: Core Curriculum Small
Broad strokes
This dataset aims to provide a foundation of knowledge common to a number of fields and areas of study. The question answer pairs were generated using a RAG implementation and a curated selection of source material. Ideally this will be the first in a series of datasets that will cover a wide range of topics.
Nomic Atlas Visualiztion
Cluster visualization for the dataset available here.
Topics… See the full description on the dataset page: https://huggingface.co/datasets/PocketDoc/Dans-MemoryCore-CoreCurriculum-Small.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for [GPT4All-J Prompt Generations]
Dataset Description
Dataset used to train GPT4All-J and GPT4All-J-LoRA We release several versions of datasets
v1.0: The original dataset we used to finetune GPT-J on v1.1-breezy: A filtered dataset where we removed all instances of AI language model v1.2-jazzy: A filtered dataset where we also removed instances like I'm sorry, I can't answer... and AI language model v1.3-groovy: The v1.2 dataset with ShareGPT and Dolly… See the full description on the dataset page: https://huggingface.co/datasets/nomic-ai/gpt4all-j-prompt-generations.