26 datasets found

Infinity-Instruct
huggingface.co
Updated Jun 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Beijing Academy of Artificial Intelligence (2024). Infinity-Instruct [Dataset]. https://huggingface.co/datasets/BAAI/Infinity-Instruct
Explore at:
Dataset updated
Jun 13, 2024
Dataset authored and provided by
Beijing Academy of Artificial Intelligence
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Infinity Instruct

Beijing Academy of Artificial Intelligence (BAAI) [Paper][Code][🤗]

The quality and scale of instruction data are crucial for model performance. Recently, open-source models have increasingly relied on fine-tuning datasets comprising millions of instances, necessitating both high quality and large scale. However, the open-source community has long been constrained by the high costs associated with building such extensive and high-quality instruction… See the full description on the dataset page: https://huggingface.co/datasets/BAAI/Infinity-Instruct.
BAAI-Infinity-Instruct-System
huggingface.co
Updated Jun 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arcee AI (2024). BAAI-Infinity-Instruct-System [Dataset]. https://huggingface.co/datasets/arcee-ai/BAAI-Infinity-Instruct-System
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 21, 2024
Dataset provided by
Arcee AI, Inc.
Authors
Arcee AI
Description
Arcee.ai Modifications

The original dataset (https://huggingface.co/datasets/BAAI/Infinity-Instruct) contained 383,697 samples that used "gpt" tags for system instructions instead of "system" tags. Additionally, 56 samples had empty values for either the human or gpt fields. We have addressed these issues by renaming the tags in the affected samples and removing those with empty values. The remainder of the dataset is unchanged.

Infinity Instruct

Beijing Academy… See the full description on the dataset page: https://huggingface.co/datasets/arcee-ai/BAAI-Infinity-Instruct-System.
h
MindSpeed-Infinity-Instruct-7M
huggingface.co
Updated Jan 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiangwen Su (2025). MindSpeed-Infinity-Instruct-7M [Dataset]. https://huggingface.co/datasets/uukuguy/MindSpeed-Infinity-Instruct-7M
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 6, 2025
Authors
Jiangwen Su
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset is built appond the Infinity Instruct project, aiming to match the multi-rounds dialogue finetune format of the MindSpeed-LLM.

Infinity Instruct

Beijing Academy of Artificial Intelligence (BAAI) [Paper][Code]🤗

The quality and scale of instruction data are crucial for model performance. Recently, open-source models have increasingly relied on fine-tuning datasets comprising millions of instances, necessitating both high quality and large… See the full description on the dataset page: https://huggingface.co/datasets/uukuguy/MindSpeed-Infinity-Instruct-7M.
h
Infinity-Instruct-7M-en-old
huggingface.co
Updated Feb 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Woojeong Kim (2025). Infinity-Instruct-7M-en-old [Dataset]. https://huggingface.co/datasets/friendshipkim/Infinity-Instruct-7M-en-old
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 20, 2025
Authors
Woojeong Kim
Description
friendshipkim/Infinity-Instruct-7M-en-old dataset hosted on Hugging Face and contributed by the HF Datasets community
h
BAAI-Infinity-Instruct-7M-core-en
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
s, BAAI-Infinity-Instruct-7M-core-en [Dataset]. https://huggingface.co/datasets/semran1/BAAI-Infinity-Instruct-7M-core-en
Explore at:
Authors
s
Description
semran1/BAAI-Infinity-Instruct-7M-core-en dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Infinity-Instruct
huggingface.co
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Max Zuo (2025). Infinity-Instruct [Dataset]. https://huggingface.co/datasets/zuom/Infinity-Instruct
Explore at:
Dataset updated
Apr 28, 2025
Authors
Max Zuo
Description
Built by stripping BAAI/Infinity-Instruct and reformatting.
h
infinity-instruct-7M
huggingface.co
Updated Aug 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sgp-bench (2024). infinity-instruct-7M [Dataset]. https://huggingface.co/datasets/sgp-bench/infinity-instruct-7M
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 19, 2024
Dataset authored and provided by
sgp-bench
Description
sgp-bench/infinity-instruct-7M dataset hosted on Hugging Face and contributed by the HF Datasets community
h
infinity-instruct-100k
huggingface.co
Updated Jun 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdulkader Abdulrazzak (2024). infinity-instruct-100k [Dataset]. https://huggingface.co/datasets/qdr91/infinity-instruct-100k
Explore at:
Dataset updated
Jun 19, 2024
Authors
Abdulkader Abdulrazzak
Description
qdr91/infinity-instruct-100k dataset hosted on Hugging Face and contributed by the HF Datasets community
h
infinity-instruct-3M
huggingface.co
Updated Aug 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sgp-bench (2024). infinity-instruct-3M [Dataset]. https://huggingface.co/datasets/sgp-bench/infinity-instruct-3M
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 16, 2024
Dataset authored and provided by
sgp-bench
Description
sgp-bench/infinity-instruct-3M dataset hosted on Hugging Face and contributed by the HF Datasets community
h
infinity-instruct-inverse
huggingface.co
Updated Dec 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ExtraOrdinaryLab (2024). infinity-instruct-inverse [Dataset]. https://huggingface.co/datasets/extraordinarylab/infinity-instruct-inverse
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 26, 2024
Dataset authored and provided by
ExtraOrdinaryLab
Description
extraordinarylab/infinity-instruct-inverse dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Infinity-Instruct-3M
huggingface.co
Updated Jan 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Woojeong Kim (2025). Infinity-Instruct-3M [Dataset]. https://huggingface.co/datasets/friendshipkim/Infinity-Instruct-3M
Explore at:
Dataset updated
Jan 22, 2025
Authors
Woojeong Kim
Description
friendshipkim/Infinity-Instruct-3M dataset hosted on Hugging Face and contributed by the HF Datasets community
h
BAAI_Infinity-Instruct-7M-Gen-Llama3_1-70B-details
huggingface.co
Updated Jul 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open LLM Leaderboard (2025). BAAI_Infinity-Instruct-7M-Gen-Llama3_1-70B-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-7M-Gen-Llama3_1-70B-details
Explore at:
Dataset updated
Jul 30, 2025
Dataset authored and provided by
Open LLM Leaderboard
Description
Dataset Card for Evaluation run of BAAI/Infinity-Instruct-7M-Gen-Llama3_1-70B

Dataset automatically created during the evaluation run of model BAAI/Infinity-Instruct-7M-Gen-Llama3_1-70B The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-7M-Gen-Llama3_1-70B-details.
h
BAAI_Infinity-Instruct-7M-Gen-mistral-7B-details
huggingface.co
Updated Jul 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open LLM Leaderboard (2025). BAAI_Infinity-Instruct-7M-Gen-mistral-7B-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-7M-Gen-mistral-7B-details
Explore at:
Dataset updated
Jul 30, 2025
Dataset authored and provided by
Open LLM Leaderboard
Description
Dataset Card for Evaluation run of BAAI/Infinity-Instruct-7M-Gen-mistral-7B

Dataset automatically created during the evaluation run of model BAAI/Infinity-Instruct-7M-Gen-mistral-7B The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-7M-Gen-mistral-7B-details.
h
Infinity-Instruct-Reformatted
huggingface.co
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shih-Kai Hsiao (2024). Infinity-Instruct-Reformatted [Dataset]. https://huggingface.co/datasets/ShinoharaHare/Infinity-Instruct-Reformatted
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 16, 2024
Authors
Shih-Kai Hsiao
Description
ShinoharaHare/Infinity-Instruct-Reformatted dataset hosted on Hugging Face and contributed by the HF Datasets community
h
BAAI_Infinity-Instruct-3M-0625-Yi-1.5-9B-details
huggingface.co
Updated Jul 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open LLM Leaderboard (2025). BAAI_Infinity-Instruct-3M-0625-Yi-1.5-9B-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-3M-0625-Yi-1.5-9B-details
Explore at:
Dataset updated
Jul 30, 2025
Dataset authored and provided by
Open LLM Leaderboard
Description
Dataset Card for Evaluation run of BAAI/Infinity-Instruct-3M-0625-Yi-1.5-9B

Dataset automatically created during the evaluation run of model BAAI/Infinity-Instruct-3M-0625-Yi-1.5-9B The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-3M-0625-Yi-1.5-9B-details.
h
Infinity-Instruct-0625-Converted
huggingface.co
Updated Jul 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Habibullah Akbar (2024). Infinity-Instruct-0625-Converted [Dataset]. https://huggingface.co/datasets/ChavyvAkvar/Infinity-Instruct-0625-Converted
Explore at:
Dataset updated
Jul 9, 2024
Authors
Habibullah Akbar
Description
ChavyvAkvar/Infinity-Instruct-0625-Converted dataset hosted on Hugging Face and contributed by the HF Datasets community
h
BAAI_Infinity-Instruct-3M-0625-Qwen2-7B-details
huggingface.co
Updated Jul 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open LLM Leaderboard (2025). BAAI_Infinity-Instruct-3M-0625-Qwen2-7B-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-3M-0625-Qwen2-7B-details
Explore at:
Dataset updated
Jul 30, 2025
Dataset authored and provided by
Open LLM Leaderboard
Description
Dataset Card for Evaluation run of BAAI/Infinity-Instruct-3M-0625-Qwen2-7B

Dataset automatically created during the evaluation run of model BAAI/Infinity-Instruct-3M-0625-Qwen2-7B The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-3M-0625-Qwen2-7B-details.
h
BAAI_Infinity-Instruct-3M-0625-Llama3-8B-details
huggingface.co
Updated Jul 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open LLM Leaderboard (2025). BAAI_Infinity-Instruct-3M-0625-Llama3-8B-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-3M-0625-Llama3-8B-details
Explore at:
Dataset updated
Jul 30, 2025
Dataset authored and provided by
Open LLM Leaderboard
Description
Dataset Card for Evaluation run of BAAI/Infinity-Instruct-3M-0625-Llama3-8B

Dataset automatically created during the evaluation run of model BAAI/Infinity-Instruct-3M-0625-Llama3-8B The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/BAAI_Infinity-Instruct-3M-0625-Llama3-8B-details.
h
Infinity-Instruct-0625-Qwen
huggingface.co
Updated Jul 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soumil Gupta (2024). Infinity-Instruct-0625-Qwen [Dataset]. https://huggingface.co/datasets/Soumil30/Infinity-Instruct-0625-Qwen
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 9, 2024
Authors
Soumil Gupta
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Soumil30/Infinity-Instruct-0625-Qwen dataset hosted on Hugging Face and contributed by the HF Datasets community
h
jlzhou_Qwen2.5-3B-Infinity-Instruct-0625-details
huggingface.co
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open LLM Leaderboard (2025). jlzhou_Qwen2.5-3B-Infinity-Instruct-0625-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/jlzhou_Qwen2.5-3B-Infinity-Instruct-0625-details
Explore at:
Dataset updated
Jul 30, 2025
Dataset authored and provided by
Open LLM Leaderboard
Description
Dataset Card for Evaluation run of jlzhou/Qwen2.5-3B-Infinity-Instruct-0625

Dataset automatically created during the evaluation run of model jlzhou/Qwen2.5-3B-Infinity-Instruct-0625 The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/jlzhou_Qwen2.5-3B-Infinity-Instruct-0625-details.

Facebook

Twitter

Click to copy link

Link copied

Cite

Beijing Academy of Artificial Intelligence (2024). Infinity-Instruct [Dataset]. https://huggingface.co/datasets/BAAI/Infinity-Instruct

Infinity-Instruct

BAAI/Infinity-Instruct

Explore at:

30 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 13, 2024

Dataset authored and provided by

Beijing Academy of Artificial Intelligence

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

Infinity Instruct

Beijing Academy of Artificial Intelligence (BAAI) [Paper][Code][🤗]

The quality and scale of instruction data are crucial for model performance. Recently, open-source models have increasingly relied on fine-tuning datasets comprising millions of instances, necessitating both high quality and large scale. However, the open-source community has long been constrained by the high costs associated with building such extensive and high-quality instruction… See the full description on the dataset page: https://huggingface.co/datasets/BAAI/Infinity-Instruct.

Clear search

Close search

Google apps

Main menu

Infinity-Instruct

BAAI-Infinity-Instruct-System

MindSpeed-Infinity-Instruct-7M

Infinity-Instruct-7M-en-old

BAAI-Infinity-Instruct-7M-core-en

Infinity-Instruct

infinity-instruct-7M

infinity-instruct-100k

infinity-instruct-3M

infinity-instruct-inverse

Infinity-Instruct-3M

BAAI_Infinity-Instruct-7M-Gen-Llama3_1-70B-details

BAAI_Infinity-Instruct-7M-Gen-mistral-7B-details

Infinity-Instruct-Reformatted

BAAI_Infinity-Instruct-3M-0625-Yi-1.5-9B-details

Infinity-Instruct-0625-Converted

BAAI_Infinity-Instruct-3M-0625-Qwen2-7B-details

BAAI_Infinity-Instruct-3M-0625-Llama3-8B-details

Infinity-Instruct-0625-Qwen

jlzhou_Qwen2.5-3B-Infinity-Instruct-0625-details

Infinity-Instruct

BAAI/Infinity-Instruct