8 datasets found

h
UniWorld-V1
huggingface.co
Updated Jun 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
linbin (2025). UniWorld-V1 [Dataset]. https://huggingface.co/datasets/LanguageBind/UniWorld-V1
Explore at:
Dataset updated
Jun 13, 2025
Authors
linbin
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The Geneval-style dataset is sourced from BLIP3o-60k.

This dataset is presented in the paper: UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation More details can be found in UniWorld-V1

Data preparation

Download the data from LanguageBind/UniWorld-V1. The dataset consists of two parts: source images and annotation JSON files. Prepare a data.txt file in the following format:

The first column is the root path to the image.

The second… See the full description on the dataset page: https://huggingface.co/datasets/LanguageBind/UniWorld-V1.
h
Data from: Video-Bench
huggingface.co
Updated Dec 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
linbin (2023). Video-Bench [Dataset]. https://huggingface.co/datasets/LanguageBind/Video-Bench
Explore at:
Dataset updated
Dec 2, 2023
Authors
linbin
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
LanguageBind/Video-Bench dataset hosted on Hugging Face and contributed by the HF Datasets community
h
MoE-LLaVA
huggingface.co
Updated Jan 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
linbin (2024). MoE-LLaVA [Dataset]. https://huggingface.co/datasets/LanguageBind/MoE-LLaVA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 30, 2024
Authors
linbin
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models If you like our project, please give us a star ⭐ on GitHub for latest update.

📰 News

[2024.01.30] The paper is released. [2024.01.27] 🤗Hugging Face demo and all codes & datasets are available now! Welcome to watch 👀 this repository for the latest updates.

😮 Highlights

MoE-LLaVA shows excellent performance in multi-modal learning.

🔥 High performance, but with fewer… See the full description on the dataset page: https://huggingface.co/datasets/LanguageBind/MoE-LLaVA.
h
VIDAL-Depth-Thermal
huggingface.co
Updated Jan 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
linbin (2024). VIDAL-Depth-Thermal [Dataset]. https://huggingface.co/datasets/LanguageBind/VIDAL-Depth-Thermal
Explore at:
Dataset updated
Jan 4, 2024
Authors
linbin
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
【ICLR 2024 🔥】LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment If you like our project, please give us a star ⭐ on GitHub for latest update.

📰 News

[2024.01.27] 👀👀👀 Our MoE-LLaVA is released! A sparse model with 3B parameters outperformed the dense model with 7B parameters. [2024.01.16] 🔥🔥🔥 Our LanguageBind has been accepted at ICLR 2024! We earn the score of 6(3)8(6)6(6)6(6) here. [2023.12.15] 💪💪💪 We… See the full description on the dataset page: https://huggingface.co/datasets/LanguageBind/VIDAL-Depth-Thermal.
h
Cambrian737k
huggingface.co
Updated Mar 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
linbin (2025). Cambrian737k [Dataset]. https://huggingface.co/datasets/LanguageBind/Cambrian737k
Explore at:
Dataset updated
Mar 20, 2025
Authors
linbin
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
LanguageBind/Cambrian737k dataset hosted on Hugging Face and contributed by the HF Datasets community
h
StyleVideoDataSet
huggingface.co
Updated Nov 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
linbin (2025). StyleVideoDataSet [Dataset]. https://huggingface.co/datasets/LanguageBind/StyleVideoDataSet
Explore at:
Dataset updated
Nov 27, 2025
Authors
linbin
Description
LanguageBind/StyleVideoDataSet dataset hosted on Hugging Face and contributed by the HF Datasets community
h
LLMBind
huggingface.co
Updated Jun 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
linbin (2024). LLMBind [Dataset]. https://huggingface.co/datasets/LanguageBind/LLMBind
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 16, 2024
Authors
linbin
Description
LanguageBind/LLMBind dataset hosted on Hugging Face and contributed by the HF Datasets community
h
TinyLLaVA-Video-v1-training-data
huggingface.co
Updated Apr 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhang Xingjian (2025). TinyLLaVA-Video-v1-training-data [Dataset]. https://huggingface.co/datasets/Zhang199/TinyLLaVA-Video-v1-training-data
Explore at:
Dataset updated
Apr 14, 2025
Authors
Zhang Xingjian
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
TinyLLaVA-Video

This dataset combines data from multiple sources for pre-training and fine-tuning. Pretrain Data: Four subsets of LLaVA-Video-178K (0_30_s_academic_v0_1, 30_60_s_academic_v0_1, 0_30_s_youtube_v0_1, 30_60_s_youtube_v0_1), supplemented with filtered Video-LLaVA data (https://huggingface.co/datasets/LanguageBind/Video-LLaVA) and data from Valley (https://github.com/RupertLuo/Valley). The video data can be downloaded from the linked datasets, and cleaned annotations are provided… See the full description on the dataset page: https://huggingface.co/datasets/Zhang199/TinyLLaVA-Video-v1-training-data.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

linbin (2025). UniWorld-V1 [Dataset]. https://huggingface.co/datasets/LanguageBind/UniWorld-V1

UniWorld-V1

LanguageBind/UniWorld-V1

Explore at:

15 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 13, 2025

Authors

linbin

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

The Geneval-style dataset is sourced from BLIP3o-60k.

This dataset is presented in the paper: UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation More details can be found in UniWorld-V1

  Data preparation

Download the data from LanguageBind/UniWorld-V1. The dataset consists of two parts: source images and annotation JSON files. Prepare a data.txt file in the following format:

The first column is the root path to the image.

The second… See the full description on the dataset page: https://huggingface.co/datasets/LanguageBind/UniWorld-V1.

Clear search

Close search

Google apps

Main menu

UniWorld-V1

Data from: Video-Bench

MoE-LLaVA

VIDAL-Depth-Thermal

Cambrian737k

StyleVideoDataSet

LLMBind

TinyLLaVA-Video-v1-training-data

UniWorld-V1

LanguageBind/UniWorld-V1