13 datasets found

P
WebVid Dataset
paperswithcode.com
Updated Sep 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Max Bain; Arsha Nagrani; Gül Varol; Andrew Zisserman (2024). WebVid Dataset [Dataset]. https://paperswithcode.com/dataset/webvid
Explore at:
Dataset updated
Sep 1, 2024
Authors
Max Bain; Arsha Nagrani; Gül Varol; Andrew Zisserman
Description
WebVid contains 10 million video clips with captions, sourced from the web. The videos are diverse and rich in their content.

Both the full 10M set and a 2.5M subset is available for download: https://github.com/m-bain/webvid-dataset
P
WebVid-CoVR Dataset
paperswithcode.com
Updated Aug 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucas Ventura; Antoine Yang; Cordelia Schmid; Gül Varol (2023). WebVid-CoVR Dataset [Dataset]. https://paperswithcode.com/dataset/webvid-covr
Explore at:
Dataset updated
Aug 27, 2023
Authors
Lucas Ventura; Antoine Yang; Cordelia Schmid; Gül Varol
Description
The WebVid-CoVR dataset is a collection of video-text-video triplets that can be used for the task of composed video retrieval (CoVR). CoVR is a task that involves searching for videos that match both a query image and a query text. The text typically specifies the desired modification to the query image.

The WebVid-CoVR dataset is automatically generated from web-scraped video-caption pairs, using a language model to generate the modification text. The dataset contains 1.6 million triplets, with diverse content and variations. The dataset also includes a manually annotated test set of 2.5K triplets, which can be used to evaluate CoVR models.
h
webvid-10M-classified
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qingyun Li, webvid-10M-classified [Dataset]. https://huggingface.co/datasets/qingy2024/webvid-10M-classified
Explore at:
Authors
Qingyun Li
Description
WebVid 10M Classified (100k)

Each description from the WebVid 10M dataset is passed through Llama 3.3 70B to classify the description as either action or no_action.

If it is classified as an action, it'll be rewritten in a clearer way. Otherwise, the rewritten description will be none.
h
webvid-10M-pro-scored
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qingyun Li, webvid-10M-pro-scored [Dataset]. https://huggingface.co/datasets/qingy2024/webvid-10M-pro-scored
Explore at:
Authors
Qingyun Li
Description
qingy2024/webvid-10M-pro-scored dataset hosted on Hugging Face and contributed by the HF Datasets community
P
WebVidVQA3M Dataset
paperswithcode.com
Updated May 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antoine Yang; Antoine Miech; Josef Sivic; Ivan Laptev; Cordelia Schmid (2022). WebVidVQA3M Dataset [Dataset]. https://paperswithcode.com/dataset/webvidvqa3m
Explore at:
Dataset updated
May 9, 2022
Authors
Antoine Yang; Antoine Miech; Josef Sivic; Ivan Laptev; Cordelia Schmid
Description
A dataset automatically generated using question generation neural models and alt-text video captions from the WebVid dataset, with 3M video-question-answer triplets.
h
TransVerse-webvid-v1
huggingface.co
Updated May 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Franklin (2024). TransVerse-webvid-v1 [Dataset]. https://huggingface.co/datasets/3it/TransVerse-webvid-v1
Explore at:
Dataset updated
May 2, 2024
Authors
Franklin
Description
3it/TransVerse-webvid-v1 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
webvid-mini-100k-scored
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qingyun Li, webvid-mini-100k-scored [Dataset]. https://huggingface.co/datasets/qingy2024/webvid-mini-100k-scored
Explore at:
Authors
Qingyun Li
Description
qingy2024/webvid-mini-100k-scored dataset hosted on Hugging Face and contributed by the HF Datasets community
h
ShareGemini
huggingface.co
Updated Jul 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Share14 (2024). ShareGemini [Dataset]. https://huggingface.co/datasets/Share14/ShareGemini
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 30, 2024
Authors
Share14
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
The videos of Webvid-2M could refer to this issue.
h
Valley-webvid2M-Pretrain-703K
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rupert Luo, Valley-webvid2M-Pretrain-703K [Dataset]. https://huggingface.co/datasets/luoruipu1/Valley-webvid2M-Pretrain-703K
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Rupert Luo
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
luoruipu1/Valley-webvid2M-Pretrain-703K dataset hosted on Hugging Face and contributed by the HF Datasets community
h
fMRI-Video
huggingface.co
Updated Jul 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fudan-fMRI-yanwei (2024). fMRI-Video [Dataset]. https://huggingface.co/datasets/Fudan-fMRI/fMRI-Video
Explore at:
Dataset updated
Jul 13, 2024
Dataset authored and provided by
Fudan-fMRI-yanwei
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
[ECCV 2024] Enhancing Cross-Subject fMRI-to-Video Decoding with Global-Local Functional Alignment

Introduction

fMRI-video dataset, 8 subjects (6 male and 2 female, aged 23-27, 3 for FCVID and 5 for WebVid) participated, and fMRI data are acquired using a 3T scanner and a 32-channel RF head coil, with the fMRI sampled at 1 frame per 0.8 seconds. In detail, stimuli videos of dimensions 256$\times$256 and 596$\times$336 are sourced from the FCVID video dataset and WebVid… See the full description on the dataset page: https://huggingface.co/datasets/Fudan-fMRI/fMRI-Video.
h
WebMotion-36K
huggingface.co
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tuyabei (2025). WebMotion-36K [Dataset]. https://huggingface.co/datasets/Tuyabei/WebMotion-36K
Explore at:
Dataset updated
May 11, 2025
Authors
Tuyabei
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Introduce

A camera motion annotation dataset based on WebVid, containing a total of 36k videos.

Usage

Recovering zip files:cat webmotion.tar.gz.part_* > webmotion.tar.gz

Then unzip it:tar -xzf webmotion.tar.gz
h
CamVid-30K
huggingface.co
Updated Mar 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuyang (2025). CamVid-30K [Dataset]. https://huggingface.co/datasets/Yuyang-z/CamVid-30K
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 30, 2025
Authors
Yuyang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
CamVid-30K

Summary

This is the CamVid-30K dataset introduced in our paper, "GenXD: Generating Any 3D and 4D Scenes." CamVid-30K is the first open-sourced, large-scale 4D dataset, designed to support various dynamic 3D tasks. It includes videos sourced from VIPSeg, OpenVid-1M, and WebVid-10M, with camera annotations curated using our data curation pipeline.
Project: https://gen-x-d.github.io/ Paper: https://arxiv.org/pdf/2411.02319 Code:… See the full description on the dataset page: https://huggingface.co/datasets/Yuyang-z/CamVid-30K.
h
self-alignment
huggingface.co
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pritam Sarkar (2025). self-alignment [Dataset]. https://huggingface.co/datasets/pritamqu/self-alignment
Explore at:
Dataset updated
Jun 2, 2025
Authors
Pritam Sarkar
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Video sources

In the json files, src indicates the video sources which can be downloaded as follows.

video-vqa-webvid_qa: WebVid video-conversation-videochat2: VideoChat2 video-classification-ssv2: SSv2 video-reasoning-clevrer_qa: CLEVRER video-vqa-tgif_frame_qa: TGIF video-reasoning-next_qa: NExTQA video-conversation-videochat1: VideoChat video-vqa-tgif_transition_qa: TGIF video-reasoning-clevrer_mc: CLEVRER video-vqa-ego_qa: EgoQA video-classification-k710:… See the full description on the dataset page: https://huggingface.co/datasets/pritamqu/self-alignment.
Not seeing a result you expected?
Learn how you can add new datasets to our index.