13 datasets found
  1. P

    WebVid Dataset

    • paperswithcode.com
    Updated Sep 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Max Bain; Arsha Nagrani; Gül Varol; Andrew Zisserman (2024). WebVid Dataset [Dataset]. https://paperswithcode.com/dataset/webvid
    Explore at:
    Dataset updated
    Sep 1, 2024
    Authors
    Max Bain; Arsha Nagrani; Gül Varol; Andrew Zisserman
    Description

    WebVid contains 10 million video clips with captions, sourced from the web. The videos are diverse and rich in their content.

    Both the full 10M set and a 2.5M subset is available for download: https://github.com/m-bain/webvid-dataset

  2. P

    WebVid-CoVR Dataset

    • paperswithcode.com
    Updated Aug 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucas Ventura; Antoine Yang; Cordelia Schmid; Gül Varol (2023). WebVid-CoVR Dataset [Dataset]. https://paperswithcode.com/dataset/webvid-covr
    Explore at:
    Dataset updated
    Aug 27, 2023
    Authors
    Lucas Ventura; Antoine Yang; Cordelia Schmid; Gül Varol
    Description

    The WebVid-CoVR dataset is a collection of video-text-video triplets that can be used for the task of composed video retrieval (CoVR). CoVR is a task that involves searching for videos that match both a query image and a query text. The text typically specifies the desired modification to the query image.

    The WebVid-CoVR dataset is automatically generated from web-scraped video-caption pairs, using a language model to generate the modification text. The dataset contains 1.6 million triplets, with diverse content and variations. The dataset also includes a manually annotated test set of 2.5K triplets, which can be used to evaluate CoVR models.

  3. h

    webvid-10M-classified

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qingyun Li, webvid-10M-classified [Dataset]. https://huggingface.co/datasets/qingy2024/webvid-10M-classified
    Explore at:
    Authors
    Qingyun Li
    Description

    WebVid 10M Classified (100k)

    Each description from the WebVid 10M dataset is passed through Llama 3.3 70B to classify the description as either action or no_action.

    If it is classified as an action, it'll be rewritten in a clearer way. Otherwise, the rewritten description will be none.

  4. h

    webvid-10M-pro-scored

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qingyun Li, webvid-10M-pro-scored [Dataset]. https://huggingface.co/datasets/qingy2024/webvid-10M-pro-scored
    Explore at:
    Authors
    Qingyun Li
    Description

    qingy2024/webvid-10M-pro-scored dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. P

    WebVidVQA3M Dataset

    • paperswithcode.com
    Updated May 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antoine Yang; Antoine Miech; Josef Sivic; Ivan Laptev; Cordelia Schmid (2022). WebVidVQA3M Dataset [Dataset]. https://paperswithcode.com/dataset/webvidvqa3m
    Explore at:
    Dataset updated
    May 9, 2022
    Authors
    Antoine Yang; Antoine Miech; Josef Sivic; Ivan Laptev; Cordelia Schmid
    Description

    A dataset automatically generated using question generation neural models and alt-text video captions from the WebVid dataset, with 3M video-question-answer triplets.

  6. h

    TransVerse-webvid-v1

    • huggingface.co
    Updated May 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Franklin (2024). TransVerse-webvid-v1 [Dataset]. https://huggingface.co/datasets/3it/TransVerse-webvid-v1
    Explore at:
    Dataset updated
    May 2, 2024
    Authors
    Franklin
    Description

    3it/TransVerse-webvid-v1 dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    webvid-mini-100k-scored

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qingyun Li, webvid-mini-100k-scored [Dataset]. https://huggingface.co/datasets/qingy2024/webvid-mini-100k-scored
    Explore at:
    Authors
    Qingyun Li
    Description

    qingy2024/webvid-mini-100k-scored dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. h

    ShareGemini

    • huggingface.co
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Share14 (2024). ShareGemini [Dataset]. https://huggingface.co/datasets/Share14/ShareGemini
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 30, 2024
    Authors
    Share14
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The videos of Webvid-2M could refer to this issue.

  9. h

    Valley-webvid2M-Pretrain-703K

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rupert Luo, Valley-webvid2M-Pretrain-703K [Dataset]. https://huggingface.co/datasets/luoruipu1/Valley-webvid2M-Pretrain-703K
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Rupert Luo
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    luoruipu1/Valley-webvid2M-Pretrain-703K dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    fMRI-Video

    • huggingface.co
    Updated Jul 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fudan-fMRI-yanwei (2024). fMRI-Video [Dataset]. https://huggingface.co/datasets/Fudan-fMRI/fMRI-Video
    Explore at:
    Dataset updated
    Jul 13, 2024
    Dataset authored and provided by
    Fudan-fMRI-yanwei
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    [ECCV 2024] Enhancing Cross-Subject fMRI-to-Video Decoding with Global-Local Functional Alignment

      Introduction
    

    fMRI-video dataset, 8 subjects (6 male and 2 female, aged 23-27, 3 for FCVID and 5 for WebVid) participated, and fMRI data are acquired using a 3T scanner and a 32-channel RF head coil, with the fMRI sampled at 1 frame per 0.8 seconds. In detail, stimuli videos of dimensions 256$\times$256 and 596$\times$336 are sourced from the FCVID video dataset and WebVid… See the full description on the dataset page: https://huggingface.co/datasets/Fudan-fMRI/fMRI-Video.

  11. h

    WebMotion-36K

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tuyabei (2025). WebMotion-36K [Dataset]. https://huggingface.co/datasets/Tuyabei/WebMotion-36K
    Explore at:
    Dataset updated
    May 11, 2025
    Authors
    Tuyabei
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Introduce

    A camera motion annotation dataset based on WebVid, containing a total of 36k videos.

      Usage
    

    Recovering zip files:cat webmotion.tar.gz.part_* > webmotion.tar.gz

    Then unzip it:tar -xzf webmotion.tar.gz

  12. h

    CamVid-30K

    • huggingface.co
    Updated Mar 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuyang (2025). CamVid-30K [Dataset]. https://huggingface.co/datasets/Yuyang-z/CamVid-30K
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 30, 2025
    Authors
    Yuyang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CamVid-30K

      Summary
    

    This is the CamVid-30K dataset introduced in our paper, "GenXD: Generating Any 3D and 4D Scenes." CamVid-30K is the first open-sourced, large-scale 4D dataset, designed to support various dynamic 3D tasks. It includes videos sourced from VIPSeg, OpenVid-1M, and WebVid-10M, with camera annotations curated using our data curation pipeline.
    Project: https://gen-x-d.github.io/ Paper: https://arxiv.org/pdf/2411.02319 Code:… See the full description on the dataset page: https://huggingface.co/datasets/Yuyang-z/CamVid-30K.

  13. h

    self-alignment

    • huggingface.co
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pritam Sarkar (2025). self-alignment [Dataset]. https://huggingface.co/datasets/pritamqu/self-alignment
    Explore at:
    Dataset updated
    Jun 2, 2025
    Authors
    Pritam Sarkar
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Video sources

    In the json files, src indicates the video sources which can be downloaded as follows.

    video-vqa-webvid_qa: WebVid video-conversation-videochat2: VideoChat2 video-classification-ssv2: SSv2 video-reasoning-clevrer_qa: CLEVRER video-vqa-tgif_frame_qa: TGIF video-reasoning-next_qa: NExTQA video-conversation-videochat1: VideoChat video-vqa-tgif_transition_qa: TGIF video-reasoning-clevrer_mc: CLEVRER video-vqa-ego_qa: EgoQA video-classification-k710:… See the full description on the dataset page: https://huggingface.co/datasets/pritamqu/self-alignment.

  14. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Max Bain; Arsha Nagrani; Gül Varol; Andrew Zisserman (2024). WebVid Dataset [Dataset]. https://paperswithcode.com/dataset/webvid

WebVid Dataset

Explore at:
Dataset updated
Sep 1, 2024
Authors
Max Bain; Arsha Nagrani; Gül Varol; Andrew Zisserman
Description

WebVid contains 10 million video clips with captions, sourced from the web. The videos are diverse and rich in their content.

Both the full 10M set and a 2.5M subset is available for download: https://github.com/m-bain/webvid-dataset

Search
Clear search
Close search
Google apps
Main menu