2 datasets found
  1. h

    MSR-VTT

    • huggingface.co
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VLM2Vec (2025). MSR-VTT [Dataset]. https://huggingface.co/datasets/VLM2Vec/MSR-VTT
    Explore at:
    Dataset updated
    Apr 8, 2025
    Dataset authored and provided by
    VLM2Vec
    Description

    Clone from "friedrichor/MSR-VTT". MSRVTT contains 10K video clips and 200K captions. We adopt the standard 1K-A split protocol, which was introduced in JSFusion and has since become the de facto benchmark split in the Text-Video Retrieval field. Train:

    train_7k: 7,010 videos, 140,200 captions
    train_9k: 9,000 videos, 180,000 captions

    Test:

    test_1k: 1,000 videos, 1,000 captions

      🌟 Citation
    

    @inproceedings{xu2016msrvtt, title={Msr-vtt: A large video description dataset… See the full description on the dataset page: https://huggingface.co/datasets/VLM2Vec/MSR-VTT.

  2. h

    MSR-VTT

    • huggingface.co
    Updated Feb 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kong (2025). MSR-VTT [Dataset]. https://huggingface.co/datasets/friedrichor/MSR-VTT
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 28, 2025
    Authors
    Kong
    Description

    MSRVTT contains 10K video clips and 200K captions. We adopt the standard 1K-A split protocol, which was introduced in JSFusion and has since become the de facto benchmark split in the Text-Video Retrieval field. Train:

    train_7k: 7,010 videos, 140,200 captions
    train_9k: 9,000 videos, 180,000 captions

    Test:

    test_1k: 1,000 videos, 1,000 captions

      🌟 Citation
    

    @inproceedings{xu2016msrvtt, title={Msr-vtt: A large video description dataset for bridging video and language}… See the full description on the dataset page: https://huggingface.co/datasets/friedrichor/MSR-VTT.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
VLM2Vec (2025). MSR-VTT [Dataset]. https://huggingface.co/datasets/VLM2Vec/MSR-VTT

MSR-VTT

VLM2Vec/MSR-VTT

Explore at:
Dataset updated
Apr 8, 2025
Dataset authored and provided by
VLM2Vec
Description

Clone from "friedrichor/MSR-VTT". MSRVTT contains 10K video clips and 200K captions. We adopt the standard 1K-A split protocol, which was introduced in JSFusion and has since become the de facto benchmark split in the Text-Video Retrieval field. Train:

train_7k: 7,010 videos, 140,200 captions
train_9k: 9,000 videos, 180,000 captions

Test:

test_1k: 1,000 videos, 1,000 captions

  🌟 Citation

@inproceedings{xu2016msrvtt, title={Msr-vtt: A large video description dataset… See the full description on the dataset page: https://huggingface.co/datasets/VLM2Vec/MSR-VTT.

Search
Clear search
Close search
Google apps
Main menu