8 datasets found
  1. P

    YouCook2 Dataset

    • paperswithcode.com
    Updated Dec 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luowei Zhou; Chenliang Xu; Jason J. Corso (2023). YouCook2 Dataset [Dataset]. https://paperswithcode.com/dataset/youcook2
    Explore at:
    Dataset updated
    Dec 28, 2023
    Authors
    Luowei Zhou; Chenliang Xu; Jason J. Corso
    Description

    YouCook2 is the largest task-oriented, instructional video dataset in the vision community. It contains 2000 long untrimmed videos from 89 cooking recipes; on average, each distinct recipe has 22 videos. The procedure steps for each video are annotated with temporal boundaries and described by imperative English sentences (see the example below). The videos were downloaded from YouTube and are all in the third-person viewpoint. All the videos are unconstrained and can be performed by individual persons at their houses with unfixed cameras. YouCook2 contains rich recipe types and various cooking styles from all over the world.

  2. h

    youcook2

    • huggingface.co
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    morpheushoc (2024). youcook2 [Dataset]. https://huggingface.co/datasets/morpheushoc/youcook2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 15, 2024
    Authors
    morpheushoc
    Description

    morpheushoc/youcook2 dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    YouCook2

    • huggingface.co
    Updated May 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    merve (2024). YouCook2 [Dataset]. https://huggingface.co/datasets/merve/YouCook2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 28, 2024
    Authors
    merve
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    merve/YouCook2 dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. h

    youcook2

    • huggingface.co
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pengxiang Li (2025). youcook2 [Dataset]. https://huggingface.co/datasets/pengxiang/youcook2
    Explore at:
    Dataset updated
    Apr 10, 2025
    Authors
    Pengxiang Li
    Description

    Due to requests and inaccessibility of online videos, we are sharing the raw video files. By downloading these files, you are agreeing to use them for non-commercial, research purposes only.

  5. h

    youcook2_internvideo_MM_L14_features_fps8

    • huggingface.co
    Updated May 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jilan Xu (2024). youcook2_internvideo_MM_L14_features_fps8 [Dataset]. https://huggingface.co/datasets/Jazzcharles/youcook2_internvideo_MM_L14_features_fps8
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 8, 2024
    Authors
    Jilan Xu
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    ๐Ÿ“™ Overview

    YouCook2 video features extracted by InternVideo_MM_L14 at 8 fps. It is used for evaluating the video-text retrieval ability of EgoInstructor. Each file (e.g. 10dZTHlkb8w.pth.tar) is a TxD feature vector, where T refers to the length of the video and D is 768.

      ๐Ÿ‹๏ธ How-To-Use
    

    Please refer to code EgoInstructor for details.

      ๐ŸŽ“ Citation
    

    @article{xu2024retrieval, title={Retrieval-augmented egocentric video captioning}, author={Xu, Jilan and Huangโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/Jazzcharles/youcook2_internvideo_MM_L14_features_fps8.

  6. valid_youcook2

    • kaggle.com
    Updated Nov 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hello518123 (2024). valid_youcook2 [Dataset]. https://www.kaggle.com/datasets/hello518123/valid-youcook2/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 30, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    hello518123
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by hello518123

    Released under Apache 2.0

    Contents

  7. h

    escher-kitchen-action

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MAIR Lab (2025). escher-kitchen-action [Dataset]. https://huggingface.co/datasets/mair-lab/escher-kitchen-action
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    MAIR Lab
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for escher-kitchen-action

    MPII and YouCook2 dataset

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    Each instance contains:

    source_image: The original image edited_image: The edited version of the image edit_instruction: The instruction used to edit the image source_image_caption: Caption for the source image target_image_caption: Caption for the edited image Additional metadata fields

      Data Splits
    

    {}

  8. P

    VALUE Dataset

    • paperswithcode.com
    • library.toponeai.link
    Updated Apr 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Linjie Li; Jie Lei; Zhe Gan; Licheng Yu; Yen-Chun Chen; Rohit Pillai; Yu Cheng; Luowei Zhou; Xin Eric Wang; William Yang Wang; Tamara Lee Berg; Mohit Bansal; Jingjing Liu; Lijuan Wang; Zicheng Liu (2024). VALUE Dataset [Dataset]. https://paperswithcode.com/dataset/value
    Explore at:
    Dataset updated
    Apr 21, 2024
    Authors
    Linjie Li; Jie Lei; Zhe Gan; Licheng Yu; Yen-Chun Chen; Rohit Pillai; Yu Cheng; Luowei Zhou; Xin Eric Wang; William Yang Wang; Tamara Lee Berg; Mohit Bansal; Jingjing Liu; Lijuan Wang; Zicheng Liu
    Description

    VALUE is a Video-And-Language Understanding Evaluation benchmark to test models that are generalizable to diverse tasks, domains, and datasets. It is an assemblage of 11 VidL (video-and-language) datasets over 3 popular tasks: (i) text-to-video retrieval; (ii) video question answering; and (iii) video captioning. VALUE benchmark aims to cover a broad range of video genres, video lengths, data volumes, and task difficulty levels. Rather than focusing on single-channel videos with visual information only, VALUE promotes models that leverage information from both video frames and their associated subtitles, as well as models that share knowledge across multiple tasks.

    The datasets used for the VALUE benchmark are: TVQA, TVR, TVC, How2R, How2QA, VIOLIN, VLEP, YouCook2 (YC2C, YC2R), VATEX

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Luowei Zhou; Chenliang Xu; Jason J. Corso (2023). YouCook2 Dataset [Dataset]. https://paperswithcode.com/dataset/youcook2

YouCook2 Dataset

Explore at:
477 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Dec 28, 2023
Authors
Luowei Zhou; Chenliang Xu; Jason J. Corso
Description

YouCook2 is the largest task-oriented, instructional video dataset in the vision community. It contains 2000 long untrimmed videos from 89 cooking recipes; on average, each distinct recipe has 22 videos. The procedure steps for each video are annotated with temporal boundaries and described by imperative English sentences (see the example below). The videos were downloaded from YouTube and are all in the third-person viewpoint. All the videos are unconstrained and can be performed by individual persons at their houses with unfixed cameras. YouCook2 contains rich recipe types and various cooking styles from all over the world.

Search
Clear search
Close search
Google apps
Main menu