8 datasets found

P
YouCook2 Dataset
paperswithcode.com
Updated Dec 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luowei Zhou; Chenliang Xu; Jason J. Corso (2023). YouCook2 Dataset [Dataset]. https://paperswithcode.com/dataset/youcook2
Explore at:
Dataset updated
Dec 28, 2023
Authors
Luowei Zhou; Chenliang Xu; Jason J. Corso
Description
YouCook2 is the largest task-oriented, instructional video dataset in the vision community. It contains 2000 long untrimmed videos from 89 cooking recipes; on average, each distinct recipe has 22 videos. The procedure steps for each video are annotated with temporal boundaries and described by imperative English sentences (see the example below). The videos were downloaded from YouTube and are all in the third-person viewpoint. All the videos are unconstrained and can be performed by individual persons at their houses with unfixed cameras. YouCook2 contains rich recipe types and various cooking styles from all over the world.
h
youcook2
huggingface.co
Updated Jul 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
morpheushoc (2024). youcook2 [Dataset]. https://huggingface.co/datasets/morpheushoc/youcook2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 15, 2024
Authors
morpheushoc
Description
morpheushoc/youcook2 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
YouCook2
huggingface.co
Updated May 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
merve (2024). YouCook2 [Dataset]. https://huggingface.co/datasets/merve/YouCook2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 28, 2024
Authors
merve
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
merve/YouCook2 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
youcook2
huggingface.co
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pengxiang Li (2025). youcook2 [Dataset]. https://huggingface.co/datasets/pengxiang/youcook2
Explore at:
Dataset updated
Apr 10, 2025
Authors
Pengxiang Li
Description
Due to requests and inaccessibility of online videos, we are sharing the raw video files. By downloading these files, you are agreeing to use them for non-commercial, research purposes only.
h
youcook2_internvideo_MM_L14_features_fps8
huggingface.co
Updated May 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jilan Xu (2024). youcook2_internvideo_MM_L14_features_fps8 [Dataset]. https://huggingface.co/datasets/Jazzcharles/youcook2_internvideo_MM_L14_features_fps8
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 8, 2024
Authors
Jilan Xu
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
📙 Overview

YouCook2 video features extracted by InternVideo_MM_L14 at 8 fps. It is used for evaluating the video-text retrieval ability of EgoInstructor. Each file (e.g. 10dZTHlkb8w.pth.tar) is a TxD feature vector, where T refers to the length of the video and D is 768.

🏋️ How-To-Use

Please refer to code EgoInstructor for details.

🎓 Citation

@article{xu2024retrieval, title={Retrieval-augmented egocentric video captioning}, author={Xu, Jilan and Huang… See the full description on the dataset page: https://huggingface.co/datasets/Jazzcharles/youcook2_internvideo_MM_L14_features_fps8.
valid_youcook2
kaggle.com
Updated Nov 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
hello518123 (2024). valid_youcook2 [Dataset]. https://www.kaggle.com/datasets/hello518123/valid-youcook2/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 30, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
hello518123
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by hello518123

Released under Apache 2.0

Contents
h
escher-kitchen-action
huggingface.co
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MAIR Lab (2025). escher-kitchen-action [Dataset]. https://huggingface.co/datasets/mair-lab/escher-kitchen-action
Explore at:
Dataset updated
May 11, 2025
Dataset authored and provided by
MAIR Lab
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for escher-kitchen-action

MPII and YouCook2 dataset

Dataset Structure Data Instances

Each instance contains:

source_image: The original image edited_image: The edited version of the image edit_instruction: The instruction used to edit the image source_image_caption: Caption for the source image target_image_caption: Caption for the edited image Additional metadata fields

Data Splits

{}
P
VALUE Dataset
paperswithcode.com
library.toponeai.link
Updated Apr 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Linjie Li; Jie Lei; Zhe Gan; Licheng Yu; Yen-Chun Chen; Rohit Pillai; Yu Cheng; Luowei Zhou; Xin Eric Wang; William Yang Wang; Tamara Lee Berg; Mohit Bansal; Jingjing Liu; Lijuan Wang; Zicheng Liu (2024). VALUE Dataset [Dataset]. https://paperswithcode.com/dataset/value
Explore at:
Dataset updated
Apr 21, 2024
Authors
Linjie Li; Jie Lei; Zhe Gan; Licheng Yu; Yen-Chun Chen; Rohit Pillai; Yu Cheng; Luowei Zhou; Xin Eric Wang; William Yang Wang; Tamara Lee Berg; Mohit Bansal; Jingjing Liu; Lijuan Wang; Zicheng Liu
Description
VALUE is a Video-And-Language Understanding Evaluation benchmark to test models that are generalizable to diverse tasks, domains, and datasets. It is an assemblage of 11 VidL (video-and-language) datasets over 3 popular tasks: (i) text-to-video retrieval; (ii) video question answering; and (iii) video captioning. VALUE benchmark aims to cover a broad range of video genres, video lengths, data volumes, and task difficulty levels. Rather than focusing on single-channel videos with visual information only, VALUE promotes models that leverage information from both video frames and their associated subtitles, as well as models that share knowledge across multiple tasks.

The datasets used for the VALUE benchmark are: TVQA, TVR, TVC, How2R, How2QA, VIOLIN, VLEP, YouCook2 (YC2C, YC2R), VATEX
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Luowei Zhou; Chenliang Xu; Jason J. Corso (2023). YouCook2 Dataset [Dataset]. https://paperswithcode.com/dataset/youcook2

YouCook2 Dataset

Explore at:

477 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Dec 28, 2023

Authors

Luowei Zhou; Chenliang Xu; Jason J. Corso

Description

YouCook2 is the largest task-oriented, instructional video dataset in the vision community. It contains 2000 long untrimmed videos from 89 cooking recipes; on average, each distinct recipe has 22 videos. The procedure steps for each video are annotated with temporal boundaries and described by imperative English sentences (see the example below). The videos were downloaded from YouTube and are all in the third-person viewpoint. All the videos are unconstrained and can be performed by individual persons at their houses with unfixed cameras. YouCook2 contains rich recipe types and various cooking styles from all over the world.

Clear search

Close search

Google apps

Main menu

YouCook2 Dataset

youcook2

YouCook2

youcook2

youcook2_internvideo_MM_L14_features_fps8

valid_youcook2

Dataset

Contents

escher-kitchen-action

VALUE Dataset

YouCook2 DatasetSee More Versions

YouCook2 Dataset