Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
The Something-Something dataset (version 2) is a collection of 220,847 labeled video clips of humans performing pre-defined, basic actions with everyday objects. It is designed to train machine learning models in fine-grained understanding of human hand gestures like putting something into something, turning something upside down and covering something with something.
Facebook
Twitteremirgocen/Something-Something-v2 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
morpheushoc/something-something-v2 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Lisa Sharapova
Released under CC0: Public Domain
Facebook
Twitterhttps://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
olarian/something-something-v2 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Neural networks trained on datasets such as ImageNet have led to major advances in visual object classification. One obstacle that prevents networks from reasoning more deeply about complex scenes and situations, and from integrating visual knowledge with natural language, like humans do, is their lack of common sense knowledge about the physical world. Videos, unlike still images, contain a wealth of detailed information about the physical world. However, most labelled video datasets represent high-level concepts rather than detailed physical aspects about actions and scenes. In this work, we describe our ongoing collection of the “something-something” database of video prediction tasks whose solutions require a common sense understanding of the depicted situation. The database currently contains more than 100,000 videos across 174 classes, which are defined as caption-templates. We also describe the challenges in crowd-sourcing this data at scale.
Facebook
TwitterThe test set of Something-Something-V2 dataset. The class name of this dataset is long and contains character (i.e. ') which doesn't support in kaggle. The following code is run to modify the class name in order to support upload in kaggle.
import os
def sanitize_name(name):
forbidden_chars = ['<', '>', ':', '"', '/', '\\', '|', '?', '*', '\'']
for char in forbidden_chars:
name = name.replace(char, '')
return name
def rename_files_in_directory(directory):
for root, dirs, files in os.walk(directory, topdown=False):
# Renaming files
for name in files:
sanitized_name = sanitize_name(name)
if name != sanitized_name:
os.rename(os.path.join(root, name), os.path.join(root, sanitized_name))
# Renaming folders
for name in dirs:
sanitized_name = sanitize_name(name)
if name != sanitized_name:
os.rename(os.path.join(root, name), os.path.join(root, sanitized_name))
if _name_ == "_main_":
directory_path = './test/'
rename_files_in_directory(directory_path)
print("Renaming completed!")
Facebook
TwitterExploration/something-something-v2_vla_goal dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Lisa Sharapova
Released under CC0: Public Domain
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Lisa Sharapova
Released under CC0: Public Domain
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Lisa Sharapova
Released under CC0: Public Domain
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Something-Something V2 Frame Pairs Dataset
A video dataset derived from Something-Something V2 for future frame prediction tasks.
Dataset Description
This dataset contains short video clips of humans performing basic actions with everyday objects. Each video shows a simple action like "Pushing something from left to right" or "Picking something up".
Statistics
Total videos: 168,913 Video format: WebM Resolution: Variable (typically 240p) Duration: 2-6 seconds… See the full description on the dataset page: https://huggingface.co/datasets/zengxianyu/sthv2-frame-pairs.
Facebook
TwitterWhat's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Video Test-Time Adaptation for Action Recognition (CVPR 2023)
Project Page GitHub Repo Arxiv Paper
Dataset Description
This dataset repo contains the following two datasets:
Kinetics400_val_corruptions: 12 corruption types for the 19877 validation videos on Kinetics400. SSv2_val_corruptions: 12 corruption types for the 24777 validation videos on Something-Something v2.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Bird Species 2 is a dataset for object detection tasks - it contains Bird Animal annotations for 304 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterSomething-Something V2 (SSV2) 视频预测数据集子集构建文档
🚀 1. 项目目标与任务定义
本项目旨在从庞大的 Something-Something V2 训练集中,构建一个针对指令驱动型视频预测任务的、高质量、小规模的训练和验证子集。
核心任务定义
我们采用经典的视频预测任务定义:给定一个短序列 (F_1 到 F_20) 和一个文本指令,预测序列的下一帧 (F_21)。
元素 描述 索引 图像尺寸
输入序列 (I_Input) 20 张连续帧作为观测输入。 F_01 到 F_20 128 × 128
目标帧 (I_Target) 序列的第 21 帧作为模型预测的真值。 F_21 128 × 128
文本指令 (T) 视频的原始文本标签 (label 字段)。 SSV2 train.json 字符串
🔬 2. 数据集子集精准提取策略(创新点)
Something-Something V2 包含 174 种动作类型和… See the full description on the dataset page: https://huggingface.co/datasets/RuLan03/Sthv2_500_3scope.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
The Something-Something dataset (version 2) is a collection of 220,847 labeled video clips of humans performing pre-defined, basic actions with everyday objects. It is designed to train machine learning models in fine-grained understanding of human hand gestures like putting something into something, turning something upside down and covering something with something.