WebVid 10M Classified (100k)
Each description from the WebVid 10M dataset is passed through Llama 3.3 70B to classify the description as either action or no_action.
If it is classified as an action, it'll be rewritten in a clearer way. Otherwise, the rewritten description will be none.
WebVid is a large-scale dataset of short videos with textual descriptions sourced from the web. The videos are diverse and rich in their content.
WebVid-10M contains:
10.7M video-caption pairs. 52K total video hours.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('webvid', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
The dataset used for training the video model consists of Webvid-10M, a large-scale dataset of short videos with textual descriptions.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
WebVid 10M Classified (100k)
Each description from the WebVid 10M dataset is passed through Llama 3.3 70B to classify the description as either action or no_action.
If it is classified as an action, it'll be rewritten in a clearer way. Otherwise, the rewritten description will be none.