Facebook
TwitterClean-up text for 40+ Wikipedia languages editions of pages correspond to entities. The datasets have train/dev/test splits per language. The dataset is cleaned up by page filtering to remove disambiguation pages, redirect pages, deleted pages, and non-entity pages. Each example contains the wikidata id of the entity, and the full Wikipedia article after page processing that removes non-content sections and structured objects. The language models trained on this corpus - including 41 monolingual models, and 2 multilingual models - can be found at https://tfhub.dev/google/collections/wiki40b-lm/1.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('wiki40b', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Facebook
TwitterThe CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('cifar10', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/cifar10-3.0.2.png" alt="Visualization" width="500px">
Facebook
TwitterA large set of images of flowers
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('tf_flowers', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/tf_flowers-3.0.1.png" alt="Visualization" width="500px">
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Yoga Yudha Tama
Released under Apache 2.0
Facebook
TwitterWikipedia dataset containing cleaned articles of all languages. The datasets are built from the Wikipedia dump (https://dumps.wikimedia.org/) with one split per language. Each example contains the content of one full Wikipedia article with cleaning to strip markdown and unwanted sections (references, etc.).
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('wikipedia', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Facebook
TwitterThe MNIST database of handwritten digits.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('mnist', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Tensorflow is a dataset for object detection tasks - it contains Objects annotations for 496 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterTFDS provides a collection of ready-to-use datasets for use with TensorFlow, Jax, and other Machine Learning frameworks.
Facebook
TwitterDataset is in .tfrecords format. use the following code to parse the data into tensorflow usable format:
import tensorflow as tf
PATH = '/kaggle/working/tf_malaria.tfrecord'
full_data = tf.data.TFRecordDataset(
filenames = [FPATH]
)
def parse_tfrecords(example):
feature_description = {
"images": tf.io.FixedLenFeature([], tf.string),
"labels": tf.io.FixedLenFeature([], tf.int64),
}
example = tf.io.parse_single_example(example, feature_description)
example["images"] = tf.io.decode_jpeg(example["images"], channels = 3)
return example["images"], example["labels"]
parsed_full_data = (
full_data
.map(parse_tfrecords)
)
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Fernandosr85
Released under Apache 2.0
Facebook
TwitterThis dataset was created by Robert Sizemore
Facebook
TwitterThis dataset consists of 101 food categories, with 101'000 images. For each class, 250 manually reviewed test images are provided as well as 750 training images. On purpose, the training images were not cleaned, and thus still contain some amount of noise. This comes mostly in the form of intense colors and sometimes wrong labels. All images were rescaled to have a maximum side length of 512 pixels.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('food101', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/food101-2.0.0.png" alt="Visualization" width="500px">
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Wildfire Smoke.v1 Raw.tensorflow is a dataset for object detection tasks - it contains Smoke annotations for 1,253 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
TensorFlow 2 is a dataset for object detection tasks - it contains Defect annotations for 851 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterThis dataset was created by nadare
Facebook
TwitterTensorFlow Implementation of UNETR model trained on 24k images from UW Madison Dataset .
Paper : https://arxiv.org/pdf/2103.10504.pdf Competition Data : https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation
Credit : Dataset image is uploaded from the competition data page https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Henry Javier
Released under Apache 2.0
Facebook
TwitterThis dataset was created by HarryTan
Released under Other (specified in description)
Facebook
TwitterThe Oxford-IIIT pet dataset is a 37 category pet image dataset with roughly 200 images for each class. The images have large variations in scale, pose and lighting. All images have an associated ground truth annotation of breed and species. Additionally, head bounding boxes are provided for the training split, allowing using this dataset for simple object detection tasks. In the test split, the bounding boxes are empty.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('oxford_iiit_pet', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Facebook
TwitterSourced from: https://www.tensorflow.org/datasets/catalog/wider_face
WIDER FACE dataset is a face detection benchmark dataset, of which images are selected from the publicly available WIDER dataset. We choose 32,203 images and label 393,703 faces with a high degree of variability in scale, pose and occlusion as depicted in the sample images. WIDER FACE dataset is organized based on 61 event classes. For each event class, we randomly select 40%/10%/50% data as training, validation and testing sets. We adopt the same evaluation metric employed in the PASCAL VOC dataset. Similar to MALF and Caltech datasets, we do not release bounding box ground truth for the test images. Users are required to submit final prediction files, which we shall proceed to evaluate.
Homepage: http://shuoyang1213.me/WIDERFACE/
Source code: tfds.object_detection.WiderFace
Versions:
0.1.0 (default): No release notes. Download size: 3.42 GiB
Dataset size: 3.45 GiB
Auto-cached (documentation): No
Splits:
Facebook
TwitterClean-up text for 40+ Wikipedia languages editions of pages correspond to entities. The datasets have train/dev/test splits per language. The dataset is cleaned up by page filtering to remove disambiguation pages, redirect pages, deleted pages, and non-entity pages. Each example contains the wikidata id of the entity, and the full Wikipedia article after page processing that removes non-content sections and structured objects. The language models trained on this corpus - including 41 monolingual models, and 2 multilingual models - can be found at https://tfhub.dev/google/collections/wiki40b-lm/1.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('wiki40b', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.