Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BIG-Bench but it doesn't require the hellish dependencies (tensorflow, pypi-bigbench, protobuf) of the official version. dataset = load_dataset("tasksource/bigbench",'movie_recommendation')
Code to reproduce: https://colab.research.google.com/drive/1MKdLdF7oqrSQCeavAcsEnPdI85kD0LzU?usp=sharing Datasets are capped to 50k examples to keep things light. I also removed the default split when train was available also to save space, as default=train+val. @article{srivastava2022beyond… See the full description on the dataset page: https://huggingface.co/datasets/NJUDeepEngine/bigbench_jsonl.
Facebook
TwitterThe PlantVillage dataset consists of 54303 healthy and unhealthy leaf images divided into 38 categories by species and disease.
NOTE: The original dataset is not available from the original source (plantvillage.org), therefore we get the unaugmented dataset from a paper that used that dataset and republished it. Moreover, we dropped images with Background_without_leaves label, because these were not present in the original dataset.
Original paper URL: https://arxiv.org/abs/1511.08060 Dataset URL: https://data.mendeley.com/datasets/tywbtsjrjv/1
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('plant_village', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/plant_village-1.0.2.png" alt="Visualization" width="500px">
Facebook
TwitterMNIST is a subset of a larger set available from NIST (it's copied from http://yann.lecun.com/exdb/mnist/)
The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. . Four files are available:
Many methods have been tested with this training set and test set (see http://yann.lecun.com/exdb/mnist/ for more details)
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BIG-Bench but it doesn't require the hellish dependencies (tensorflow, pypi-bigbench, protobuf) of the official version. dataset = load_dataset("tasksource/bigbench",'movie_recommendation')
Code to reproduce: https://colab.research.google.com/drive/1MKdLdF7oqrSQCeavAcsEnPdI85kD0LzU?usp=sharing Datasets are capped to 50k examples to keep things light. I also removed the default split when train was available also to save space, as default=train+val. @article{srivastava2022beyond… See the full description on the dataset page: https://huggingface.co/datasets/NJUDeepEngine/bigbench_jsonl.