3 datasets found
  1. h

    bigbench_jsonl

    • huggingface.co
    Updated Feb 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NJUDeepEngine (2023). bigbench_jsonl [Dataset]. https://huggingface.co/datasets/NJUDeepEngine/bigbench_jsonl
    Explore at:
    Dataset updated
    Feb 5, 2023
    Dataset authored and provided by
    NJUDeepEngine
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    BIG-Bench but it doesn't require the hellish dependencies (tensorflow, pypi-bigbench, protobuf) of the official version. dataset = load_dataset("tasksource/bigbench",'movie_recommendation')

    Code to reproduce: https://colab.research.google.com/drive/1MKdLdF7oqrSQCeavAcsEnPdI85kD0LzU?usp=sharing Datasets are capped to 50k examples to keep things light. I also removed the default split when train was available also to save space, as default=train+val. @article{srivastava2022beyond… See the full description on the dataset page: https://huggingface.co/datasets/NJUDeepEngine/bigbench_jsonl.

  2. T

    plant_village

    • tensorflow.org
    • opendatalab.com
    • +2more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). plant_village [Dataset]. http://identifiers.org/arxiv:1511.08060
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The PlantVillage dataset consists of 54303 healthy and unhealthy leaf images divided into 38 categories by species and disease.

    NOTE: The original dataset is not available from the original source (plantvillage.org), therefore we get the unaugmented dataset from a paper that used that dataset and republished it. Moreover, we dropped images with Background_without_leaves label, because these were not present in the original dataset.

    Original paper URL: https://arxiv.org/abs/1511.08060 Dataset URL: https://data.mendeley.com/datasets/tywbtsjrjv/1

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('plant_village', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/plant_village-1.0.2.png" alt="Visualization" width="500px">

  3. MNIST Dataset

    • kaggle.com
    • opendatalab.com
    • +4more
    zip
    Updated Jan 8, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hojjat Khodabakhsh (2019). MNIST Dataset [Dataset]. https://www.kaggle.com/datasets/hojjatk/mnist-dataset
    Explore at:
    zip(23112702 bytes)Available download formats
    Dataset updated
    Jan 8, 2019
    Authors
    Hojjat Khodabakhsh
    Description

    Context

    MNIST is a subset of a larger set available from NIST (it's copied from http://yann.lecun.com/exdb/mnist/)

    Content

    The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. . Four files are available:

    • train-images-idx3-ubyte.gz: training set images (9912422 bytes)
    • train-labels-idx1-ubyte.gz: training set labels (28881 bytes)
    • t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)
    • t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)

    How to read

    See sample MNIST reader

    Acknowledgements

    • Yann LeCun, Courant Institute, NYU
    • Corinna Cortes, Google Labs, New York
    • Christopher J.C. Burges, Microsoft Research, Redmond

    Inspiration

    Many methods have been tested with this training set and test set (see http://yann.lecun.com/exdb/mnist/ for more details)

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
NJUDeepEngine (2023). bigbench_jsonl [Dataset]. https://huggingface.co/datasets/NJUDeepEngine/bigbench_jsonl

bigbench_jsonl

bigbench

NJUDeepEngine/bigbench_jsonl

Explore at:
Dataset updated
Feb 5, 2023
Dataset authored and provided by
NJUDeepEngine
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

BIG-Bench but it doesn't require the hellish dependencies (tensorflow, pypi-bigbench, protobuf) of the official version. dataset = load_dataset("tasksource/bigbench",'movie_recommendation')

Code to reproduce: https://colab.research.google.com/drive/1MKdLdF7oqrSQCeavAcsEnPdI85kD0LzU?usp=sharing Datasets are capped to 50k examples to keep things light. I also removed the default split when train was available also to save space, as default=train+val. @article{srivastava2022beyond… See the full description on the dataset page: https://huggingface.co/datasets/NJUDeepEngine/bigbench_jsonl.

Search
Clear search
Close search
Google apps
Main menu