63 datasets found
  1. ImageNet-1k-1

    • kaggle.com
    Updated Apr 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sautkin (2023). ImageNet-1k-1 [Dataset]. https://www.kaggle.com/datasets/sautkin/imagenet1k1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 2, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sautkin
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset containing 500-999 classes of ImageNet Is part of the Imagenet dataset, all parts are: ImageNet-1k-0 - https://www.kaggle.com/datasets/sautkin/imagenet1k0 (0-499 classes); ImageNet-1k-1 - this; ImageNet-1k-2 - https://www.kaggle.com/datasets/sautkin/imagenet1k2 (0-499 classes); ImageNet-1k-3 - https://www.kaggle.com/datasets/sautkin/imagenet1k3 (500-999 classes); ImageNet-1k-valid - https://www.kaggle.com/datasets/sautkin/imagenet1kvalid (0-999 classes, test part)

  2. h

    imagenet-1k

    • huggingface.co
    Updated Apr 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Large Scale Visual Recognition Challenge (2022). imagenet-1k [Dataset]. https://huggingface.co/datasets/ILSVRC/imagenet-1k
    Explore at:
    Dataset updated
    Apr 30, 2022
    Dataset authored and provided by
    Large Scale Visual Recognition Challenge
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Card for ImageNet

      Dataset Summary
    

    ILSVRC 2012, commonly known as 'ImageNet' is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). ImageNet aims to provide on average 1000 images to illustrate each synset. Images of each concept are… See the full description on the dataset page: https://huggingface.co/datasets/ILSVRC/imagenet-1k.

  3. T

    imagenet2012

    • tensorflow.org
    Updated Jun 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). imagenet2012 [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet2012
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    ILSVRC 2012, commonly known as 'ImageNet' is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated. In its completion, we hope ImageNet will offer tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy.

    The test split contains 100K images but no labels because no labels have been publicly released. We provide support for the test split from 2012 with the minor patch released on October 10, 2019. In order to manually download this data, a user must perform the following operations:

    1. Download the 2012 test split available here.
    2. Download the October 10, 2019 patch. There is a Google Drive link to the patch provided on the same page.
    3. Combine the two tar-balls, manually overwriting any images in the original archive with images from the patch. According to the instructions on image-net.org, this procedure overwrites just a few images.

    The resulting tar-ball may then be processed by TFDS.

    To assess the accuracy of a model on the ImageNet test split, one must run inference on all images in the split, export those results to a text file that must be uploaded to the ImageNet evaluation server. The maintainers of the ImageNet evaluation server permits a single user to submit up to 2 submissions per week in order to prevent overfitting.

    To evaluate the accuracy on the test split, one must first create an account at image-net.org. This account must be approved by the site administrator. After the account is created, one can submit the results to the test server at https://image-net.org/challenges/LSVRC/eval_server.php The submission consists of several ASCII text files corresponding to multiple tasks. The task of interest is "Classification submission (top-5 cls error)". A sample of an exported text file looks like the following:

    771 778 794 387 650
    363 691 764 923 427
    737 369 430 531 124
    755 930 755 59 168
    

    The export format is described in full in "readme.txt" within the 2013 development kit available here: https://image-net.org/data/ILSVRC/2013/ILSVRC2013_devkit.tgz Please see the section entitled "3.3 CLS-LOC submission format". Briefly, the format of the text file is 100,000 lines corresponding to each image in the test split. Each line of integers correspond to the rank-ordered, top 5 predictions for each test image. The integers are 1-indexed corresponding to the line number in the corresponding labels file. See labels.txt.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imagenet2012', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/imagenet2012-5.1.0.png" alt="Visualization" width="500px">

  4. h

    imagenet-1k-32x32

    • huggingface.co
    Updated Sep 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Paine (2024). imagenet-1k-32x32 [Dataset]. https://huggingface.co/datasets/benjamin-paine/imagenet-1k-32x32
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 15, 2024
    Authors
    Benjamin Paine
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Repack Information

    This repository contains a complete repack of ILSVRC/imagenet-1k in Parquet format with the following data transformations:

    Images were center-cropped to square to the minimum height/width dimension. Images were then rescaled to 256x256 using Lanczos resampling. This dataset is available at benjamin-paine/imagenet-1k-256x256 Images were then rescaled to 128x128 using Lanczos resampling. This dataset is available at benjamin-paine/imagenet-1k-128x128. Images were… See the full description on the dataset page: https://huggingface.co/datasets/benjamin-paine/imagenet-1k-32x32.

  5. T

    imagenet2012_real

    • tensorflow.org
    Updated Jun 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). imagenet2012_real [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet2012_real
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    This dataset contains ILSVRC-2012 (ImageNet) validation images augmented with a new set of "Re-Assessed" (ReaL) labels from the "Are we done with ImageNet" paper, see https://arxiv.org/abs/2006.07159. These labels are collected using the enhanced protocol, resulting in multi-label and more accurate annotations.

    Important note: about 3500 examples contain no label, these should be excluded from the averaging when computing the accuracy. One possible way of doing this is with the following NumPy code:

    is_correct = [pred in real_labels[i] for i, pred in enumerate(predictions) if real_labels[i]]
    real_accuracy = np.mean(is_correct)
    

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imagenet2012_real', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/imagenet2012_real-1.0.0.png" alt="Visualization" width="500px">

  6. g

    ImageNet Micro

    • gts.ai
    json
    Updated Jul 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2024). ImageNet Micro [Dataset]. https://gts.ai/dataset-download/imagenet-micro/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jul 3, 2024
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Explore the ImageNet Micro dataset, a curated subset of ImageNet Mini featuring 999 classes with approximately 30 training images and 4 validation/testing images per class.

  7. a

    Tiny ImageNet

    • datasets.activeloop.ai
    • huggingface.co
    deeplake
    Updated Apr 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ya Le and Xuan S. Yang (2022). Tiny ImageNet [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/tiny-imagenet-dataset/
    Explore at:
    deeplakeAvailable download formats
    Dataset updated
    Apr 2, 2022
    Authors
    Ya Le and Xuan S. Yang
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Tiny ImageNet Dataset is a dataset of 100,000 tiny (64x64) images of objects. It is a popular dataset for image classification and object detection research. The dataset consists of 200 different classes, each of which has 500 images.

  8. imagenet classes

    • kaggle.com
    Updated Nov 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SKYap (2019). imagenet classes [Dataset]. https://www.kaggle.com/datasets/skyap79/imagenet-classes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 5, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SKYap
    Description

    Dataset

    This dataset was created by SKYap

    Contents

  9. a

    ImageNet LSVRC 2012 Training Set (Object Detection)

    • academictorrents.com
    bittorrent
    Updated Oct 16, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei (2015). ImageNet LSVRC 2012 Training Set (Object Detection) [Dataset]. https://academictorrents.com/details/a306397ccf9c2ead27155983c254227c0fd938e2
    Explore at:
    bittorrent(147897477120)Available download formats
    Dataset updated
    Oct 16, 2015
    Dataset authored and provided by
    Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    A BitTorrent file to download data with the title 'ImageNet LSVRC 2012 Training Set (Object Detection)'

  10. R

    Data from: Imagenet Dataset

    • universe.roboflow.com
    zip
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tra (2025). Imagenet Dataset [Dataset]. https://universe.roboflow.com/tra-wxqt8/imagenet-fgrw1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    tra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    1 Bounding Boxes
    Description

    ImageNet

    ## Overview
    
    ImageNet is a dataset for object detection tasks - it contains 1 annotations for 1,002 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  11. a

    Imagenet Full (Fall 2011 release)

    • academictorrents.com
    bittorrent
    Updated Oct 16, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jia Deng and Wei Dong and Richard Socher and Li-Jia Li and Kai Li and Li Fei-Fei (2015). Imagenet Full (Fall 2011 release) [Dataset]. https://academictorrents.com/details/564a77c1e1119da199ff32622a1609431b9f1c47
    Explore at:
    bittorrent(1309848811520)Available download formats
    Dataset updated
    Oct 16, 2015
    Dataset authored and provided by
    Jia Deng and Wei Dong and Richard Socher and Li-Jia Li and Kai Li and Li Fei-Fei
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    ImageNet is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated. In its completion, we hope ImageNet will offer tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy. For more information, see

  12. h

    imagenet-22k-wds

    • huggingface.co
    Updated Jan 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PyTorch Image Models (2024). imagenet-22k-wds [Dataset]. https://huggingface.co/datasets/timm/imagenet-22k-wds
    Explore at:
    Dataset updated
    Jan 29, 2024
    Dataset authored and provided by
    PyTorch Image Models
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Summary

    This is a copy of the full ImageNet dataset consisting of all of the original 21841 clases. It also contains labels in a separate field for the '12k' subset described at at (https://github.com/rwightman/imagenet-12k, https://huggingface.co/datasets/timm/imagenet-12k-wds) This dataset is from the original fall11 ImageNet release which has been replaced by the winter21 release which removes close to 3000 synsets containing people, a number of these are of an offensive… See the full description on the dataset page: https://huggingface.co/datasets/timm/imagenet-22k-wds.

  13. T

    imagenet_v2

    • tensorflow.org
    • tensorflow.google.cn
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). imagenet_v2 [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet_v2
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    ImageNet-v2 is an ImageNet test set (10 per class) collected by closely following the original labelling protocol. Each image has been labelled by at least 10 MTurk workers, possibly more, and depending on the strategy used to select which images to include among the 10 chosen for the given class there are three different versions of the dataset. Please refer to section four of the paper for more details on how the different variants were compiled.

    The label space is the same as that of ImageNet2012. Each example is represented as a dictionary with the following keys:

    • 'image': The image, a (H, W, 3)-tensor.
    • 'label': An integer in the range [0, 1000).
    • 'file_name': A unique sting identifying the example within the dataset.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imagenet_v2', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/imagenet_v2-matched-frequency-3.0.0.png" alt="Visualization" width="500px">

  14. n

    ImageNet ILSVRC2012

    • scidm.nchc.org.tw
    Updated Oct 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). ImageNet ILSVRC2012 [Dataset]. https://scidm.nchc.org.tw/dataset/imagenet-ilsvrc2012
    Explore at:
    Dataset updated
    Oct 15, 2024
    Description

    ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images. Currently we have an average of over five hundred images per node. We hope ImageNet will become a useful resource for researchers, educators, students and all of you who share our passion for pictures.

  15. T

    imagenet_r

    • tensorflow.org
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). imagenet_r [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet_r
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    ImageNet-R is a set of images labelled with ImageNet labels that were obtained by collecting art, cartoons, deviantart, graffiti, embroidery, graphics, origami, paintings, patterns, plastic objects, plush objects, sculptures, sketches, tattoos, toys, and video game renditions of ImageNet classes. ImageNet-R has renditions of 200 ImageNet classes resulting in 30,000 images. by collecting new data and keeping only those images that ResNet-50 models fail to correctly classify. For more details please refer to the paper.

    The label space is the same as that of ImageNet2012. Each example is represented as a dictionary with the following keys:

    • 'image': The image, a (H, W, 3)-tensor.
    • 'label': An integer in the range [0, 1000).
    • 'file_name': A unique sting identifying the example within the dataset.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imagenet_r', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/imagenet_r-0.2.0.png" alt="Visualization" width="500px">

  16. a

    ImageNet LSVRC 2012 Validation Set (Object Detection)

    • academictorrents.com
    bittorrent
    Updated Oct 16, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei (2015). ImageNet LSVRC 2012 Validation Set (Object Detection) [Dataset]. https://academictorrents.com/details/5d6d0df7ed81efd49ca99ea4737e0ae5e3a5f2e5
    Explore at:
    bittorrent(6744924160)Available download formats
    Dataset updated
    Oct 16, 2015
    Dataset authored and provided by
    Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    A BitTorrent file to download data with the title 'ImageNet LSVRC 2012 Validation Set (Object Detection)'

  17. h

    mini-imagenet

    • huggingface.co
    Updated Dec 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PyTorch Image Models (2024). mini-imagenet [Dataset]. https://huggingface.co/datasets/timm/mini-imagenet
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 6, 2024
    Dataset authored and provided by
    PyTorch Image Models
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Description

    A mini version of ImageNet-1k with 100 of 1000 classes present. Unlike some 'mini' variants this one includes the original images at their original sizes. Many such subsets downsample to 84x84 or other smaller resolutions.

      Data Splits
    
    
    
    
    
      Train
    

    50000 samples from ImageNet-1k train split

      Validation
    

    10000 samples from ImageNet-1k train split

      Test
    

    5000 samples from ImageNet-1k validation split (all 50 samples per class)… See the full description on the dataset page: https://huggingface.co/datasets/timm/mini-imagenet.

  18. f

    Tiny_Imagenet

    • figshare.com
    application/x-rar
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiujian Hu (2023). Tiny_Imagenet [Dataset]. http://doi.org/10.6084/m9.figshare.22012529.v1
    Explore at:
    application/x-rarAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Xiujian Hu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    iny Imagenet has 200 Classes, each class has 500 traininig images, 50 Validation Images and 50 test images. Label Classes and Bounding Boxes are provided. More details can be found at https://tiny-imagenet.herokuapp.com/",

    This challenge is part of Stanford Class CS 231N

  19. a

    ImageNet Large Scale Visual Recognition Challenge (V2017)

    • academictorrents.com
    bittorrent
    Updated Mar 6, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei (2019). ImageNet Large Scale Visual Recognition Challenge (V2017) [Dataset]. https://academictorrents.com/details/943977d8c96892d24237638335e481f3ccd54cfb
    Explore at:
    bittorrent(166022728827)Available download formats
    Dataset updated
    Mar 6, 2019
    Dataset authored and provided by
    Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    A BitTorrent file to download data with the title 'ImageNet Large Scale Visual Recognition Challenge (V2017)'

  20. Z

    Data from: PASS: An ImageNet replacement for self-supervised pretraining...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Asano, Yuki Markus (2022). PASS: An ImageNet replacement for self-supervised pretraining without humans [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_5267181
    Explore at:
    Dataset updated
    Jun 6, 2022
    Dataset provided by
    Vedaldi, Andrea
    Zisserman, Andrew
    Rupprecht, Christian
    Asano, Yuki Markus
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Computer vision has long relied on ImageNet and other large datasets of images sampled from the Internet for pretraining models. However, these datasets have ethical and technical shortcomings, such as containing personal information taken without consent, unclear license usage, biases, and, in some cases, even problematic image content. On the other hand, state-of-the-art pretraining is nowadays obtained with unsupervised methods, meaning that labelled datasets such as ImageNet may not be necessary, or perhaps not even optimal, for model pretraining. We thus propose an unlabelled dataset PASS: Pictures without humAns for Self-Supervision. PASS only contains images with CC-BY license and complete attribution metadata, addressing the copyright issue. Most importantly, it contains no images of people at all, and also avoids other types of images that are problematic for data protection or ethics. We show that PASS can be used for pretraining with methods such as MoCo-v2, SwAV and DINO. In the transfer learning setting, it yields similar downstream performances to ImageNet pretraining even on tasks that involve humans, such as human pose estimation. PASS does not make existing datasets obsolete, as for instance it is insufficient for benchmarking. However, it shows that model pretraining is often possible while using safer data, and it also provides the basis for a more robust evaluation of pretraining methods.

    A simple download script is here: https://github.com/yukimasano/PASS/blob/main/download.sh Visit our webpage here: https://www.robots.ox.ac.uk/~vgg/research/pass/

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sautkin (2023). ImageNet-1k-1 [Dataset]. https://www.kaggle.com/datasets/sautkin/imagenet1k1
Organization logo

ImageNet-1k-1

the first part of the dataset containing the second 500 classes

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 2, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sautkin
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

This dataset containing 500-999 classes of ImageNet Is part of the Imagenet dataset, all parts are: ImageNet-1k-0 - https://www.kaggle.com/datasets/sautkin/imagenet1k0 (0-499 classes); ImageNet-1k-1 - this; ImageNet-1k-2 - https://www.kaggle.com/datasets/sautkin/imagenet1k2 (0-499 classes); ImageNet-1k-3 - https://www.kaggle.com/datasets/sautkin/imagenet1k3 (500-999 classes); ImageNet-1k-valid - https://www.kaggle.com/datasets/sautkin/imagenet1kvalid (0-999 classes, test part)

Search
Clear search
Close search
Google apps
Main menu