9 datasets found

OpenCv Torchvision Transforms
kaggle.com
zip
Updated Jul 22, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sh (2019). OpenCv Torchvision Transforms [Dataset]. https://www.kaggle.com/sidhanthholalkere/opencv-torchvision-transforms
Explore at:
zip(196012 bytes)Available download formats
Dataset updated
Jul 22, 2019
Authors
sh
Description
OpenCV implementation of Torchvision transforms, from https://github.com/YU-Zhiyang/opencv_transforms_torchvision

Why? Because opencv is faster than PIL

To use do

package_path = '../input/cvtorchvision/opencv_transforms_torchvision-master/'

sys.path.append(package_path)

from cvtorchvision import cvtransforms
h
movie_posters-genres-80k-torchvision-transforms
huggingface.co
Updated Nov 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tim Olsén (2023). movie_posters-genres-80k-torchvision-transforms [Dataset]. https://huggingface.co/datasets/skvarre/movie_posters-genres-80k-torchvision-transforms
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 6, 2023
Authors
Tim Olsén
Description
Dataset Card for "movie_posters-genres-80k-torchvision-transforms"

More Information needed
cifar_10_in_tensor
kaggle.com
zip
Updated Oct 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KKaiWWang (2022). cifar_10_in_tensor [Dataset]. https://www.kaggle.com/datasets/kkaiwwang/cifar-10-in-tensor
Explore at:
zip(1454680895 bytes)Available download formats
Dataset updated
Oct 28, 2022
Authors
KKaiWWang
Description
CIFAR-10 Dataset with format of Pytorch Tensor.

You can directly use torch.load('---File_Path---') to load data.

The whole dataset was seperated into 3 parts: train_X, train_y, test_X. Specifically, train_X contains 50, 000 'images' and test_X contains 300, 000 'images'. To be more detailed, train_X has shape of (50000, 3, 32, 32), train_y has shape of (50000,) and test_X has shape of (300000, 3, 32, 32).

Tips: If you wanna use data augment, it's unnecessary to transform these tensors to images to do so, actually you can directly apply Torchvision Transforms (or a Compose of Transforms) on tensors, it does work :)
Z
Data from: ImageNet-Patch: A Dataset for Benchmarking Machine Learning...
data.niaid.nih.gov
zenodo.org
Updated Jun 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maura Pintor; Daniele Angioni; Angelo Sotgiu; Luca Demetrio; Ambra Demontis; Battista Biggio; Fabio Roli (2022). ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6568777
Explore at:
Dataset updated
Jun 30, 2022
Dataset provided by
University of Genoa, Italy
University of Cagliari, Italy
Authors
Maura Pintor; Daniele Angioni; Angelo Sotgiu; Luca Demetrio; Ambra Demontis; Battista Biggio; Fabio Roli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Adversarial patches are optimized contiguous pixel blocks in an input image that cause a machine-learning model to misclassify it. However, their optimization is computationally demanding and requires careful hyperparameter tuning. To overcome these issues, we propose ImageNet-Patch, a dataset to benchmark machine-learning models against adversarial patches. It consists of a set of patches optimized to generalize across different models and applied to ImageNet data after preprocessing them with affine transformations. This process enables an approximate yet faster robustness evaluation, leveraging the transferability of adversarial perturbations.

We release our dataset as a set of folders indicating the patch target label (e.g., banana), each containing 1000 subfolders as the ImageNet output classes.

An example showing how to use the dataset is shown below.

code for testing robustness of a model

import os.path

from torchvision import datasets, transforms, models import torch.utils.data

class ImageFolderWithEmptyDirs(datasets.ImageFolder): """ This is required for handling empty folders from the ImageFolder Class. """

def find_classes(self, directory): classes = sorted(entry.name for entry in os.scandir(directory) if entry.is_dir()) if not classes: raise FileNotFoundError(f"Couldn't find any class folder in {directory}.") class_to_idx = {cls_name: i for i, cls_name in enumerate(classes) if len(os.listdir(os.path.join(directory, cls_name))) > 0} return classes, class_to_idx

extract and unzip the dataset, then write top folder here

dataset_folder = 'data/ImageNet-Patch'

available_labels = { 487: 'cellular telephone', 513: 'cornet', 546: 'electric guitar', 585: 'hair spray', 804: 'soap dispenser', 806: 'sock', 878: 'typewriter keyboard', 923: 'plate', 954: 'banana', 968: 'cup' }

select folder with specific target

target_label = 954

dataset_folder = os.path.join(dataset_folder, str(target_label)) normalizer = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) transforms = transforms.Compose([ transforms.ToTensor(), normalizer ])

dataset = ImageFolderWithEmptyDirs(dataset_folder, transform=transforms) model = models.resnet50(pretrained=True) loader = torch.utils.data.DataLoader(dataset, shuffle=True, batch_size=5) model.eval()

batches = 10 correct, attack_success, total = 0, 0, 0 for batch_idx, (images, labels) in enumerate(loader): if batch_idx == batches: break pred = model(images).argmax(dim=1) correct += (pred == labels).sum() attack_success += sum(pred == target_label) total += pred.shape[0]

accuracy = correct / total attack_sr = attack_success / total

print("Robust Accuracy: ", accuracy) print("Attack Success: ", attack_sr)
h
imagenet1k_dcae-f64-latents
huggingface.co
Updated Mar 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sway (2025). imagenet1k_dcae-f64-latents [Dataset]. https://huggingface.co/datasets/SwayStar123/imagenet1k_dcae-f64-latents
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 9, 2025
Authors
Sway
Description
Example usage. You will have to use a shape batching dataset when training in batches from datasets import load_dataset from diffusers import AutoencoderDC import torch import torchvision.transforms as transforms from PIL import Image

ds = load_dataset("SwayStar123/imagenet1k_dcae-f64-latents_train")

with torch.inference_mode(): device = "cuda" ae = AutoencoderDC.from_pretrained("mit-han-lab/dc-ae-f64c128-mix-1.0-diffusers", cache_dir="ae", torch_dtype=torch.bfloat16).to(device).eval()… See the full description on the dataset page: https://huggingface.co/datasets/SwayStar123/imagenet1k_dcae-f64-latents.
COCO2017 Image Caption Train
kaggle.com
zip
Updated May 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seungjun Lee (2024). COCO2017 Image Caption Train [Dataset]. https://www.kaggle.com/datasets/seungjunleeofficial/coco2017-image-caption-train
Explore at:
zip(19236355851 bytes)Available download formats
Dataset updated
May 30, 2024
Authors
Seungjun Lee
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains only the COCO 2017 train images (118K images) and a caption annotation JSON file, designed to fit within Google Colab's available disk space of approximately 50GB when connected to a GPU runtime.

If you're using PyTorch on Google Colab, you can easily utilize this dataset as follows:

Manually downloading and uploading the file to Colab can be time-consuming. Therefore, it's more efficient to download this data directly into Google Colab. Please ensure you have first added your Kaggle key to Google Colab. You can find more details on this process here

from google.colab import drive import os import torch import torchvision.datasets as dset import torchvision.transforms as transforms os.environ["KAGGLE_KEY"] = userdata.get('KAGGLE_KEY') os.environ["KAGGLE_USERNAME"] = userdata.get('KAGGLE_USERNAME') # Download the Dataset and unzip it !kaggle datasets download -d seungjunleeofficial/coco2017-image-caption-train !mkdir "/content/Dataset" !unzip "coco2017-image-caption-train" -d "/content/Dataset" # load the dataset cap = dset.CocoCaptions(root = '/content/Dataset/COCO2017 Image Captioning Train/train2017', annFile = '/content/Dataset/COCO2017 Image Captioning Train/captions_train2017.json', transform=transforms.PILToTensor())

You can then use the dataset in the following way:

print(f"Number of samples: {len(cap)}") img, target = cap[3] print(img.shape) print(target) # Output example: torch.Size([3, 425, 640]) # ['A zebra grazing on lush green grass in a field.', 'Zebra reaching its head down to ground where grass is.', # 'The zebra is eating grass in the sun.', 'A lone zebra grazing in some green grass.', # 'A Zebra grazing on grass in a green open field.']
pets_pgd_imagenette
kaggle.com
zip
Updated Apr 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
liangyinming50 (2024). pets_pgd_imagenette [Dataset]. https://www.kaggle.com/liangyinming50/pets-pgd-resnet18
Explore at:
zip(151081329 bytes)Available download formats
Dataset updated
Apr 12, 2024
Authors
liangyinming50
Description
使用resnet18进行训练 resnet18 干净样本数据集精准度为 89.27% 对抗样本攻击成功率99%以上攻击方法pgd 干净样本数据集 https://github.com/fastai/imagenette

resnet18模型使用的是 https://github.com/Harry24k/MAIR/tree/main/mair中的utils resnet18

使用fastai 读取数据集

from fastai.vision.all import * from utils import * from attacks import * import torchvision.transforms as tt

def get_x(files): return files

def get_y(files): return files.parent.name

自定义数据分割函数

def my_splitter(items): train_idx = [i for i,o in enumerate(items) if 'train' in o.parts] valid_idx = [i for i,o in enumerate(items) if 'val' in o.parts] return train_idx, valid_idx

请替换"path/to/data"为您的数据根目录的实际路径

path = Path("./data/imagenette2/")

attack_pets = DataBlock( blocks=(ImageBlock, CategoryBlock), get_items=get_image_files, splitter=my_splitter, get_x=get_x, get_y=get_y, )

attack_dls = attack_pets.dataloaders(path, shuffle=False, bs=32)
heptapod_dataset
kaggle.com
zip
Updated Jun 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matheus Latorre Cavini (2025). heptapod_dataset [Dataset]. https://www.kaggle.com/datasets/matheuslatorrecavini/heptapod-dataset/discussion
Explore at:
zip(10262538 bytes)Available download formats
Dataset updated
Jun 28, 2025
Authors
Matheus Latorre Cavini
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This dataset consists of 4900 images of logograms from Heptapod B language, in resolution 224x224, and the captions for their meaning in English. There are 49 unique logograms and 100 variations (rotation, scaling, translation) for each of them.

Original source of the data: Wolfram Research GitHub Repository. Distributed under Creative Commons Attribution-NonCommercial 4.0 International License.

The dataset was augmented by merging morphems of the logograms and by applying geometric transformations to create variations of each image.

The captions.txt file provide captions for each unique logogram, and can interpreted as:

000.png | Abbot is dead is the caption for images 0000.png to 0099.png

001.png | Abbot is the caption for images 0100.png to 0199.png

002.png | Abbot chooses save humanityis the caption for images 0200.png to 0299.png

And so on

Suggested loading for PyTorch:

from PIL import Image import torch from torch.utils.data import Dataset, DataLoader from torchvision import transforms import os class TextToImageDataset(Dataset): def _init_(self, image_dir, captions_file, transform=None): self.image_dir = image_dir # Path for the images on the dataset self.transform = transform self.pairs = [] # Array to store (image, sentence) pairs with open(captions_file, "r") as f: for line in f: idx, caption = line.strip().split("|") idx = idx.strip().split(".")[0] caption = caption.strip() for i in range(100): img_file = f"{(int(idx)*100 + i):04d}.png" # Get the image number by doing idx*100 + i self.pairs.append((caption, img_file)) # Apply the same caption for every variation of the same logogram def _len_(self): return len(self.pairs) def _getitem_(self, idx): text, img_file = self.pairs[idx] image = Image.open(os.path.join(self.image_dir, img_file)).convert("RGB") if self.transform: image = self.transform(image) return text, image #item = (text, image)

transform = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor() ]) base_dir = "/kaggle/input/heptapod-dataset/dataset/" dataset = TextToImageDataset(image_dir=base_dir+"images",captions_file=base_dir+"captions.txt", transform=transform)

PyTorch 1.0.0 Pretrained Image Models

kaggle.com

zip

Updated Jan 18, 2019

Facebook

Twitter

Click to copy link

Link copied

Cite

Benjamin Minixhofer (2019). PyTorch 1.0.0 Pretrained Image Models [Dataset]. https://www.kaggle.com/bminixhofer/pytorch-pretrained-image-models

Explore at:

zip(283107267 bytes)Available download formats

Dataset updated

Jan 18, 2019

Authors

Benjamin Minixhofer

Description

Usage

import torch
from torchvision import models, transforms

# densenet121
model = models.densenet121()
model.load_state_dict(torch.load('densenet121.pth'))

# densenet201
model = models.densenet201()
model.load_state_dict(torch.load('densenet201.pth'))

# resnet50
model = models.resnet50()
model.load_state_dict(torch.load('resnet50.pth'))

# resnet34
model = models.resnet34()
model.load_state_dict(torch.load('resnet34.pth'))

License

From the PyTorch github repo:

From PyTorch:

Copyright (c) 2016-   Facebook, Inc      (Adam Paszke)
Copyright (c) 2014-   Facebook, Inc      (Soumith Chintala)
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies  (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU           (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006   Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)

From Caffe2:

Copyright (c) 2016-present, Facebook Inc. All rights reserved.

All contributions by Facebook:
Copyright (c) 2016 Facebook Inc.

All contributions by Google:
Copyright (c) 2015 Google Inc.
All rights reserved.

All contributions by Yangqing Jia:
Copyright (c) 2015 Yangqing Jia
All rights reserved.

All contributions from Caffe:
Copyright(c) 2013, 2014, 2015, the respective contributors
All rights reserved.

All other contributions:
Copyright(c) 2015, 2016 the respective contributors
All rights reserved.

Caffe2 uses a copyright model similar to Caffe: each contributor holds
copyright over their contributions to Caffe2. The project versioning records
all such contribution and copyright details. If a contributor wants to further
mark their specific copyright on a particular contribution, they should
indicate their copyright solely in the commit message of the change when it is
committed.

All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright
  notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright
  notice, this list of conditions and the following disclaimer in the
  documentation and/or other materials provided with the distribution.

3. Neither the names of Facebook, Deepmind Technologies, NYU, NEC Laboratories America
  and IDIAP Research Institute nor the names of its contributors may be
  used to endorse or promote products derived from this software without
  specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

sh (2019). OpenCv Torchvision Transforms [Dataset]. https://www.kaggle.com/sidhanthholalkere/opencv-torchvision-transforms

OpenCv Torchvision Transforms

Explore at:

zip(196012 bytes)Available download formats

Dataset updated

Jul 22, 2019

Authors

Description

OpenCV implementation of Torchvision transforms, from https://github.com/YU-Zhiyang/opencv_transforms_torchvision

Why? Because opencv is faster than PIL

To use do

package_path = '../input/cvtorchvision/opencv_transforms_torchvision-master/'

sys.path.append(package_path)

from cvtorchvision import cvtransforms

Clear search

Close search

Google apps

Main menu

OpenCv Torchvision Transforms

movie_posters-genres-80k-torchvision-transforms

cifar_10_in_tensor

CIFAR-10 Dataset with format of Pytorch Tensor.

You can directly use torch.load('---File_Path---') to load data.

The whole dataset was seperated into 3 parts: train_X, train_y, test_X. Specifically, train_X contains 50, 000 'images' and test_X contains 300, 000 'images'. To be more detailed, train_X has shape of (50000, 3, 32, 32), train_y has shape of (50000,) and test_X has shape of (300000, 3, 32, 32).

Data from: ImageNet-Patch: A Dataset for Benchmarking Machine Learning...

code for testing robustness of a model

extract and unzip the dataset, then write top folder here

select folder with specific target

imagenet1k_dcae-f64-latents

COCO2017 Image Caption Train

pets_pgd_imagenette

自定义数据分割函数

请替换"path/to/data"为您的数据根目录的实际路径

heptapod_dataset

PyTorch 1.0.0 Pretrained Image Models

Usage

License

OpenCv Torchvision Transforms