6 datasets found

Z
Data from: ImageNet-Patch: A Dataset for Benchmarking Machine Learning...
data.niaid.nih.gov
Updated Jun 30, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniele Angioni (2022). ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6568777
Explore at:
Dataset updated
Jun 30, 2022
Dataset provided by
Daniele Angioni
Battista Biggio
Luca Demetrio
Angelo Sotgiu
Ambra Demontis
Maura Pintor
Fabio Roli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Adversarial patches are optimized contiguous pixel blocks in an input image that cause a machine-learning model to misclassify it. However, their optimization is computationally demanding and requires careful hyperparameter tuning. To overcome these issues, we propose ImageNet-Patch, a dataset to benchmark machine-learning models against adversarial patches. It consists of a set of patches optimized to generalize across different models and applied to ImageNet data after preprocessing them with affine transformations. This process enables an approximate yet faster robustness evaluation, leveraging the transferability of adversarial perturbations.

We release our dataset as a set of folders indicating the patch target label (e.g., banana), each containing 1000 subfolders as the ImageNet output classes.

An example showing how to use the dataset is shown below.

code for testing robustness of a model

import os.path

from torchvision import datasets, transforms, models import torch.utils.data

class ImageFolderWithEmptyDirs(datasets.ImageFolder): """ This is required for handling empty folders from the ImageFolder Class. """

def find_classes(self, directory): classes = sorted(entry.name for entry in os.scandir(directory) if entry.is_dir()) if not classes: raise FileNotFoundError(f"Couldn't find any class folder in {directory}.") class_to_idx = {cls_name: i for i, cls_name in enumerate(classes) if len(os.listdir(os.path.join(directory, cls_name))) > 0} return classes, class_to_idx

extract and unzip the dataset, then write top folder here

dataset_folder = 'data/ImageNet-Patch'

available_labels = { 487: 'cellular telephone', 513: 'cornet', 546: 'electric guitar', 585: 'hair spray', 804: 'soap dispenser', 806: 'sock', 878: 'typewriter keyboard', 923: 'plate', 954: 'banana', 968: 'cup' }

select folder with specific target

target_label = 954

dataset_folder = os.path.join(dataset_folder, str(target_label)) normalizer = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) transforms = transforms.Compose([ transforms.ToTensor(), normalizer ])

dataset = ImageFolderWithEmptyDirs(dataset_folder, transform=transforms) model = models.resnet50(pretrained=True) loader = torch.utils.data.DataLoader(dataset, shuffle=True, batch_size=5) model.eval()

batches = 10 correct, attack_success, total = 0, 0, 0 for batch_idx, (images, labels) in enumerate(loader): if batch_idx == batches: break pred = model(images).argmax(dim=1) correct += (pred == labels).sum() attack_success += sum(pred == target_label) total += pred.shape[0]

accuracy = correct / total attack_sr = attack_success / total

print("Robust Accuracy: ", accuracy) print("Attack Success: ", attack_sr)
h
SemEval_training_data_emotions
huggingface.co
Updated Feb 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
web (2024). SemEval_training_data_emotions [Dataset]. https://huggingface.co/datasets/dim/SemEval_training_data_emotions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 7, 2024
Authors
web
Description
Dataset Card for "SemEval_traindata_emotions"

Как был получен from datasets import load_dataset import datasets from torchvision.io import read_video import json import torch import os from torch.utils.data import Dataset, DataLoader import tqdm

dataset_path = "./SemEval-2024_Task3/training_data/Subtask_2_train.json"

dataset = json.loads(open(dataset_path).read()) print(len(dataset))

all_conversations = []

for item in dataset: all_conversations.extend(item["conversation"])… See the full description on the dataset page: https://huggingface.co/datasets/dim/SemEval_training_data_emotions.
h
short-metaworld-vla
huggingface.co
Updated Jul 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
H.Z (2025). short-metaworld-vla [Dataset]. https://huggingface.co/datasets/hz1919810/short-metaworld-vla
Explore at:
Dataset updated
Jul 2, 2025
Authors
H.Z
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Short-MetaWorld Dataset

Overview

Short-MetaWorld is a curated dataset from Meta-World containing Multi-Task 10 (MT10) and Meta-Learning 10 (ML10) tasks with 100 successful trajectories per task and 20 steps per trajectory. This dataset is specifically designed for multi-task robot learning, imitation learning, and vision-language robotics research.

🚀 Quick Start

from short_metaworld_loader import load_short_metaworld from torch.utils.data import DataLoader

… See the full description on the dataset page: https://huggingface.co/datasets/hz1919810/short-metaworld-vla.
h
IndicVoices_bengali
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Subrata Sarkar, IndicVoices_bengali [Dataset]. https://huggingface.co/datasets/subratasarkar32/IndicVoices_bengali
Explore at:
Authors
Subrata Sarkar
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
IndicVoices_bengali

This dataset has been created from ai4bharat/IndicVoices. Since directly trying to load the dataset for bengali was not working with IndicVoices due to errors in some files, this dataset addresses those files by removing them. To use this dataset with lazyloading for training speech to text models, below is sample code with wav2vec2. import pandas as pd import torchaudio from torch.utils.data import Dataset, DataLoader from transformers import Wav2Vec2Processor… See the full description on the dataset page: https://huggingface.co/datasets/subratasarkar32/IndicVoices_bengali.
h
am-nlp-abstrct
huggingface.co
Updated May 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Nardi (2025). am-nlp-abstrct [Dataset]. https://huggingface.co/datasets/david-inf/am-nlp-abstrct
Explore at:
Dataset updated
May 18, 2025
Authors
David Nardi
Description
Dataset summary

Dataset forked from pie/asbtrct. Here all sentences from AbstRCT dataset abstracts are grouped together with labels:

0: Premise 1: Claim 2: MajorClaim

import random import torch import numpy as np from datasets import load_dataset from transformers import set_seed, AutoTokenizer, DataCollatorWithPadding from torch.utils.data import DataLoader

def get_dataset(tokenizer, max_length=128): dataset = load_dataset("david-inf/am-nlp-abstrct")

def… See the full description on the dataset page: https://huggingface.co/datasets/david-inf/am-nlp-abstrct.
h
malaysian-youtube
huggingface.co
Updated Jan 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Malaysia AI (2024). malaysian-youtube [Dataset]. https://huggingface.co/datasets/malaysia-ai/malaysian-youtube
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 5, 2024
Dataset authored and provided by
Malaysia AI
Area covered
Malaysia, YouTube
Description
Malaysian Youtube

Malaysian and Singaporean youtube channels, total up to 60k audio files with total 18.7k hours. URLs data at https://github.com/mesolitica/malaya-speech/tree/master/data/youtube/data Notebooks at https://github.com/mesolitica/malaya-speech/tree/master/data/youtube

How to load the data efficiently?

import pandas as pd import json from datasets import Audio from torch.utils.data import DataLoader, Dataset

chunks = 30 sr = 16000

class Train(Dataset):… See the full description on the dataset page: https://huggingface.co/datasets/malaysia-ai/malaysian-youtube.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Daniele Angioni (2022). ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6568777

Data from: ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches

Explore at:

Dataset updated

Jun 30, 2022

Dataset provided by

Daniele Angioni
Battista Biggio
Luca Demetrio
Angelo Sotgiu
Ambra Demontis
Maura Pintor
Fabio Roli

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Adversarial patches are optimized contiguous pixel blocks in an input image that cause a machine-learning model to misclassify it. However, their optimization is computationally demanding and requires careful hyperparameter tuning. To overcome these issues, we propose ImageNet-Patch, a dataset to benchmark machine-learning models against adversarial patches. It consists of a set of patches optimized to generalize across different models and applied to ImageNet data after preprocessing them with affine transformations. This process enables an approximate yet faster robustness evaluation, leveraging the transferability of adversarial perturbations.

We release our dataset as a set of folders indicating the patch target label (e.g., banana), each containing 1000 subfolders as the ImageNet output classes.

An example showing how to use the dataset is shown below.

code for testing robustness of a model

import os.path

from torchvision import datasets, transforms, models import torch.utils.data

class ImageFolderWithEmptyDirs(datasets.ImageFolder): """ This is required for handling empty folders from the ImageFolder Class. """

def find_classes(self, directory):
  classes = sorted(entry.name for entry in os.scandir(directory) if entry.is_dir())
  if not classes:
    raise FileNotFoundError(f"Couldn't find any class folder in {directory}.")
  class_to_idx = {cls_name: i for i, cls_name in enumerate(classes) if
          len(os.listdir(os.path.join(directory, cls_name))) > 0}
  return classes, class_to_idx

extract and unzip the dataset, then write top folder here

dataset_folder = 'data/ImageNet-Patch'

available_labels = { 487: 'cellular telephone', 513: 'cornet', 546: 'electric guitar', 585: 'hair spray', 804: 'soap dispenser', 806: 'sock', 878: 'typewriter keyboard', 923: 'plate', 954: 'banana', 968: 'cup' }

select folder with specific target

target_label = 954

dataset_folder = os.path.join(dataset_folder, str(target_label)) normalizer = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) transforms = transforms.Compose([ transforms.ToTensor(), normalizer ])

dataset = ImageFolderWithEmptyDirs(dataset_folder, transform=transforms) model = models.resnet50(pretrained=True) loader = torch.utils.data.DataLoader(dataset, shuffle=True, batch_size=5) model.eval()

batches = 10 correct, attack_success, total = 0, 0, 0 for batch_idx, (images, labels) in enumerate(loader): if batch_idx == batches: break pred = model(images).argmax(dim=1) correct += (pred == labels).sum() attack_success += sum(pred == target_label) total += pred.shape[0]

accuracy = correct / total attack_sr = attack_success / total

print("Robust Accuracy: ", accuracy) print("Attack Success: ", attack_sr)

Clear search

Close search

Google apps

Main menu

Data from: ImageNet-Patch: A Dataset for Benchmarking Machine Learning...

code for testing robustness of a model

extract and unzip the dataset, then write top folder here

select folder with specific target

SemEval_training_data_emotions

short-metaworld-vla

… See the full description on the dataset page: https://huggingface.co/datasets/hz1919810/short-metaworld-vla.

IndicVoices_bengali

am-nlp-abstrct

malaysian-youtube

Data from: ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial PatchesSee More Versions

code for testing robustness of a model

extract and unzip the dataset, then write top folder here

select folder with specific target

Data from: ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches