Facebook
TwitterThis dataset was created by Nandhu Elango
Facebook
TwitterThis is a package developed by xtract.ai containing utilities for the training and evaluation of pytorch models. See it on PyPI and on Github.
Facebook
Twitterhttps://github.com/pytorch/pytorch/raw/main/docs/source/_static/img/pytorch-logo-dark.png" alt="PyTorch Logo">
PyTorch is a Python package that provides two high-level features: - Tensor computation (like NumPy) with strong GPU acceleration - Deep neural networks built on a tape-based autograd system
You can reuse your favorite Python packages such as NumPy, SciPy, and Cython to extend PyTorch when needed.
Our trunk health (Continuous Integration signals) can be found at hud.pytorch.org.
At a granular level, PyTorch is a library that consists of the following components:
| Component | Description |
|---|---|
| torch | A Tensor library like NumPy, with strong GPU support |
| torch.autograd | A tape-based automatic differentiation library that supports all differentiable Tensor operations in torch |
| torch.jit | A compilation stack (TorchScript) to create serializable and optimizable models from PyTorch code |
| torch.nn | A neural networks library deeply integrated with autograd designed for maximum flexibility |
| torch.multiprocessing | Python multiprocessing, but with magical memory sharing of torch Tensors across processes. Useful for data loading and Hogwild training |
| torch.utils | DataLoader and other utility functions for convenience |
Usually, PyTorch is used either as:
Elaborating Further:
If you use NumPy, then you have used Tensors (a.k.a. ndarray).
PyTorch provides Tensors that can live either on the CPU or the GPU and accelerates the computation by a huge amount.
We provide a wide variety of tensor routines to accelerate and fit your scientific computation needs such as slicing, indexing, mathematical operations, linear algebra, reductions. And they are fast!
PyTorch has a unique way of building neural networks: using and replaying a tape recorder.
Most frameworks such as TensorFlow, Theano, Caffe, and CNTK have a static view of the world. One has to build a neural network and reuse the same structure again and again. Changing the way the network behaves means that one has to start from scratch.
With PyTorch, we use a technique called reverse-mode auto-differentiation, which allows you to change the way your network behaves arbitrarily with zero lag or overhead. Our inspiration comes from several research papers on this topic, as well as current and past work such as torch-autograd, autograd, Chainer, etc.
While this technique is not unique to PyTorch, it's one of the fastest implementations of it to date. You get the best of speed and flexibility for your crazy resear...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A Benchmark Dataset for Deep Learning for 3D Topology Optimization
This dataset represents voxelized 3D topology optimization problems and solutions. The solutions have been generated in cooperation with the Ariane Group and Synera using the Altair OptiStruct implementation of SIMP within the Synera software. The SELTO dataset consists of four different 3D datasets for topology optimization, called disc simple, disc complex, sphere simple and sphere complex. Each of these datasets is further split into a training and a validation subset.
The following paper provides full documentation and examples:
Dittmer, S., Erzmann, D., Harms, H., Maass, P., SELTO: Sample-Efficient Learned Topology Optimization (2022) https://arxiv.org/abs/2209.05098.
The Python library DL4TO (https://github.com/dl4to/dl4to) can be used to download and access all SELTO dataset subsets. Each TAR.GZ file container consists of multiple enumerated pairs of CSV files. Each pair describes a unique topology optimization problem and contains an associated ground truth solution. Each problem-solution pair consists of two files, where one contains voxel-wise information and the other file contains scalar information. For example, the i-th sample is stored in the files i.csv and i_info.csv, where i.csv contains all voxel-wise information and i_info.csv contains all scalar information. We define all spatially varying quantities at the center of the voxels, rather than on the vertices or surfaces. This allows for a shape-consistent tensor representation.
For the i-th sample, the columns of i_info.csv correspond to the following scalar information:
E - Young's modulus [Pa]
ν - Poisson's ratio [-]
σ_ys - a yield stress [Pa]
h - discretization size of the voxel grid [m]
The columns of i.csv correspond to the following voxel-wise information:
x, y, z - the indices that state the location of the voxel within the voxel mesh
Ω_design - design space information for each voxel. This is a ternary variable that indicates the type of density constraint on the voxel. 0 and 1 indicate that the density is fixed at 0 or 1, respectively. -1 indicates the absence of constraints, i.e., the density in that voxel can be freely optimized
Ω_dirichlet_x, Ω_dirichlet_y, Ω_dirichlet_z - homogeneous Dirichlet boundary conditions for each voxel. These are binary variables that define whether the voxel is subject to homogeneous Dirichlet boundary constraints in the respective dimension
F_x, F_y, F_z - floating point variables that define the three spacial components of external forces applied to each voxel. All forces are body forces given in [N/m^3]
density - defines the binary voxel-wise density of the ground truth solution to the topology optimization problem
How to Import the Dataset
with DL4TO: With the Python library DL4TO (https://github.com/dl4to/dl4to) it is straightforward to download and access the dataset as a customized PyTorch torch.utils.data.Dataset object. As shown in the tutorial this can be done via:
from dl4to.datasets import SELTODataset
dataset = SELTODataset(root=root, name=name, train=train)
Here, root is the path where the dataset should be saved. name is the name of the SELTO subset and can be one of "disc_simple", "disc_complex", "sphere_simple" and "sphere_complex". train is a boolean that indicates whether the corresponding training or validation subset should be loaded. See here for further documentation on the SELTODataset class.
without DL4TO: After downloading and unzipping, any of the i.csv files can be manually imported into Python as a Pandas dataframe object:
import pandas as pd
root = ... file_path = f'{root}/{i}.csv' columns = ['x', 'y', 'z', 'Ω_design','Ω_dirichlet_x', 'Ω_dirichlet_y', 'Ω_dirichlet_z', 'F_x', 'F_y', 'F_z', 'density'] df = pd.read_csv(file_path, names=columns)
Similarly, we can import a i_info.csv file via:
file_path = f'{root}/{i}_info.csv' info_column_names = ['E', 'ν', 'σ_ys', 'h'] df_info = pd.read_csv(file_path, names=info_columns)
We can extract PyTorch tensors from the Pandas dataframe df using the following function:
import torch
def get_torch_tensors_from_dataframe(df, dtype=torch.float32): shape = df[['x', 'y', 'z']].iloc[-1].values.astype(int) + 1 voxels = [df['x'].values, df['y'].values, df['z'].values]
Ω_design = torch.zeros(1, *shape, dtype=int)
Ω_design[:, voxels[0], voxels[1], voxels[2]] = torch.from_numpy(data['Ω_design'].values.astype(int))
Ω_Dirichlet = torch.zeros(3, *shape, dtype=dtype)
Ω_Dirichlet[0, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['Ω_dirichlet_x'].values, dtype=dtype)
Ω_Dirichlet[1, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['Ω_dirichlet_y'].values, dtype=dtype)
Ω_Dirichlet[2, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['Ω_dirichlet_z'].values, dtype=dtype)
F = torch.zeros(3, *shape, dtype=dtype)
F[0, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['F_x'].values, dtype=dtype)
F[1, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['F_y'].values, dtype=dtype)
F[2, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['F_z'].values, dtype=dtype)
density = torch.zeros(1, *shape, dtype=dtype)
density[:, voxels[0], voxels[1], voxels[2]] = torch.tensor(df['density'].values, dtype=dtype)
return Ω_design, Ω_Dirichlet, F, density
Facebook
TwitterTo make it easy to use the dataset with pytorch here is a utility script utils.py
This is the Caltech UCSD Birds (CUB) 200 2011 extended dataset (http://www.vision.caltech.edu/visipedia/CUB-200-2011.html).
The original CUB dataset doesn't have captions. A separate paper (https://arxiv.org/pdf/1605.05395.pdf) collected 10 captions for each image in CUB.
The class id for each image is given in metadata.pth file. The captions were encoded using spacy tokenizer and saved in metadata.pth file.
The images were resized (maintaining aspect ratio) and center cropped to 64x64, 128x128 and 256x256 resolutions.
There are 2 official dataset splits: train_val and test.
The word_id_to_word and word_to_word_id mappings in metadata.pth file can be used to encode and decode the captions.
The class_id_to_class_name and class_name_to_class_id mappings in metadata.pth file can be used to encode and decode the class ids.
All .pth files were created using pytorch torch.save().
They can be loaded using pytorch torch.load()
When processing captions and creating the vocabulary, words that appeared less than 5 times were replaced with `
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides image segmentation data for feral cats, designed for computer vision and machine learning tasks. It builds upon the original public domain dataset by Paul Cashman from Roboflow, with additional preprocessing and multiple data formats for easier consumption.
The dataset is organized into three standard splits: - Train set - Validation set - Test set
Each split contains data in multiple formats: 1. Original JPG images 2. Segmentation mask JPG images 3. Parquet files containing flattened image and mask data 4. Pickle files containing serialized image and mask data
train/: Original training imagesvalid/: Original validation imagestest/: Original test imagestrain_mask/: Corresponding segmentation masks for trainingvalid_mask/: Corresponding segmentation masks for validationtest_mask/: Corresponding segmentation masks for testingtrain_dataset.parquet, valid_dataset.parquet, test_dataset.parquetsplit_at = image_size[0] * image_size[1] * image_channels
[-1, 224, 224, 3])[-1, 224, 224, 1])train_dataset.pkl, valid_dataset.pkl, test_dataset.pklsplit_at = image_size[0] * image_size[1] * image_channelstrain_dataset.csv, valid_dataset.csv, test_dataset.csvAll images were preprocessed with the following operations: - Resized to 224×224 pixels using bilinear interpolation - Segmentation masks were also resized to match the images using nearest neighbor interpolation - Original RLE (Run-Length Encoding) segmentation data converted to binary masks
When used with the provided PyTorch dataset class, images are normalized with: - Mean: [0.48235, 0.45882, 0.40784] - Standard Deviation: [0.00392156862745098, 0.00392156862745098, 0.00392156862745098]
A custom CatDataset class is included for easy integration with PyTorch:
from cat_dataset import CatDataset
# Load from parquet format
dataset = CatDataset(
root="path/to/dataset",
split="train", # Options: "train", "valid", "test"
format="parquet", # Options: "parquet", "pkl"
image_size=[224, 224],
image_channels=3,
mask_channels=1
)
# Use with PyTorch DataLoader
from torch.utils.data import DataLoader
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
Loading time benchmarks from the original implementation: - Parquet format: ~1.29 seconds per iteration - Pickle format: ~0.71 seconds per iteration
The pickle format provides the fastest loading times and is recommended for most use cases.
If you use this dataset in your research or projects, please cite:
@misc{feral-cat-segmentation_dataset,
title = {feral-cat-segmentation Dataset},
type = {Open Source Dataset},
author = {Paul Cashman},
howpublished = {\url{https://universe.roboflow.com/paul-cashman-mxgwb/feral-cat-segmentation}},
url = {https://universe.roboflow.com/paul-cashman-mxgwb/feral-cat-segmentation},
journal = {Roboflow Universe},
publisher = {Roboflow},
year = {2025},
month = {mar},
note = {visited on 2025-03-19},
}
from ca...
Facebook
TwitterSimple chatbot implementation with PyTorch. A chatbot made in Python that features various data about the Star Wars universe. This is a generic chatbot. Can be trained on pretty much any conversation as long as formatted correctly JSON file. I used it for a final project in Artificial Intelligence. To use just run the script training first, then run your chatbot. For more please have a look on GitHub
Chatbots are extremely helpful for business organizations and also the customers. The majority of people prefer to talk directly from a chatbox instead of calling service centers. Today I am going to build an exciting project on Chatbot. I will implement a chatbot from scratch that will be able to understand what the user is talking about and give an appropriate response. Chatbots are nothing but an intelligent piece of software that can interact and communicate with people just like humans. Here in this project we created an AI Chatbot which is focused for The Star Wars Cinematic Universe and trying training it in such a way that it can answer some of the basics queries about Star Wars.
Chatbots are basically AI intelligence bots which can interact with the user or customers depends upon the usage. It is an application of Artificial Intelligence and Machine Learning¬. Now-a-days technology is increasing rapidly. In this technological world every industry is trying to automate things to provide better services. One of the great application of automation would be chatbot.
All chatbots come under the NLP (Natural Language Processing) concepts. NLP is composed of two things: - NLU (Natural Language Understanding): The ability of machines to understand human language like English. - NLG (Natural Language Generation): The ability of a machine to generate text similar to human written sentences Imagine a user asking a question to a chatbot: “Hey, what’s on the news today?” The chatbot will break down the user sentence into two things: intent and an entity. The intent for this sentence could be get_news as it refers to an action the user wants to perform. The entity tells specific details about the intent, so "today" will be the entity. So this way, a machine learning model is used to recognize the intents and entities of the chat.
I created a new python file and name it as chatbot.py and then import all the required modules. After that I loaded starwarsintents.json data file in our Python program.
import numpy as np
import nltk
from nltk.stem.porter import PorterStemmer
stemmer = PorterStemmer()
import torch
import torch.nn as nn
import random
import json
from torch.utils.data import Dataset, DataLoader
from tkinter import *
with open("starwarsintents.json", "r") as f:
intents = json.load(f)
```
## Preprocessing the Data
- Creating Custom Functions:
We will create custom Functions so that it is easy for us to implement afterwards. Natural language (nltk) took kit is a really useful library that contains important classes that will be useful in any of your NLP task. To know a bit more about Natural language (nltk). Please click [here](https://machinelearningmastery.com/natural-language-processing/) for more information.
- Stemming:
If we have 3 words like “walk”, “walked”, “walking”, these might seem different words but they generally have the same meaning and also have the same base form; “walk”. So, in order for our model to understand all different form of the same words we need to train our model with that form. This is called Stemming. There are different methods that we can use for stemming. Here we will use Porter Stemmer model form our NLTK Library. For more information click [here](http://snowball.tartarus.org/algorithms/porter/stemmer.html).
- Bag of Words:
We will be splitting each word in the sentences and adding it to an array. We will be using bag of words. Which will initially be a list of zeros with the size equal to the length of the all words array.If we have a array of sentences = ["hello", "how", "are", "you"] and an array of total words = ["hi", "hel...
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains 1,004 labeled images from the classic NES game "Duck Hunt" (1984), specifically prepared for YOLO (You Only Look Once) object detection training. The dataset includes sprites of the iconic hunting dog and ducks in various states, augmented to provide a balanced and comprehensive training set for computer vision models.
Perfect for: - Object detection model training - Computer vision research - Retro gaming AI projects - YOLO algorithm benchmarking - Educational purposes
| Metric | Value |
|---|---|
| Total Images | 1,004 |
| Dataset Size | 12 MB |
| Image Format | PNG |
| Annotation Format | YOLO (.txt) |
| Classes | 4 |
| Train/Val Split | 711/260 (73%/27%) |
| Class ID | Class Name | Count | Description |
|---|---|---|---|
| 0 | dog | 252 | The hunting dog in various poses (jumping, laughing, sniffing, etc.) |
| 1 | duck_dead | 256 | Dead ducks (both black and red variants) |
| 2 | duck_shot | 248 | Ducks in the moment of being shot |
| 3 | duck_flying | 248 | Flying ducks in all directions (left, right, diagonal) |
yolo_dataset_augmented/
├── images/
│ ├── train/ # 711 training images
│ └── val/ # 260 validation images
├── labels/
│ ├── train/ # 711 YOLO annotation files
│ └── val/ # 260 YOLO annotation files
├── classes.txt # Class names mapping
├── dataset.yaml # YOLO configuration file
└── augmented_dataset_stats.json # Detailed statistics
The original 47 images were enhanced using advanced data augmentation techniques to create a balanced dataset:
{
'rotation_range': (-15, 15), # Small rotations for game sprites
'brightness_range': (0.7, 1.3), # Brightness variations
'contrast_range': (0.8, 1.2), # Contrast adjustments
'saturation_range': (0.8, 1.2), # Color saturation
'noise_intensity': 0.02, # Gaussian noise
'horizontal_flip_prob': 0.5, # 50% chance horizontal flip
'scaling_range': (0.8, 1.2), # Scale variations
}
from ultralytics import YOLO
# Load and train
model = YOLO('yolov8n.pt') # Load pretrained model
results = model.train(data='dataset.yaml', epochs=100, imgsz=640)
# Validate
metrics = model.val()
# Predict
results = model('path/to/test/image.png')
import torch
from torch.utils.data import Dataset, DataLoader
from PIL import Image
import os
class DuckHuntDataset(Dataset):
def _init_(self, images_dir, labels_dir, transform=None):
self.images_dir = images_dir
self.labels_dir = labels_dir
self.transform = transform
self.images = os.listdir(images_dir)
def _len_(self):
return len(self.images)
def _getitem_(self, idx):
img_path = os.path.join(self.images_dir, self.images[idx])
label_path = os.path.join(self.labels_dir,
self.images[idx].replace('.png', '.txt'))
image = Image.open(img_path)
# Load YOLO annotations
with open(label_path, 'r') as f:
labels = f.readlines()
if self.transform:
image = self.transform(image)
return image, labels
# Usage
dataset = DuckHuntDataset('images/train', 'labels/train')
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
Each .txt file contains one line per object:
class_id center_x center_y width height
Example annotation:
0 0.492 0.403 0.212 0.315
Where values are normalized (0-1) relative to image dimensions.
This dataset is based on sprites from the iconic 1984 NES game "Duck Hunt," one of the most recognizable video games in history. The game featured:
Facebook
TwitterThe MELD Preprocessed Dataset is a multi-modal dataset designed for research on emotion recognition from audio, video, and textual data. The dataset builds upon the original MELD dataset and applies extensive preprocessing steps to extract features from different modalities. Each sample is saved as a .pt file containing a dictionary of preprocessed features, making it easy for developers to load and integrate into PyTorch-based workflows.
The preprocessing script performs several key steps:
Text Cleaning:
fix_encoding_with_bytes(text): Decodes text from bytes using UTF-8, Latin-1, or cp1252, ensuring correct encoding.replace_double_encoding(text): Fixes issues related to double-encoded characters (e.g., replacing "Â’" with the proper apostrophe).Audio Processing:
torchaudio.transforms.MelSpectrogram with 64 mel bins (VGGish format).Video Processing:
Saving Processed Samples:
.pt file in a directory structure split by data type (train, dev, and test).dia0_utt1.mp4 becomes dia0_utt1.pt).Each preprocessed sample is stored in a .pt file and contains a dictionary with the following keys:
utterance (str): The cleaned textual utterance.emotion (str/int): The corresponding emotion label.video_path (str): Original path to the video file from which the sample was extracted.audio (Tensor): Raw audio waveform tensor of shape [channels, time].audio_sample_rate (int): The sampling rate of the audio waveform.audio_mel (Tensor): The computed log-scaled Mel-spectrogram with shape [channels, n_mels, time].face (NumPy array): The extracted face image (RGB format) of shape (224, 224, 3). If no face was detected, a default black image is provided.The preprocessed files are organized into splits:
preprocessed_data/
├── train/
│ ├── dia0_utt0.pt
│ ├── dia1_utt1.pt
│ └── ...
├── dev/
│ ├── dia0_utt0.pt
│ ├── dia1_utt1.pt
│ └── ...
└── test/
│ ├── dia0_utt0.pt
│ ├── dia1_utt1.pt
└── ...
A custom PyTorch dataset and DataLoader are provided to facilitate easy integration:
from torch.utils.data import Dataset
import os
import torch
class PreprocessedMELDDataset(Dataset):
def _init_(self, data_dir):
"""
Args:
data_dir (str): Directory where preprocessed .pt files are stored.
"""
self.data_dir = data_dir
self.files = [os.path.join(data_dir, f) for f in os.listdir(data_dir) if f.endswith('.pt')]
def _len_(self):
return len(self.files)
def _getitem_(self, idx):
sample_path = self.files[idx]
sample = torch.load(sample_path)
return sample
def preprocessed_collate_fn(batch):
"""
Collates a list of sample dictionaries into a single dictionary with keys mapping to lists.
Modify this function to pad or stack tensor data if needed.
"""
collated = {}
collated['utterance'] = [sample['utterance'] for sample in batch]
collated['emotion'] = [sample['emotion'] for sample in batch]
collated['video_path'] = [sample['video_path'] for sample in batch]
collated['audio'] = [sample['audio'] for sample in batch]
collated['audio_sample_rate'] = batch[0]['audio_sample_rate']
collated['audio_mel'] = [sample['audio_mel'] for sample in batch]
collated['face'] = [sample['face'] for sample in batch]
return collated
from torch.utils.data import DataLoader
# Define paths for each split
train_data_dir = "preprocessed_data/train"
dev_data_dir = "preproces...
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterThis dataset was created by Nandhu Elango