Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains 3,400 records of fashion retail sales, capturing various details about customer purchases, including item details, purchase amounts, ratings, and payment methods. It is useful for analyzing customer buying behavior, product popularity, and payment preferences.
Column Name | Data Type | Non-Null Count | Description |
---|---|---|---|
Customer Reference ID | Integer | 3,400 | A unique identifier for each customer. |
Item Purchased | String | 3,400 | The name of the fashion item purchased. |
Purchase Amount (USD) | Float | 2,750 | The purchase price of the item in USD (650 missing values). |
Date Purchase | String | 3,400 | The date on which the purchase was made (format: DD-MM-YYYY). |
Review Rating | Float | 3,076 | The customer review rating (scale: 1 to 5, 324 missing values). |
Payment Method | String | 3,400 | The payment method used (e.g., Credit Card, Cash). |
Purchase Amount (USD)
: 650 missing values Review Rating
: 324 missing values Payment Method
includes multiple categories, allowing analysis of payment trends. Date Purchase
is in DD-MM-YYYY format, which can be useful for time-series analysis. eileennoonan/paramaggarwal-kaggle-fashion-product-images-small dataset hosted on Hugging Face and contributed by the HF Datasets community
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset captures 100,000 sales transactions in the fashion industry, featuring extreme outliers, missing values, and a multiclass classification target (Sales_Category). With 9 categorical and 10 numerical attributes, this dataset is ideal for exploratory data analysis (EDA), data visualization, and machine learning tasks. It includes details such as product names, brands, gender-specific clothing, pricing, discounts, stock levels, and customer behavior.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is a curated collection of fashion product images paired with their titles and descriptions, designed for training and fine-tuning multimodal AI models. Originally derived from Param Aggraval's "Fashion Product Images Dataset," it has undergone extensive preprocessing to improve usability and efficiency.
Preprocessing steps include:
1. Resize all images to a size of 256 X 256 px, preserving their original aspect ratio.
2. Streamlining the reference CSV file to retain only essential fields: image file name, display name, product description, and category.
3. Removing redundant style JSON files to minimize dataset complexity.
These optimizations have reduced the dataset size by 95%, making it lighter and faster to use without compromising data quality. This refined dataset is ideal for research and applications in multimodal AI, including tasks like product recommendation, image-text matching, and domain-specific fine-tuning.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for H&M Clothes captions
_Dataset used to train/finetune [Clothes text to image model] Captions are generated by using the 'detail_desc' and 'colour_group_name' or 'perceived_colour_master_name' from kaggle/competitions/h-and-m-personalized-fashion-recommendations. Original images were also obtained from the url (https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations/data?select=images)
For each row the dataset contains image and text⦠See the full description on the dataset page: https://huggingface.co/datasets/wbensvage/clothes_desc.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by harsh
Released under MIT
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This synthetic dataset simulates two years of transactional data for a multinational fashion retailer, featuring:
- π 4+ million sales records
- πͺ 35 stores across 7 countries:
πΊπΈ United States | π¨π³ China | π©πͺ Germany | π¬π§ United Kingdom | π«π· France | πͺπΈ Spain | π΅πΉ Portugal
Currencies Covered:
Each transaction includes detailed currency information, covering multiple currencies:
π΅ USD (United States) | πΆ EUR (Eurozone) | π΄ CNY (China) | π· GBP (United Kingdom)
π Geographic Sales Comparison
Gain insights into how sales performance varies between regions and countries, and identify trends that drive success in different markets.
π₯ Analyze Staffing and Performance
Evaluate store staffing ratios and analyze the impact of employee performance on store success.
ποΈ Customer Behavior and Segmentation
Understand regional customer preferences, analyze demographic factors such as age and occupation, and segment customers based on their purchasing habits.
π± Multi-Currency Analysis
Explore how transactions in different currencies (USD, EUR, CNY, GBP) are handled, analyze currency exchange effects, and compare sales across regions using multiple currencies.
π Product Trends
Assess how product categories (e.g., Feminine, Masculine, Children) and specific product attributes (size, color) perform across different regions.
π― Pricing and Discount Analysis
Study how different pricing models and discounts affect sales and customer decisions across diverse geographies.
π Advanced Cross-Country & Currency Analysis
Conduct complex, multi-dimensional analytics that interconnect countries, currencies, and sales data, identifying hidden correlations between economic factors, regional demand, and financial performance.
Generated using algorithms, it simulates real-world retail dynamics while ensuring privacy.
This dataset is an ideal resource for retail analysts, data scientists, and business intelligence professionals aiming to explore multinational retail data, optimize operations, and uncover new insights into customer behavior, sales trends, and employee efficiency.
Description:
π Download the dataset here
This dataset offers a detailed collection of Khaadiβs fashion line, providing a wide variety of high-quality images and metadata of clothing items such as dresses, shirts, and trousers. It is a powerful resource for exploring fashion trends, customer preferences, and product analysis.
Download Dataset
Applications in Fashion Industry
This dataset is ideal for applications in retail analytics, trend identification, and predictive modeling. It supports tasks like developing personalized recommendation systems, inventory management, and understanding customer buying behavior. Businesses can utilize it to optimize their fashion offerings, enhance customer satisfaction, and increase sales by aligning with trends and preferences.
Machine Learning Applications
This dataset supports various machine learning and AI models, especially in the areas of image classification, visual search engines, and style-based recommendations. It can be used for training convolutional neural networks (CNNs) to recognize patterns in clothing styles, materials, and other relevant features, helping to build automated systems that assist in inventory categorization, demand prediction, and customer recommendations.
Conclusion
The Khaadi Fashion Dataset is a versatile and powerful resource for both the fashion and tech industries. Whether youβre a retailer looking to enhance your customer offerings or a machine learning practitioner aiming to build robust models, this dataset equips you with the essential data for an in-depth analysis of fashion products.
This dataset is sourced from Kaggle.
Fashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('fashion_mnist', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/fashion_mnist-3.0.1.png" alt="Visualization" width="500px">
This dataset was created by Fares Dyab
Released under Apache 2.0
This dataset was created by mobin alhassan
It contains the following files:
Fashion-MNIST is a dataset of Zalando's article imagesβconsisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.
The original MNIST dataset contains a lot of handwritten digits. Members of the AI/ML/Data Science community love this dataset and use it as a benchmark to validate their algorithms. In fact, MNIST is often the first dataset researchers try. "If it doesn't work on MNIST, it won't work at all", they said. "Well, if it does work on MNIST, it may still fail on others."
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Fares Dyab
Released under MIT
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Pedro Klein
Released under Apache 2.0
Each image in seperate image set has a unique six-digit number such as 000001.jpg. A corresponding annotation file in txt format is provided in annotation set such as 000001.txt. Each annotation file is organized as below:
category_id: a number which corresponds to the category name. In category_id, 1 represents short sleeve top, 2 represents long sleeve top, 3 represents short sleeve outwear, 4 represents long sleeve outwear, 5 represents vest, 6 represents sling, 7 represents shorts, 8 represents trousers, 9 represents skirt, 10 represents short sleeve dress, 11 represents long sleeve dress, 12 represents vest dress and 13 represents sling dress.
bounding_box: [x1,y1,x2,y2]οΌwhere x1 and y_1 represent the upper left point coordinate of bounding box, x_2 and y_2 represent the lower right point coordinate of bounding box. (width=x2-x1;height=y2-y1)
The dataset is split into a training set (10K images), a validation set (2k images)
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by DILLIP MEHER
Released under Apache 2.0
This dataset has been created to support a set of experiments about using TFRecord for full GPU usage. I have taken the Fashion Classification Dataset and then I have reduced the size of images (now 256x256) and stored in a set of TFRecords files that can be easily used in TF2 code, for fast processing with GPU and TPU.
It is a set of files in TFRecords format. Each record contains an image of a clothes or boot or something similar, together with a set of metadata. This is the format
LABELED_TFREC_FORMAT = {
"image": tf.io.FixedLenFeature([], tf.string),
"image_name": tf.io.FixedLenFeature([], tf.string),
"base_colour" : tf.io.FixedLenFeature([], tf.int64),
"target" : tf.io.FixedLenFeature([], tf.int64)
}
base_colour and target have been codified:
This is the list of Master Categories (target): * Accessories * Apparel * Footwear * Free Items * Home * Personal Care * Sporting Goods
codified with [0, .. 6]
Original images taken from Param Aggarwal dataset: https://www.kaggle.com/paramaggarwal/fashion-product-images-dataset
I created this dataset to support a set of experiments around TFRecords. With it, you can easily create a Fashion Classifier that can be quickly trained. Using a two-GPU machine (p100) it takes 120 sec. per epoch and with around 20 epochs you can easily reach 0.994 accuracy on the test set.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains 3,400 records of fashion retail sales, capturing various details about customer purchases, including item details, purchase amounts, ratings, and payment methods. It is useful for analyzing customer buying behavior, product popularity, and payment preferences.
Column Name | Data Type | Non-Null Count | Description |
---|---|---|---|
Customer Reference ID | Integer | 3,400 | A unique identifier for each customer. |
Item Purchased | String | 3,400 | The name of the fashion item purchased. |
Purchase Amount (USD) | Float | 2,750 | The purchase price of the item in USD (650 missing values). |
Date Purchase | String | 3,400 | The date on which the purchase was made (format: DD-MM-YYYY). |
Review Rating | Float | 3,076 | The customer review rating (scale: 1 to 5, 324 missing values). |
Payment Method | String | 3,400 | The payment method used (e.g., Credit Card, Cash). |
Purchase Amount (USD)
: 650 missing values Review Rating
: 324 missing values Payment Method
includes multiple categories, allowing analysis of payment trends. Date Purchase
is in DD-MM-YYYY format, which can be useful for time-series analysis.