Human-labelled products image dataset named “Products-10k", is so far the largest production recognition dataset containing 10,000 products frequently bought by online customers in JD.com, covering a full spectrum of categories including Fashion, 3C, food, healthcare, household commodities, etc. Moreover, large-scale product labels are organized as a graph to indicate the complex hierarchy and interdependency among products.
Citing: Yalong Bai, Yuxiang Chen, Wei Yu, Linfang Wang, Wei Zhang. "Products-10K: A Large-scale Product Recognition Dataset". [arXiv]
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Because this dataset has been used in a competition, we had to hide some of the data to prepare the test dataset for the competition. Thus, in the previous version of the dataset, only train.csv file is existed.
This dataset represents 10 different physical poses that can be used to distinguish 5 exercises. The exercises are Push-up, Pull-up, Sit-up, Jumping Jack and Squat. For every exercise, 2 different classes have been used to represent the terminal positions of that exercise (e.g., “up” and “down” positions for push-ups).
About 500 videos of people doing the exercises have been used in order to collect this data. The videos are from Countix Dataset that contain the YouTube links of several human activity videos. Using a simple Python script, the videos of 5 different physical exercises are downloaded. From every video, at least 2 frames are manually extracted. The extracted frames represent the terminal positions of the exercise.
For every frame, MediaPipe framework is used for applying pose estimation, which detects the human skeleton of the person in the frame. The landmark model in MediaPipe Pose predicts the location of 33 pose landmarks (see figure below). Visit Mediapipe Pose Classification page for more details.
https://mediapipe.dev/images/mobile/pose_tracking_full_body_landmarks.png" alt="33 pose landmarks">
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Letter Recognition is a dataset for object detection tasks - it contains A B C D E F G H I J K L M N O P annotations for 2,860 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Infrared Face Detection Dataset
Dataset contains 125,500+ images, including infrared images, from 4,484 individuals with or without a mask of various races, genders, and ages. It is specifically designed for research in face recognition and facial recognition technology, focusing on the unique challenges posed by thermal infrared imaging. By utilizing this dataset, researchers and developers can enhance their understanding of recognition systems and improve the recognition accuracy… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/infrared-face-recognition-dataset.
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Flowers Recognition Dataset is a multi-class image classification dataset consisting of flower images. It includes five categories: daisy, dandelion, rose, sunflower, and tulip.
2) Data Utilization (1) Characteristics of the Flowers Recognition Dataset: • The dataset contains real-world images with varying resolutions and aspect ratios, making it suitable for practicing tasks such as preprocessing, normalization, and data augmentation. • The diversity in background, lighting, and composition makes the dataset appropriate for evaluating model performance in realistic scenarios.
(2) Applications of the Flowers Recognition Dataset: • Development of plant recognition applications: Can be used as training data for mobile apps or photo-based plant identification services. • Training multi-class image classification models: Useful for building deep learning models that automatically classify flower types and for experimenting with transfer learning.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Includes face images of 11 subjects with 3 sets of images: one of the subject with no occlusion, one of them wearing a hat, and one of them wearing glasses. Each set consists of 5 subject positions (subject's two profile positions, one central position, and two positions angled between the profile and central positions), with 7 lighting angles for each position (completing a 180 degree arc around the subject), and 5 light settings for each angle (warm, cold, low, medium, and bright). Images are 5184 pixels tall by 3456 pixels wide and are saved in .JPG format.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The database used in this study is comprising of 44406 fruit images, which we collected in a period of 6 months. The images where made with in our lab’s environment under different scenarios which we mention below. We captured all the images on a clear background with resolution of 320×258 pixels. We used HD Logitech web camera to took the pictures. During collecting this database, we created all kind of challenges, which, we have to face in real-world recognition scenarios in supermarket and fruit shops such as light, shadow, sunshine, pose variation, to make our model robust for, it might be necessary to cope with illumination variation, camera capturing artifacts, specular reflection shading and shadows. We tested our model’s robustness in all scenarios and it perform quit well.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Image Dataset of face images for compuer vision tasks
Dataset comprises 500,600+ images of individuals representing various races, genders, and ages, with each person having a single face image. It is designed for facial recognition and face detection research, supporting the development of advanced recognition systems. By leveraging this dataset, researchers and developers can enhance deep learning models, improve face verification and face identification techniques, and refine… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/face-recognition-image-dataset.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The dataset contains a comprehensive collection of human activity videos, spanning across 7 distinct classes. These classes include clapping, meeting and splitting, sitting, standing still, walking, walking while reading book, and walking while using the phone.
Each video clip in the dataset showcases a specific human activity and has been labeled with the corresponding class to facilitate supervised learning.
The primary inspiration behind creating this dataset is to enable machines to recognize and classify human activities accurately. With the advent of computer vision and deep learning techniques, it has become increasingly important to train machine learning models on large and diverse datasets to improve their accuracy and robustness.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Facial recognition datasets consist solely of images of faces, with no additional annotations. They include diverse examples of facial features, poses, and lighting conditions, and are used to train and evaluate facial recognition systems for tasks like face detection and recognition.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Crowd Recognition Dataset is a dataset for object detection tasks - it contains Frame The Red Blue And Black Ch annotations for 200 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Explore the Wildlife Recognition Dataset with 9,638 high-resolution images across 17 species, perfect for deep learning, wildlife research, and conservation.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This is a remote sensing image Military Aircraft Recognition dataset that include 3842 images, 20 types, and 22341 instances annotated with horizontal bounding boxes and oriented bounding boxes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Includes videos of 11 subjects, each showing 18 different angles of their face for one second each. The process was repeated with 5 light settings (warm, cold, low, medium, and bright). Videos are recorded in 3840 pixels tall by 2160 pixels wide and are saved in .MP4 format.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The dataset covers a wide range of indoor environments, including residential spaces, offices, classrooms, libraries, kitchens, and more.
Video dataset featuring five hand gestures for sign language recognition, gesture-based controls, VR interaction, and robotics
https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html
Mudestreda Multimodal Device State Recognition Dataset
obtained from real industrial milling device with Time Series and Image Data for Classification, Regression, Anomaly Detection, Remaining Useful Life (RUL) estimation, Signal Drift measurement, Zero Shot Flank Took Wear, and Feature Engineering purposes.
The official dataset used in the paper "Multimodal Isotropic Neural Architecture with Patch Embedding" ICONIP23.
Official repository: https://github.com/hubtru/Minape
Conference paper: https://link.springer.com/chapter/10.1007/978-981-99-8079-6_14
Mudestreda (MD) | Size 512 Samples (Instances, Observations)| Modalities 4 | Classes 3 |
Future research: Regression, Remaining Useful Life (RUL) estimation, Signal Drift detection, Anomaly Detection, Multivariate Time Series Prediction, and Feature Engineering.
Notice: Tables and images do not render properly.
Recommended: README.md
includes the Mudestreda description and images Mudestreda.png
and Mudestreda_Stage.png
.
Data Overview
Task: Uni/Multi-Modal Classification
Domain: Industrial Flank Tool Wear of the Milling Machine
Input (sample): 4 Images: 1 Tool Image, 3 Spectrograms (X, Y, Z axis)
Output: Machine state classes: Sharp
, Used
, Dulled
Evaluation: Accuracies, Precision, Recal, F1-score, ROC curve
Each tool's wear is categorized sequentially: Sharp → Used → Dulled.
The dataset includes measurements from ten tools: T1 to T10.
Data splitting options include random or chronological distribution, without shuffling.
Options:
Original data or Augmented data
Random distribution or Tool Distribution (see Dataset Splitting)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Includes videos of 11 subjects, each showing 18 different angles of their face for one second each. The process was repeated with 5 light settings (warm, cold, low, medium, and bright). Videos are recorded in 3840 pixels tall by 2160 pixels wide and are saved in .MP4 format.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains images for gesture recognition, divided into two main sets: dataset0608 and data_synthetic_variab. The data was collected using a wooden hand. dataset0608 This dataset consists of two modes: ris_random and ris_optimized. The main difference between the two subfolders is the configuration of the RIS (random or optimized). This dataset consists of four subfolders: ris_random, ris_random2, ris_optimized, and ris_optimized2. The main difference between the subfolders is the format of the data: - ris_random and ris_optimized: Data is stored in individual files for each frame, named as 'frame_{i}{posture}{n_med}' - ris_random2 and ris_optimized2: Data has already been processed and combined into single files for all frames using the compact_files_frames.txt function, named as 'all_frames_{posture}_{n_med}' For each gestures = {close, two, open}, we have n_med values from 0 to 114 and 10 frames. Therefore, the ris_random and ris_optimized folders contain 10 frames × 115 measurements × 3 gestures = 3450 files, while the ris_random2 and ris_optimized2 folders contain 1 × 115 measurements × 3 gestures = 345 files. data_synthetic_variab This dataset consists of two modes: ris_random and ris_optimized. The main difference between the two subfolders is the configuration of the RIS (random or optimized). This dataset consists of four subfolders: ris_random, ris_random2, ris_optimized, and ris_optimized2. The main difference between the subfolders is the format of the data: - ris_random and ris_optimized: Data is stored in individual files for each frame, named as 'frame_{i}{posture}{n_med}' - ris_random2 and ris_optimized2: Data has already been processed and combined into single files for all frames using the compact_files_frames.txt function, named as 'all_frames_{posture}_{n_med}' For each gestures = {close, two, open}, we have n_med values from 0 to 8 and 10 frames. This dataset provides additional synthetic data with variations in hand position to increase the dataset's diversity. Each gesture is represented by 8 different ways, where the hand position was slightly modified between each sample. These real data were used as a basis for generating synthetic data. By using the functions in the files "multiply_files.txt" and "add_gaussian_noise.txt," the dataset was expanded and made more realistic by adding Gaussian noise to the images. Therefore, the ris_random and ris_optimized folders contain 10 frames × 8 measurements × 3 gestures = 240 files, while the ris_random2 and ris_optimized2 folders contain 1 × 8 measurements × 3 gestures = 24 files. Functions * add_gaussian_noise.txt: This script adds Gaussian noise to the images to simulate real-world conditions and improve the robustness of the model. * compact_files_frames.txt: This script combines multiple frames into a single image, which can be useful for certain types of analysis.
Facial Expression Recognition dataset helps AI interpret human emotions for improved sentiment analysis and recognition
Human-labelled products image dataset named “Products-10k", is so far the largest production recognition dataset containing 10,000 products frequently bought by online customers in JD.com, covering a full spectrum of categories including Fashion, 3C, food, healthcare, household commodities, etc. Moreover, large-scale product labels are organized as a graph to indicate the complex hierarchy and interdependency among products.
Citing: Yalong Bai, Yuxiang Chen, Wei Yu, Linfang Wang, Wei Zhang. "Products-10K: A Large-scale Product Recognition Dataset". [arXiv]