68 datasets found

Image Annotation Services | Image Labeling for AI & ML |Computer Vision...
datarade.ai
Updated Dec 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2023). Image Annotation Services | Image Labeling for AI & ML |Computer Vision Data| Annotated Imagery Data [Dataset]. https://datarade.ai/data-products/nexdata-image-annotation-services-ai-assisted-labeling-nexdata
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Dec 29, 2023
Dataset authored and provided by
Nexdata
Area covered
Austria, El Salvador, Japan, India, Grenada, Latvia, Bosnia and Herzegovina, Romania, Bulgaria, Hong Kong
Description
Overview We provide various types of Annotated Imagery Data annotation services, including:

Bounding box

Polygon

Segmentation

Polyline

Key points

Image classification

Image description ...

Our Capacity

Platform: Our platform supports human-machine interaction and semi-automatic labeling, increasing labeling efficiency by more than 30% per annotator.It has successfully been applied to nearly 5,000 projects.

Annotation Tools: Nexdata's platform integrates 30 sets of annotation templates, covering audio, image, video, point cloud and text.

-Secure Implementation: NDA is signed to gurantee secure implementation and Annotated Imagery Data is destroyed upon delivery.

-Quality: Multiple rounds of quality inspections ensures high quality data output, certified with ISO9001

About Nexdata Nexdata has global data processing centers and more than 20,000 professional annotators, supporting on-demand data annotation services, such as speech, image, video, point cloud and Natural Language Processing (NLP) Data, etc. Please visit us at https://www.nexdata.ai/computerVisionTraining?source=Datarade

Global Data Annotation and Labeling Tool Market Research Report: By...

wiseguyreports.com

Updated Oct 14, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Data Annotation and Labeling Tool Market Research Report: By Application (Image Annotation, Text Annotation, Audio Annotation, Video Annotation), By Deployment Type (Cloud-Based, On-Premises), By End Use Industry (Healthcare, Automotive, Retail, Finance, Education), By Type of Annotation (Semantic Segmentation, Bounding Box Annotation, Polygon Annotation, Landmark Annotation) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/data-annotation-and-labeling-tool-market

Explore at:

Dataset updated

Oct 14, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Oct 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	3.75(USD Billion)
MARKET SIZE 2025	4.25(USD Billion)
MARKET SIZE 2035	15.0(USD Billion)
SEGMENTS COVERED	Application, Deployment Type, End Use Industry, Type of Annotation, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	growing AI adoption, increasing data volume, demand for automation, enhanced accuracy requirements, need for regulatory compliance
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Cognizant, Health Catalyst, Microsoft Azure, Slydian, Scale AI, Lionbridge AI, Samarthanam Trust, DataRobot, Clarifai, SuperAnnotate, Amazon Web Services, Appen, Google Cloud, iMerit, TAGSYS, Labelbox
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	Increased AI adoption, Demand for automated solutions, Advancements in machine learning, Expanding IoT data sources, Need for regulatory compliance
COMPOUND ANNUAL GROWTH RATE (CAGR)	13.4% (2025 - 2035)

m
Annotated UAV Image Dataset for Object Detection Using LabelImg and Roboflow...
data.mendeley.com
Updated Aug 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anindita Das (2025). Annotated UAV Image Dataset for Object Detection Using LabelImg and Roboflow [Dataset]. http://doi.org/10.17632/fwg6pt6ckd.1
Explore at:
Unique identifier
https://doi.org/10.17632/fwg6pt6ckd.1
Dataset updated
Aug 21, 2025
Authors
Anindita Das
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset consists of drone images that were obtained for agricultural field monitoring to detect weeds and crops through computer vision and machine learning approaches. The images were obtained through high-resolution UAVs and annotated using the LabelImg and Roboflow tool. Each image has a corresponding YOLO annotation file that contains bounding box information and class IDs for detected objects. The dataset includes:

Original images in .jpg format with a resolution of 585 × 438 pixels.

Annotation files (.txt) corresponding to each image, following the YOLO format: class_id x_center y_center width height.

A classes.txt file listing the object categories used in labeling (e.g., Weed, Crop).

The dataset is intended for use in machine learning model development, particularly for precision agriculture, weed detection, and plant health monitoring. It can be directly used for training YOLOv7 and other object detection models.

Global Automated Annotation Tool Market Research Report: By Application...

wiseguyreports.com

Updated Sep 15, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Automated Annotation Tool Market Research Report: By Application (Image Annotation, Text Annotation, Video Annotation, Audio Annotation), By Deployment Type (Cloud-Based, On-Premises), By End Use (Healthcare, Automotive, Retail, Financial Services), By Annotation Type (Bounding Box, Polygon Annotation, Line Annotation, Semantic Segmentation) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/automated-annotation-tool-market

Explore at:

Dataset updated

Sep 15, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Sep 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	2.72(USD Billion)
MARKET SIZE 2025	3.06(USD Billion)
MARKET SIZE 2035	10.0(USD Billion)
SEGMENTS COVERED	Application, Deployment Type, End Use, Annotation Type, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	Rising demand for AI models, Increasing data volume, Need for cost-effective solutions, Advancements in machine learning, Growing partnerships and collaborations
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	IBM, Edgecase, AWS, Mighty AI, CrowdFlower, NVIDIA, Clarifai, Gigantum, Microsoft, Labelbox, Zegami, Scale AI, Google, SuperAnnotate, DataRobot
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	Rising demand for AI training data, Increasing adoption of machine learning, Expansion in autonomous vehicle technology, Growth in healthcare automation, Surge in data-driven decision making
COMPOUND ANNUAL GROWTH RATE (CAGR)	12.6% (2025 - 2035)

SkySeaLand Object Detection Dataset
kaggle.com
zip
Updated Nov 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md. Zahid Hasan Riad (2025). SkySeaLand Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/mdzahidhasanriad/skysealand
Explore at:
zip(275159131 bytes)Available download formats
Dataset updated
Nov 10, 2025
Authors
Md. Zahid Hasan Riad
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
🛰️ SkySeaLand Dataset

The SkySeaLand Dataset is a high-resolution satellite imagery collection developed for object detection, classification, and aerial analysis tasks. It focuses on transportation-related objects observed from diverse geospatial contexts, offering precise YOLO-formatted annotations for four categories: airplane, boat, car, and ship.

This dataset bridges terrestrial, maritime, and aerial domains, providing a unified resource for developing and benchmarking computer vision models in complex real-world environments.

📚 Overview

Total Images: 1,300+

Total Bounding Boxes: 19,103

Annotation Format: YOLO (one .txt file per image)

Classes: Airplane, Boat, Car, Ship

Image Resolution: High (suitable for fine-grained detection and classification)

Geographic Coverage: Asia, Europe, Russia, and the United States

Scene Types: Airports, coastal areas, harbors, highways, marinas, and offshore regions

Applications: Object detection, transfer learning, geospatial AI, aerial surveillance, and domain adaptation studies.

📊 Dataset Split Summary

The SkySeaLand Dataset is divided into the following subsets for training, validation, and testing:

Train Set: 80% of the total dataset, consisting of 1,048 images

Validation Set: 10% of the total dataset, consisting of 132 images

Test Set: 10% of the total dataset, consisting of 127 images

Total Dataset:

Total Images: 1,307 images

This split ensures a balanced distribution for training, validating, and testing models, facilitating robust model evaluation and performance analysis.

📊 Class Distribution

Class Name Object Count
Airplane 4,847
Boat 3,697
Car 6,932
Ship 3,627

The dataset maintains a moderately balanced distribution among categories, ensuring stable model performance during multi-class training and evaluation.

🧾 Annotation Format

Each label file contains normalized bounding box annotations in YOLO format.
The format for each line is:

Where: - class_id: The class of the object (refer to the table below). - x_center, y_center: The center coordinates of the bounding box, normalized between 0 and 1 relative to the image width and height. - width, height: The width and height of the bounding box, also normalized between 0 and 1.

Class ID and Categories

Class ID Category
0 Airplane
1 Boat
2 Car
3 Ship

All coordinates are normalized between 0 and 1 relative to the image width and height.

🧰 Data Source and Tools

Data Source:
- Satellite imagery was obtained from Google Earth Pro under fair-use and research guidelines.
- The dataset was prepared solely for academic and educational computer vision research.

Annotation Tools:
- Manual annotations were performed and verified using:
- CVAT (Computer Vision Annotation Tool)
- Roboflow

These tools were used to ensure consistent annotation quality and accurate bounding box placement across all object classes.

🧠 Research Applications

Benchmarking YOLO models on mixed-domain aerial imagery

Studying model generalization between terrestrial and maritime scenes

Developing lightweight detection systems for drones or satellite platforms

Evaluating multi-class performance in unstructured outdoor imagery

📈 Suggested Experiments

Compare YOLOv12 vs. Faster R-CNN performance

Apply augmentation strategies (rotation, scaling, blur) for generalization

Cross-environment evaluation (train on airports, test on coastal regions)

Analyze class-wise F1 and IoU metrics for model interpretability
R
Car Highway Dataset
universe.roboflow.com
zip
Updated Sep 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sallar (2023). Car Highway Dataset [Dataset]. https://universe.roboflow.com/sallar/car-highway/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Sep 13, 2023
Dataset authored and provided by
Sallar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Vehicles Bounding Boxes
Description
Car-Highway Data Annotation Project

Introduction

In this project, we aim to annotate car images captured on highways. The annotated data will be used to train machine learning models for various computer vision tasks, such as object detection and classification.

Project Goals

Collect a diverse dataset of car images from highway scenes.

Annotate the dataset to identify and label cars within each image.

Organize and format the annotated data for machine learning model training.

Tools and Technologies

For this project, we will be using Roboflow, a powerful platform for data annotation and preprocessing. Roboflow simplifies the annotation process and provides tools for data augmentation and transformation.

Annotation Process

Upload the raw car images to the Roboflow platform.

Use the annotation tools in Roboflow to draw bounding boxes around each car in the images.

Label each bounding box with the corresponding class (e.g., car).

Review and validate the annotations for accuracy.

Data Augmentation

Roboflow offers data augmentation capabilities, such as rotation, flipping, and resizing. These augmentations can help improve the model's robustness.

Data Export

Once the data is annotated and augmented, Roboflow allows us to export the dataset in various formats suitable for training machine learning models, such as YOLO, COCO, or TensorFlow Record.

Milestones

Data Collection and Preprocessing

Annotation of Car Images

Data Augmentation

Data Export

Model Training

Conclusion

By completing this project, we will have a well-annotated dataset ready for training machine learning models. This dataset can be used for a wide range of applications in computer vision, including car detection and tracking on highways.
Human Tracking & Object Detection Dataset
kaggle.com
zip
Updated Jul 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unique Data (2023). Human Tracking & Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/people-tracking
Explore at:
zip(46156442 bytes)Available download formats
Dataset updated
Jul 27, 2023
Authors
Unique Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
People Tracking & Object Detection dataset

The dataset comprises of annotated video frames from positioned in a public space camera. The tracking of each individual in the camera's view has been achieved using the rectangle tool in the Computer Vision Annotation Tool (CVAT).

The dataset is created on the basis of Real-Time Traffic Video Dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fc5a8dc4f63fe85c64a5fead10fad3031%2Fpersons_gif.gif?generation=1690705558283123&alt=media" alt="">

Dataset Structure

The images directory houses the original video frames, serving as the primary source of raw data.

The annotations.xml file provides the detailed annotation data for the images.

The boxes directory contains frames that visually represent the bounding box annotations, showing the locations of the tracked individuals within each frame. These images can be used to understand how the tracking has been implemented and to visualize the marked areas for each individual.

Data Format

The annotations are represented as rectangle bounding boxes that are placed around each individual. Each bounding box annotation contains the position ( xtl-ytl-xbr-ybr coordinates ) for the respective box within the frame. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F4f274551e10db2754c4d8a16dff97b33%2Fcarbon%20(10).png?generation=1687776281548084&alt=media" alt="">

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

🚀 You can learn more about our high-quality unique datasets here

keywords: multiple people tracking, human detection dataset, object detection dataset, people tracking dataset, tracking human object interactions, human Identification tracking dataset, people detection annotations, detecting human in a crowd, human trafficking dataset, deep learning object tracking, multi-object tracking dataset, labeled web tracking dataset, large-scale object tracking dataset
R
Nrl Player Detection Dataset
universe.roboflow.com
zip
Updated Aug 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AAP BLOCKY (2023). Nrl Player Detection Dataset [Dataset]. https://universe.roboflow.com/aap-blocky-yqzrb/nrl-player-detection/model/2
Explore at:
zipAvailable download formats
Dataset updated
Aug 4, 2023
Dataset authored and provided by
AAP BLOCKY
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Players Bounding Boxes
Description
This project involves annotating players on a rugby league field in a set of video frames. The goal is to label each player with a bounding box in each frame.

We have extracted around 1500 frames from rugby league videos, and we need to annotate the players in each frame. The labels should be accurate and consistent across all frames.

I've uploaded the dataset so you can use the built-in annotation tool to label each player with a bounding box. To get started, follow these steps:

Open the annotation tool and select the first frame in the dataset.

Use the rectangle tool to draw a bounding box around each player in the frame.

Add the label 'Player' to each bounding box

Move to the next frame in the dataset and repeat steps 3-4.

Continue annotating all frames in the dataset until all players are labeled.

We recommend exporting the labels in the YOLO format.

If you have any questions or concerns about the annotation process, please don't hesitate to reach out to us.

Thank you for your help with this project!
Bee Image Object Detection
kaggle.com
datasetninja.com
zip
Updated Dec 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AndrewLCA (2022). Bee Image Object Detection [Dataset]. https://www.kaggle.com/datasets/andrewlca/bee-image-object-detection
Explore at:
zip(5960524101 bytes)Available download formats
Dataset updated
Dec 18, 2022
Authors
AndrewLCA
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
The dataset was created for bee object detection based on images. Videos were taken at the entrance of 25 beehives in three apiaries in San Jose, Cupertino, and Gilroy in CA, USA. The videos were taken above the landing pad of different beehives. The camera was placed at a distinct angle to provide a clear view of the hive entrance.

The images were saved one frame per second from videos. The annotation platform Label Studio was selected to annotate bees in each image due to the friendly user interface and high quality. The below criteria was followed in the labeling process. First, at least 50% of the bee's body must be visible. Second, the image cannot be too blurry. After tagging each bee with a rectangle box in the annotation tool, output label files with Yolo labeling format were generated for each image. The output label files contained one set of bounding-box (BBox) coordinates for each bee in the image. If there were multiple objects in the image, there would be one line for one object in the label file. It recorded the object ID, X-axis center, Y-axis center, BBox width, and height with normalized image size from 0 to 1.

Please cite the paper if you used the data in your research: Liang, A. (2024). Developing a multimodal system for bee object detection and health assessment. IEEE Access, 12, 158703 - 15871. https://doi.org/10.1109/ACCESS.2024.3464559.
d
Sharks and rays swimming in a large public aquarium
search.dataone.org
data.niaid.nih.gov
+1more
Updated Jul 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TomÃ¡s Lopes (2025). Sharks and rays swimming in a large public aquarium [Dataset]. http://doi.org/10.5061/dryad.2rbnzs7tg
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.2rbnzs7tg
Dataset updated
Jul 27, 2025
Dataset provided by
Dryad Digital Repository
Authors
TomÃ¡s Lopes
Time period covered
Jan 1, 2023
Description
Large Public Aquaria are complex ecosystems that require constant monitoring to detect and correct anomalies that may affect the habitat and their species. Many of those anomalies can be directly or indirectly spotted by monitoring the behavior of fish. This can be a quite laborious task to be done by biologists alone. Automated fish tracking methods, specially of the non-intrusive type, can help biologists in the timely detection of such events. These systems require annotated data of fish to be trained. We used footage collected from the main aquarium of OceanÃ¡rio de Lisboa to create a novel dataset with fish annotations from the shark and ray species. The dataset has the following characteristics:

66 shark training tracks with a total of 15812 bounding boxes 88 shark testing tracks with a total of 15978 bounding boxes 133 ray training tracks with a total of 28168 bounding boxes 192 ray testing tracks with a total of 31529 bounding boxes

The training set corresponds to a calm enviro..., The dataset was collected using a stationary camera positioned outside the main tank of OceanÃ¡rio de Lisboa aiming at the fish. Additionally, this data was processed using the CVAT annotation tool to create the sharks and rays annotations., , # Sharks and rays swimming in a large public aquarium

Each set has 2 folders: gt and img1. The gt folder contains 3 txt files: gt, gt_out and labels. The gt and gt_out files contain the bounding box annotations sorted in two distinct ways. The former has the annotations sorted by frame number, while the latter is sorted by the track ID. Each line of the ground truth files represents one bounding box of a fish trajectory. The bounding boxes are represented with the following format: frame id, track id, x, y, w, h, not ignored, class id, visibility. The folder img1 contains all the annotated frames.

Description of the bounding boxes variables:

frame id points to the frame where the bounding box was obtained;

track id identifies the track of a fish with which the bonding box is associated;

x and y are the pixel coordinates of the top left corner of the bounding box;

w and h are the width and height of the bounding box respectively. These variables are measured in terms of pixels o...
MRI for Brain Tumor with Bounding Boxes
kaggle.com
zip
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Sorour1 (2024). MRI for Brain Tumor with Bounding Boxes [Dataset]. https://www.kaggle.com/datasets/ahmedsorour1/mri-for-brain-tumor-with-bounding-boxes
Explore at:
zip(139133481 bytes)Available download formats
Dataset updated
Jul 12, 2024
Authors
Ahmed Sorour1
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Brain Tumor Detection Dataset

Overview

This dataset contains high-quality MRI images of brain tumors with detailed annotations. The dataset is meticulously curated, cleaned, and annotated to aid in the development and evaluation of machine learning models for brain tumor detection and classification.

Dataset Composition

The dataset includes a total of 5,249 MRI images divided into training and validation sets. Each image is annotated with bounding boxes in YOLO format, and labels corresponding to one of the four classes of brain tumors.

Classes

Class 0: Glioma

Class 1: Meningioma

Class 2: No Tumor

Class 3: Pituitary

Data Split

1. Training Set:

Glioma: 1,153 images

Meningioma: 1,449 images

No Tumor: 711 images

Pituitary: 1,424 images

2. Validation Set:

Glioma: 136 images

Meningioma: 140 images

No Tumor: 100 images

Pituitary: 136 images

Image Characteristics

The images in the dataset are from different angles of MRI scans including sagittal, axial, and coronal views. This variety ensures comprehensive coverage of brain anatomy, enhancing the robustness of models trained on this dataset.

Annotations

The bounding boxes were manually annotated using the LabelImg tool by a dedicated team. This rigorous process ensures high accuracy and reliability of the annotations.

Source and Inspiration

This dataset was inspired by two existing datasets: 1. https://www.kaggle.com/datasets/masoudnickparvar/brain-tumor-mri-dataset 2. https://www.kaggle.com/datasets/sartajbhuvaji/brain-tumor-classification-mri

A thorough cleaning process was performed to remove noisy, mislabeled, and poor-quality images, resulting in a high-quality and well-labeled dataset.

Usage

This dataset is suitable for training and validating deep learning models for the detection and classification of brain tumors. The variety in MRI scan angles and the precision of annotations provide an excellent foundation for developing robust computer vision applications in medical imaging.

Citation

If you use this dataset in your research or project, please consider citing it appropriately to acknowledge the effort put into its creation and annotation.
142-Birds-Species-Object-Detection-V1
kaggle.com
zip
Updated Oct 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sai Sanjay Kottakota (2024). 142-Birds-Species-Object-Detection-V1 [Dataset]. https://www.kaggle.com/datasets/saisanjaykottakota/142-birds-species-object-detection-v1
Explore at:
zip(1081589024 bytes)Available download formats
Dataset updated
Oct 17, 2024
Authors
Sai Sanjay Kottakota
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Data Annotation for Computer Vision using Web Scraping and CVAT

Introduction

This project demonstrates the process of creating a labeled dataset for computer vision tasks using web scraping and the CVAT annotation tool. Web scraping was employed to gather images from the web, and CVAT was utilized to annotate these images with bounding boxes around objects of interest. This dataset can then be used to train object detection models.

Dataset Creation

Web Scraping: Images of 142 bird species were collected using web scraping techniques. Libraries such as requests and Beautiful Soup were likely used for this task.

CVAT Annotation: The collected images were uploaded to CVAT, where bounding boxes were manually drawn around each bird instance in the images. This created a labeled dataset ready for training computer vision models.

Usage

This dataset can be used to train object detection models for bird species identification. It can also be used to evaluate the performance of existing object detection models on a specific dataset.

Code

The code used for this project is available in the attached notebook. It demonstrates how to perform the following tasks:

Download the dataset.

Install necessary libraries.

Upload the dataset to Kaggle.

Create a dataset in Kaggle and upload the data.

Conclusion

This project provides a comprehensive guide to data annotation for computer vision tasks. By combining web scraping and CVAT, we were able to create a high-quality labeled dataset for training object detection models. Sources github.com/cvat-ai/cvat opencv.org/blog/data-annotation/

Sample manifest.jsonl metadata

{"version":"1.1"} {"type":"images"} {"name":"Spot-billed_Pelican_-_Pelecanus_philippensis_-_Media_Search_-_Macaulay_Library_and_eBirdMacaulay_Library_logoMacaulay_Library_lo/10001","extension":".jpg","width":480,"height":360,"meta":{"related_images":[]}} {"name":"Spot-billed_Pelican_-_Pelecanus_philippensis_-_Media_Search_-_Macaulay_Library_and_eBirdMacaulay_Library_logoMacaulay_Library_lo/10002","extension":".jpg","width":480,"height":320,"meta":{"related_images":[]}}
S
QT-MSTR: A Multilingual Scene Text Annotation Dataset for the Qinghai-Tibet...
scidb.cn
Updated Nov 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jia yang ji; Ze Rang Cuo; Wang Yanfeng; Zhuoma Tso (2025). QT-MSTR: A Multilingual Scene Text Annotation Dataset for the Qinghai-Tibet Region [Dataset]. http://doi.org/10.57760/sciencedb.j00001.01643
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.j00001.01643
Dataset updated
Nov 12, 2025
Dataset provided by
Science Data Bank
Authors
jia yang ji; Ze Rang Cuo; Wang Yanfeng; Zhuoma Tso
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Area covered
Qinghai, Tibet
Description
The QT-MSTR dataset is a text detection and recognition dataset focused on multi-lingual scenes in the Qinghai-Tibet Plateau region of China. It aims to provide high-quality benchmark data for research areas such as Tibetan OCR, multi-lingual scene text recognition, and low-resource language processing through real-world street-view images. Data were collected between 2020 and 2023, covering key urban areas in the Qinghai-Tibet region, including Xining and Haidong in Qinghai Province, Gannan Tibetan Autonomous Prefecture and Tianzhu Tibetan Autonomous County in Gansu Province, as well as Lhasa in the Tibet Autonomous Region. The collection focused on public spaces where multi-lingual text commonly appears, such as commercial streets, tourist service points, transportation hubs, and areas around public facilities, to accurately reflect the "Tibetan-Chinese-English" multilingual environment of the region. Data were captured using mainstream smartphone rear cameras and portable digital cameras under natural lighting conditions, with all images saved at their original resolution (primarily 4032×3024 pixels). In terms of data processing, we established a standardized annotation pipeline. First, all images underwent strict privacy protection processing, with faces and license plates that could involve personal identity information being blurred. Subsequently, annotators proficient in Tibetan, Chinese, and English performed initial annotations using the LabelMe tool. The annotation content includes not only precise bounding boxes (quadrilateral annotations) for text lines but also language information (Tibetan, Chinese, English, numeric, or mixed text) and the corresponding transcribed text. To strictly control data quality, we implemented a dual process of automated script validation and expert review, focusing on checking the structural integrity of JSON files, the validity of bounding boxes, and the accuracy of language tags, with manual emphasis on reviewing ambiguous samples identified by the automated process. The final dataset consists of 1,000 original images and exactly 1,000 paired annotation files in JSON format. Each data file is named according to the "QT[category]_[sequence number]" rule (e.g., QTdor_001.jpg and QTdor_001.json), ensuring a one-to-one correspondence between images and annotations. The annotation files adopt a standard structure that clearly defines the geometric location, language attribute, and text content of each text instance in the image. The dataset is complete, with no missing values or invalid samples. Potential errors introduced during the annotation process mainly stem from text blurring under extreme lighting or partial occlusion in complex backgrounds; the bounding box annotations for such samples have all been reviewed by experts to ensure overall annotation accuracy. The dataset uses common .jpg (image) and .json (annotation) formats and can be read and processed using any deep learning framework (such as PyTorch or TensorFlow) and common annotation tools that support these formats, with no need for specific niche software.
Z
The Semantic PASCAL-Part Dataset
data.niaid.nih.gov
data-staging.niaid.nih.gov
Updated Jan 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Donadello, Ivan; Serafini, Luciano (2022). The Semantic PASCAL-Part Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5878772
Explore at:
Dataset updated
Jan 20, 2022
Dataset provided by
Fondazione Bruno Kessler
Free University of Bozen-Bolzano
Authors
Donadello, Ivan; Serafini, Luciano
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Semantic PASCAL-Part dataset

The Semantic PASCAL-Part dataset is the RDF version of the famous PASCAL-Part dataset used for object detection in Computer Vision. Each image is annotated with bounding boxes containing a single object. Couples of bounding boxes are annotated with the part-whole relationship. For example, the bounding box of a car has the part-whole annotation with the bounding boxes of its wheels.

This original release joins Computer Vision with Semantic Web as the objects in the dataset are aligned with concepts from:

the provided supporting ontology;

the WordNet database through its synstes;

the Yago ontology.

The provided Python 3 code (see the GitHub repo) is able to browse the dataset and convert it in RDF knowledge graph format. This new format easily allows the fostering of research in both Semantic Web and Machine Learning fields.

Structure of the semantic PASCAL-Part Dataset

This is the folder structure of the dataset:

semanticPascalPart: it contains the refined images and annotations (e.g., small specific parts are merged into bigger parts) of the PASCAL-Part dataset in Pascal-voc style.

Annotations_set: the test set annotations in .xml format. For further information See the PASCAL VOC format here.

Annotations_trainval: the train and validation set annotations in .xml format. For further information See the PASCAL VOC format here.

JPEGImages_test: the test set images in .jpg format.

JPEGImages_trainval: the train and validation set images in .jpg format.

test.txt: the 2416 image filenames in the test set.

trainval.txt: the 7687 image filenames in the train and validation set.

The PASCAL-Part Ontology

The PASCAL-Part OWL ontology formalizes, through logical axioms, the part-of relationship between whole objects (22 classes) and their parts (39 classes). The ontology contains 85 logical axiomns in Description Logic in (for example) the following form:

Every potted_plant has exactly 1 plant AND has exactly 1 pot

We provide two versions of the ontology: with and without cardinality constraints in order to allow users to experiment with or without them. The WordNet alignment is encoded in the ontology as annotations. We further provide the WordNet_Yago_alignment.csv file with both WordNet and Yago alignments.

The ontology can be browsed with many Semantic Web tools such as:

Protégé: a graphical tool for ongology modelling;

OWLAPI: Java API for manipulating OWL ontologies;

rdflib: Python API for working with the RDF format.

RDF stores: databases for storing and semantically retrieve RDF triples. See here for some examples.

Citing semantic PASCAL-Part

If you use semantic PASCAL-Part in your research, please use the following BibTeX entry

@article{DBLP:journals/ia/DonadelloS16, author = {Ivan Donadello and Luciano Serafini}, title = {Integration of numeric and symbolic information for semantic image interpretation}, journal = {Intelligenza Artificiale}, volume = {10}, number = {1}, pages = {33--47}, year = {2016} }
Tasmanian Orange Roughy Stereo Image Machine Learning Dataset
data.csiro.au
researchdata.edu.au
Updated Apr 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ben Scoulding; Kylie Maguire; Eric Orenstein; Chris Jackett (2025). Tasmanian Orange Roughy Stereo Image Machine Learning Dataset [Dataset]. http://doi.org/10.25919/a90r-4962
Explore at:
Unique identifier
https://doi.org/10.25919/a90r-4962
Dataset updated
Apr 7, 2025
Dataset provided by
CSIROhttp://www.csiro.au/
Authors
Ben Scoulding; Kylie Maguire; Eric Orenstein; Chris Jackett
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Time period covered
Jul 11, 2019 - Jul 18, 2019
Area covered

Dataset funded by
CSIROhttp://www.csiro.au/
Description
The Tasmanian Orange Roughy Stereo Image Machine Learning Dataset is a collection of annotated stereo image pairs collected by a net-attached Acoustic and Optical System (AOS) during orange roughy (Hoplostethus atlanticus) biomass surveys off the northeast coast of Tasmania, Australia in July 2019. The dataset consists of expertly annotated imagery from six AOS deployments (OP12, OP16, OP20, OP23, OP24, and OP32), representing a variety of conditions including different fish densities, benthic substrates, and altitudes above the seafloor. Each image was manually annotated with bounding boxes identifying orange roughy and other marine species. For all annotated images, paired stereo images from the opposite camera have been included where available to enable stereo vision analysis. This dataset was specifically developed to investigate the effectiveness of machine learning-based object detection techniques for automating fish detection under variable real-world conditions, providing valuable resources for advancing automated image processing in fisheries science. Lineage: Data were obtained onboard the 32 m Fishing Vessel Saxon Onward during an orange roughy acoustic biomass survey off the northeast coast of Tasmania in July 2019. Stereo image pairs were collected using a net-attached Acoustic and Optical System (AOS), which is a self-contained autonomous system with multi-frequency and optical capabilities mounted on the headline of a standard commercial orange roughy demersal trawl. Images were acquired by a pair of Prosilica GX3300 Gigabyte Ethernet cameras with Zeiss F2.8 lenses (25 mm focal length), separated by 90 cm and angled inward at 7° to provide 100% overlap at a 5 m range. Illumination was provided by two synchronised quantum trio strobes. Stereo pairs were recorded at 1 Hz in JPG format with a resolution of 3296 x 2472 pixels and a 24-bit depth.

Human experts manually annotated images from the six deployments using both the CVAT annotation tool (producing COCO format annotations) and LabelImg tool (producing XML format annotations). Only port camera views were annotated for all deployments. Annotations included bounding boxes for "orange roughy" and "orange roughy edge" (for partially visible fish), as well as other marine species (brittle star, coral, eel, miscellaneous fish, etc.). Prior to annotation, under-exposed images were enhanced based on altitude above the seafloor using a Dark Channel Prior (DCP) approach, and images taken above 10 m altitude were discarded due to poor visibility.

For all annotated images, the paired stereo images (from the opposite camera) have been included where available to enable stereo vision applications. The dataset represents varying conditions of fish density (1-59 fish per image), substrate types (light vs. dark), and altitudes (2.0-10.0 m above seafloor), making it particularly valuable for training and evaluating object detection models under variable real-world conditions.

The final standardised COCO dataset contains 1051 annotated port-side images, 849 paired images (without annotations), and 14414 total annotations across 17 categories. The dataset's category distribution includes orange roughy (9887), orange roughy edge (2928), mollusc (453), cnidaria (359), misc fish (337), sea anemone (136), sea star (105), sea feather (100), sea urchin (45), coral (22), eel (15), oreo (10), brittle star (8), whiptail (4), chimera (2), siphonophore (2), and shark (1).
h
I3D-Tools-Dataset
huggingface.co
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
I3D-Lab-IISc (2025). I3D-Tools-Dataset [Dataset]. https://huggingface.co/datasets/i3dlabiisc/I3D-Tools-Dataset
Explore at:
Dataset updated
May 28, 2025
Authors
I3D-Lab-IISc
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
I3D Tools Dataset

This is the official dataset for the "I3D Tools Dataset" paper. The dataset contains a diverse collection of 16 hand tool categories, curated for applications in object detection, segmentation, and synthetic data generation. Codebase:

📊 Dataset Statistics

Number of Tool Classes: 16
Total Images: ~35,000
Image Resolution: 1024x1024
Annotations per Image: YOLOv8 bounding box format Pixel-level segmentation mask Natural language caption… See the full description on the dataset page: https://huggingface.co/datasets/i3dlabiisc/I3D-Tools-Dataset.
An Annotated Image Dataset for Small Apple Fruitlet Detection in Complex...
figshare.com
zip
Updated Nov 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bo Wang (2025). An Annotated Image Dataset for Small Apple Fruitlet Detection in Complex Orchard Environments [Dataset]. http://doi.org/10.6084/m9.figshare.30674432.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.30674432.v1
Dataset updated
Nov 21, 2025
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Bo Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This study presents a comprehensive dataset comprising 2,517 images of apple fruitlets, each uniformly rescaled to 500×500 pixels to maintain dimensional consistency. Manual annotation was performed using LabelImg software, with bounding box coordinates stored in XML format. To enhance usability across different platforms, we additionally provide annotations in TXT format. Figure 3 illustrates a representative annotation example. During the annotation process, special attention was given to: (1) precise localization of small fruitlets, and (2) accurate annotation of partially occluded targets, where only visible portions were labeled to minimize false positives in subsequent analyses.
n
Towards automated ethogramming: Cognitively-inspired event segmentation for...
data.niaid.nih.gov
datadryad.org
zip
Updated Mar 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ramy Mounir; Ahmed Shahabaz; Roman Gula; Jörn Theuerkauf; Sudeep Sarkar (2023). Towards automated ethogramming: Cognitively-inspired event segmentation for streaming wildlife video monitoring [Dataset]. http://doi.org/10.5061/dryad.kh18932bb
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.kh18932bb
Dataset updated
Mar 19, 2023
Dataset provided by
University of South Florida
Museum and Institute of Zoology
Authors
Ramy Mounir; Ahmed Shahabaz; Roman Gula; Jörn Theuerkauf; Sudeep Sarkar
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Our dataset, Nest Monitoring of the Kagu, consists of around ten days (253 hours) of continuous monitoring sampled at 25 frames per second. Our proposed dataset aims to facilitate computer vision research that relates to event detection and localization. We fully annotated the entire dataset (23M frames) with spatial localization labels in the form of a tight bounding box. Additionally, we provide temporal event segmentation labels of five unique bird activities: Feeding, Pushing leaves, Throwing leaves, Walk-In, and Walk-Out. The feeding event represents the period of time when the birds feed the chick. The nest-building events (pushing/throwing leaves) occur when the birds work on the nest during incubation. Pushing leaves is a nest-building behavior during which the birds form a crater by pushing leaves with their legs toward the edges of the nest while sitting on the nest. Throwing leaves is another nest-building behavior during which the birds throw leaves with the bill towards the nest while being, most of the time, outside the nest. Walk-in and walkout events represent the transitioning events from an empty nest to incubation or brooding, and vice versa. We also provide five additional labels that are based on time-of-day and lighting conditions: Day, Night, Sunrise, Sunset, and Shadows. In our manuscript, we provide a baseline approach that detects events and spatially localizes the bird in each frame using an attention mechanism. Our approach does not require any labels and uses a predictive deep learning architecture that is inspired by cognitive psychology studies, specifically, Event Segmentation Theory (EST). We split the dataset such that the first two days are used for validation, and performance evaluation is done on the last eight days. Methods The video monitoring system consisted of a commercial infrared illuminator surveillance camera (Sony 1/3′′ CCD image sensor), and an Electret mini microphone with built-in SMD amplifier (Henri Electronic, Germany), connected to a recording device via a 6.4-mm multicore cable. The transmission cable consisted of a 3-mm coaxial cable for the video signal, a 2.2-mm coaxial cable for the audio signal and two 2-mm (0.75 mm2) cables to power the camera and microphone. We powered the systems with 25-kg deep cycle, lead-acid batteries with a storage capacity of 100 Ah. We used both Archos™ 504 DVRs (with 80 GB hard drives) and Archos 700 DVRs (with 100 GB hard drives). All cameras were equipped with 12 infrared light emitting diodes (LEDs) for night vision. We have manually annotated the dataset with temporal events, time-of-day/lighting conditions, and spatial bounding boxes without relying on any object detection/tracking algorithms. The temporal annotations were initially created by experts who study the behavior of the Kagu bird and later refined to improve the precision of the temporal boundaries. Additional labels, such as lighting conditions, were added during the refinement process. The spatial bounding box annotations of 23M frames were created manually using professional video editing software (Davinci Resolve). We attempted to use available data annotation software tools, but they did not work for the scale of our video (10 days of continuous monitoring). We resorted to video editing software, which helped us annotate and export bounding box masks as videos. The masks were then post-processed to convert annotations from binary mask frames to bounding box coordinates for storage. It is worth noting that the video editing software allowed us to linearly interpolate between keyframes of the bounding boxes annotations, which helped save time and effort when the bird’s motion is linear. Both temporal and spatial annotations were verified by two volunteer graduate students. The process of creating spatial and temporal annotations took approximately two months.
Hand Sign Language DataSet with Annotation in JSON
kaggle.com
zip
Updated Feb 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sures Ramar (2020). Hand Sign Language DataSet with Annotation in JSON [Dataset]. https://www.kaggle.com/datasets/suresrkumar/hand-sign-language-dataset-with-annotation-in-json/discussion
Explore at:
zip(99604845 bytes)Available download formats
Dataset updated
Feb 28, 2020
Authors
Sures Ramar
Description
Our aim is to identify Hand Gesture from the given image and display the result in text format or audio which will be useful for Hearing impaired people. T0 train the CNN model, we have prepared our own dataset. The following are the dataset details

Image Resolution : 12 mega pixel Image Size : 1920 * 1080

The dataset has 9 hand gestures. The following are the hand gestures:

Class ID : Class Name "1": "Have", "2": "Nice", "3": "Day", "4": "Early", "5": "Morning", "6": "Wakeup", "7": "Love", "8": "Funny", "9": "You"

Train dataset has 232 images and Validation Dataset has 55 images.

All the images are annotated using VGG Annotator tool .

Annotation Details:

Hand Gesture is annotated with polygon coordinates. Annotated only the hand region (Palm and Fingers). Annotation information are stored in JSON file (via_region_annotation.json)

Each Hand Gesture has 20 images in Train Dataset and 5 images in validation Dataset.

In our Project, we have used MASK RCNN to detect the Hand Gesture . It gives 3 results such as Class Name, Bounding Box Regressor and Segmentation.

Accuracy Score : Intersection Over Union (IoU) - 0.875 and mAP (Mean Arithmetic Precision) - 0.95

If you have any queries. Please reach out to us via email (HSL.Queries@gmail.com)
h
PASCAL_MLX
huggingface.co
Updated May 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaan Keskin (2025). PASCAL_MLX [Dataset]. https://huggingface.co/datasets/ikaankeskin/PASCAL_MLX
Explore at:
Dataset updated
May 14, 2025
Authors
Kaan Keskin
Description
VOC2012 Image and Annotation Visualization Notebook

Github: https://github.com/ikaankeskin/MLXdatasets/tree/main/ObjectDetection/PASCAL HuggingFace: https://huggingface.co/datasets/ikaankeskin/PASCAL_MLX This repository contains a tool that facilitates the download, extraction, and visualization of the VOC2012 dataset, complete with bounding box annotations extracted from associated XML files.

Features

Automated Dataset Download: Fetches the VOC2012 dataset from… See the full description on the dataset page: https://huggingface.co/datasets/ikaankeskin/PASCAL_MLX.

Class Name	Object Count
Airplane	4,847
Boat	3,697
Car	6,932
Ship	3,627

Class ID	Category
0	Airplane
1	Boat
2	Car
3	Ship

Facebook

Twitter

Click to copy link

Link copied

Cite

Nexdata (2023). Image Annotation Services | Image Labeling for AI & ML |Computer Vision Data| Annotated Imagery Data [Dataset]. https://datarade.ai/data-products/nexdata-image-annotation-services-ai-assisted-labeling-nexdata

Image Annotation Services | Image Labeling for AI & ML |Computer Vision Data| Annotated Imagery Data

Explore at:

.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats

Dataset updated

Dec 29, 2023

Dataset authored and provided by

Nexdata

Area covered

Austria, El Salvador, Japan, India, Grenada, Latvia, Bosnia and Herzegovina, Romania, Bulgaria, Hong Kong

Description

Overview We provide various types of Annotated Imagery Data annotation services, including:
Bounding box
Polygon
Segmentation
Polyline
Key points
Image classification
Image description ...
Our Capacity
Platform: Our platform supports human-machine interaction and semi-automatic labeling, increasing labeling efficiency by more than 30% per annotator.It has successfully been applied to nearly 5,000 projects.

Annotation Tools: Nexdata's platform integrates 30 sets of annotation templates, covering audio, image, video, point cloud and text.

-Secure Implementation: NDA is signed to gurantee secure implementation and Annotated Imagery Data is destroyed upon delivery.

-Quality: Multiple rounds of quality inspections ensures high quality data output, certified with ISO9001

About Nexdata Nexdata has global data processing centers and more than 20,000 professional annotators, supporting on-demand data annotation services, such as speech, image, video, point cloud and Natural Language Processing (NLP) Data, etc. Please visit us at https://www.nexdata.ai/computerVisionTraining?source=Datarade

Clear search

Close search

Google apps

Main menu

Image Annotation Services | Image Labeling for AI & ML |Computer Vision...

Global Data Annotation and Labeling Tool Market Research Report: By...

Annotated UAV Image Dataset for Object Detection Using LabelImg and Roboflow...

Global Automated Annotation Tool Market Research Report: By Application...

SkySeaLand Object Detection Dataset

🛰️ SkySeaLand Dataset

📚 Overview

📊 Dataset Split Summary

Total Dataset:

📊 Class Distribution

🧾 Annotation Format

Class ID and Categories

🧰 Data Source and Tools

🧠 Research Applications

📈 Suggested Experiments

Car Highway Dataset

Car-Highway Data Annotation Project

Introduction

Project Goals

Tools and Technologies

Annotation Process

Data Augmentation

Data Export

Milestones

Conclusion

Human Tracking & Object Detection Dataset

People Tracking & Object Detection dataset

The dataset is created on the basis of Real-Time Traffic Video Dataset

Dataset Structure

Data Format

👉 Legally sourced datasets and carefully structured for AI training and model development. Explore samples from our dataset of 95,000+ human images & videos - Full dataset

Nrl Player Detection Dataset

Bee Image Object Detection

Sharks and rays swimming in a large public aquarium

Description of the bounding boxes variables:

MRI for Brain Tumor with Bounding Boxes

Brain Tumor Detection Dataset

Overview

Dataset Composition

Classes

Data Split

1. Training Set:

2. Validation Set:

Image Characteristics

Annotations

Source and Inspiration

Usage

Citation

142-Birds-Species-Object-Detection-V1

Data Annotation for Computer Vision using Web Scraping and CVAT

Introduction

Dataset Creation

Usage

Code

Conclusion

Sample manifest.jsonl metadata

QT-MSTR: A Multilingual Scene Text Annotation Dataset for the Qinghai-Tibet...

The Semantic PASCAL-Part Dataset

Tasmanian Orange Roughy Stereo Image Machine Learning Dataset

I3D-Tools-Dataset

An Annotated Image Dataset for Small Apple Fruitlet Detection in Complex...

Towards automated ethogramming: Cognitively-inspired event segmentation for...

Hand Sign Language DataSet with Annotation in JSON

PASCAL_MLX

Image Annotation Services | Image Labeling for AI & ML |Computer Vision Data| Annotated Imagery Data