Facebook
TwitterThis dataset contains images and a set of labels that expose certain characterisitics of that images, such as varroa-mite infections, bees carrying pollen-packets or bee that are cooling the hive by flappingn their wings. Additionally, this dataset contains images of wasps to be able to distinguish bees and wasps.
The images of the bees are taken from above and rotated. The bee is vertical and either its head or the trunk is on top. All images were taken with a green background and the distance to the bees was always the same, thus all bees have the same size.
Each image can have multiple labels assigned to it. E.g. a bee can be cooling the hive and have a varrio-mite infection at the same time.
This dataset is designed as mutli-label dataset, where each label, e.g. varroa_output, contains 1 if the characterisitic was present in the image and a 0 if it wasn't. All images are provided by 300 pixel height and 150 pixel witdh. As default the dataset provides the images as 150x75 (h,w) pixel. You can select 300 pixel height by loading the datset with the name "bee_dataset/bee_dataset_300" and with 200 pixel height by "bee_dataset/bee_dataset_200".
License: GNU GENERAL PUBLIC LICENSE
Author: Fabian Hickert Fabian.Hickert@raspbee.de
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('bee_dataset', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/bee_dataset-bee_dataset_300-1.0.0.png" alt="Visualization" width="500px">
Facebook
Twitterhttps://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/
Boxes on Bees and Pollen Overview
The goal of the BeeLivingSensor project is to non-invasively track honey bees at hive entrances, and to track the type and volume of pollen they bring into the hive. By analyzing the color of the pollen and aggregating it with other data, the project aims to to determine the plant biodiversity around the beehive.
This data set contains approximately 5,000 images of bees annotated with bounding boxes on both bees and pollen, for a total of around 50,000 annotations.
Data format
The zipfile linked below contains 4993 image files, each of which is associated with an .xml file of the same name. For example, “Chueried_Churied_01_ST_216.xml” contains the annotations for the image “Chueried_Churied_01_ST_216.jpg”.
Annotations are in the Pascal VOC XML format for objection detection. A typical individual bee object, for example, would be annotated as:
Citation
If you use these data in a publication or report, please use the following citation:
Noninvasive bee tracking in videos: deep learning algorithms and cloud platform design specifications. Dataset, 2021.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The Bee Image Object Detection dataset was generated for the purpose of detecting bee objects within images. The dataset comprises videos captured at the entrances of 25 beehives situated in three separate apiaries in San Jose, Cupertino, and Gilroy, CA, USA. These videos were recorded directly above the landing pads of various beehives. The camera was positioned at a unique angle to capture distinct and clear images of bees engaged in activities such as taking off, landing, or moving around on the landing pad.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Honey bee datasetThe dataset contains 157,529 high quality images of centred and cropped honey bees from video data.Specification85 individuals (10 drones, 75 worker bees)Individuals only, centred and cropped to 512x512 pixelsGood colour fidelity, uniform lightingFiles grouped by contiguous path segments with an average length of 46 frames and a standard deviation of 51 frames.Video captured at 60 frames per secondA total of 3430 trajectories and 157529 images are availableFolder structureThe first level separates the 87 individuals and gives information on sex and pollen-bearing trait:[worker/drone]_[pollen/nopollen]_YYYY-MM-DDTHH-MM-SS_The second level enumerates the trajectories of each individual:[01234]The data is contained in the third level as .jpg files. A .csv file contains the x and y coordinates of the cropped bee in the original video file.Capturing procedureEach individual was recorded for 5 minutes. Trajectories were broken when the individual was close to the edge of the image, making it impossible to capture the centre of the image.
Facebook
Twitterhttp://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
This dataset was captured in July 2022 using the second generation of Bee Visual Inspector. Please check the project repository on GitHub.
It contains image, sound and sensoric data. Images without bees were manually removed. Sound and sensoric data were collected every 900 s. Sensoric data cover the following categories:
Facebook
TwitterBees were collected in 24 fields across eastern Iowa in summer 2019. This data collection was part of a pesticide study funded by the USGS Ecosystems Mission Area- Environmental Health Program. Bees were collected using the sweep net method and then were immediately placed on dry ice in the field. Bees were kept frozen to prevent degradation. In the lab, each wild bee was photographed from one or more angles using an AmScope microscope fitted with an MU1400 digital camera at 20x magnification. Bees were then morphologically identified based on the images. All images were checked for quality control before they were archived on this site. This data release includes 1) a txt file with bee identifications and image names, 2) zipped images of the bees, and 3) a text file with alternative text for each image.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Every third bite of food relies on pollination by bees. At the same time, this past winter honeybee hive losses have exceeded 60% in some states. How can we address this issue? How can we better understand our bees? And most importantly, how can we save them before it's too late?
While many indications of hive strength and health are visible on the inside of the hive, frequent check-ups on the hive are time-consuming and disruptive to the bees' workflow and hive in general. By investigating the bees that leave the hive, we can gain a more complete understanding of the hive itself. For example, an unhealthy hive infected with varroa mites will have bees with deformed wings or mites on their backs. These characteristics can be observed without opening the hive. To protect against robber bees, we could track the ratio of pollen-carrying bees vs those without. A large influx of bees without pollen may be an indication of robber bees. This dataset aims to provide basic visual data to train machine learning models to classify bees in these categories, paving the way for more intelligent hive monitoring or beekeeping in general.
This dataset contains 5,100+ bee images annotated with location, date, time, subspecies, health condition, caste, and pollen.
The original batch of images was extracted from still time-lapse videos of bees. By averaging the frames to calculate a background image, each frame of the video was subtracted against that background to bring out the bees in the forefront. The bees were then cropped out of the frame so that each image has only one bee. Because each video is accompanied by a form with information about the bees and hive, the labeling process is semi-automated. Each video results in differing image crop quality levels. This dataset will be updated as more videos and data become available.
-1 means the information is coming soon.
Thank you to everyone who has submitted a video:
James Temple
Ken McKenzie
Howard Wetsman
Daniel Long
Michael J. Gras
John Therriault
Cal Hansen
Jim Davis
Jack Goral
How can we improve our understanding of a hive through images of bees?
How can we expedite the hive checkup process?
How can bee image data help us recognize problems earlier?
How can bee image data help us save our bees?
If you would like to contribute or learn more, please fill out this form to be added to the email list: https://goo.gl/forms/FzSUhw6z9QMSTpaH2, or contact jy2k16@gmail.com
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bee detection/segmentation datasets contain directories of images and labels. Detection dataset contains 7200 frames. Segmentation dataset contains 2300 cropped images of bees labeled with triangle shape for direction vector estimation. Labels are saved in yolo format. Frame resolution 1920x1080. Frames were captured 30 cm above 8 different beehive landing boards. Time: June-July 2023. Location: Lithuania.
In version no. 5, labels were corrected in detection dataset. Pose directory was added for bee direction estimation. Annotations in pose dataset were saved in yolo format. If bee is fully visible then first point [px1, py1] marks a head, second point [px2, py2] marks stinger. If only part of bee is visible then first point marks the front, second point marks the back. Annotation format in pose dataset: [class=0, x, y, w, h, px1, py1, px2, py2] Pose dataset contains total 400 frames of 8 beehives entrances. 50 frames per beehive.
In version no. 6, the ramp detection dataset was added. It contains 156 images. Annotation format in ramp detection dataset: [class=0, x, y, w, h, px1, py1, px2, py2, px3, py3, px4, py4] Here, x, y, w, h are coordinates of the bounding boxes and px1, py1, px2, py2, px3, py3, px4, py4 are coordinates of 4 keypoints: 1 - top left, 2 - top right, 3 - bottom right, 4 - bottom left.
The tracking and behavior datasets represent annotated MP4 files with tracks of bees during foraging, defense, fanning, and washboarding. It contains 17162 tracks, 21946 frames. The tracks and behavior are annotated only for the bees that are in the entrance zone. The bounding boxes are normalized to the width and height of the frame. Annotation format in tracking dataset: [frame-id, track-id, bb-left, bb-top, bb-width, bb-height]. Annotation format in behavior dataset: [frame-id, track-id, bb-left, bb-top, bb-width, bb-height, class-for, class-def, class-fan, class-wash]
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
(C) 2017 Ivan Rodriguez, Rémi Mégret, Edgar Acuña, José Agosto, Tugrul Giray
This image dataset has been created from videos captured at the entrance of a bee colony in June 2017 at the Bee facility of the Gurabo Agricultural Experimental Station of the University of Puerto Rico.
images/ contains images for pollen bearing and no pollen bearing honey bees.
NP1268-15r.jpg for non-pollen and P7797-103r.jpg for pollen bearing bees. The numbers correspond to frame and item number respectively, you need to be careful that they are not numbered sequentially. Read-skimage.ipynb Jupyter notebook for simple script to load the data and create the dataset using skimage library.
This dataset is based upon work supported by the National Science Foundation under Grant No. 1707355 and 1633184.
If you publish work based on this dataset, please cite the following publication:
Thanks to UPR students Grace Rodriguez, Christian Esteves and Emmanuel Nieves for their help in the video annotations. Thanks to UPR students Stephanie Feliciano and Janpierre Aleman for their help in the development of the camera system.
This dataset is shared on Kaggle under licenses CC-BY 4.0 and ODC-ODbL 1.0
Facebook
TwitterThis dataset contains synthetically generated honeybee images produced using a Deep Convolutional Generative Adversarial Network (DCGAN). These images were created as part of a research project on improving honeybee health monitoring and supporting machine-learning models designed to detect Varroa destructor infestations.
All images in this dataset are artificially generated and do not depict real bees. They were trained on publicly available honeybee image datasets collected from the following sources:
Yang, J. (2018). The BeeImage Dataset: Annotated Honey Bee Images; Kaggle. https://www.kaggle.com/datasets/jenny18/honey-bee-annotated-images
Schurischuster, S., & Martin, K. (2020, October). VarroaDataset; Zenodo. https://doi.org/10.5281/zenodo.4085043
Hickert, F. (2021). Dataset for a camera based bee-hive monitoring [Review of Dataset for a camera based bee-hive monitoring]. BeeAlarmed - a Camera Based Bee-Hive Monitoring. https://github.com/BeeAlarmed
These authors created the original real-image datasets. This synthetic dataset is intended solely for research and educational use, and all credit for the foundational imagery belongs to the above creators.
The source code used to train the DCGAN and generate images can be found here: https://github.com/sammyjnor/VarroaDCGAN
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We 6,332 images from iNaturalist platform (https://www.inaturalist.org). We have trained two AI models. One is to classify bee-vs-nonbee insects and another one to classify Bumble bee vs Non-bumble bee insects. These images were used to train, validate and test our AI models.
Facebook
TwitterDataset Card for "dreambooth-bee-images"
More Information needed
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The BEEHIVE dataset has been created for Precision Agriculture, Measurement Science, and Entomology research specifically dealing with Apis mellifera (common honeybee) image analysis. The contents of the dataset include data acquired from two cameras located at different observation points: the "Frame" dataset, acquired with a camera placed inside a frame of the beehive and depicting very close-range images of the bees (potentially a few with Varroa destructor mites on their backs), and the "Bottom" dataset, acquired with a camera positioned at the bottom of the beehive. In this case, a metallic grid partially occludes the view.
The two datasets are already subdivided into training, validation, and test sub-datasets following a 70%-20%-10% splitting protocol. The "Frame" dataset includes 1.440 training images, 411 validation images, and 206 test images. The "Bottom" dataset includes 1044 training images, 303 validation images, and 147 test images. Each dataset includes annotations obtained in RoboFlow for the task of object detection, considering two classes: "bee" and "blurred_bee" for the "Frame" dataset, "bee" and "occluded_bee" for the "Bottom" dataset.
The data is valuable for the field of Precision Agriculture, Entomology, Measurement Science, and Computer Vision, especially for the tasks of bees' monitoring, counting, and detection of potential parasites by training image-based Deep Learning models. It is also useful as a reference dataset for benchmarking models.
If you use this dataset for your work, please cite the related papers: - https://doi.org/10.3390/s24165270
Facebook
TwitterThis dataset was created by Shaylynn Morphew
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset is composed of 91 images of bees on sunflowers. The photos were taken in the new community of Taichung City, Taiwan from 3:35 to 3:40 pm on Monday, November 04, 2019 with the iPhone XR mobile phone. The image after shooting is 1478*1108 pixels and the format is .jpg file. In order to train the model more effectively, we use multi-angle rotation to perform data enhancement and expand the number of data sets to three times of the original. Use Matlab to manually mark the region of interest in the image, a total of 412 annotations, and the annotation image is saved in a .mat file.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset supports the analyses and experiments of the paper:
Blaha et al., "Effective Searching for the Honeybee Queen in a Living Colony," CASE 2024. DOI: 10.1109/CASE59546.2024.10711366
The dataset provides comprehensive observations of honeybee queen behaviour within an observation hive captured using an automated robotic system. The primary objective is to facilitate research in precision beekeeping by offering detailed spatial and temporal data on bee activities.
The data was collected using a robotic system equipped with sensors and cameras designed to non-invasively monitor bee activity within the hive. The system performed systematic scans of predefined locations, capturing images and detecting bee positions and orientations. This approach minimizes disturbance to the bees and ensures consistent data quality. For more details, see Ulrich et al., "Autonomous tracking of honey bee behaviours over long-term periods with cooperating robots," Science Robotics 2024. DOI: 10.1126/scirobotics.adn6848.
The dataset consists of one file 2023-month-queenpos-short.txt which is a CSV formatted file containing the following columns:
| Column Number | Column Name | Units | Description |
| 1 | Timestamp |
ns | Timestamp of the detection in Unix epoch seconds (UTC). |
| 2 | Camera | - | String representing the particular camera (e.g., "/hive_2/xy_1/camera_1" represents hive n. 2, robot on side 1, camera is duplicated) |
| 3 | Heading | rad | Orientation of the Whycon marker in the image. |
| 4 | X - Queen | m | X-coordinate of the honeybee queen's position in meters. |
| 5 | Y - Queen | m | Y-coordinate of the honeybee queen's position in meters. |
| 6 | X - Camera | m | X-coordinate of the camera's position in meters. |
| 7 | Y - Camera | m | Y-coordinate of the camera's position in meters. |
| 8 | X - Queen in Image | m | X-coordinate of the queen's metric position in the image in meters. |
| 9 | Y - Queen in Image | m | Y-coordinate of the queen's metric position in the image in meters. |
Note: the observation hive has two sides, each monitored by a separate robot.
To attribute this dataset in your research, please cite the two corresponding papers:
Blaha et al., "Effective Searching for the Honeybee Queen in a Living Colony," CASE 2024. DOI: 10.1109/CASE59546.2024.10711366
Ulrich et al., "Autonomous tracking of honey bee behaviours over long-term periods with cooperating robots," Science Robotics 2024. DOI: 10.1126/scirobotics.adn6848
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
In 2014–2022, USDA-ARS Tucson, AZ, by itself and in collaboration with other precision apiculture (PA) research programs, including the PA program at Utah State University, and several commercial operations, acquired a large reservoir of multi-sensor data, including thousands of frame photographs and sensor measurements, from field experiments with managed honey bee colonies. This reservoir is a loose collection of hive frame photos, CSV files, spreadsheets, and hive inspection text logs. Our project explores and exploits this reservoir and makes public its curated subsets. This dataset is the first such subset we curated in 2024-25 under USDA-NIFA Award 205732 "DSFAS - Exploration and Exploitation of the 2014-2022 USDA-ARS Tucson, AZ Digital Data Reservoir of Field Experiments with Managed Honey Bee Colonies."The zipped directory ANNOTATED_HIVE_FRAMES includes 13 image subdirectories with annotated images.1) 2013_07_28_CHBRC -- 57 Files2) 2014_07_30_12_CHBRC -- 111 Files3) 2015_02_11_MAC_RR -- 660 Files4) 2016_03_30_HOOPS -- 153 Files5) 2017_02_01_SRER_BEAR_CAGE -- 87 Files6) 2018_02_13_SRER_SC_complete_3_9_25 -- 195 Files7) 2018_04_18_SRER_SC_Methoxy -- 366 Files8) 2019_07_11_SRER_BC_Neonic -- 60 Files9) 2020_02_27_RR_Hive_Directions -- 36 Files10) 2021_06_08_CHBRC_VLAD -- 282 Files11) 2021_09_27_RR_ColdStor -- 855 Files12) 2021_02_11_CT_ColdStor -- 111 Files13) 2014_12_15_50_CHBRC --- 30 filesThe name of each subfolder includes a year, a month, and a date on which the frame photos were taken, followed by the location of the apiary where the photos were taken. The de-abbreviations are as follows:CHBRC -- Carl Hayden Bee Research CenterMAC -- Maricopa Agriculture CenterRR -- Red Rock Agriculture CenterHOOPS -- one of the apiaries at CHBRCSRER -- Santa Rita Experimental RangeSRER -- Shipping CorralsCT -- Cow TownEach of the 13 subdirectories has three subsubdirectories: PNG/, XML/, TXT/.PNG/ -- hive frame photos in PNG format;XML/ -- XML annotations of images in PNG/ with LabelImgTXT/ -- TXT annotations of images in PNG/ for YOLO trainingThus, in each of the 13 folders, each PNG image has two annotation files. E.g.,2020_02_27_RR_Hive_Directions_IMG_2540_VK.PNG2020_02_27_RR_Hive_Directions_IMG_2540_VK.xml2020_02_27_RR_Hive_Directions_IMG_2540_VK.txtEach PNG is annotated for the following categories:(1) CappedHoneyCell (2) CappedWorkerBroodCell (3) EmptyCombCell(4) PollenCell (5) UncappedNectarCell (6) UncappedWorkerLarvaCell(7) BeeHiveFrameThe counts on the number of annotated region of interest (ROI) images are as follows:CappedHoneyCell: 19,723CappedWorkerBroodCell: 21,456EmptyCombCell: 20,655PollenCell: 13,406UncappedNectarCell: 11,009UncappedWorkerLarvaCell: 18,283BeeHiveFrame: 1001Each such ROI can be extracted into a separate image and used in training machine learning algorithms.The subdirectory SRC/ contains two Python scripts that can convert XML to TXT and TXT to XML: xml_to_txt_converter.py and txt_to_xml_converter.py.USDA_ARZ_DATA_YOLO_19june2025.zip is a 3GB zip version of these images prepared for YOLO training. It is available at https://usu.box.com/s/dh75xkinwfyl3sqgb9vugy1ahf6z9mrh.SRC/ also contains the following Python scripts that we used for training YOLO networks:(a) train_valid_split.py -- splits all alldata.txt in USDA_ARZ_DATA_YOLO_19june2025.zip into train.txt and valid.txt for YOLO training.(b) tune_y8n.py --- tunes YOLOv8-nano(c) tune_y8s.py --- tunes YOLOv8-small(d) tune_y11n.py -- tunes YOLOv11-nano(e) tune_y11s.py -- tunes YOLOv11-smallThe folder METADATA/ contains two files: METADATA.txt and PapersDataSets_DrMeikle.xlsx. These files provide the metadata on the the USDA-ARS Tucson, AZ reservoir.
Facebook
TwitterRGB images containing honey bees and bumble bees (with stationary nadir view) collected in cultivars: (i) phaceliia, (ii) phacellia-maize intercropping (iii) flower mix (iv) flower mix - maize intercropping. and also images taken using smartphone with honey bees and bumble bees in focus. We manually annotated RGB images with bounding boxes. each bounding box categorises the bee as a honey bee or a bumble bee or an unknown bee. Unknown bees are bees which could be a bumble bee or a honey bee, but based on the image, we could not tell exactly which category it actually belongs to. Flies that look similar to bees are also labelled as unknown bees. We saved the labels are in the YOLO format.
Facebook
TwitterBees were collected in 24 fields across eastern Iowa in summer 2019. This data collection was part of a pesticide study funded by the USGS Ecosystems Mission Area- Environmental Health Program. Bees were collected using the sweep net method and then were immediately placed on dry ice in the field. Bees were kept frozen to prevent degradation. In the lab, each wild bee was photographed from one or more angles using an AmScope microscope fitted with an MU1400 digital camera at 20x magnification. Bees were then morphologically identified based on the images. All images were checked for quality control before they were archived on this site. This data release includes 1) a txt file with bee identifications and image names, 2) zipped images of the bees, and 3) a text file with alternative text for each image.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset, meticulously created by Jordan Bird, Leah Bird, Carrie Ijichi, Aurelie Jolivald, Salisu Wada, Kay Owa, and Chloe Barnes from Nottingham Trent University (United Kingdom), forms a critical part of the RF100 initiative, an Intel-sponsored project aimed at developing a new object detection benchmark to assess model generalizability.
Comprising a rich collection of bee detection images, this dataset is structured to challenge and refine AI models in object detection tasks. With 5,640 images in the training set (70%), 1,604 images in the validation set (20%), and 836 images in the test set (10%), it offers a comprehensive resource for model training, validation, and testing.
By participating in the RF100 benchmark, researchers and developers can contribute to advancing the field of AI while ensuring that models are robust, accurate, and capable of generalizing across diverse scenarios.
Explore the RF100 initiative and access more resources on https://github.com/roboflow-ai/roboflow-100-benchmark GitHub repository.
Source: https://universe.roboflow.com/roboflow-100/bees-jt5in
Facebook
TwitterThis dataset contains images and a set of labels that expose certain characterisitics of that images, such as varroa-mite infections, bees carrying pollen-packets or bee that are cooling the hive by flappingn their wings. Additionally, this dataset contains images of wasps to be able to distinguish bees and wasps.
The images of the bees are taken from above and rotated. The bee is vertical and either its head or the trunk is on top. All images were taken with a green background and the distance to the bees was always the same, thus all bees have the same size.
Each image can have multiple labels assigned to it. E.g. a bee can be cooling the hive and have a varrio-mite infection at the same time.
This dataset is designed as mutli-label dataset, where each label, e.g. varroa_output, contains 1 if the characterisitic was present in the image and a 0 if it wasn't. All images are provided by 300 pixel height and 150 pixel witdh. As default the dataset provides the images as 150x75 (h,w) pixel. You can select 300 pixel height by loading the datset with the name "bee_dataset/bee_dataset_300" and with 200 pixel height by "bee_dataset/bee_dataset_200".
License: GNU GENERAL PUBLIC LICENSE
Author: Fabian Hickert Fabian.Hickert@raspbee.de
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('bee_dataset', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/bee_dataset-bee_dataset_300-1.0.0.png" alt="Visualization" width="500px">