Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
conservationists
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Birds Classification is a dataset for classification tasks - it contains Bird Labels annotations for 896 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Image classification of 525 bird species is a computer vision project aimed at developing a system capable of identifying and categorizing various bird species from images.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This Data set is a good example of of a complex Multi class classification problem. 200 classes which are divided into Train data and Test data where each class can be identified using its folder name.
This data set is from Caltech-UCSD Birds-200-2011 dataset, which was further cleansed into train test folders.
http://www.vision.caltech.edu/visipedia/CUB-200-2011.html
This dataset will produce pretty good result using the Imagenet pretrained models , since the dataset has overlapping images from Imagenet.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Bird Feeder Bird Classification 2 is a dataset for object detection tasks - it contains Birds 9mSy RbVP annotations for 914 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Bird Classification VGG16 is a dataset for classification tasks - it contains Birds annotations for 9,515 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data accompanying the paper: Jeantet and Dufourq (2023). Empowering Deep Learning Acoustic Classifiers with Human-like Ability to Utilize Contextual Information for Wildlife Monitoring. Ecological Informatics. 77, 15749541, DOI: 10.1016/j.ecoinf.2023.102256
Our investigation contributes to the field of deep learning and bioacoustics by highlighting the potential for improved classification performance through the incorporation of contextual information such as time and location.
To test if spatial-temporal information can enhance deep learning classifier, we developed a subset dataset derived from Xeno-Canto that included location metadata as input alongside the spectrogram. We used this dataset with the primary purpose of creating a bird song classification task with species carefully selected to share similar vocal characteristics but from distinct geographical distributions. We only considered the recordings of category `A', corresponding to the best quality score in the database.
The dataset contains songs of 22 bird species from 5 families and genera differents. The recordings were downloaded from the Xeno-canto database in .wav format and each recording was manually annotated by labelling the start and stop time for every vocalisation occurrence using Sonic Visualiser. In total, database contained 6537 occurrences of bird songs of various length from 967 file recordings. A precise description of the distribution by species and country can be found in the associated article.
The audio files are provided in "Audio.zip" and the manually verified annotation in "Annotations.zip". The name of each file follows the following nomenclature: Family_genus_species_country of recording_date of recording_ID Xenocanto_type of song.wav/svl. The meta-data information of each file can be find in the csv file provided (Xenocanto_metadata_qualityA_selection) based on the number of the ID Xeno-canto. The annotations can be viewed using the Sonic Visualiser software. The python codes to process these files and train neural networks can be found here : github
The files were divided into a training folder and a validation folder to train and evaluate the efficiency of each method. For each species and country, we randomly selected 70% of the downloaded recordings for the training dataset and kept the remaining 30% for validation.
Process to select the species : We selected the ten most recorded families in the Passeriformes order, the most represented order in Xeno-canto database. From each of the ten families, we again sub-samples the ten most recorded genera. For each genus, we observed the countries of the recordings and the number of available recordings per species and countries. From these observations, we made a self-selection of genera containing species with similar songs but recorded in different regions, with enough recordings available by species and country to form a dataset . At the end, 5 genus were selected containing 22 species. We considered only recordings associated with bird songs, specifically, within Xeno-canto we selected the `song' type. To balance the number of recordings between species of the same genus, we reduced the number of recordings for the most represented species. Thus, for each genus we calculated the average of the number of records available per species and per country and limited the number of recordings for the species/country pairs that were in greater number to this value plus two.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
YOLO-based Segmented Dataset for Drone vs. Bird Detection for Deep and Machine Learning Algorithms
Unmanned aerial vehicles (UAVs), or drones, have witnessed a sharp rise in both commercial and recreational use, but this surge has brought about significant security concerns. Drones, when misidentified or undetected, can pose risks to people, infrastructure, and air traffic, especially when confused with other airborne objects, such as birds. To overcome this challenge, accurate detection systems are essential. However, a reliable dataset for distinguishing between drones and birds has been lacking, hindering the progress of effective models in this field.
This dataset is designed to fill this gap, enabling the development and fine-tuning of models to better identify drones and birds in various environments. The dataset comprises a diverse collection of images, sourced from Pexel’s website, representing birds and drones in motion. These images were captured from video frames and are segmented, augmented, and pre-processed to simulate different environmental conditions, enhancing the model's training process.
Formatted in accordance with the YOLOv7 PyTorch specification, the dataset is organized into three folders: Test, Train, and Valid. Each folder contains two sub-folders—*Images* and Labels—with the Labels folder including the associated metadata in plaintext format. This metadata provides valuable information about the detected objects within each image, allowing the model to accurately learn and detect drones and birds in varying circumstances. The dataset contains a total of 20,925 images, all with a resolution of 640 x 640 pixels in JPEG format, providing comprehensive training and validation opportunities for machine learning models.
Test Folder: Contains 889 images (both drone and bird images). The folder has sub-categories marked as BT (Bird Test Images) and DT (Drone Test Images).
Train Folder: With a total of 18,323 images, this folder includes both drone and bird images, also categorized as BT and DT.
Valid Folder: Consisting of 1,740 images, the images in this folder are similarly categorized into BT and DT.
This dataset is essential for training more accurate models that can differentiate between drones and birds in real-time applications, thereby improving the reliability of drone detection systems for enhanced security and efficiency.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Indian-Birds-Species-Image-Classification consists of 25 bird species found in India, including Asian Green Bee-eater, Brown-Headed Barbet.
Fine-grained spatio-temporal dataset specifically designed for bird behavior detection and species classification. The acquisition of the data was conducted within Alicante wetlands, specifically within the wetlands of La Mata Natural Park and El Hondo Natural Park (sutheastern Spain). This dataset was collected as part of the contributions to the CHAN-TWIN project. Only the annotations are available in the first release of the dataset, we are working to release the full set of videos soon.
The dataset is composed by the next items:
bounding_boxes.csv: Annotations from each of the frames composing the videos of the dataset. These annotations are given in CSV format, where the last field corresponds to the different bounding boxes that appear in each frame. The bounding boxes are tuples composed of 6 fields, following the next structure: [(X_max,Y_max,X_min,Y_min,Behavior_id,Bird_id)]. As different birds can appear in one frame, each of them is assigned with an ID, which is indicated with the field Bird_id of the previously mentioned tuple.
behaviors_ID.csv: This file contains a mapping of the seven behavior classes that make up the data set and their numeric identifiers.
species_ID.csv: This file contains the numerical identifiers associated with each bird species.
videos: This folder contains the videos that compose the dataset.
Within this dataset 13 different bird species are identified. These species are the next: White Wagtail, Glossy Ibis, Squacco Heron, Black-winged Stilt, Yellow-legged Gull, Common Gallinule, Black-headed Gull, Eurasian Coot, Little Ringed Plover, Eurasian Moorhen, Eurasian Magpie, Gadwall, Mallard and Northern Shoveler.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Birds Multiclass Image Classification is a dataset for object detection tasks - it contains Flying Land Standing Swimming annotations for 250 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
An automatic bird sound recognition system is a useful tool for collecting data of different bird species for ecological analysis. Together with autonomous recording units (ARUs), such a system provides a possibility to collect bird observations on a scale that no human observer could ever match. During the last decades progress has been made in the field of automatic bird sound recognition, but recognizing bird species from untargeted soundscape recordings remains a challenge. In this article we demonstrate the workflow for building a global identification model and adjusting it to perform well on the data of autonomous recorders from a specific region. We show how data augmentation and a combination of global and local data can be used to train a convolutional neural network to classify vocalizations of 101 bird species. We construct a model and train it with a global data set to obtain a base model. The base model is then fine-tuned with local data from Southern Finland in order to adapt it to the sound environment of a specific location and tested with two data sets: one originating from the same Southern Finnish region and another originating from a different region in German Alps. Our results suggest that fine-tuning with local data significantly improves the network performance. Classification accuracy was improved for test recordings from the same area as the local training data (Southern Finland) but not for recordings from a different region (German Alps). Data augmentation enables training with a limited number of training data and even with few local data samples significant improvement over the base model can be achieved. Our model outperforms the current state-of-the-art tool for automatic bird sound classification. Using local data to adjust the recognition model for the target domain leads to improvement over general non-tailored solutions. The process introduced in this article can be applied to build a fine-tuned bird sound classification model for a specific environment. Methods This repository contains data and recognition models described in paper Domain-specific neural networks improve automated bird sound recognition already with small amount of local data. (Lauha et al., 2022).
This dataset was created by Keshavi Aggarwal
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
greenarcade/wav2vec2-vd-bird-sound-classification-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A Simple Bird Classification Algorithm
A simple model capable of identifying a handful of bird species, each listed in the project classes. The list will be updated as needed.
Produced for a Computer Vision course project.
This dataset was created by Gaurav Dutta
Original provider: Caroline Fox, Dalhousie University and Raincoast Conservation Foundation
Dataset credits: Caroline Fox, Dalhousie University and Raincoast Conservation Foundation
Abstract: Associated publication abstract: Increasingly disrupted and altered, the world’s oceans are subject to immense and intensifying anthropogenic pressures. Of the biota inhabiting these ecosystems, marine birds are among the most threatened. For conservation efforts targeting marine birds to be effective, quantitative information relating to their at-sea density and distribution is typically a crucial knowledge component. In this study, we generated predictive machine learning ensemble models for 13 marine bird species and 7 groups (representing 24 additional species) in Canada’s Pacific coast waters, including several species listed under Canada’s Species at Risk Act. Predictive models were based on systematic marine bird line transect survey information collected in spring, summer, and fall on Canada’s Pacific coast (2005−2008). Multiple Covariate Distance Sampling (MCDS) was used to estimate marine bird density along transect segments. Spatial and temporal environmental predictors, including remote sensing information, were used in model ensembles, which were constructed using 4 machine learning algorithms in Salford Systems Predictive Modeler v7.0 (SPM7): Random Forests, TreeNet, Multivariate Adaptive Regression Splines, and Classification and Regression Trees. Predictive models were subsequently combined to generate seasonal and overall predictions of areas important to marine birds based on normalized marine bird species or group richness and densities. Our results employ open access data sharing and are intended to better inform marine bird conservation efforts and management planning on Canada’s Pacific coast and for broader-scale geographic initiatives across North America and elsewhere.
Supplemental information: Marine bird line-transect survey information collected using Distance Sampling in coastal British Columbia, Canada (2005-2008) is provided in three forms: (1) raw, unadjusted marine bird sightings; (2) for a subset of species, marine bird density estimates along 1km transect segments using Multiple Covariates Distance Sampling (MCDS), and; (3) for a subset of species, surface density estimates per ~14km2 hexagon using machine learning ensemble modeling. For data products 2 and 3, the marine bird subsets were restricted to species sighted in sufficient numbers for analysis. Surveys were completed by Raincoast Conservation Foundation.
Note that several species alpha codes are non-standard, due to grouping of species identifications (e.g., large gulls and dark shearwaters).
ANMU = Ancient Murrelet ANMUf = Ancient Murrelet family (varying #s of parents and chicks, or just chicks) BAEA = Bald Eagle BEKI = Belted Kingfisher BFAL = Black-footed Albatross BLKI = Black-legged Kittiwake BLOY = Black Oystercatcher BLSC = Black Scoter BLTU = Black Turnstone BOGU = Bonaparte's Gull BRAC = Brandt's Cormorant BRAN = Brant Goose BUFF = Bufflehead Duck BULS = Buller's Shearwater CAAU = Cassin's Auklet CAGU = California Gull CANG = Canada Goose COLO = Common Loon COME = Common Merganser COMU = Common Murre COMUf = Common Murre family (parent with chick, or just chicks) CORA = Common Raven DARK = Sooty Shearwater, Short-tailed Shearwater, Flesh-footed Shearwater DCCO = Double-crested Cormorant DEJU = Dark-eyed Junco DUNL = Dunlin FTSP = Fork-tailed Storm Petrel GBHE = Great Blue Heron GWGU = Glaucous-winged Gull HADU = Harlequin Duck HETHGU = Herring Gull/Thayer's Gull HOGR = Horned Grebe HOPU = Horned Puffin LAAL = Laysan Albatross LEFTSP = mixed flock Fork-tailed and Leach's Storm-petrels LESP = Leach's Storm Petrel LTDU = Longtail Duck LTJA = Long-tailed Jaeger MALL = Mallard Duck MAMU = Marbled Murrelet MEGU = Mew Gull NOCR = Northwestern Crow NOFU = Northern Fulmar NSHO = Northern Shoveler OSPR = Osprey PAJA = Parasitic Jaeger PALO = Pacific Loon PECO = Pelagic Cormorant PFSH = Pink-footed Shearwater PIGU = Pigeon Guillemot POJA = Pomarine Jaeger RBME = Red-breasted Merganser RHAU = Rhinoceros Auklet RNGR = Red-necked Grebe RNPH = Red-necked Phalarope RTLO = Red-throated Loon RUHU = Rufous Hummingbird SAGU = Sabine's Gull SNGO = Snow Goose STAL = Short-tailed Albatross SUSC = Surf Scoter THGU = Thayer's Gull TOWA = Townsend's Warbler TRES = Tree Swallow TUPU = Tufted Puffin TUPUf = Tufted Puffin family (parent with chick) WEGR = Western Grebe WEGU = Western Gull WHIM = Whimbrel WWSC = White-winged Scoter YBLO = Yellow-billed Loon
UNAL = Unidentified Alcid
UNCO = Unidentified cormorant
UNDU = Unidentified ducks in the distance
UNGE = Unidentified Geese in the distance
UNGO = Unidentified Goldeneye
UNGR = Unidentified Grebe
ULGU = Unidentified Larus Gull
UNJA = Unidentified Jaeger
UNLO = Unidentified Loon
UNSO = Unidentified Scoter
UNSW = Unidentified Shearwater
UNSH = Unidentified Shorebirds
UNST = Unidentified Storm-petrel
UNTE = Unidentified Tern
UNTU = Unidentified Turnstone
Note that several species alpha codes are non-standard, due to grouping of species identifications (e.g., large gulls and dark shearwaters).
ANMU = Ancient Murrelet BFAL = Black-footed Albatross CAAU = Cassin's Auklet COMU = Common Murre CORM = Cormorants (Brandt's, Double-crested, Pelagic) DARK = Dark shearwaters (Flesh-footed, Short-tailed, Sooty) FTSP = Fork-tailed Storm-petrel GREB = Grebes (Horned, Red-necked, Western) LESP = Leach's Storm-petrel lgGULL = large Larus spp. gulls (California, Glaucous-winged, American, Thayer's) LOON = Loons (Yellow-billed, Common, Red-throated, Pacific) MAMU = Marbled Murrelet NOFU = Northern Fulmar PFSH = Pink-footed Shearwater PIGU = Pigeon Guillemot RHAU = Rhinoceros Auklet RNPH = Red=necked Phalarope SCOT = Scoters (Black, White-winged, Surf) smGULL = small gulls (Black-legged Kittiwake, Bonaparte's, Mew, Sabine's) TUPU = Tufted Puffin
Field names represent, using ANMU and BFAL as the examples:
Shape file name represents the bird species (e.g., ANMU = Ancient Murrelet) plus "w" (w = density estimates of birds on water only) or "sw" (sw = density estimates of combination of birds in flight and on water).
Note that several species alpha codes are non-standard, due to grouping of species identifications (e.g., large gulls and dark shearwaters).
ANMU = Ancient Murrelet BFAL = Black-footed Albatross CAAU = Cassin's Auklet COMU = Common Murre CORM = Cormorants (Brandt's, Double-crested, Pelagic) DARK = Dark shearwaters (Flesh-footed, Short-tailed, Sooty) FTSP = Fork-tailed Storm-petrel GREB = Grebes (Horned, Red-necked, Western) LESP = Leach's Storm-petrel lgGULL = large Larus spp. gulls (California, Glaucous-winged, American, Thayer's) LOON = Loons (Yellow-billed, Common, Red-throated, Pacific) MAMU = Marbled Murrelet NOFU = Northern Fulmar PFSH = Pink-footed Shearwater PIGU = Pigeon Guillemot RHAU = Rhinoceros Auklet RNPH = Red=necked Phalarope SCOT = Scoters (Black, White-winged, Surf) smGULL = small gulls (Black-legged Kittiwake, Bonaparte's, Mew, Sabine's) TUPU = Tufted Puffin
Field names represent, using ANMUw as the example:
The emergence of continental to global scale biodiversity data has led to growing understanding of patterns in species distributions, and the determinants of these distributions, at large spatial scales. However, identifying the specific mechanisms, including demographic processes, and determining species distributions remains difficult, as large-scale data are typically restricted to observations of only species presence. New remote automated approaches for collecting data, such as automated recording units (ARUs), provide a promising avenue towards direct measurement of demographic processes, such as reproduction, that cannot feasibly be measured at scale by traditional survey methods. In this study, we analyze data collected by ARUs from 452 survey points across an approximately 1500 km study region to compare patterns in adult and juvenile distributions in the Great Horned Owl (Bubo virginianus). We specifically examine whether habitat associated with successful reproduction is the ..., Owl surveys: Nighttime autonomous acoustic recordings were collected from 452 survey locations across 1500 km of the eastern United States. Two Convolutional Neural Networks were developed to classify the adult song and juvenile begging call of the Great Horned Owl (Bubo virginianus). These classifiers were run on the recordings and the highest scoring ten five-second clips occurring on ten separate days at each survey location were extracted. These clips were manually reviewed by a human listener to ensure they contained the relevant owl sounds. Presence/absence was translated into 1/0 detection histories to be used in occupancy models. Covariates: GPS coordinates were collected at each survey location (these are not provided to protect landowner identity). National Land Cover Database information was extracted for the amount of forest and agricultural land cover within a 1750 m radius of each survey location for use as occupancy covariates. Tree basal area and < 10 cm DBH stem dens..., , # Evaluating the predictors of habitat use and successful reproduction in a model bird species using a large scale automated acoustic array
https://doi.org/10.5061/dryad.5hqbzkhcz
Data are provided as a single CSV file owl_data.csv with columns
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Bird Recognition is a dataset for classification tasks - it contains Bird Species annotations for 4,893 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This archive includes sound clips (.wav files) and associated mel-scale spectrograms of bird vocalizations for 54 species in Sonoma County, California, USA. These data were used for training and validating convolutional neural network (CNN) models for bird species detection. We also include xeno-canto training and validation mel spectrograms used to pretrain CNNs. Details on these data are explained in the paper by Clark et al. (2023) titled "The effect of soundscape composition on bird vocalization classification in a citizen science biodiversity monitoring project". These data are available for use without restrictions, with no warranty on data quality or utility for a given application. We request that any work that does use these data cite the Clark et al. (2023) paper.
Clark, M.L., Salas, L., Baligar, S., Quinn, C., Snyder, R.L., Leland, D., Schackwitz, W., Goetz, S.J., Newsam, S. (2023). The effect of soundscape composition on bird vocalization classification in a citizen science biodiversity monitoring project. Ecological Informatics. https://doi.org/10.1016/j.ecoinf.2023.102065
Associated code for training CNN models, performing inference, and applying post-classification corrections can be found in the GitHub archive https://github.com/pointblue/Soundscapes2Landscapes/tree/master/CNN_Bird_Species
Raw sound data from the Soundscapes to Landscapes project are available upon request: Dr. Matthew Clark, matthew.clark@sonoma.edu
These data were collected as part of the Soundscapes to Landscapes project (soundscapes2landscapes.org), funded by NASA’s Citizen Science for Earth Systems Program (CSESP) 16-CSESP 2016-0009 under cooperative agreement 80NSSC18M0107.
This depository includes the following archives:
mel_specs.zip: contains 2-sec mel spectrograms split into training (“tr”), validation (“val”), testing (“test”) data for each target bird species (n = 54) used to fine-tune the CNNs. Select spectrogram files are appended with “aug” if they are augmented versions for the training data.
wav.zip: contains the associated wav-format sound recordings used to generate the training, validation, testing mel spectrograms found in mel_specs.zip.
Xeno-canto_pretrain.tar: contains 2-sec mel spectrograms split into training and validation data for 40 bird species used for CNN pre-training that were generated using a warbleR segmentation methodology described in the paper. The sound files used to generate these mel spectrograms came from the Kaggle competition, https://www.kaggle.com/datasets/imoore/xenocanto-bird-recordings-dataset Mel spectrogram naming reflects the XC number used for cataloging on Xeno-canto in the format XC123456_2.png. The six numbers following the XC characters can be used to search for unique recordings on Xeno-canto (https://xeno-canto.org/) using the search query “nr:123456” in the search tool or queried using the Xeno-canto API (https://xeno-canto.org/explore/api). Unique recording names can be extracted from the mel spectrogram filenames.
soundscape_test_wavs.zip: the wav-format sound recordings used to perform soundscape testing.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
conservationists