31 datasets found
  1. CarDD with YOLO Annotations (Images + Labels)

    • kaggle.com
    zip
    Updated Aug 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Fernandes Carvalho (2025). CarDD with YOLO Annotations (Images + Labels) [Dataset]. https://www.kaggle.com/datasets/gabrielfcarvalho/cardd-with-yolo-annotations-images-labels/data
    Explore at:
    zip(3010616273 bytes)Available download formats
    Dataset updated
    Aug 5, 2025
    Authors
    Gabriel Fernandes Carvalho
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    CarDD with YOLO Annotations (Images + Labels)

    This dataset packages the Car Damage Detection (CarDD) images together with YOLO-format labels converted from the original COCO/SOD annotations.

    • Images & original annotations: CarDD (PIC Lab, CAS).
    • YOLO labels: Converted by Gabriel Fernandes Carvalho.
    • Task: Object detection of car damage categories.
    • Format: YOLO text files — each line is class x_center y_center width height (normalized).

    Classes

    • 0 dent
    • 1 scratch
    • 2 crack
    • 3 glass shatter
    • 4 lamp broken
    • 5 tire flat
  2. d

    Data from: X-ray CT data with semantic annotations for the paper "A workflow...

    • catalog.data.gov
    • datasetcatalog.nlm.nih.gov
    • +1more
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). X-ray CT data with semantic annotations for the paper "A workflow for segmenting soil and plant X-ray CT images with deep learning in Google’s Colaboratory" [Dataset]. https://catalog.data.gov/dataset/x-ray-ct-data-with-semantic-annotations-for-the-paper-a-workflow-for-segmenting-soil-and-p-d195a
    Explore at:
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    Leaves from genetically unique Juglans regia plants were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA). Soil samples were collected in Fall of 2017 from the riparian oak forest located at the Russell Ranch Sustainable Agricultural Institute at the University of California Davis. The soil was sieved through a 2 mm mesh and was air dried before imaging. A single soil aggregate was scanned at 23 keV using the 10x objective lens with a pixel resolution of 650 nanometers on beamline 8.3.2 at the ALS. Additionally, a drought stressed almond flower bud (Prunus dulcis) from a plant housed at the University of California, Davis, was scanned using a 4x lens with a pixel resolution of 1.72 µm on beamline 8.3.2 at the ALS Raw tomographic image data was reconstructed using TomoPy. Reconstructions were converted to 8-bit tif or png format using ImageJ or the PIL package in Python before further processing. Images were annotated using Intel’s Computer Vision Annotation Tool (CVAT) and ImageJ. Both CVAT and ImageJ are free to use and open source. Leaf images were annotated in following Théroux-Rancourt et al. (2020). Specifically, Hand labeling was done directly in ImageJ by drawing around each tissue; with 5 images annotated per leaf. Care was taken to cover a range of anatomical variation to help improve the generalizability of the models to other leaves. All slices were labeled by Dr. Mina Momayyezi and Fiona Duong.To annotate the flower bud and soil aggregate, images were imported into CVAT. The exterior border of the bud (i.e. bud scales) and flower were annotated in CVAT and exported as masks. Similarly, the exterior of the soil aggregate and particulate organic matter identified by eye were annotated in CVAT and exported as masks. To annotate air spaces in both the bud and soil aggregate, images were imported into ImageJ. A gaussian blur was applied to the image to decrease noise and then the air space was segmented using thresholding. After applying the threshold, the selected air space region was converted to a binary image with white representing the air space and black representing everything else. This binary image was overlaid upon the original image and the air space within the flower bud and aggregate was selected using the “free hand” tool. Air space outside of the region of interest for both image sets was eliminated. The quality of the air space annotation was then visually inspected for accuracy against the underlying original image; incomplete annotations were corrected using the brush or pencil tool to paint missing air space white and incorrectly identified air space black. Once the annotation was satisfactorily corrected, the binary image of the air space was saved. Finally, the annotations of the bud and flower or aggregate and organic matter were opened in ImageJ and the associated air space mask was overlaid on top of them forming a three-layer mask suitable for training the fully convolutional network. All labeling of the soil aggregate and soil aggregate images was done by Dr. Devin Rippner. These images and annotations are for training deep learning models to identify different constituents in leaves, almond buds, and soil aggregates Limitations: For the walnut leaves, some tissues (stomata, etc.) are not labeled and only represent a small portion of a full leaf. Similarly, both the almond bud and the aggregate represent just one single sample of each. The bud tissues are only divided up into buds scales, flower, and air space. Many other tissues remain unlabeled. For the soil aggregate annotated labels are done by eye with no actual chemical information. Therefore particulate organic matter identification may be incorrect. Resources in this dataset:Resource Title: Annotated X-ray CT images and masks of a Forest Soil Aggregate. File Name: forest_soil_images_masks_for_testing_training.zipResource Description: This aggregate was collected from the riparian oak forest at the Russell Ranch Sustainable Agricultural Facility. The aggreagate was scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 0,0,0; pores spaces have a value of 250,250, 250; mineral solids have a value= 128,0,0; and particulate organic matter has a value of = 000,128,000. These files were used for training a model to segment the forest soil aggregate and for testing the accuracy, precision, recall, and f1 score of the model.Resource Title: Annotated X-ray CT images and masks of an Almond bud (P. Dulcis). File Name: Almond_bud_tube_D_P6_training_testing_images_and_masks.zipResource Description: Drought stressed almond flower bud (Prunis dulcis) from a plant housed at the University of California, Davis, was scanned by X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 4x lens with a pixel resolution of 1.72 µm using. For masks, the background has a value of 0,0,0; air spaces have a value of 255,255, 255; bud scales have a value= 128,0,0; and flower tissues have a value of = 000,128,000. These files were used for training a model to segment the almond bud and for testing the accuracy, precision, recall, and f1 score of the model.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads Resource Title: Annotated X-ray CT images and masks of Walnut leaves (J. Regia) . File Name: 6_leaf_training_testing_images_and_masks_for_paper.zipResource Description: Stems were collected from genetically unique J. regia accessions at the 117 USDA-ARS-NCGR in Wolfskill Experimental Orchard, Winters, California USA to use as scion, and were grafted by Sierra Gold Nursery onto a commonly used commercial rootstock, RX1 (J. microcarpa × J. regia). We used a common rootstock to eliminate any own-root effects and to simulate conditions for a commercial walnut orchard setting, where rootstocks are commonly used. The grafted saplings were repotted and transferred to the Armstrong lathe house facility at the University of California, Davis in June 2019, and kept under natural light and temperature. Leaves from each accession and treatment were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 170,170,170; Epidermis value= 85,85,85; Mesophyll value= 0,0,0; Bundle Sheath Extension value= 152,152,152; Vein value= 220,220,220; Air value = 255,255,255.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads

  3. AAU RainSnow Traffic Surveillance Dataset

    • kaggle.com
    zip
    Updated Sep 21, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aalborg University (2018). AAU RainSnow Traffic Surveillance Dataset [Dataset]. https://www.kaggle.com/datasets/aalborguniversity/aau-rainsnow/discussion
    Explore at:
    zip(3391982600 bytes)Available download formats
    Dataset updated
    Sep 21, 2018
    Dataset authored and provided by
    Aalborg University
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Rain, Snow, and Bad Weather in Traffic Surveillance

    Computed vision-based image analysis lays the foundation for automatic traffic surveillance. This works well in daylight when the road users are clearly visible to the camera but often struggles when the visibility of the scene is impaired by insufficient lighting or bad weather conditions such as rain, snow, haze, and fog.

    In this dataset, we have focused on collecting traffic surveillance video in rainfall and snowfall, capturing 22 five-minute videos from seven different traffic intersections. The illumination of the scenes vary from broad daylight to twilight and night. The scenes feature glare from headlights of cars, reflections from puddles, and blur from raindrops at the camera lens.

    We have collected the data using a conventional RGB colour camera and a thermal infrared camera. If combined, these modalities should enable robust detection and classification of road users even under challenging weather conditions.

    100 frames have been selected randomly from each five-minute sequence and any road user in these frames is annotated on a per-pixel, instance-level with corresponding category label. In total, 2,200 frames are annotated, containing 13,297 objects.

    Content and Annotations

    The dataset is collected from seven intersections in the Danish cities of Aalborg and Viborg. The RGB and thermal cameras are placed on a street lamp, observing the traffic from above. The resolution of both cameras are 640x480 pixels and the frame rate is fixed at 20 frames/second. With 21 five-minute sequences (and one four-minute sequence), this results in 130,800 RGB-thermal image pairs.

    The two video streams are synchronized in time. The images of one modality is transferred to another by a fixed homography. A sample implementation is shown in the file aauRainSnowUtility.py

    Annotations

    Each instance of a road user is annotated on a pixel-level with corresponding category label. We have used the AAU VAP Multimodal Pixel Annotator as our annotation tool. The category labels are compatible with the MSCOCO category labels and the entire annotated dataset is converted to a json-file that can be used directly with the COCO API. In our kernel, we have used pycocotools to showcase the dataset and annotations.

    Acknowledgements

    Please cite the following paper if you find the dataset useful:

    @article{bahnsen2018rain,
     title={Rain Removal in Traffic Surveillance: Does it Matter?},
     author={Bahnsen, Chris H. and Moeslund, Thomas B.},
     journal={IEEE Transactions on Intelligent Transportation Systems},
     year={2018},
     publisher={IEEE},
     doi={10.1109/TITS.2018.2872502},
     pages={1--18}
    }
    
  4. h

    cvpdl_detr_dataset

    • huggingface.co
    Updated Oct 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tsui (2024). cvpdl_detr_dataset [Dataset]. https://huggingface.co/datasets/tsui10902118/cvpdl_detr_dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 6, 2024
    Authors
    Tsui
    Description

    Image Source: cvpdl 2024 hw1 This dataset is created by:

    resize all images with longer side length 1333px (scale.py) convert labels to coco format and attach to annotations column (dataset.ipynb)

      dataset_info:
    

    features: - name: image dtype: image - name: annotations struct: - name: annotations list: - name: area dtype: int64 - name: bbox sequence: int64 - name: category_id dtype: int64 - name: image_id… See the full description on the dataset page: https://huggingface.co/datasets/tsui10902118/cvpdl_detr_dataset.

  5. A collection of fully-annotated soundscape recordings from the Island of...

    • data.niaid.nih.gov
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amanda Navine; Stefan Kahl; Ann Tanimoto-Johnson; Holger Klinck; Patrick Hart (2024). A collection of fully-annotated soundscape recordings from the Island of Hawai'i [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7078498
    Explore at:
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Cornell Lab of Ornithologyhttp://birds.cornell.edu/
    Listening Observatory for Hawaiian Ecosystems, University of Hawai'i at Hilo
    Authors
    Amanda Navine; Stefan Kahl; Ann Tanimoto-Johnson; Holger Klinck; Patrick Hart
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Island of Hawai'i
    Description

    This collection contains 635 soundscape recordings with a total duration of almost 51 hours, which have been annotated by expert ornithologists who provided 59,583 bounding box labels for 27 different bird species from the Hawaiian Islands, including 6 threatened or endangered native birds. The data were recorded between 2016 and 2022 at four sites across Hawai‘i Island. This collection has partially been featured as test data in the 2022 BirdCLEF competition and can primarily be used for training and evaluation of machine learning algorithms.

    Data collection

    Soundscapes for this collection were recorded for various research projects by the Listening Observatory for Hawaiian Ecosystems (LOHE) at the University of Hawai‘i at Hilo. The recordings were collected using Wildlife Acoustics Inc. Song Meters (models 2, 4, or Mini), as 16-bit wav files at a sampling rate of 44.1 kHz, using the default gain settings of each model. Further specifics for each recording, such as recording location and habitat type, can be found in the metadata provided. Soundscapes in this collection vary in length, ranging from just under a minute to 9 minutes in duration. All audio was unified, converted to FLAC, and resampled to 32 kHz for this collection. Parts of this dataset have previously been used in the 2022 BirdCLEF competition.

    Sampling and annotation protocol

    This collection is a subset of the files recorded over the course of the LOHE lab’s respective studies. The data were subsampled for annotation by aurally scanning the recordings and visually scanning spectrograms generated using Raven Pro software for target species of interest to the individual research project for which each recording was collected. Recordings that did not contain vocalizations of the species of interest were excluded from full annotation and thus this collection.

    Using Raven Pro, annotators were asked to create a selection box around every bird call they could recognize, ignoring those that were too faint or unidentifiable at a spectrogram window size of 700 points. Provided labels contain full bird calls that are boxed in time and frequency. Annotators were allowed to combine multiple consecutive calls of the same species into one bounding box label if pauses between calls were shorter than 0.5 seconds. We converted labels to eBird species codes, following the 2021 eBird taxonomy (Clements list).

    Files in this collection

    Audio recordings can be accessed by downloading and extracting the “soundscape_data.zip” file. Soundscape recording filenames contain a sequential file ID, site ID, recording date, and timestamp in HST. As an example, the file “UHH_001_S01_20161121_150000.flac” has sequential ID 001 and was recorded at site S01 on Nov 21st, 2016 at 15:00:00 HST. Ground truth annotations are listed in “annotations.csv” where each line specifies the corresponding filename, start and end time in seconds, low and high frequency in Hertz, and an eBird species code. These species codes can be assigned to the scientific and common name of a species with the “species.csv” file. The approximate recording location with Universal Transverse Mercator (UTM) coordinates and other metadata can be found in the “recording_location.csv” file.

    Acknowledgements

    Compiling this extensive dataset was a major undertaking, and we are very thankful to the domain experts who helped to collect and manually annotate the data for this collection. Specifically, we want to thank Charlotte Forbes-Perry with the Pacific Cooperative Studies Unit, University of Hawai'i at Hawai‘i Volcanoes National Park as well as the following current and past members of the LOHE lab (in alphabetical order): Keith Burnett, Saxony Charlot, Noah Hunt, Caleb Kow, Elizabeth Lough, and Bret Mossman.

    Access and permits to record soundscapes were provided by (in alphabetical order): Hakalau Forest National Wildlife Refuge, the State of Hawai‘i Department of Land and Natural Resources Division of Forestry and Wildlife, and the U.S. Fish and Wildlife Service.

    We would also like to acknowledge our funding sources (in alphabetical order): The National Park Service Inventory and Monitoring Division, the National Science Foundation, and the U.S. Army Engineer Research and Development Center.

  6. Expert and AI-generated annotations of the tissue types for the...

    • data.niaid.nih.gov
    Updated Dec 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bridge, Christopher; Brown, G. Thomas; Jung, Hyun; Lisle, Curtis; Clunie, David; Milewski, David; Liu, Yanling; Collins, Jack; Linardic, Corinne M.; Hawkins, Douglas S.; Venkatramani, Rajkumar; Fedorov, Andrey; Khan, Javed (2024). Expert and AI-generated annotations of the tissue types for the RMS-Mutation-Prediction microscopy images [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10462857
    Explore at:
    Dataset updated
    Dec 20, 2024
    Dataset provided by
    National Cancer Institutehttp://www.cancer.gov/
    Seattle Children's Hospital
    Texas Children's Cancer Center
    Massachusetts General Hospital
    Duke University School of Medicine
    Brigham and Women's Hospital
    KnowledgeVis, LLC
    Frederick National Laboratory for Cancer Research
    Authors
    Bridge, Christopher; Brown, G. Thomas; Jung, Hyun; Lisle, Curtis; Clunie, David; Milewski, David; Liu, Yanling; Collins, Jack; Linardic, Corinne M.; Hawkins, Douglas S.; Venkatramani, Rajkumar; Fedorov, Andrey; Khan, Javed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=RMS-Mutation-Prediction-Expert-Annotations.. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below.

    Collection description

    This dataset contains 2 components:

    Annotations of multiple regions of interest performed by an expert pathologist with eight years of experience for a subset of hematoxylin and eosin (H&E) stained images from the RMS-Mutation-Prediction image collection [1,2]. Annotations were generated manually, using the Aperio ImageScope tool, to delineate regions of alveolar rhabdomyosarcoma (ARMS), embryonal rhabdomyosarcoma (ERMS), stroma, and necrosis [3]. The resulting planar contour annotations were originally stored in ImageScope-specific XML format, and subsequently converted into Digital Imaging and Communications in Medicine (DICOM) Structured Report (SR) representation using the open source conversion tool [4].

    AI-generated annotations stored as probabilistic segmentations. WARNING: After the release of v20, it was discovered that a mistake had been made during data conversion that affected the newly-released segmentations accompanying the "RMS-Mutation-Prediction" collection. Segmentations released in v20 for this collection have the segment labels for alveolar rhabdomyosarcoma (ARMS) and embryonal rhabdomyosarcoma (ERMS) switched in the metadata relative to the correct labels. Thus segment 3 in the released files is labelled in the metadata (the SegmentSequence) as ARMS but should correctly be interpreted as ERMS, and conversely segment 4 in the released files is labelled as ERMS but should be correctly interpreted as ARMS. We apologize for the mistake and any confusion that it has caused, and will be releasing a corrected version of the files in the next release as soon as possible.

    Many pixels from the whole slide images annotated by this dataset are not contained inside any annotation contours and are considered to belong to the background class. Other pixels are contained inside only one annotation contour and are assigned to a single class. However, cases also exist in this dataset where annotation contours overlap. In these cases, the pixels contained in multiple contours could be assigned membership in multiple classes. One example is a necrotic tissue contour overlapping an internal subregion of an area designated by a larger ARMS or ERMS annotation. The ordering of annotations in this DICOM dataset preserves the order in the original XML generated using ImageScope. These annotations were converted, in sequence, into segmentation masks and used in the training of several machine learning models. Details on the training methods and model results are presented in [1]. In the case of overlapping contours, the order in which annotations are processed may affect the generated segmentation mask if prior contours are overwritten by later contours in the sequence. It is up to the application consuming this data to decide how to interpret tissues regions annotated with multiple classes. The annotations included in this dataset are available for visualization and exploration from the National Cancer Institute Imaging Data Commons (IDC) 5 as of data release v18. Direct link to open the collection in IDC Portal: https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=RMS-Mutation-Prediction-Expert-Annotations.

    Files included

    A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, pan_cancer_nuclei_seg_dicom-collection_id-idc_v19-aws.s5cmd corresponds to the annotations for th eimages in the collection_id collection introduced in IDC data release v19. DICOM Binary segmentations were introduced in IDC v20. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced.

    For each of the collections, the following manifest files are provided:

    rms_mutation_prediction_expert_annotations-idc_v20-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets

    rms_mutation_prediction_expert_annotations-idc_v20-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets

    rms_mutation_prediction_expert_annotations-idc_v20-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids)

    Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP.

    Download instructions

    Each of the manifests include instructions in the header on how to download the included files.

    To download the files using .s5cmd manifests:

    install idc-index package: pip install --upgrade idc-index

    download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd

    To download the files using .dcf manifest, see manifest header.

    Acknowledgments

    Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.

    If you use the files referenced in the attached manifests, we ask you to cite this dataset, as well as the publication describing the original dataset [2] and publication acknowledging IDC [5].

    References

    [1] D. Milewski et al., "Predicting molecular subtype and survival of rhabdomyosarcoma patients using deep learning of H&E images: A report from the Children's Oncology Group," Clin. Cancer Res., vol. 29, no. 2, pp. 364–378, Jan. 2023, doi: 10.1158/1078-0432.CCR-22-1663.

    [2] Clunie, D., Khan, J., Milewski, D., Jung, H., Bowen, J., Lisle, C., Brown, T., Liu, Y., Collins, J., Linardic, C. M., Hawkins, D. S., Venkatramani, R., Clifford, W., Pot, D., Wagner, U., Farahani, K., Kim, E., & Fedorov, A. (2023). DICOM converted whole slide hematoxylin and eosin images of rhabdomyosarcoma from Children's Oncology Group trials [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8225132

    [3] Agaram NP. Evolving classification of rhabdomyosarcoma. Histopathology. 2022 Jan;80(1):98-108. doi: 10.1111/his.14449. PMID: 34958505; PMCID: PMC9425116,https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9425116/

    [4] Chris Bridge. (2024). ImagingDataCommons/idc-sm-annotations-conversion: v1.0.0 (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.10632182

    [5] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W. L., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National cancer institute imaging data commons: Toward transparency, reproducibility, and scalability in imaging artificial intelligence. Radiographics 43, (2023).

  7. A collection of fully-annotated soundscape recordings from the Western...

    • data.niaid.nih.gov
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Kahl; Connor M. Wood; Philip Chaon; M. Zachariah Peery; Holger Klinck (2024). A collection of fully-annotated soundscape recordings from the Western United States [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7050013
    Explore at:
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Cornell Lab of Ornithologyhttp://birds.cornell.edu/
    San Jose State Research Foundation
    Department of Forest and Wildlife Ecology, University of Wisconsin - Madison
    Authors
    Stefan Kahl; Connor M. Wood; Philip Chaon; M. Zachariah Peery; Holger Klinck
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Western United States, United States
    Description

    This collection contains 33 hour-long soundscape recordings, which have been annotated with 20,147 bounding box labels for 56 different bird species from the Western United States. The data were recorded in 2018 in the Sierra Nevada, California, USA. This collection has partially been featured as test data in the 2021 BirdCLEF competition and can primarily be used for training and evaluation of machine learning algorithms.

    Data collection

    Measuring the effects of forest management activities in the Sierra Nevada, California, USA can reveal a potential correlation with avian population density and diversity. For this dataset, passive acoustic surveys were conducted in the Lassen and Plumas National Forests in May-August 2018. Survey grid cells (4 km2) were randomly selected from a 6,000-km2 area, and SWIFT recording units were deployed at locations conducive to sound propagation (e.g., ridges rather than gullies) within those cells. The sensitivity of the used microphones was -44 (+/-3) dB re 1 V/Pa. The microphone's frequency response was not measured, but is assumed to be flat (+/- 2 dB) in the frequency range 100 Hz to 7.5 kHz. The analog signal was amplified by 38 dB and digitized (16-bit resolution) using an analog-to-digital converter (ADC) with a clipping level of -/+ 0.9 V. Recording units recorded continuously 17:00 - 23:59, 0:00 - 10:00, one-hour files were stored as uncompressed WAVE sampled at 32 kHz and later converted to FLAC. Parts of this dataset have previously been used in the 2021 BirdCLEF competition.

    Sampling and annotation protocol

    We subsampled data for this collection by selecting locations that spanned the full elevational and latitudinal gradients of our study area (~840 – 1700 m asl and 39.41 – 40.71°N), and thus represent a broad range of plant communities. A single annotator boxed every bird call he could recognize, ignoring those that are too faint or unidentifiable. Raven Pro software was used to annotate the data. Provided labels contain full bird calls that are boxed in time and frequency. The annotator was allowed to combine multiple consecutive calls of one species into one bounding box label if pauses between calls were shorter than five seconds. We use eBird species codes as labels, following the 2021 eBird taxonomy (Clements list).

    Files in this collection

    Audio recordings can be accessed by downloading and extracting the “soundscape_data.zip” file. Soundscape recording filenames contain a sequential file ID, recording date and timestamp in PDT. As an example, the file “SNE_001_20180509_050002.flac” has sequential ID 001 and was recorded on May 9th 2018 at 05:00:02 PDT. Ground truth annotations are listed in “annotations.csv” where each line specifies the corresponding filename, start and end time in seconds, low and high frequency in Hertz and an eBird species code. These species codes can be assigned to scientific and common name of a species with the “species.csv” file. The approximate recording location with longitude and latitude can be found in the “recording_location.txt” file.

    Acknowledgements

    The collection and annotation of this dataset was funded by the U.S. Forest Service Region 5 and the California Department of Fish and Wildlife.

  8. r

    on-tree mango-branch instance segmentation dataset

    • researchdata.edu.au
    • acquire.cqu.edu.au
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chiranjivi Neupane (2024). on-tree mango-branch instance segmentation dataset [Dataset]. http://doi.org/10.25946/26212598.V1
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Central Queensland University
    Authors
    Chiranjivi Neupane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset has been prepared for use in machine vision-based mango fruit and branch localisation for detection of fruit-branch occlusion. Images are from Honey Gold and Keitt mango varieties. The dataset contains:

    - 250 RGB images (200 training + 50 test images) of mango tree canopies acquired using Azure Kinect Camera under artificial lighting condition.

    - COCO JSON format label files with multi class (mango+branch), single classes (mango only and branch only) polygon annotations.

    - Labels converted to txt format to use for YOLOv8-seg + other models training.

    Annotation: The annotation tool - VGG Image Annotator (VIA) was used for ground truth labeling of images using polygon labelling tool.

  9. Benthic megafaunal assemblages from the Mayotte island outer slope: a case...

    • seanoe.org
    csv, pdf
    Updated 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mélissa Hanafi-Portier; Catherine Borremans; Olivier Soubigou; Sarah Samadi; Laure Corbari; Karine Olu (2023). Benthic megafaunal assemblages from the Mayotte island outer slope: a case study illustrating workflow from annotation on images to georeferenced densities in sampling units [Dataset]. http://doi.org/10.17882/97234
    Explore at:
    csv, pdfAvailable download formats
    Dataset updated
    2023
    Dataset provided by
    SEANOE
    Authors
    Mélissa Hanafi-Portier; Catherine Borremans; Olivier Soubigou; Sarah Samadi; Laure Corbari; Karine Olu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    the mayotte island outer slopes have been explored using a towed camera to describe the megabenthic communities in the bathyal domain. the present dataset focus on the eastern slope, at depths between 500 and 1100m. it is presented as a case study of workflow from annotations on images by using the biigle online platform to georeferenced densities in standardized sampling units by the adelie post-processing software. data have been acquired during the biomaglo cruise led by mnhn and ifremer (corbari, 2017). workflow to obtain a density matrix from raw observation exportthe file “1_dive05_raw” corresponds to the raw data export of the biigle software for a specific dive.this raw “csv” format contains one row for each annotation label. since an annotation can have multiple labels, there may be multiple rows for a single annotation. the first row always contains the column headers. the columns are as follows: 1: annotation label id (not the annotation id) 2 : label id 3 : label name 4 : label hierarchy (see the extended report on how to interpret a label hierarchy) 5: id of the user who created/attached the annotation label 6 : user firstname 7 : user lastname 8 : image id 9 : image filename 10 : image longitude 11 : image latitude 12 : annotation shape id 13 : annotation shape name 14 : annotation points. the annotation points are encoded as a json array of alternating x and y values (e.g. [x1,y1,x2,y2,...]). for circles, the third value of the points array is the radius of the circle. 15 : additional attributes of the image. the additional attributes of the image are encoded as a json object. the content may vary depending on the biigle modules that are installed and the operations performed on the image (e.g. a laser point detection to calculate the area of an image). 16 : annotation idthe file “2_dive05_biomaglo_taxo_abund.csv” corresponds to a «matrix of abundance» format after cleaning the dataset and reordering the columns and calculating the number of individuals per taxa per image in the r environment. this file is then formatted such as the file “3_dive05_biomaglo_taxo_abund_sig.csv”. this third file corresponds to the format required by adelie (ifremer software), used to calculate the density of taxa per sampling unit in the gis. taxon names are converted to “id” because the software requires “.dbf” format files that truncate names beyond 10 characters.from the adelie software, we perform a join between the abundance matrix and the “dim” file of the dive which includes the georeferenced metadata of the images (latitude, longitude, altitude, filename). after the join, we proceed to the density calculation. the input files are the “join file” (i.e. georeferenced abundance matrix) and the navigation file which includes the altitude data (“nav”). the polygon area (sampling unit) must be specified (here 200m2).the file “4_output_dive05.csv” is the output after calculating the abundance per sampling unit. column « objectid,n,9,0 » includes the polygon numbers.this file is then cleaned in the r environment to obtain the file “5_dive05_density_final”: the value of the average surface image for the dive is added to the file, and from the number of replicates (images) per polygon, we standardize the abundance per polygon surface (~200m2). taxon names are reassigned instead of the “id”.

  10. Z

    A collection of annotated soundscape recordings from western Kenya

    • data.niaid.nih.gov
    Updated Jul 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kahl, Stefan; Reers, Hendrik; Cherutich, Francis; Jacot, Alain; Klinck, Holger (2024). A collection of annotated soundscape recordings from western Kenya [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10943499
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    Independent
    Swiss Ornithological Institute
    OekoFor GbR
    K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University
    Authors
    Kahl, Stefan; Reers, Hendrik; Cherutich, Francis; Jacot, Alain; Klinck, Holger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Kenya
    Description

    This collection contains 35 soundscape recordings of 32 hours total duration, which have been annotated with 10,294 labels for 176 different bird species from western Kenya. The data were recorded in 2021 and 2022 west and southwest of Lake Baringo in Baringo County, Kenya. This collection has partially been featured as test data in the 2023 BirdCLEF competition and can primarily be used for training and evaluation of machine learning algorithms.

    Data collection

    For this collection, AudioMoths and SWIFT recording units were deployed at multiple locations west and southwest of Lake Baringo, Baringo County, Kenya between Dezember 2021 and February 2022. Recording locations cover a variety of habitats from open grasslands to semi-arid scrubland and mountain forests. Recordings were originally sampled at 48 kHz and converted to MP3 for faster file transfer. For publication, all files were resampled to 32 kHz and converted to FLAC.

    Sampling and annotation protocol

    A total of 32 hours of audio from various sites west and southwest of Lake Baringo were selected for annotation. Annotators were tasked with identifying and labeling each bird call they could discern, excluding any calls that were too weak or indiscernible. The annotation process was carried out using Audacity. Provided labels mark the center of each bird call. In this collection, we use eBird species codes as labels, following the 2021 eBird taxonomy (Clements list). Parts of this dataset have previously been used in the 2023 BirdCLEF competition.

    Files in this collection

    Audio recordings can be accessed by downloading and extracting the “soundscape_data.zip” file. Soundscape recording filenames contain a sequential file ID, recording date and timestamp in EAT (UTC+3). As an example, the file “KEN_001_20211207_153852.flac” has sequential ID 001 and was recorded on December 7th 2021 at 15:38:52 EAT. Ground truth annotations are listed in “annotations.csv” where each line specifies the corresponding filename, start and end time in seconds, and an eBird species code. These species codes can be assigned to scientific and common name of a species with the “species.csv” file. The approximate recording location with longitude and latitude can be found in the “recording_location.txt” file.

    Acknowledgements

    Compiling this extensive dataset was a major undertaking, and we are very thankful to the domain experts who helped to collect and manually annotate the data for this collection. In particular, our thanks go to Francis Cherutich for setting up recording units, collecting and annotating data, and to Alain Jacot for assisting in programming the units and transporting the recorders to Kenya.

  11. n

    Rana sierrae annotated aquatic soundscapes (2022)

    • data-staging.niaid.nih.gov
    • dataone.org
    • +3more
    zip
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sam Lapp; Justin Kitzes (2023). Rana sierrae annotated aquatic soundscapes (2022) [Dataset]. http://doi.org/10.5061/dryad.9s4mw6mn3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 20, 2023
    Dataset provided by
    University of Pittsburgh
    Authors
    Sam Lapp; Justin Kitzes
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    This dataset is associated with the following manuscript, which contains details in the methodology of data collection and annotation: Lapp, S., Smith, T. C., Wilhelm, A, Knapp, R., Kitzes, J. In press. Aquatic soundscape recordings reveal diverse vocalizations and nocturnal activity of an endangered frog. The American Naturalist. Rana sierrae (the Sierra Nevada yellow-legged frog) is an endangered species residing in high-elevation lakes in the Sierra Nevada mountains. The species is highly aquatic and, unlike most amphibians, primarily vocalizes while underwater. As a result, its vocalizations have rarely been recorded and its vocal repertoire is not well studied. This dataset contains an annotated set of underwater soundscape recordings containing 1236 annotations of R. sierrae vocalizations. We annotated five distinct vocalization types of R. sierrae, only two of which have been previously documented for this species. Besides the calls of R. sierrae, these audio recordings also contain stridulation sounds (not annotated), which were most likely produced by members of the family Corixidae or other aquatic invertebrates that stridulate underwater. Methods The audio in this dataset is a set of 672 10-second audio files recorded at a spacing of 15 minutes over the course of 7 days on a single underwater audio recorder. The recorder, an AudioMoth version 1.2.0 in an underwater case, was deployed approximately 0.5 m from the shoreline, on the bottom of a lake in the Sierra Nevada in which R. sierrae breed and overwinter. The annotations of the five call types correspond to the descriptions in the associated manuscript: A primary vocalization "meow" described in Vredenburg et al 2007 B stuttered vocalization, also described in Vredenburg et al 2007 C chuck, double/triple chuck calls D short downward single note E frequency-modulated call X: could not determine if sound is R. sierrae or not; these were excluded from training and validation of the CNN in the manuscript Files were annotated by Sam Lapp using Raven Pro with closed-back headphones while viewing spectrogram. Only calls that could both be heard and seen on spectrogram were annotated. This dataset also contains one-hot labels (0/1 per class per audio clip) for 2-second segments of audio. To generate these labels, we considered R. sierrae vocalizations to be present in a 2-second sample if any R. sierrae annotation overlapped with the sample by at least 0.2 seconds or if greater than 50% of an annotation box overlapped in time with the sample. A notebook in the associated GitHub repository demonstrates how the Raven annotations were converted to one-hot labels. Works Cited

    Vredenburg VT, Bingham R, Knapp R, Morgan JAT, Moritz C, Wake D. 2007. Concordant molecular and phenotypic data delineate new taxonomy and conservation priorities for the endangered mountain yellow-legged frog. Journal of Zoology 271:361–374.

  12. Solar photovoltaic annotations for computer vision related to the...

    • figshare.com
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simiao Ren; Jordan Malof; T. Robert Fetter; Robert Beach; Jay Rineer; Kyle Bradbury (2023). Solar photovoltaic annotations for computer vision related to the "Classification Training Dataset for Crop Types in Rwanda" drone imagery dataset [Dataset]. http://doi.org/10.6084/m9.figshare.18094043.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Simiao Ren; Jordan Malof; T. Robert Fetter; Robert Beach; Jay Rineer; Kyle Bradbury
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Rwanda
    Description

    This dataset contains annotations (i.e. polygons) for solar photovoltaic (PV) objects in the previously published dataset "Classification Training Dataset for Crop Types in Rwanda" published by RTI International (DOI: 10.34911/rdnt.r4p1fr [1]). These polygons are intended to enable the use of this dataset as a machine learning training dataset for solar PV identification in drone imagery. Note that this dataset contains ONLY the solar panel polygon labels and needs to be used with the original RGB UAV imagery “Drone Imagery Classification Training Dataset for Crop Types in Rwanda” (https://mlhub.earth/data/rti_rwanda_crop_type). The original dataset contains UAV imagery (RGB) in .tiff format in six provinces in Rwanda, each with three phases imaged and our solar PV annotation dataset follows the same data structure with province and phase label in each subfolder.Data processing:Please refer to this Github repository for further details: https://github.com/BensonRen/Drone_based_solar_PV_detection. The original dataset is divided into 8000x8000 pixel image tiles and manually labeled with polygons (mainly rectangles) to indicate the presence of solar PV. These polygons are converted into pixel-wise, binary class annotations.Other information:1. The six provinces that UAV imagery came from are: (1) Cyampirita (2) Kabarama (3) Kaberege (4) Kinyaga (5) Ngarama (6) Rwakigarati. These original data collections were staged across 18 phases, each collected a set of imagery from a given Province (each provinces had 3 phases of collection). We have annotated 15 out of 18 phases, with the missing ones being: Kabarama-Phase2, Kaberege-Phase3, and Kinyaga-Phase3 due to data compatibility issues of the unused phases.2. The annotated polygons are transformed into binary maps the size of the image tiles but where each pixel is either 0 or 1. In this case, 0 represents background and 1 represents solar PV pixels. These binary maps are in .png format and each Province/phase set has between 9 and 49 annotation patches. Using the code provided in the above repository, the same image patches can be cropped from the original RGB imagery.3. Solar PV densities vary across the image patches. In total, there were 214 solar PV instances labeled in the 15 phase.Associated publications:“Utilizing geospatial data for assessing energy security: Mapping small solar home systems using unmanned aerial vehicles and deep learning” [https://arxiv.org/abs/2201.05548]This dataset is published under CC-BY-NC-SA-4.0 license. (https://creativecommons.org/licenses/by-nc-sa/4.0/)

  13. A collection of fully-annotated soundscape recordings from the Southwestern...

    • zenodo.org
    • data.niaid.nih.gov
    csv, pdf, txt, zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    W. Alexander Hopping; W. Alexander Hopping; Stefan Kahl; Stefan Kahl; Holger Klinck; Holger Klinck (2024). A collection of fully-annotated soundscape recordings from the Southwestern Amazon Basin [Dataset]. http://doi.org/10.5281/zenodo.7079124
    Explore at:
    csv, txt, pdf, zipAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    W. Alexander Hopping; W. Alexander Hopping; Stefan Kahl; Stefan Kahl; Holger Klinck; Holger Klinck
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This collection contains 21 hour-long soundscape recordings, which have been annotated with 14,798 bounding box labels for 132 different bird species from the Southwestern Amazon Basin. The data were recorded in 2019 in the Inkaterra Reserva Amazonica, Madre de Dios, Peru. This collection has partially been featured as test data in the 2020 BirdCLEF competition and can primarily be used for training and evaluation of machine learning algorithms.

    Data collection

    This acoustic data was collected at the Inkaterra Reserva Amazonica (ITRA) between January 14th and February 2nd, 2019, during the rainy season. ITRA is a 2 km2 lowland rainforest reserve on the banks of the Madre de Dios river, approximately 20 km east of the frontier town of Puerto Maldonado. The region's extraordinary biodiversity is threatened by accelerating rates of deforestation, degradation, and fragmentation, which are driven primarily by expanding road networks, mining, agriculture, and an increasing population. The acoustic data from this site were collected as part of a study designed to assess spatio-temporal variation in avian species richness and vocal activity levels across intact, degraded, and edge forest, and between different days at the same point locations.

    Ten SWIFT recording units, provided by the K. Lisa Yang Center for Conservation Bioacoustics at the Cornell Lab of Ornithology, were placed at separate sites spanning edge habitat, degraded forest, and intact forest within the reserve. These omnidirectional recorders were set to record uncompressed WAVE files continuously for the duration of their deployment, with a sampling rate of 48 kHz. The sensitivity of the used microphones was -44 (+/-3) dB re 1 V/Pa. The microphone's frequency response was not measured but is assumed to be flat (+/- 3 dB) in the frequency range 100 Hz to 7.5 kHz. The analog signal was amplified by 35 dB and digitized (16-bit resolution) using an analog-to-digital converter (ADC) with a clipping level of -/+ 0.9 V. For this collection, recordings were resampled at 32 kHz and converted to FLAC. Recorders were placed at a consistent height of approximately 1.5 m above the ground. To minimize background noise, all sites used for data analysis were located at a minimum distance of 450 m from the river.

    Sampling and annotation protocol

    A total of 21 dawn-hours, from 05:00-06:00 PET (10:00-11:00 UTC), representing 7 of the 10 sites on three randomly-selected dates, were manually annotated. Many neotropical bird species sing almost exclusively during the dawn hour, so this time window was selected to maximize the number of species present in the recordings. A single annotator boxed every bird call he could identify and ignored those that were too faint. Raven Pro software was used to annotate the data. Provided labels contain full bird calls that are boxed in time and frequency. The annotator was allowed to combine multiple consecutive calls of one species into one bounding box label if pauses between calls were shorter than five seconds. In this collection, we use eBird species codes as labels, following the 2021 eBird taxonomy (Clements list). Parts of this dataset have previously been featured in the 2020 BirdCLEF competition.

    Files in this collection

    Audio recordings can be accessed by downloading and extracting the “soundscape_data.zip” file. Soundscape recording filenames contain a sequential file ID, recording site, date, and timestamp in UTC. As an example, the file “PER_001_S01_20190116_100007Z.flac” has sequential ID 001 and was recorded at site S01 on Jan 16th, 2019 at 10:00:07 UTC. Ground truth annotations are listed in “annotations.csv” where each line specifies the corresponding filename, start and end time in seconds, low and high frequency in Hertz, and an eBird species code. These species codes can be assigned to scientific and common name of a species with the “species.csv” file. Unidentifiable calls have been marked with “????” and are included in the ground truth annotations. The approximate recording location and a short habitat description for all sites can be found in the “recording_location.txt” file.

    Acknowledgements

    We would like to thank the Inkaterra Association (ITA) staff for providing logistical support and excellent field station facilities, particularly Noe Huaraca, Dennis Osorio, and Kevin Jiménez Gonzales, who helped set up recorders. Noe Huaraca, John Fitzpatrick, Fernando Angulo, Will Sweet, Ken Rosenburg, and Alex Wiebe helped identify unknown vocalizations. Funding for equipment was provided by the K. Lisa Yang Center for Conservation Bioacoustics at the Cornell Lab of Ornithology, with support from Innóvate Perú, CORBIDI, and the Inkaterra Association. Travel expenses were funded by the Cornell Lab of Ornithology.

  14. h

    schematic_images

    • huggingface.co
    Updated Jul 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    chun yen huang (2025). schematic_images [Dataset]. https://huggingface.co/datasets/hanky2397/schematic_images
    Explore at:
    Dataset updated
    Jul 26, 2025
    Authors
    chun yen huang
    Description

    Schematic to HSPICE Netlist Dataset

    If you want to train wire detection model, you can refer to our GitHub repository: Netlistfy This dataset is designed for the task of converting schematic images into HSPICE netlists. It includes various annotations and labels required for object detection and net extraction. The dataset is structured as follows:

      Dataset Structure
    

    ├── images.zip # Schematic images ├── components.zip # Component annotations (YOLO format) and… See the full description on the dataset page: https://huggingface.co/datasets/hanky2397/schematic_images.

  15. SKU110K Dataset

    • kaggle.com
    • universe.roboflow.com
    • +1more
    zip
    Updated Jun 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francisco 'Cisco' Zabala (2022). SKU110K Dataset [Dataset]. https://www.kaggle.com/datasets/thedatasith/sku110k-annotations/code
    Explore at:
    zip(14123122669 bytes)Available download formats
    Dataset updated
    Jun 9, 2022
    Authors
    Francisco 'Cisco' Zabala
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Context

    Here you will find the annotations (labels), images, and a notebook containing the process to use the SKU110K dataset, as well as to convert the annotations to a variety of formats (e.g., YOLOv5).

    Content

    This dataset contains the labels comprising the bounding boxes (only one class for all detections), as well as all images for the SKU110K dataset. The included notebook is meant to be run locally, and the code shows how to use the Kaggle CLI for downloading the images to their corresponding location.

    Acknowledgements

    I'd love to acknowledge the original authors: https://github.com/eg4000/SKU110K_CVPR19

  16. VSAI Dataset (YOLO11-OBB format)

    • kaggle.com
    zip
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mridankan Mandal (2025). VSAI Dataset (YOLO11-OBB format) [Dataset]. https://www.kaggle.com/datasets/redzapdos123/vsai-dataset-yolo11-obb-format/code
    Explore at:
    zip(8332516716 bytes)Available download formats
    Dataset updated
    Aug 29, 2025
    Authors
    Mridankan Mandal
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    VSAI Aerial Vehicle Detection Dataset (YOLO OBB Format):

    A cleaned, and reformatted version of the VSAI Dataset, specifically adapted for Oriented Bounding Box (OBB) vehicle detection using the YOLOv11 format.

    Overview:

    This dataset is designed for aerial/drone-based vehicle detection tasks. It is a modified version of the original VSAI Dataset v1 by the DroneVision Team. This version has been modified by Mridankan Mandal for the easy of training object detection models like the YOLO11-OBB models.

    The dataset is split into two classes: small-vehicle and large-vehicle. All annotations have been converted to the YOLOv11-OBB format, and the data is organized into training, validation, and testing sets.

    Key Features and Modifications:

    This dataset improves upon the original by incorporating several key modifications to make it more accessible and useful for modern computer vision tasks:

    • Format Conversion: The annotations have been converted to the YOLOv11-OBB format, which uses four corner points to define an oriented bounding box.
    • Data Cleaning: All image and annotation pairs where the label file was empty have been removed to ensure dataset quality.
    • Structured Splits: The dataset is pre-split into train (80%), validation (10%), and test (10%) sets, with the following image counts:
      • Train: 4,297 images
      • Validation: 537 images
      • Test: 538 images
      • Total: 5,372 images
    • Coordinate Normalization: All bounding box coordinates are normalized to a range of [0.0 - 1.0], making them ready for training without preprocessing.

    Directory Structure

    The dataset is organized in a standard YOLO format for easy integration with popular training frameworks.

    YOLOOBBVSAIDataset/
    ├── train/
    │  ├── images/   #Contains 4,297 image files.
    │  └── labels/   #Contains 4,297 .txt label files.
    ├── val/
    │  ├── images/   #Contains 537 image files.
    │  └── labels/   #Contains 537 .txt label files.
    ├── test/
    │  ├── images/   #Contains 538 image files.
    │  └── labels/   #Contains 538 .txt label files.
    ├── data.yaml    #Dataset configuration file.
    ├── license.md   #Full license details.
    └── ReadMe.md    #Dataset README file.
    

    Annotation Format:

    Each .txt label file contains one or more lines, with each line representing a single object in the YOLOv11-OBB format:

    class_id x1 y1 x2 y2 x3 y3 x4 y4

    • class_id: An integer representing the object class (0 for small-vehicle, 1 for large-vehicle).
    • (x1, y1)...(x4, y4): The four corner points of the oriented bounding box, with coordinates normalized between 0 and 1.

    data.yaml:

    To begin training a YOLO model with this dataset, you can use the provided data.yaml file. Simply update the path to the location of the dataset on your local machine.

    #The path to the root dataset directory.
    path: /path/to/YOLOOBBVSAIDataset/
    train: train/images
    val: val/images
    test: test/images
    
    #Number of classes.
    nc: 2
    
    #The Class names,
    names:
     0: small-vehicle
     1: large-vehicle
    

    License and Attribution:

    This dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

    • You are free to: Use, modify, and redistribute this dataset for non-commercial research and educational purposes.
    • You must: Provide proper attribution to both the original creators and the modifier, and release any derivative works under the same license.

    Proper Attribution:

    When using this dataset, attribute as follows:

    • Original VSAI Dataset v1 by DroneVision Team, licensed under CC BY-NC-SA 4.0.
    • Modified VSAI Dataset (YOLOv11-OBB Format) by Mridankan Mandal, licensed under CC BY-NC-SA 4.0.

    Citation:

    If you use this dataset in your research, use the following BibTeX entry to cite it:

    @dataset{vsai_yolo_obb_2025,
     title={VSAI Dataset (YOLOv11-OBB Format)},
     author={Mridankan Mandal},
     year={2025},
     note={Modified from original VSAI v1 dataset by DroneVision},
     license={CC BY-NC-SA 4.0}
    }
    
  17. A collection of fully-annotated soundscape recordings from the southern...

    • zenodo.org
    • data.niaid.nih.gov
    csv, pdf, txt, zip
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mary Clapp; Mary Clapp; Stefan Kahl; Stefan Kahl; Erik Meyer; Megan McKenna; Holger Klinck; Holger Klinck; Gail Patricelli; Gail Patricelli; Erik Meyer; Megan McKenna (2024). A collection of fully-annotated soundscape recordings from the southern Sierra Nevada mountain range [Dataset]. http://doi.org/10.5281/zenodo.7525805
    Explore at:
    csv, txt, zip, pdfAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mary Clapp; Mary Clapp; Stefan Kahl; Stefan Kahl; Erik Meyer; Megan McKenna; Holger Klinck; Holger Klinck; Gail Patricelli; Gail Patricelli; Erik Meyer; Megan McKenna
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Sierra Nevada
    Description

    This collection contains 100 soundscape recordings of 10 minutes duration, which have been annotated with 10,296 bounding box labels for 21 different bird species from the Western United States. The data were recorded in 2015 in the southern end of the Sierra Nevada mountain range in California, USA. This collection has been featured as test data in the 2020 BirdCLEF and Kaggle Birdcall Identification competition and can primarily be used for training and evaluation of machine learning algorithms.

    Data collection

    The recordings were made in Sequoia and Kings Canyon National Parks, two contiguous national parks in the southern Sierra Nevada mountain range in California, USA. The focus of the acoustic study was the high-elevation region of the Parks; specifically, the headwater lake basins above 3,000 km in elevation. The original intent of the study was to monitor seasonal activity of birds and bats at lakes containing trout and lakes without trout, because the cascading impacts of trout on the adjacent terrestrial zone remain poorly understood. Soundscapes were recorded for 24 h continuously at 10 lakes (5 fishless, 5 fish-containing) throughout Sequoia and Kings Canyon National Parks during June-September 2015. Song Meter SM2+ units (Wildlife Acoustics, USA) powered by custom-made solar panels were used to obviate the need to swap batteries, due to the recording locations being extremely difficult to access. Song Meters continuously recorded mono-channel, 16-bits uncompressed WAVE files at 48 kHz sampling rate. For this collection, recordings were resampled at 32 kHz and converted to FLAC.

    Sampling and annotation protocol

    A total of 100 10-minute segments of audio between July 9 and 12, 2015 from morning hours (06:10-09:10 PDT) from all 10 sites were selected at random. Annotators were asked to box every bird call they could recognize, ignoring those that are too faint or unidentifiable. Every sound that could not be confidently assigned an identity was reviewed with 1-2 other experts in bird identification. To minimize observer bias, all identifying information about the location, date and time of the recordings was hidden from the annotator. Raven Pro software was used to annotate the data. Provided labels contain full bird calls that are boxed in time and frequency. In this collection, we use eBird species codes as labels, following the 2021 eBird taxonomy (Clements list). Unidentifiable calls have been marked with “????” and were added as bounding box labels to the ground truth annotations. Parts of this dataset have previously been used in the 2020 BirdCLEF and Kaggle Birdcall Identification competition.

    Files in this collection

    Audio recordings can be accessed by downloading and extracting the “soundscape_data.zip” file. Soundscape recording filenames contain a sequential file ID, recording date and timestamp in PDT (UTC-7). As an example, the file “HSN_001_20150708_061805.flac” has sequential ID 001 and was recorded on July 8th 2015 at 06:18:05 PDT. Ground truth annotations are listed in “annotations.csv” where each line specifies the corresponding filename, start and end time in seconds, low and high frequency in Hertz and an eBird species code. These species codes can be assigned to scientific and common name of a species with the “species.csv” file. The approximate recording location with longitude and latitude can be found in the “recording_location.txt” file.

    Acknowledgements

    Compiling this extensive dataset was a major undertaking, and we are very thankful to the domain experts who helped to collect and manually annotate the data for this collection (individual contributors in alphabetic order): Anna Calderón, Thomas Hahn, Ruoshi Huang, Angelly Tovar

  18. Annotated recordings of two captive groups of rooks, with individual...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Killian Martin; Killian Martin; Francesca M. Cornero; Francesca M. Cornero; Nicola S. Clayton; Nicola S. Clayton; Olivier Adam; Olivier Adam; Nicolas Obin; Nicolas Obin; Valérie Dufour; Valérie Dufour (2023). Annotated recordings of two captive groups of rooks, with individual identity and context [Dataset]. http://doi.org/10.2306/rook.comp
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Killian Martin; Killian Martin; Francesca M. Cornero; Francesca M. Cornero; Nicola S. Clayton; Nicola S. Clayton; Olivier Adam; Olivier Adam; Nicolas Obin; Nicolas Obin; Valérie Dufour; Valérie Dufour
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was used in our paper "Vocal complexity in a socially complex corvid: gradation, diversity, and lack of common call repertoire in male rooks" (DOI will be added on publication).

    If you use this dataset in your work, please cite the article (citation to be added on publication).

    The dataset includes audio recording (stored in "audio.zip") and annotations (stored in "labels.zip" and "clean labels.zip), collected between 2020 and 2022 in two captive groups of rooks in outdoor aviaries, one in Strasbourg (France) and one in Cambridge (UK). The audio was compressed losslessly to FLAC files from the original uncompressed WAV to fit with the Zenodo 50GB limit. They can be converted back to WAV with the Python soundfile package or with FFMPEG from the command line if needed (though some conversion will probably fail due to the ~4GB file size limit on WAV).

    NOTE: To ease checking data formatting without downloading the entire dataset, the "example.zip" folder contains one audio file and its associated annotations.

    Two different versions of the annotations files are included: "clean_labels.zip" includes the TSV files used in the analysis for the paper, and "labels.zip" includes the TXT files used for annotation, which can be opened along with the audio files in the Audacity software for reviewing. Each audio file corresponds to one TXT and one TSV file, with corresponding files sharing the same filename. Filename format is 'YYYYMMDD_HHMMSS(_StartXXXX)', meaning the date and time of the beginning of the recording; optionally, "StartXXXX" means that the original recording was split into multiple files, with each file starting XXXX seconds after the start of the original recording.

    TSV annotations include, for each recorded rook vocalisations: time stamps (Start, End columns), emitter identity (Source column), context of emission (Event column; these annotations are often abbreviations, but the most important distinction is between calls and songs, denoted by the presence or absence of "sing" in the Event cell). and additional comments (Comment column). Special cases for annotations include Inc (unknown single individual vocalised but could not be identified), Pls (several individuals vocalised but overlapped too much to be separated), Comment (for events of note that were not vocalisations), and Ignore (this was used for sections that could not be checked for annotations for any reason; vocalisations may be included but were not annotated).

    TXT annotations include the same information but the Source and Event columns are merged and the corresponding Comments are additional markers between the vocalisation timestamps, to be compatible with Audacity. This is best viewed in the example file.

    Additional info regarding the individuals can be found in Table S1 of the paper.

    The code used for the analysis is hosted at https://gitlab.com/kimartin/cluster_rook_vocs.

    For questions on the dataset, please reach out to killian.martin@ens-lyon.fr

  19. A collection of fully-annotated soundscape recordings from the Northeastern...

    • zenodo.org
    • data.niaid.nih.gov
    csv, pdf, txt, zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Kahl; Stefan Kahl; Russell Charif; Russell Charif; Holger Klinck; Holger Klinck (2024). A collection of fully-annotated soundscape recordings from the Northeastern United States [Dataset]. http://doi.org/10.5281/zenodo.7018484
    Explore at:
    pdf, csv, zip, txtAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Stefan Kahl; Stefan Kahl; Russell Charif; Russell Charif; Holger Klinck; Holger Klinck
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Northeastern United States, United States
    Description

    This collection contains 285 hour-long soundscape recordings which have been annotated by expert ornithologists who provided 50,760 bounding box labels for 81 different bird species from the Northeastern USA. The data were recorded in 2017 in the Sapsucker Woods bird sanctuary in Ithaca, NY, USA. This collection has (partially) been used as test data in the 2019, 2020 and 2021 BirdCLEF competition and can primarily be used for training and evaluation of machine learning algorithms.

    Data collection

    As part of the Sapsucker Woods Acoustic Monitoring Project (SWAMP), the K. Lisa Yang Center for Conservation Bioacoustics at the Cornell Lab of Ornithology deployed 30 first-generation SWIFT recorders in the surrounding bird sanctuary area in Ithaca, NY, USA. The sensitivity of the used microphones was -44 (+/-3) dB re 1 V/Pa. The microphone's frequency response was not measured, but is assumed to be flat (+/- 2 dB) in the frequency range 100 Hz to 7.5 kHz. The analog signal was amplified by 33 dB and digitized (16-bit resolution) using an analog-to-digital converter (ADC) with a clipping level of -/+ 0.9 V. This ongoing study aims to investigate the vocal activity patterns and seasonally changing diversity of local bird species. The data are also used to assess the impact of noise pollution on the behavior of birds. Recordings were recorded 24 h/day in 1-hour uncompressed WAVE files at 48 kHz, converted to FLAC and resampled to 32 kHz for this collection. Parts of this dataset have previously been used in the 2019, 2020 and 2021 BirdCLEF competition.

    Sampling and annotation protocol

    We subsampled data for this collection by randomly selecting one 1-hour file from one of the 30 different recording units for each hour of one day per week between Feb and Aug 2017. For this collection, we excluded recordings that were shorter than one hour or did not contain a bird vocalization. Annotators were asked to box every bird call they could recognize, ignoring those that are too faint or unidentifiable. Raven Pro software was used to annotate the data. Provided labels contain full bird calls that are boxed in time and frequency. Annotators were allowed to combine multiple consecutive calls of one species into one bounding box label if pauses between calls were shorter than five seconds. We use eBird species codes as labels, following the 2021 eBird taxonomy (Clements list).

    Files in this collection

    Audio recordings can be accessed by downloading and extracting the “soundscape_data.zip” file. Soundscape recording filenames contain a sequential file ID, recording date and timestamp in UTC. As an example, the file “SSW_001_20170225_010000Z.flac” has sequential ID 001 and was recorded on Feb 25th 2017 at 01:00:00 UTC. Ground truth annotations are listed in “annotations.csv” where each line specifies the corresponding filename, start and end time in seconds, low and high frequency in Hertz and an eBird species code. These species codes can be assigned to scientific and common name of a species with the “species.csv” file. The approximate recording location with longitude and latitude can be found in the “recording_location.txt” file.

    Acknowledgements

    Compiling this extensive dataset was a major undertaking, and we are very thankful to the domain experts who helped to collect and manually annotate the data for this collection (individual contributors in alphabetic order): Jessie Barry, Sarah Dzielski, Cullen Hanks, Robert Koch, Jim Lowe, Jay McGowan, Ashik Rahaman, Yu Shiu, Laurel Symes, and Matt Young.

  20. R

    Converter Dataset

    • universe.roboflow.com
    zip
    Updated May 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    label (2024). Converter Dataset [Dataset]. https://universe.roboflow.com/label-a6heb/converter-iolcu/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 2, 2024
    Dataset authored and provided by
    label
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Label Bounding Boxes
    Description

    Converter

    ## Overview
    
    Converter is a dataset for object detection tasks - it contains Label annotations for 400 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gabriel Fernandes Carvalho (2025). CarDD with YOLO Annotations (Images + Labels) [Dataset]. https://www.kaggle.com/datasets/gabrielfcarvalho/cardd-with-yolo-annotations-images-labels/data
Organization logo

CarDD with YOLO Annotations (Images + Labels)

Car Damage Detection dataset with YOLO-format labels converted from COCO

Explore at:
zip(3010616273 bytes)Available download formats
Dataset updated
Aug 5, 2025
Authors
Gabriel Fernandes Carvalho
License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

CarDD with YOLO Annotations (Images + Labels)

This dataset packages the Car Damage Detection (CarDD) images together with YOLO-format labels converted from the original COCO/SOD annotations.

  • Images & original annotations: CarDD (PIC Lab, CAS).
  • YOLO labels: Converted by Gabriel Fernandes Carvalho.
  • Task: Object detection of car damage categories.
  • Format: YOLO text files — each line is class x_center y_center width height (normalized).

Classes

  • 0 dent
  • 1 scratch
  • 2 crack
  • 3 glass shatter
  • 4 lamp broken
  • 5 tire flat
Search
Clear search
Close search
Google apps
Main menu