3 datasets found
  1. stage1-Cnn14-16k-valid2-weights

    • kaggle.com
    Updated Oct 13, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gopi Durgaprasad (2020). stage1-Cnn14-16k-valid2-weights [Dataset]. https://www.kaggle.com/gopidurgaprasad/stage1-cnn14-16k-valid2-weights/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 13, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gopi Durgaprasad
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Gopi Durgaprasad

    Released under CC0: Public Domain

    Contents

  2. o

    Data from: DCASE 2024 Challenge Task 7 Development Dataset : Environmental...

    • explore.openaire.eu
    • zenodo.org
    Updated Mar 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keunwoo Choi; Laurie M. Heller; Keisuke Imoto; Mathieu Lagrange; Junwon Lee; Brian McFee; Yuki Okamoto; Modan Tailleur (2024). DCASE 2024 Challenge Task 7 Development Dataset : Environmental Sound Scene Synthesis [Dataset]. http://doi.org/10.5281/zenodo.10869643
    Explore at:
    Dataset updated
    Mar 25, 2024
    Authors
    Keunwoo Choi; Laurie M. Heller; Keisuke Imoto; Mathieu Lagrange; Junwon Lee; Brian McFee; Yuki Okamoto; Modan Tailleur
    Description

    Description This dataset comprises embeddings and captions utilized as the development dataset for DCASE 2024 Challenge Task 7, focusing on 'Environmental Sound Scene Synthesis.' The embeddings are derived from 60 different 4-second audio files formatted as mono 32-bit 32kHz, and are contained in the 'embeddings.tar.xz' file. Captions corresponding to each audio file can be found in 'caption.csv'. This dataset does not comprise the audio files, only the embeddings. Three different types of embeddings are provided: VGGish (vggish), MS-CLAP (clap-2023), and PANNs CNN14 Wavegram-Logmel (panns-wavegram-logmel). Only PANNs CNN14 Wavegram-Logmel (panns-wavegram-logmel) embeddings are used for evaluation in the challenge. For further details, please refer to the challenge website. Contact Modan Tailleur, modan.tailleur@ls2n.fr Mathieu Lagrange, mathieu.lagrange@ls2n.fr

  3. Soundscape Datasets for Few-Shot Bird Sound Classification

    • zenodo.org
    zip
    Updated Oct 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moummad Ilyass; Moummad Ilyass (2024). Soundscape Datasets for Few-Shot Bird Sound Classification [Dataset]. http://doi.org/10.5281/zenodo.13994373
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 29, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Moummad Ilyass; Moummad Ilyass
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This repository provides easy access to open-source soundscape datasets of bird sounds, specifically optimized for few-shot classification.

    soundscapes.zip contains evaluation soundscape datasets from the BIRB benchmark (https://arxiv.org/abs/2312.07439), downsampled to 16kHz, preprocessed using CNN14 from PANNs (https://arxiv.org/abs/1912.10211), to select a 6-second window with the highest bird activation, and converted to Pytorch (.pt) format to facilitate usability for evaluating deep neural networks.

    These preprocessed datasets are employed in the work "Domain-Invariant Representation Learning of Bird Sounds" (https://arxiv.org/abs/2409.08589), which evaluates the few-shot learning capabilities of deep learning models trained on focal recordings (e.g., Xeno-Canto) and tested on soundscape recordings.

    Dataset Structure

    Validation Dataset

    • POW (pow.pt): The validation dataset consists of 16,047 examples across 43 classes and is organized as a dictionary with 'data' and 'label' keys representing bird sounds and their corresponding labels. Storing the entire validation dataset in a single tensor enables rapid loading and efficient processing, significantly accelerating the validation process. Classes with only one example are removed, as they are insufficient for one-shot classification tasks. Source: https://zenodo.org/records/4656848#.Y7ijhOxudhE

    Test Datasets

    Each test dataset is structured with multiple subfolders, each labeled with an eBird species code to represent data for a specific bird species.

    Code and detailed instructions, including data loading, model implementation, and few-shot evaluation, can be found at: https://github.com/ilyassmoummad/ProtoCLR

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gopi Durgaprasad (2020). stage1-Cnn14-16k-valid2-weights [Dataset]. https://www.kaggle.com/gopidurgaprasad/stage1-cnn14-16k-valid2-weights/discussion
Organization logo

stage1-Cnn14-16k-valid2-weights

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 13, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Gopi Durgaprasad
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Dataset

This dataset was created by Gopi Durgaprasad

Released under CC0: Public Domain

Contents

Search
Clear search
Close search
Google apps
Main menu