9 datasets found
  1. Urban Sound 8k (Tabular)

    • kaggle.com
    zip
    Updated Feb 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Orvile (2025). Urban Sound 8k (Tabular) [Dataset]. https://www.kaggle.com/datasets/orvile/urban-sound-8k-tabular-form
    Explore at:
    zip(1772628 bytes)Available download formats
    Dataset updated
    Feb 23, 2025
    Authors
    Orvile
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Urban Sound 8K Feature Extracted Dataset 🎵📊

    A Processed Version of the UrbanSound8K Dataset for Machine Learning Applications

    📌 Overview

    This dataset is derived from the UrbanSound8K dataset, which contains 8,732 labeled sound clips across 10 urban sound classes. The original dataset has been processed to extract key audio features that are commonly used in machine learning and deep learning applications for sound classification.

    8732 actual dataset 8674 this is what i have got so we have lost 58 sample on the way :(

    🔍 Extracted Features

    We extracted 35 columns representing various audio characteristics. The target variable is the class column, which contains the sound category.

    • ✔️ MFCC (Mel-Frequency Cepstral Coefficients) – Captures timbral features of the sound
    • ✔️ Chroma Features – Represents harmonic content
    • ✔️ Spectral Contrast – Measures the difference between peaks and valleys in the spectrum
    • ✔️ Zero Crossing Rate (ZCR) – Counts the rate of sign changes in the waveform
    • ✔️ Spectral Centroid – Indicates the center of mass of the spectrum
    • ✔️ Class Label – The urban sound category (Target Variable)

    🔊 Classes & Labels

    ClassID Class Name - 0 Air Conditioner - 1 Car Horn - 2 Children Playing - 3 Dog Bark - 4 Drilling - 5 Engine Idling - 6 Gun Shot - 7 Jackhammer - 8 Siren - 9 Street Music

    📊 Dataset Structure

    • Total Columns: 35
    • Feature Columns: 34 extracted audio features
    • Target Column: class (categorical)
    • Original Data Source: UrbanSound8K

    🚀 Use Cases

    • ✔️ Deep Learning & AI-based Audio Classification
    • ✔️ Urban Noise Detection & Sound Event Recognition
    • ✔️ Music & Environmental Sound Analysis
    • ✔️ Audio Feature Engineering for Machine Learning

    ⚖️ License & Citation

    This dataset is a processed version of UrbanSound8K. If you use it, please cite the original dataset:

    📄 J. Salamon, C. Jacoby, and J. P. Bello, "A Dataset and Taxonomy for Urban Sound Research," 22nd ACM International Conference on Multimedia, Orlando, USA, Nov. 2014.

    🔗 More Info: UrbanSound8K Website

  2. UrbanSound8K

    • kaggle.com
    • opendatalab.com
    zip
    Updated Feb 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris Gorgolewski (2020). UrbanSound8K [Dataset]. https://www.kaggle.com/datasets/chrisfilo/urbansound8k/code
    Explore at:
    zip(6026232524 bytes)Available download formats
    Dataset updated
    Feb 4, 2020
    Authors
    Chris Gorgolewski
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset contains 8732 labeled sound excerpts (<=4s) of urban sounds from 10 classes: air_conditioner, car_horn, children_playing, dog_bark, drilling, enginge_idling, gun_shot, jackhammer, siren, and street_music. The classes are drawn from the urban sound taxonomy. For a detailed description of the dataset and how it was compiled please refer to our paper. All excerpts are taken from field recordings uploaded to www.freesound.org. The files are pre-sorted into ten folds (folders named fold1-fold10) to help in the reproduction of and comparison with the automatic classification results reported in the article above.

    In addition to the sound excerpts, a CSV file containing metadata about each excerpt is also provided.

    AUDIO FILES INCLUDED

    8732 audio files of urban sounds (see description above) in WAV format. The sampling rate, bit depth, and number of channels are the same as those of the original file uploaded to Freesound (and hence may vary from file to file).

    META-DATA FILES INCLUDED

    UrbanSound8k.csv
    
    

    This file contains meta-data information about every audio file in the dataset. This includes:

    • slice_file_name: The name of the audio file. The name takes the following format: [fsID]-[classID]-[occurrenceID]-[sliceID].wav, where: [fsID] = the Freesound ID of the recording from which this excerpt (slice) is taken [classID] = a numeric identifier of the sound class (see description of classID below for further details) [occurrenceID] = a numeric identifier to distinguish different occurrences of the sound within the original recording [sliceID] = a numeric identifier to distinguish different slices taken from the same occurrence

    • fsID: The Freesound ID of the recording from which this excerpt (slice) is taken

    • start The start time of the slice in the original Freesound recording

    • end: The end time of slice in the original Freesound recording

    • salience: A (subjective) salience rating of the sound. 1 = foreground, 2 = background.

    • fold: The fold number (1-10) to which this file has been allocated.

    • classID: A numeric identifier of the sound class: 0 = air_conditioner 1 = car_horn 2 = children_playing 3 = dog_bark 4 = drilling 5 = engine_idling 6 = gun_shot 7 = jackhammer 8 = siren 9 = street_music

    • class: The class name: air_conditioner, car_horn, children_playing, dog_bark, drilling, engine_idling, gun_shot, jackhammer, siren, street_music.

    BEFORE YOU DOWNLOAD: AVOID COMMON PITFALLS!

    Since releasing the dataset we have noticed a couple of common mistakes that could invalidate your results, potentially leading to manuscripts being rejected or the publication of incorrect results. To avoid this, please read the following carefully:

    1. Don't reshuffle the data! Use the predefined 10 folds and perform 10-fold (not 5-fold) cross validation The experiments conducted by vast majority of publications using UrbanSound8K (by ourselves and others) evaluate classification models via 10-fold cross validation using the predefined splits*. We strongly recommend following this procedure.

    Why? If you reshuffle the data (e.g. combine the data from all folds and generate a random train/test split) you will be incorrectly placing related samples in both the train and test sets, leading to inflated scores that don't represent your model's performance on unseen data. Put simply, your results will be wrong. Your results will NOT be comparable to previous results in the literature, meaning any claims to an improvement on previous research will be invalid. Even if you don't reshuffle the data, evaluating using different splits (e.g. 5-fold cross validation) will mean your results are not comparable to previous research.

    1. Don't evaluate just on one split! Use 10-fold (not 5-fold) cross validation and average the scores We have seen reports that only provide results for a single train/test split, e.g. train on folds 1-9, test on fold 10 and report a single accuracy score. We strongly advise against this. Instead, perform 10-fold cross validation using the provided folds and report the average score.

    Why? Not all the splits are as "easy". That is, models tend to obtain much higher scores when trained on folds 1-9 and tested on fold 10, compared to (e.g.) training on folds 2-10 and testing on fold 1. For this reason, it is important to evaluate your model on each of the 10 splits and report the average accuracy. Again, your results will NOT be comparable to previous results in the literature.

    Acknowledgements

    We kindly request that articles and other works in which this dataset is used cite the following paper:

    J. Salamon, C. Jacoby and J. P. Bello, "A Dataset and Taxonomy for Urban Sound Research", 22nd ACM International Conference on Multimedia, Orlando USA, Nov. 2014.

    More information at https://urbansounddataset.weebly.com/urbansound8k.html

  3. UrbanSound8K

    • zenodo.org
    application/gzip
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Justin Salamon; Justin Salamon; Christopher Jacoby; Juan Pablo Bello; Christopher Jacoby; Juan Pablo Bello (2020). UrbanSound8K [Dataset]. http://doi.org/10.5281/zenodo.1203745
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Justin Salamon; Justin Salamon; Christopher Jacoby; Juan Pablo Bello; Christopher Jacoby; Juan Pablo Bello
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Please visit urbansounddataset.weebly.com for a detailed description.

    We kindly request that articles and other works in which this dataset is used cite the following paper:

    J. Salamon, C. Jacoby and J. P. Bello, "A Dataset and Taxonomy for Urban Sound Research", 22nd ACM International Conference on Multimedia, Orlando USA, Nov. 2014.
    [ACM][PDF][BibTeX]

  4. UrbanSound8K Images

    • kaggle.com
    zip
    Updated Jun 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gokul Rejithkumar (2021). UrbanSound8K Images [Dataset]. https://www.kaggle.com/datasets/gokulrejith/urban-sound-8k-images
    Explore at:
    zip(226944585 bytes)Available download formats
    Dataset updated
    Jun 6, 2021
    Authors
    Gokul Rejithkumar
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset contains 8732 labeled sound excerpts melspectograms (<=4s) of urban sounds from 10 classes: air_conditioner, car_horn, children_playing, dog_bark, drilling, enginge_idling, gun_shot, jackhammer, siren, and street_music. The classes are drawn from the urban sound taxonomy. All excerpts are taken from field recordings uploaded to www.freesound.org. This data set contains images of the sound in the UrbanSound8K data set.

  5. urban sound converted

    • kaggle.com
    zip
    Updated Nov 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kyrylo Chorniy (2021). urban sound converted [Dataset]. https://www.kaggle.com/kirillchorniy/urban-sound-8k-with-extracted-features
    Explore at:
    zip(5471736 bytes)Available download formats
    Dataset updated
    Nov 7, 2021
    Authors
    Kyrylo Chorniy
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    converted urban sound using mel-frequency ceptrum original dataset: https://www.kaggle.com/chrisfilo/urbansound8k all credits to the original authors

  6. h

    urbansounds_melspectrograms

    • huggingface.co
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ethan Edwards (2025). urbansounds_melspectrograms [Dataset]. https://huggingface.co/datasets/EthanGLEdwards/urbansounds_melspectrograms
    Explore at:
    Dataset updated
    Jul 8, 2025
    Authors
    Ethan Edwards
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Processing Details

    The audio files from the original dataset have been converted to log-mel spectrograms using the following parameters:

    Mel frequency bins: 128 (n_mels=128) Power to dB conversion: Using librosa.power_to_db() with reference to maximum power Library used: librosa for audio processing

      Processing Pipeline
    

    Load audio data from the original Urban Sounds 8K dataset Extract mel spectrograms using librosa.feature.melspectrogram() Convert to log scale using… See the full description on the dataset page: https://huggingface.co/datasets/EthanGLEdwards/urbansounds_melspectrograms.

  7. Segregated-urban8K-sounds

    • kaggle.com
    zip
    Updated Mar 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raghav Rawat (2021). Segregated-urban8K-sounds [Dataset]. https://www.kaggle.com/raghavrawat/segregatedurban8ksounds
    Explore at:
    zip(6026697381 bytes)Available download formats
    Dataset updated
    Mar 31, 2021
    Authors
    Raghav Rawat
    Description

    Dataset

    This dataset was created by Raghav Rawat

    Contents

  8. IDMT-FL Dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Nov 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David S. Johnson; Wolfgang Lorenz; Michael Taenzer; Stylianos Mimilakis; Sascha Grollmisch; Sascha Grollmisch; Jakob Abeßer; Jakob Abeßer; Hanna Lukashevich; Hanna Lukashevich; David S. Johnson; Wolfgang Lorenz; Michael Taenzer; Stylianos Mimilakis (2023). IDMT-FL Dataset [Dataset]. http://doi.org/10.5281/zenodo.7551584
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 24, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David S. Johnson; Wolfgang Lorenz; Michael Taenzer; Stylianos Mimilakis; Sascha Grollmisch; Sascha Grollmisch; Jakob Abeßer; Jakob Abeßer; Hanna Lukashevich; Hanna Lukashevich; David S. Johnson; Wolfgang Lorenz; Michael Taenzer; Stylianos Mimilakis
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    The IDMT-DESED-FL and IDMT-URBAN-FL datasets enable research in sound event detection (SED) within a federated learning (FL) context. IDMT-DESED-FL and IDMT-URBAN-FL consist of sound events sourced from well-known DESED and URBAN-8K datasets. Each source dataset contains sound events from ten classes for the use cases of SED in domestic and urban environments, respectively. To simulate an FL scenario, the source events are mixed with background noise to generate 30.000 ten-second soundscapes which are partitioned to 100 edge devices. Each soundscape is generated by mixing up to five sound events (possibly overlapping) with background noise. Both datasets contain independent and identically distributed (IID) and non-IID versions, to provide a more real-world like distribution of event classes.

    • IDMT-DESED-FL sound event classes include alarm/bell/ringing, blender, cat, dog, dishes, electric shaver/toothbrush, frying, running water, speech, and vacuum cleaner. The background classes include apartment room, computer interior, computer lab, emergency staircase, and library.
    • IDMT-URBAN-FL sound event classes include air conditioner, car horn, children playing, dog bark, drilling, engine idling, gun shot, jackhammer, siren, and street music. Background classes for IDMT-URBAN-FL are sourced from the Isolated Urban Sound Database (IUSB), and include birds, crowd, fountain, rain, and traffic.

    Due to the size of the datasets, with this download we provide the scripts and details necessary to generate the FL datasets using the source material from DESED, URBAN-8K, and IUSB.

    See the above referenced paper and README contained with the data folder for further details.

  9. urban_sound_8k

    • kaggle.com
    zip
    Updated Jun 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tariq Blecher (2022). urban_sound_8k [Dataset]. https://www.kaggle.com/datasets/tariqblecher/urban-sound-8k
    Explore at:
    zip(396020145 bytes)Available download formats
    Dataset updated
    Jun 25, 2022
    Authors
    Tariq Blecher
    Description

    Tensorflow could not read in the original wave files so I re-wrote them locally and re-uploaded. For my own purposes (https://www.kaggle.com/tariqblecher/speech-enhancement-tensorflow-unet-softmask) I rewrote the data at a sampling rate of 8K.

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Orvile (2025). Urban Sound 8k (Tabular) [Dataset]. https://www.kaggle.com/datasets/orvile/urban-sound-8k-tabular-form
Organization logo

Urban Sound 8k (Tabular)

A Processed Version of the UrbanSound8K Dataset for Machine Learning Application

Explore at:
zip(1772628 bytes)Available download formats
Dataset updated
Feb 23, 2025
Authors
Orvile
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Urban Sound 8K Feature Extracted Dataset 🎵📊

A Processed Version of the UrbanSound8K Dataset for Machine Learning Applications

📌 Overview

This dataset is derived from the UrbanSound8K dataset, which contains 8,732 labeled sound clips across 10 urban sound classes. The original dataset has been processed to extract key audio features that are commonly used in machine learning and deep learning applications for sound classification.

8732 actual dataset 8674 this is what i have got so we have lost 58 sample on the way :(

🔍 Extracted Features

We extracted 35 columns representing various audio characteristics. The target variable is the class column, which contains the sound category.

  • ✔️ MFCC (Mel-Frequency Cepstral Coefficients) – Captures timbral features of the sound
  • ✔️ Chroma Features – Represents harmonic content
  • ✔️ Spectral Contrast – Measures the difference between peaks and valleys in the spectrum
  • ✔️ Zero Crossing Rate (ZCR) – Counts the rate of sign changes in the waveform
  • ✔️ Spectral Centroid – Indicates the center of mass of the spectrum
  • ✔️ Class Label – The urban sound category (Target Variable)

🔊 Classes & Labels

ClassID Class Name - 0 Air Conditioner - 1 Car Horn - 2 Children Playing - 3 Dog Bark - 4 Drilling - 5 Engine Idling - 6 Gun Shot - 7 Jackhammer - 8 Siren - 9 Street Music

📊 Dataset Structure

  • Total Columns: 35
  • Feature Columns: 34 extracted audio features
  • Target Column: class (categorical)
  • Original Data Source: UrbanSound8K

🚀 Use Cases

  • ✔️ Deep Learning & AI-based Audio Classification
  • ✔️ Urban Noise Detection & Sound Event Recognition
  • ✔️ Music & Environmental Sound Analysis
  • ✔️ Audio Feature Engineering for Machine Learning

⚖️ License & Citation

This dataset is a processed version of UrbanSound8K. If you use it, please cite the original dataset:

📄 J. Salamon, C. Jacoby, and J. P. Bello, "A Dataset and Taxonomy for Urban Sound Research," 22nd ACM International Conference on Multimedia, Orlando, USA, Nov. 2014.

🔗 More Info: UrbanSound8K Website

Search
Clear search
Close search
Google apps
Main menu