53 datasets found

audiomentations
kaggle.com
zip
Updated Jan 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HyeongChan Kim (2023). audiomentations [Dataset]. https://www.kaggle.com/kozistr/audiomentations
Explore at:
zip(35911 bytes)Available download formats
Dataset updated
Jan 10, 2023
Authors
HyeongChan Kim
Description
Audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

official : https://github.com/iver56/audiomentations
A literature review of data augmentation techniques for audio...
plos.figshare.com
xls
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jane Saldanha; Shaunak Chakraborty; Shruti Patil; Ketan Kotecha; Satish Kumar; Anand Nayyar (2023). A literature review of data augmentation techniques for audio classification. [Dataset]. http://doi.org/10.1371/journal.pone.0266467.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0266467.t002
Dataset updated
Jun 16, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Jane Saldanha; Shaunak Chakraborty; Shruti Patil; Ketan Kotecha; Satish Kumar; Anand Nayyar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A literature review of data augmentation techniques for audio classification.
G
Data Augmentation Tools Market Research Report 2033
growthmarketreports.com
csv, pdf, pptx
Updated Aug 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Growth Market Reports (2025). Data Augmentation Tools Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-augmentation-tools-market
Explore at:
csv, pdf, pptxAvailable download formats
Dataset updated
Aug 23, 2025
Dataset authored and provided by
Growth Market Reports
Time period covered
2024 - 2032
Area covered
Global
Description
Data Augmentation Tools Market Outlook

As per our latest research, the global Data Augmentation Tools market size reached USD 1.47 billion in 2024, reflecting the rapidly increasing adoption of artificial intelligence and machine learning across diverse sectors. The market is experiencing robust momentum, registering a CAGR of 25.3% from 2025 to 2033. By the end of 2033, the Data Augmentation Tools market is forecasted to reach a substantial value of USD 11.6 billion. This impressive growth is primarily driven by the escalating need for high-quality, diverse datasets to train advanced AI models, coupled with the proliferation of digital transformation initiatives across industries.

The primary growth factor fueling the Data Augmentation Tools market is the exponential rise in AI and machine learning applications, which require vast amounts of labeled data for effective training. As organizations strive to develop more accurate and robust models, the demand for data augmentation solutions that can synthetically expand and diversify datasets has surged. This trend is particularly pronounced in sectors such as healthcare, automotive, and retail, where the quality and quantity of data directly impact the performance and reliability of AI systems. The market is further propelled by the increasing complexity of data types, including images, text, audio, and video, necessitating sophisticated augmentation tools capable of handling multimodal data.

Another significant driver is the growing focus on reducing model bias and improving generalization capabilities. Data augmentation tools enable organizations to generate synthetic samples that account for various real-world scenarios, thereby minimizing overfitting and enhancing the robustness of AI models. This capability is critical in regulated industries like BFSI and healthcare, where the consequences of biased or inaccurate models can be severe. Furthermore, the rise of edge computing and IoT devices has expanded the scope of data augmentation, as organizations seek to deploy AI solutions in resource-constrained environments that require optimized and diverse training datasets.

The proliferation of cloud-based solutions has also played a pivotal role in shaping the trajectory of the Data Augmentation Tools market. Cloud deployment offers scalability, flexibility, and cost-effectiveness, allowing organizations of all sizes to access advanced augmentation capabilities without significant infrastructure investments. Additionally, the integration of data augmentation tools with popular machine learning frameworks and platforms has streamlined adoption, enabling seamless workflow integration and accelerating time-to-market for AI-driven products and services. These factors collectively contribute to the sustained growth and dynamism of the global Data Augmentation Tools market.

From a regional perspective, North America currently dominates the Data Augmentation Tools market, accounting for the largest revenue share in 2024, followed closely by Europe and Asia Pacific. The strong presence of leading technology companies, robust investment in AI research, and early adoption of digital transformation initiatives have established North America as a key hub for data augmentation innovation. Meanwhile, Asia Pacific is poised for the fastest growth over the forecast period, driven by the rapid expansion of the IT and telecommunications sector, burgeoning e-commerce industry, and increasing government initiatives to promote AI adoption. Europe also maintains a significant market presence, supported by stringent data privacy regulations and a strong focus on ethical AI development.

Component Analysis

The Component segment of the Data Augmentation Tools market is bifurcated into Software and Services, each playing a critical role in enabling organizations to leverage data augmentation for AI and machine learning initiatives. The software sub-segment comprises
D
Data Augmentation Tools Market Research Report 2033
dataintelo.com
csv, pdf, pptx
Updated Oct 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Data Augmentation Tools Market Research Report 2033 [Dataset]. https://dataintelo.com/report/data-augmentation-tools-market
Explore at:
pdf, pptx, csvAvailable download formats
Dataset updated
Oct 1, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Data Augmentation Tools Market Outlook

According to our latest research, the global Data Augmentation Tools market size reached USD 1.62 billion in 2024, with a robust year-on-year growth trajectory. The market is poised for accelerated expansion, projected to achieve a CAGR of 26.4% from 2025 to 2033. By the end of 2033, the market is forecasted to reach approximately USD 12.34 billion. This dynamic growth is primarily driven by the rising demand for artificial intelligence (AI) and machine learning (ML) applications across diverse industry verticals, which necessitate vast quantities of high-quality training data. The proliferation of data-centric AI models and the increasing complexity of real-world datasets are compelling enterprises to invest in advanced data augmentation tools to enhance data diversity and model robustness, as per the latest research insights.

One of the principal growth factors fueling the Data Augmentation Tools market is the intensifying adoption of AI-driven solutions across industries such as healthcare, automotive, retail, and finance. Organizations are increasingly leveraging data augmentation to overcome the challenges posed by limited or imbalanced datasets, which are often a bottleneck in developing accurate and reliable AI models. By synthetically expanding training datasets through augmentation techniques, enterprises can significantly improve the generalization capabilities of their models, leading to enhanced performance and reduced risk of overfitting. Furthermore, the surge in computer vision, natural language processing, and speech recognition applications is creating a fertile environment for the adoption of specialized augmentation tools tailored to image, text, and audio data.

Another significant factor contributing to market growth is the rapid evolution of augmentation technologies themselves. Innovations such as Generative Adversarial Networks (GANs), automated data labeling, and domain-specific augmentation pipelines are making it easier for organizations to deploy and scale data augmentation strategies. These advancements are not only reducing the manual effort and expertise required but also enabling the generation of highly realistic synthetic data that closely mimics real-world scenarios. As a result, businesses across sectors are able to accelerate their AI/ML development cycles, reduce costs associated with data collection and labeling, and maintain compliance with stringent data privacy regulations by minimizing the need to use sensitive real-world data.

The growing integration of data augmentation tools within cloud-based AI development platforms is also acting as a major catalyst for market expansion. Cloud deployment offers unparalleled scalability, accessibility, and collaboration capabilities, allowing organizations of all sizes to harness the power of data augmentation without significant upfront infrastructure investments. This democratization of advanced data engineering tools is especially beneficial for small and medium enterprises (SMEs) and academic research institutes, which often face resource constraints. The proliferation of cloud-native augmentation solutions is further supported by strategic partnerships between technology vendors and cloud service providers, driving broader market penetration and innovation.

From a regional perspective, North America continues to dominate the Data Augmentation Tools market, driven by the presence of leading AI technology companies, a mature digital infrastructure, and substantial investments in research and development. However, the Asia Pacific region is emerging as the fastest-growing market, fueled by rapid digital transformation initiatives, a burgeoning startup ecosystem, and increasing government support for AI innovation. Europe also holds a significant share, underpinned by strong regulatory frameworks and a focus on ethical AI development. Meanwhile, Latin America and the Middle East & Africa are witnessing steady adoption, particularly in sectors such as BFSI and healthcare, where data-driven insights are becoming increasingly critical.

Component Analysis

The Data Augmentation Tools market by component is bifurcated into Software and Services. The software segment currently accounts for the largest share of the market, owing to the widespread deployment of standalone and integrated augmentation solutions across enterprises and research institutions. These software plat
Audio Data Preparation and Augmentation
kaggle.com
zip
Updated Aug 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Subho117 (2024). Audio Data Preparation and Augmentation [Dataset]. https://www.kaggle.com/datasets/subho117/audio-data-preparation-and-augmentation
Explore at:
zip(109957 bytes)Available download formats
Dataset updated
Aug 28, 2024
Authors
Subho117
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Subho117

Released under MIT

Contents
Z
Audio Piano Triads Dataset
data.niaid.nih.gov
Updated Aug 18, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agustín Macaya (2021). Audio Piano Triads Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4740876
Explore at:
Dataset updated
Aug 18, 2021
Dataset provided by
Pontificia Universidad Católica de Chile
Authors
Agustín Macaya
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Created by: Agustín Macaya Valladares Date: May 5th, 2021

Dataset contains 43.200 examples of piano triads in .wav format.

Details: - Sample rate: 16000 Hz. - Data type: 16-bit PCM (int16). - File size: Each example has a file size of 128 kB (5.53 GB for complete dataset). - Duration: 4 seconds. - Sound: Piano (digital). - Chords were played by a human on a velocity-sensitive piano keyboard. - 3 seconds pressed, 1 second released.

3 octaves (2,3,4).

12 base notes per octave: Cn, Df, Dn, Ef, En, Fn, Gf, Gn, Af, An, Bf, Bn. (n is natural, f is flat).

4 triad types per note: major (j), minor (n), diminished (d), augmented (a). No inversions.

3 volumes per triad: forte (f), metsoforte (m), piano (p).

10 original examples per combination of octave, base note, triad type, and volume. (103*12*4*3 = 4.320 examples).

x10 data augmentation for each example (4.320 * 10 = 43.200 total examples).

Data augmentation through random temporal and amplitude shifts.

Metadata is in the name of the chord. For example: "piano_3_Af_d_m_45.wav" is a piano chord, (3) 3rd octave, (Af) A flat base note, (d) diminished, (m) metsoforte, 45th example.

Note: - The audios are in 16-bit PCM (int16) data type to reduce the file size. This means that the dynamic range of values in the array is -32768 to 32768, integers. To normalize the audios in the range -1 to 1 just divide by 32768.
AugLy Facebook Research
kaggle.com
zip
Updated Dec 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mathurin Aché (2021). AugLy Facebook Research [Dataset]. https://www.kaggle.com/mathurinache/augly-facebook-research
Explore at:
zip(87198042 bytes)Available download formats
Dataset updated
Dec 6, 2021
Authors
Mathurin Aché
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

AugLy: A new data augmentation library to help build more robust AI models

Content

AugLy was developed by Facebook researchers and engineers across the globe in offices based at our Seattle and Paris offices. It has four sub-libraries, each corresponding to a different modality. Each library follows the same interface: We provide transforms in both function-based and class-based formats, and we provide intensity functions that help you understand how intense a transformation is (based on the given parameters). AugLy can also generate useful metadata to help you understand how your data was transformed.

Acknowledgements

Code comes from https://github.com/facebookresearch/AugLy" alt="Facebook research Github">

Inspiration

See Notebooks to see in practice Augly
Firearms Audio Dataset – 58 Gun Types
kaggle.com
zip
Updated Aug 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ZACKY_ZAC (2025). Firearms Audio Dataset – 58 Gun Types [Dataset]. https://www.kaggle.com/zackyzac/firearms-audio-dataset-58-gun-types
Explore at:
zip(55975198 bytes)Available download formats
Dataset updated
Aug 30, 2025
Authors
ZACKY_ZAC
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
🔫 Firearms Audio Classification Dataset – 58 Gun Types (Standardized & Augmented)

This dataset is a comprehensive collection of firearm audio recordings curated and standardized from multiple open-source datasets, including:

NIJ Gunshot Audio Forensics Dataset

BGG (PUBG) Gunshot Dataset

Gunshot Audio Dataset (YouTube – Kaggle)

Gunshot/Gunfire Audio Dataset (Zenodo)

📌 Dataset Highlights

58 firearm classes (e.g., AK-47, M16, Glock, Desert Eagle, MP5, M249, etc.).

Each class contains 300 standardized audio files.

Audio standardized to:

0.5 seconds per clip

44.1 kHz sampling rate

.wav format (mono)

Data augmentation applied (pitch shifting, time stretching, noise injection) to balance classes.

Focused on peak-segment extraction (0.5s window around the loudest gunshot).

🎯 Purpose

The main goal is to provide a clean and balanced dataset for firearm audio classification and clustering tasks.

Potential applications include:
- 🔊 Gunshot sound recognition
- 🎯 Firearm type classification
- 📊 Acoustic clustering of firearms
- 🛡️ Forensic audio analysis
- 🤖 Deep learning experiments (CNNs, RNNs, Transformers on audio features)

⚡ Usage Notes

Recommended preprocessing: MFCC, Mel-Spectrograms, or Wave2Vec embeddings.

Each audio file contains a single gunshot or burst segment, normalized in duration.

Designed for both supervised classification and unsupervised clustering.

📊 Dataset Size

58 classes × 300 files each = 17,400 audio files

Total duration ≈ 2.5 hours of gunshot audio

🚀 Example Workflow

Load an audio file with librosa.

Extract MFCCs or Mel-spectrograms.

Train a classifier or cluster embeddings.

import librosa, librosa.display import matplotlib.pyplot as plt # Load audio y, sr = librosa.load("Firearms_Dataset/ak-47/ak-47_001.wav", sr=44100) # Extract MFCCs mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13) # Plot plt.figure(figsize=(10, 4)) librosa.display.specshow(mfccs, x_axis='time') plt.colorbar() plt.title("MFCC - AK-47 Gunshot") plt.tight_layout() plt.show()
Deepfake Audio Mel-spectrograms
kaggle.com
zip
Updated Nov 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
codeofwhite (2024). Deepfake Audio Mel-spectrograms [Dataset]. https://www.kaggle.com/datasets/hejuncheung/mel-plot-processed
Explore at:
zip(440039753 bytes)Available download formats
Dataset updated
Nov 12, 2024
Authors
codeofwhite
Description
Folder Structure

1. output_hifigen_augmentated

This subdirectory contains Mel-spectrograms generated using Hifi-GAN augmented data. The data is split into training, validation, and test subsets:

mel_spectrograms ——test

————fake

————real

——train

————fake

————real

——val

————fake

————real

2. output_jp_augmentated

This subdirectory includes Mel-spectrograms with a different augmentation strategy (likely JP augmentation). It follows the same structure as above:

mel_spectrograms ——test ——train ——val

Highlights

Custom Mel-Spectrograms: All audio samples are represented in Mel-spectrogram format, facilitating their use in deep learning tasks like binary classification of fake vs. real audio.

Augmented Data: Two augmentation techniques are applied, offering diverse data for robust model training and evaluation.

Balanced Splits: Each dataset partition (train, val, test) includes separate folders for fake and real samples to simplify dataset handling.
Z
WaivOps WRLD-SMB: Open Audio Resources for Machine Learning in Music
data-staging.niaid.nih.gov
data.niaid.nih.gov
Updated Oct 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patchbanks; WaivOps (2024). WaivOps WRLD-SMB: Open Audio Resources for Machine Learning in Music [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_13921289
Explore at:
Dataset updated
Oct 12, 2024
Authors
Patchbanks; WaivOps
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
WRLD-SMB Dataset

WRLD-SMB is an open audio dataset featuring a collection of synthetic drum recordings in the style of Brazilian samba music. It includes 1,100 audio loops recorded in uncompressed stereo WAV format, along with paired JSON files intended for the supervised training of generative AI audio models.

Overview

This dataset was developed using multi-velocity audio samples and a paired MIDI dataset. The intended use of this dataset is to train or fine-tune AI models in learning high-performance drum notations, aiming to replicate the live sound of a small drum ensemble. To facilitate augmentation and supervised training with labeled audio data, a dropout technique was employed on the rendered audio files to generate variational mixes of the drum tracks.

The primary purpose of this dataset is to provide accessible content for machine learning applications in music and audio. Potential use cases include generative music, feature extraction, tempo detection, audio classification, rhythm analysis, drum synthesis, music information retrieval (MIR), sound design and signal processing.

Specifications

1,100 audio loops (approximately 5.5 hours)

16-bit 44.1kHz WAV format

Tempo range: 90–120 BPM

Paired label data (WAV + JSON)

Variational drum patterns

Subgenre styles (Traditional and modern samba, bossa nova, fusion)

A JSON file is provided for referencing and converting MIDI note numbers to text labels. You can update the text labels to suit your preferences.

License

This dataset was compiled by WaivOps, a crowdsourced music project managed by the sound label company Patchbanks. All recordings have been compiled by verified sources for copyright clearance.

The WRLD-SMB dataset is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).

Additional Info

For audio examples or more information about this dataset, please refer to the GitHub repository.
Audio Cartography
openneuro.org
Updated Aug 8, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Megen Brittell (2020). Audio Cartography [Dataset]. http://doi.org/10.18112/openneuro.ds001415.v1.0.0
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds001415.v1.0.0
Dataset updated
Aug 8, 2020
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Megen Brittell
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The Audio Cartography project investigated the influence of temporal arrangement on the interpretation of information from a simple spatial data set. I designed and implemented three auditory map types (audio types), and evaluated differences in the responses to those audio types.

The three audio types represented simplified raster data (eight rows x eight columns). First, a "sequential" representation read values one at a time from each cell of the raster, following an English reading order, and encoded the data value as loudness of a single fixed-duration and fixed-frequency note. Second, an augmented-sequential ("augmented") representation used the same reading order, but encoded the data value as volume, the row as frequency, and the column as the rate of the notes play (constant total cell duration). Third, a "concurrent" representation used the same encoding as the augmented type, but allowed the notes to overlap in time.

Participants completed a training session in a computer-lab setting, where they were introduced to the audio types and practiced making a comparison between data values at two locations within the display based on what they heard. The training sessions, including associated paperwork, lasted up to one hour. In a second study session, participants listened to the auditory maps and made decisions about the data they represented while the fMRI scanner recorded digital brain images.

The task consisted of listening to an auditory representation of geospatial data ("map"), and then making a decision about the relative values of data at two specified locations. After listening to the map ("listen"), a graphic depicted two locations within a square (white background). Each location was marked with a small square (size: 2x2 grid cells); one square had a black solid outline and transparent black fill, the other had a red dashed outline and transparent red fill. The decision ("response") was made under one of two conditions. Under the active listening condition ("active") the map was played a second time while participants made their decision; in the memory condition ("memory"), a decision was made in relative quiet (general scanner noises and intermittent acquisition noise persisted). During the initial map listening, participants were aware of neither the locations of the response options within the map extent, nor the response conditions under which they would make their decision. Participants could respond any time after the graphic was displayed; once a response was entered, the playback stopped (active response condition only) and the presentation continued to the next trial.

Data was collected in accordance with a protocol approved by the Institutional Review Board at the University of Oregon.

Additional details about the specific maps used in this are available through University of Oregon's ScholarsBank (DOI 10.7264/3b49-tr85).

Details of the design process and evaluation are provided in the associated dissertation, which is available from ProQuest and University of Oregon's ScholarsBank.

Scripts that created the experimental stimuli and automated processing are available through University of Oregon's ScholarsBank (DOI 10.7264/3b49-tr85).

Preparation of fMRI Data

Conversion of the DICOM files produced by the scanner to NiFTi format was performed by MRIConvert (LCNI). Orientation to standard axes was performed and recorded in the NiFTi header (FMRIB, fslreorient2std). The excess slices in the anatomical images that represented tissue in the next were trimmed (FMRIB, robustfov). Participant identity was protected through automated defacing of the anatomical data (FreeSurfer, mri_deface), with additional post-processing to ensure that no brain voxels were erroneously removed from the image (FMRIB, BET; brain mask dilated with three iterations "fslmaths -dilM").

Preparation of Metadata

The dcm2niix tool (Rorden) was used to create draft JSON sidecar files with metadata extracted from the DICOM headers. The draft sidecar file were revised to augment the JSON elements with additional tags (e.g., "Orientation" and "TaskDescription") and to make a more human-friendly version of tag contents (e.g., "InstitutionAddress" and "DepartmentName"). The device serial number was constant throughout the data collection (i.e., all data collection was conducted on the same scanner), and the respective metadata values were replaced with an anonymous identifier: "Scanner1".

Preparation of Behavioral Data

The stimuli consisted of eighteen auditory maps. Spatial data were generated with the rgeos, sp, and spatstat libraries in R; auditory maps were rendered with the Pyo (Belanger) library for Python and prepared for presentation in Audacity. Stimuli were presented using PsychoPy (Peirce, 2007), which produced log files from which event details were extracted. The log files included timestamped entries for stimulus timing and trigger pulses from the scanner.

Log files are available in "sourcedata/behavioral".

Extracted event details accompany BOLD images in "sub-NN/func/*events.tsv".

Three column explanatory variable files are in "derivatives/ev/sub-NN".

References

Audacity® software is copyright © 1999-2018 Audacity Team. Web site: https://audacityteam.org/. The name Audacity® is a registered trademark of Dominic Mazzoni.

FMRIB (Functional Magnetic Resonance Imaging of the Brain). FMRIB Software Library (FSL; fslreorient2std, robustfov, BET). Oxford, v5.0.9, Available: https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/

FreeSurfer (mri_deface). Harvard, v1.22, Available: https://surfer.nmr.mgh.harvard.edu/fswiki/AutomatedDefacingTools)

LCNI (Lewis Center for Neuroimaging). MRIConvert (mcverter), v2.1.0 build 440, Available: https://lcni.uoregon.edu/downloads/mriconvert/mriconvert-and-mcverter

Peirce, JW. PsychoPy–psychophysics software in Python. Journal of Neuroscience Methods, 162(1–2):8 – 13, 2007. Software Available: http://www.psychopy.org/

Python software is copyright © 2001-2015 Python Software Foundation. Web site: https://www.python.org

Pyo software is copyright © 2009-2015 Olivier Belanger. Web site: http://ajaxsoundstudio.com/software/pyo/.

R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available: https://www.R-project.org/.

rgeos software is copyright © 2016 Bivand and Rundel. Web site: https://CRAN.R-project.org/package=rgeos

Rorden, C. dcm2niix, v1.0.20171215, Available: https://github.com/rordenlab/dcm2niix

spatstat software is copyright © 2016 Baddeley, Rubak, and Turner. Web site: https://CRAN.R-project.org/package=spatstat

sp software is copyright © 2016 Pebesma and Bivand. Web site: https://CRAN.R-project.org/package=sp
Data from: Birds of a Feather Augment Together: Exploring Sonic Links...
zenodo.org
Updated Oct 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob Bhattacharyya; Jacob Bhattacharyya (2025). Birds of a Feather Augment Together: Exploring Sonic Links Between Real and Virtual Worlds in Audio Augmented Reality [Dataset]. http://doi.org/10.5281/zenodo.16614796
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.16614796
Dataset updated
Oct 8, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jacob Bhattacharyya; Jacob Bhattacharyya
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Quantitative dataset presented in ISMAR 2025 submission "Birds of a Feather Augment Together: Exploring Sonic Links Between Real and Virtual Worlds in Audio Augmented Reality".

Dataset covers the evaluation questionnaires presented to participants. Data has been pre-processed to flip negatively phrased questions as appropriate, and ensure that 7-step Likert data is scaled from -3 to 3.
h
MIT_environmental_impulse_responses
huggingface.co
opendatalab.com
Updated Aug 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Scripka (2023). MIT_environmental_impulse_responses [Dataset]. https://huggingface.co/datasets/davidscripka/MIT_environmental_impulse_responses
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 19, 2023
Authors
David Scripka
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
MIT Environmental Impulse Response Dataset The audio recordings in this dataset are originally created by the Computational Audition Lab at MIT. The source of the data can be found at: https://mcdermottlab.mit.edu/Reverb/IR_Survey.html. The audio files in the dataset have been resampled to a sampling rate of 16 kHz. This resampling was done to reduce the size of the dataset while making it more suitable for various tasks, including data augmentation. The dataset consists of 271 audio files… See the full description on the dataset page: https://huggingface.co/datasets/davidscripka/MIT_environmental_impulse_responses.
G
Synthetic Training Data Market Research Report 2033
growthmarketreports.com
csv, pdf, pptx
Updated Aug 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Growth Market Reports (2025). Synthetic Training Data Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/synthetic-training-data-market
Explore at:
csv, pdf, pptxAvailable download formats
Dataset updated
Aug 29, 2025
Dataset authored and provided by
Growth Market Reports
Time period covered
2024 - 2032
Area covered
Global
Description
Synthetic Training Data Market Outlook

According to our latest research, the global synthetic training data market size in 2024 is valued at USD 1.45 billion, demonstrating robust momentum as organizations increasingly adopt artificial intelligence and machine learning solutions. The market is projected to grow at a remarkable CAGR of 38.7% from 2025 to 2033, reaching an estimated USD 22.46 billion by 2033. This exponential growth is primarily driven by the rising demand for high-quality, diverse, and privacy-compliant datasets that fuel advanced AI models, as well as the escalating need for scalable data solutions across various industries.

One of the primary growth factors propelling the synthetic training data market is the escalating complexity and diversity of AI and machine learning applications. As organizations strive to develop more accurate and robust AI models, the need for vast amounts of annotated and high-quality training data has surged. Traditional data collection methods are often hampered by privacy concerns, high costs, and time-consuming processes. Synthetic training data, generated through advanced algorithms and simulation tools, offers a compelling alternative by providing scalable, customizable, and bias-mitigated datasets. This enables organizations to accelerate model development, improve performance, and comply with evolving data privacy regulations such as GDPR and CCPA, thus driving widespread adoption across sectors like healthcare, finance, autonomous vehicles, and robotics.

Another significant driver is the increasing adoption of synthetic data for data augmentation and rare event simulation. In sectors such as autonomous vehicles, manufacturing, and robotics, real-world data for edge-case scenarios or rare events is often scarce or difficult to capture. Synthetic training data allows for the generation of these critical scenarios at scale, enabling AI systems to learn and adapt to complex, unpredictable environments. This not only enhances model robustness but also reduces the risk associated with deploying AI in safety-critical applications. The flexibility to generate diverse data types, including images, text, audio, video, and tabular data, further expands the applicability of synthetic data solutions, making them indispensable tools for innovation and competitive advantage.

The synthetic training data market is also experiencing rapid growth due to the heightened focus on data privacy and regulatory compliance. As data protection regulations become more stringent worldwide, organizations face increasing challenges in accessing and utilizing real-world data for AI training without violating user privacy. Synthetic data addresses this challenge by creating realistic yet entirely artificial datasets that preserve the statistical properties of original data without exposing sensitive information. This capability is particularly valuable for industries such as BFSI, healthcare, and government, where data sensitivity and compliance requirements are paramount. As a result, the adoption of synthetic training data is expected to accelerate further as organizations seek to balance innovation with ethical and legal responsibilities.

From a regional perspective, North America currently leads the synthetic training data market, driven by the presence of major technology companies, robust R&D investments, and early adoption of AI technologies. However, the Asia Pacific region is anticipated to witness the highest growth rate during the forecast period, fueled by expanding AI initiatives, government support, and the rapid digital transformation of industries. Europe is also emerging as a key market, particularly in sectors where data privacy and regulatory compliance are critical. Latin America and the Middle East & Africa are gradually increasing their market share as awareness and adoption of synthetic data solutions grow. Overall, the global landscape is characterized by dynamic regional trends, with each region contributing uniquely to the marketÂ’s expansion.

The introduction of a Synthetic Data Generation Engine has revolutionized the way organizations approach data creation and management. This engine leverages cutting-edge algorithms to produce high-quality synthetic datasets that mirror real-world data without compromising privacy. By sim
Audiomentations
kaggle.com
zip
Updated Apr 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
atfujita (2022). Audiomentations [Dataset]. https://www.kaggle.com/datasets/atsunorifujita/audiomentations
Explore at:
zip(62619 bytes)Available download formats
Dataset updated
Apr 22, 2022
Authors
atfujita
Description
A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio and multichannel audio. Can be integrated in training pipelines in e.g. Tensorflow/Keras or Pytorch. Has helped people get world-class results in Kaggle competitions. Is used by companies making next-generation audio products.

Need a Pytorch-specific alternative with GPU support? Check out torch-audiomentations!
h
multi_accent_speech
huggingface.co
Updated Sep 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cagatay Nufer (2025). multi_accent_speech [Dataset]. https://huggingface.co/datasets/cagatayn/multi_accent_speech
Explore at:
Dataset updated
Sep 10, 2025
Authors
Cagatay Nufer
Description
Multi-Accent English Speech Corpus (Augmented & Speaker-Disjoint)

This dataset is a curated and augmented multi-accent English speech corpus designed for speech recognition, accent classification, and representation learning.It consolidates multiple open-source accent corpora, converts all audio to a unified format, applies targeted data augmentation, and exports in a tidy, Hugging Face–ready structure.

✨ Key Features

Accents covered (12 total):american_english… See the full description on the dataset page: https://huggingface.co/datasets/cagatayn/multi_accent_speech.
D
Dataset Licensing For AI Training Market Research Report 2033
dataintelo.com
csv, pdf, pptx
Updated Sep 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Dataset Licensing For AI Training Market Research Report 2033 [Dataset]. https://dataintelo.com/report/dataset-licensing-for-ai-training-market
Explore at:
pdf, pptx, csvAvailable download formats
Dataset updated
Sep 30, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Dataset Licensing for AI Training Market Outlook

According to our latest research, the global Dataset Licensing for AI Training market size reached USD 2.1 billion in 2024, with a robust CAGR of 22.4% projected through the forecast period. By 2033, the market is expected to achieve a value of USD 15.2 billion. This remarkable growth is primarily fueled by the exponential rise in demand for high-quality, diverse, and ethically sourced datasets required to train increasingly sophisticated artificial intelligence (AI) models across industries. As organizations continue to scale their AI initiatives, the need for compliant, scalable, and customizable licensing solutions has never been more critical, driving significant investments and innovation in the dataset licensing ecosystem.

A primary growth factor for the Dataset Licensing for AI Training market is the proliferation of AI applications across sectors such as healthcare, finance, automotive, and government. As AI models become more complex, their hunger for diverse and representative datasets intensifies, making data acquisition and licensing a strategic priority for enterprises. The increasing adoption of machine learning, deep learning, and generative AI technologies further amplifies the need for specialized datasets, pushing both data providers and consumers to seek flexible and secure licensing arrangements. Additionally, regulatory developments such as GDPR in Europe and similar data privacy frameworks worldwide are compelling organizations to prioritize licensed, compliant datasets over ad hoc or unlicensed data sources, further accelerating market growth.

Another significant driver is the growing sophistication of dataset licensing models themselves. Vendors are moving beyond traditional open-source or proprietary licenses, introducing hybrid, creative commons, and custom-negotiated agreements tailored to specific use cases and industries. This evolution is enabling AI developers to access a broader variety of data types—text, image, audio, video, and multimodal—while ensuring legal clarity and minimizing risk. Moreover, the rise of data marketplaces and third-party platforms is streamlining the process of dataset discovery, negotiation, and compliance monitoring, making it easier for organizations of all sizes to source and license the data they need for AI training at scale.

The surging demand for high-quality annotated datasets is also fostering partnerships between data providers, annotation service vendors, and AI developers. These collaborations are leading to the creation of bespoke datasets that cater to niche applications, such as autonomous driving, medical diagnostics, and advanced robotics. At the same time, advances in synthetic data generation and data augmentation are expanding the universe of licensable datasets, offering new avenues for licensing and monetization. As the market matures, we expect to see increased standardization, transparency, and interoperability in licensing frameworks, further lowering barriers to entry and accelerating innovation in AI model development.

Regionally, North America continues to dominate the Dataset Licensing for AI Training market, accounting for the largest share in 2024, driven by the presence of leading technology companies, robust regulatory frameworks, and a mature AI ecosystem. Europe follows closely, with significant investments in ethical AI and data governance initiatives. Asia Pacific is emerging as a high-growth region, fueled by rapid digital transformation, government-backed AI strategies, and a burgeoning startup landscape. Latin America and the Middle East & Africa are also witnessing increased adoption of licensed datasets, particularly in sectors such as healthcare and public administration, although their market shares remain comparatively smaller. This global momentum underscores the universal need for high-quality, licensed datasets as the foundation of responsible and effective AI training.

License Type Analysis

The License Type segment in the Dataset Licensing for AI Training market is characterized by a diverse range of options, including Open Source, Proprietary, Creative Commons, and Custom/Negotiated licenses. Open source licenses have long been favored by academic and research communities due to their accessibility and collaborative ethos. However, their adoption in commercial AI projects is often tempered by concerns over data provenance, usage restrictions, a
h
singaporean_district_noise
huggingface.co
Updated Jul 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DANG VAN THUC (2025). singaporean_district_noise [Dataset]. https://huggingface.co/datasets/thucdangvan020999/singaporean_district_noise
Explore at:
Dataset updated
Jul 10, 2025
Authors
DANG VAN THUC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Singapore
Description
Singaporean district with noise

Dataset Description

Singaporean district speech dataset with controlled noise augmentation for ASR training

Dataset Summary

Language: EN Task: Automatic Speech Recognition
Total Samples: 2,288 Audio Sample Rate: 16kHz Base Dataset: Custom dataset Processing: Noise-augmented

Dataset Structure Data Fields

audio: Audio file (16kHz WAV format) text: Transcription text noise_type: Type of background noise… See the full description on the dataset page: https://huggingface.co/datasets/thucdangvan020999/singaporean_district_noise.
m
Sudden Queen Loss Event in an Africanized Honeybee Colony
data.mendeley.com
Updated May 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ícaro de Lima Rodrigues (2024). Sudden Queen Loss Event in an Africanized Honeybee Colony [Dataset]. http://doi.org/10.17632/j97khfj656.1
Explore at:
Unique identifier
https://doi.org/10.17632/j97khfj656.1
Dataset updated
May 21, 2024
Authors
Ícaro de Lima Rodrigues
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset consists of 23 features extracted from audio recordings of an Africanized honeybee hive in Fortaleza-CE, Brazil. The first feature is the recording date, and the last is the label indicating the queen's presence status. The label can take two values: "QR" for queenright (presence of queen) or "QL" for queenless (absence of queen). The remaining features are directly extracted from the audio signal, divided into three groups: time-domain features (zcr, energy, and energy entropy), spectral features (centroid, spread, entropy, flux, and rolloff), and 13 MFCC coefficients. For further details on the meaning of each feature, please refer to https://doi.org/10.1371/journal.pone.0144610.t002.

The data were collected from daily recordings over a 6-day period, with the queen bee removed from the dataset on the last day. Consequently, the QR and QL classes are unbalanced, with QL representing only 1/6 of the data. This situation is common in this type of monitoring, where the hive's functioning is expected to remain within normal well-being parameters most of the time. Naturally, anomalies such as the sudden queen loss are uncommon and therefore represent a smaller portion of the data. The experiment and the data aim to replicate and incorporate these conditions for greater fidelity to the addressed problem.

Such issues can be addressed using techniques such as anomaly detection, one-class classification, or incremental learning. Additionally, techniques for handling unbalanced data in classification problems, such as data augmentation and resampling, can be employed. Using OC-SVM, we achieved results with 96% accuracy and 99% precision.
D
Data from: ARAUS: A Large-Scale Dataset and Baseline Models of Affective...
researchdata.ntu.edu.sg
zip
Updated Nov 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kenneth Ooi; Kenneth Ooi; Zhen-Ting Ong; Zhen-Ting Ong; Karn N. Watcharasupat; Karn N. Watcharasupat; Bhan Lam; Bhan Lam; Joo Young Hong; Joo Young Hong; Woon-Seng Gan; Woon-Seng Gan (2023). ARAUS: A Large-Scale Dataset and Baseline Models of Affective Responses to Augmented Urban Soundscapes [Dataset]. http://doi.org/10.21979/N9/9OTEVX
Explore at:
zip(9362005), zip(29322354), zip(1854668318), zip(586577999), zip(221886207), zip(12150441), zip(3891054), zip(2242857218)Available download formats
Unique identifier
https://doi.org/10.21979/N9/9OTEVX
Dataset updated
Nov 1, 2023
Dataset provided by
DR-NTU (Data)
Authors
Kenneth Ooi; Kenneth Ooi; Zhen-Ting Ong; Zhen-Ting Ong; Karn N. Watcharasupat; Karn N. Watcharasupat; Bhan Lam; Bhan Lam; Joo Young Hong; Joo Young Hong; Woon-Seng Gan; Woon-Seng Gan
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Dataset funded by
Ministry of National Development (MND)
National Research Foundation (NRF)
Description
This repository contains the ARAUS dataset, a publicly-available dataset (comprising a 5-fold training/validation set and an independent test set) of 25,440 unique subjective perceptual responses to augmented soundscapes presented as audio-visual stimuli. Each augmented soundscape is made by digitally adding "maskers" (bird, water, wind, traffic, construction, or silence) to urban soundscape recordings at fixed soundscape-to-masker ratios. This mimics a real-life soundscape augmentation system, whereby a speaker (or some other sound source) is used to add "maskers" to an actual urban soundscape. Responses were then collected by asking participants to rate how pleasant, annoying, eventful, uneventful, vibrant, monotonous, chaotic, calm, and appropriate each augmented soundscape was. The data in this repository aims to form a benchmark for fair comparisons of models for the prediction and analysis of perceptual attributes of soundscapes. Please refer to our publication submitted to IEEE Transactions on Affective Computing for more details regarding the data collection, annotation, and processing methodologies for the creation of the dataset: Kenneth Ooi, Zhen-Ting Ong, Karn N. Watcharasupat, Bhan Lam, Joo Young Hong, Woon-Seng Gan, ARAUS: A large-scale dataset and baseline models of affective responses to augmented urban soundscapes, IEEE Transactions on Affective Computing, doi: 10.1109/TAFFC.2023.3247914. Replication code and baseline models that we have trained using the ARAUS dataset can be found at our GitHub repository: https://github.com/ntudsp/araus-dataset-baseline-models

Facebook

Twitter

Click to copy link

Link copied

Cite

HyeongChan Kim (2023). audiomentations [Dataset]. https://www.kaggle.com/kozistr/audiomentations

audiomentations

A Python library for audio data augmentation

Explore at:

zip(35911 bytes)Available download formats

Dataset updated

Jan 10, 2023

Authors

HyeongChan Kim

Description

Audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

official : https://github.com/iver56/audiomentations

Clear search

Close search

Google apps

Main menu

audiomentations

Audiomentations

A literature review of data augmentation techniques for audio...

Data Augmentation Tools Market Research Report 2033

Data Augmentation Tools Market Outlook

Component Analysis

Data Augmentation Tools Market Research Report 2033

Data Augmentation Tools Market Outlook

Component Analysis

Audio Data Preparation and Augmentation

Dataset

Contents

Audio Piano Triads Dataset

AugLy Facebook Research

Context

Content

Acknowledgements

Inspiration

Firearms Audio Dataset – 58 Gun Types

🔫 Firearms Audio Classification Dataset – 58 Gun Types (Standardized & Augmented)

📌 Dataset Highlights

🎯 Purpose

⚡ Usage Notes

📊 Dataset Size

🚀 Example Workflow

Deepfake Audio Mel-spectrograms

Folder Structure

1. output_hifigen_augmentated

2. output_jp_augmentated

Highlights

Custom Mel-Spectrograms: All audio samples are represented in Mel-spectrogram format, facilitating their use in deep learning tasks like binary classification of fake vs. real audio.

Augmented Data: Two augmentation techniques are applied, offering diverse data for robust model training and evaluation.

Balanced Splits: Each dataset partition (train, val, test) includes separate folders for fake and real samples to simplify dataset handling.

WaivOps WRLD-SMB: Open Audio Resources for Machine Learning in Music

Audio Cartography

Preparation of fMRI Data

Preparation of Metadata

Preparation of Behavioral Data

References

Data from: Birds of a Feather Augment Together: Exploring Sonic Links...

MIT_environmental_impulse_responses

Synthetic Training Data Market Research Report 2033

Synthetic Training Data Market Outlook

Audiomentations

multi_accent_speech

Dataset Licensing For AI Training Market Research Report 2033

Dataset Licensing for AI Training Market Outlook

License Type Analysis

singaporean_district_noise

Sudden Queen Loss Event in an Africanized Honeybee Colony

Data from: ARAUS: A Large-Scale Dataset and Baseline Models of Affective...

audiomentations

A Python library for audio data augmentation

Audiomentations