100+ datasets found

Speech Recognition Data Collection Services | 100+ Languages Resources...
datarade.ai
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2023). Speech Recognition Data Collection Services | 100+ Languages Resources |Audio Data | Speech Recognition Data | Machine Learning (ML) Data [Dataset]. https://datarade.ai/data-products/nexdata-speech-recognition-data-collection-services-100-nexdata
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Dec 28, 2023
Dataset authored and provided by
Nexdata
Area covered
Estonia, Haiti, Cambodia, Brazil, Malaysia, Sri Lanka, United Kingdom, Lithuania, Austria, El Salvador
Description
Overview With extensive experience in speech recognition, Nexdata has resource pool covering more than 50 countries and regions. Our linguist team works closely with clients to assist them with dictionary and text corpus construction, speech quality inspection, linguistics consulting and etc.

Our Capacity -Global Resources: Global resources covering hundreds of languages worldwide

-Compliance: All the Machine Learning (ML) Data are collected with proper authorization -Quality: Multiple rounds of quality inspections ensures high quality data output

-Secure Implementation: NDA is signed to gurantee secure implementation and Machine Learning (ML) Data is destroyed upon delivery.

About Nexdata Nexdata is equipped with professional Machine Learning (ML) Data collection devices, tools and environments, as well as experienced project managers in data collection and quality control, so that we can meet the data collection requirements in various scenarios and types. Please visit us at https://www.nexdata.ai/service/speech-recognition?source=Datarade
Speech Recognition Data Collection Services | 100+ Languages Resources...
data.nexdata.ai
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2024). Speech Recognition Data Collection Services | 100+ Languages Resources |Audio Data | Speech Recognition Data | Machine Learning (ML) Data [Dataset]. https://data.nexdata.ai/products/nexdata-speech-recognition-data-collection-services-100-nexdata
Explore at:
Dataset updated
Aug 3, 2024
Dataset authored and provided by
Nexdata
Area covered
Jordan, Finland, Cambodia, Luxembourg, Tunisia, Lebanon, Singapore, Netherlands, Mongolia, New Zealand
Description
Nexdata is equipped with professional recording equipment and has resources pool of 70+ countries and regions, and provide various types of speech recognition data collection services for Machine Learning (ML) Data.
Speech Synthesis Data Collection Service | 50+ Languages Resources |...
data.nexdata.ai
Updated Aug 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2024). Speech Synthesis Data Collection Service | 50+ Languages Resources | Numerous Voice Sample | TTS Data | Audio Data | Deep Learning (DL) Data [Dataset]. https://data.nexdata.ai/products/nexdata-speech-synthesis-data-collection-services-50-lan-nexdata
Explore at:
Dataset updated
Aug 3, 2024
Dataset authored and provided by
Nexdata
Area covered
Uruguay, Azerbaijan, French Guiana, Romania, Malaysia, Portugal, Singapore, Italy, Dominican Republic, Mexico
Description
Nexdata provides multi-language, multi-timbre, multi-domain and multi-style speech synthesis data collection servicesfor Deep Learning Data.
F
English (Canada) General Conversation Speech Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). English (Canada) General Conversation Speech Dataset [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-english-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada
Dataset funded by
FutureBeeAI
Description
What’s Included
Welcome to the English Language General Conversation Speech Dataset, a comprehensive and diverse collection of voice data specifically curated to advance the development of English language speech recognition models, with a particular focus on Canadian accents and dialects.
With high-quality audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and Generative Voice AI algorithms. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in Canada.
Speech Data:
This training dataset comprises 30 hours of audio recordings covering a wide range of topics and scenarios, ensuring robustness and accuracy in speech technology applications. To achieve this, we collaborated with a diverse network of 40 native English speakers from different states/provinces of Canada. This collaborative effort guarantees a balanced representation of Canadian accents, dialects, and demographics, reducing biases and promoting inclusivity.
Each audio recording captures the essence of spontaneous, unscripted conversations between two individuals, with an average duration ranging from 15 to 60 minutes. The speech data is available in WAV format, with stereo channel files having a bit depth of 16 bits and a sample rate of 8 kHz. The recording environment is generally quiet, without background noise and echo.
Metadata:
In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This metadata includes the participant's age, gender, country, state, and dialect. Furthermore, additional metadata such as recording device detail, topic of recording, bit depth, and sample rate will be provided.
The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of English language speech recognition models.
Transcription:
This dataset provides a manual verbatim transcription of each audio file to enhance your workflow efficiency. The transcriptions are available in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags.
Our goal is to expedite the deployment of English language conversational AI and NLP models by offering ready-to-use transcriptions, ultimately saving valuable time and resources in the development process.
Updates and Customization:
We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our voice dataset is regularly updated with new audio data captured in diverse real-world conditions.
If you require a custom training dataset with specific environmental conditions such as in-car, busy street, restaurant, or any other scenario, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.
License:
This audio dataset, created by FutureBeeAI, is now available for commercial use.
Conclusion:
Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, exploring generative voice AI, or building cutting-edge voice assistants and bots, our dataset serves as a reliable and valuable resource.
h
wolof-audio-data
huggingface.co
Updated Dec 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdoulaye Diallo (2024). wolof-audio-data [Dataset]. https://huggingface.co/datasets/vonewman/wolof-audio-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 14, 2024
Authors
Abdoulaye Diallo
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Wolof Audio Dataset

The Wolof Audio Dataset is a collection of audio recordings and their corresponding transcriptions in Wolof. This dataset is designed to support the development of Automatic Speech Recognition (ASR) models for the Wolof language. It was created by combining three existing datasets:

ALFFA: Available at serge-wilson/wolof_speech_transcription FLEURS: Available at vonewman/fleurs-wolof-dataset Urban Bus Wolof Speech Dataset: Available at vonewman/urban-bus-wolof… See the full description on the dataset page: https://huggingface.co/datasets/vonewman/wolof-audio-data.
F
Japanese (Japan) General Conversation Speech Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Japanese (Japan) General Conversation Speech Dataset [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-japanese-japan
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
Area covered
Japan
Dataset funded by
FutureBeeAI
Description
What’s Included
Welcome to the Japanese Language General Conversation Speech Dataset, a comprehensive and diverse collection of voice data specifically curated to advance the development of Japanese language speech recognition models, with a particular focus on Japan accents and dialects.
With high-quality audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and Generative Voice AI algorithms. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the Japanese language spoken in Japan.
Speech Data:
This training dataset comprises 50 hours of audio recordings covering a wide range of topics and scenarios, ensuring robustness and accuracy in speech technology applications. To achieve this, we collaborated with a diverse network of 70 native Japanese speakers from different states/provinces of Japan. This collaborative effort guarantees a balanced representation of Japan accents, dialects, and demographics, reducing biases and promoting inclusivity.
Each audio recording captures the essence of spontaneous, unscripted conversations between two individuals, with an average duration ranging from 15 to 60 minutes. The speech data is available in WAV format, with stereo channel files having a bit depth of 16 bits and a sample rate of 8 kHz. The recording environment is generally quiet, without background noise and echo.
Metadata:
In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This metadata includes the participant's age, gender, country, state, and dialect. Furthermore, additional metadata such as recording device detail, topic of recording, bit depth, and sample rate will be provided.
The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Japanese language speech recognition models.
Transcription:
This dataset provides a manual verbatim transcription of each audio file to enhance your workflow efficiency. The transcriptions are available in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags.
Our goal is to expedite the deployment of Japanese language conversational AI and NLP models by offering ready-to-use transcriptions, ultimately saving valuable time and resources in the development process.
Updates and Customization:
We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our voice dataset is regularly updated with new audio data captured in diverse real-world conditions.
If you require a custom training dataset with specific environmental conditions such as in-car, busy street, restaurant, or any other scenario, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.
License:
This audio dataset, created by FutureBeeAI, is now available for commercial use.
Conclusion:
Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, exploring generative voice AI, or building cutting-edge voice assistants and bots, our dataset serves as a reliable and valuable resource.
n
Data from: Domain-specific neural networks improve automated bird sound...
data.niaid.nih.gov
datadryad.org
+1more
zip
Updated Sep 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrik Lauha; Panu Somervuo; Petteri Lehikoinen; Lisa Geres; Tobias Richter; Sebastian Seibold; Otso Ovaskainen (2022). Domain-specific neural networks improve automated bird sound recognition already with small amount of local data [Dataset]. http://doi.org/10.5061/dryad.2bvq83btd
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.2bvq83btd
Dataset updated
Sep 28, 2022
Dataset provided by
University of Jyväskylä
Goethe University Frankfurt
Technical University of Munich
University of Helsinki
Authors
Patrik Lauha; Panu Somervuo; Petteri Lehikoinen; Lisa Geres; Tobias Richter; Sebastian Seibold; Otso Ovaskainen
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
An automatic bird sound recognition system is a useful tool for collecting data of different bird species for ecological analysis. Together with autonomous recording units (ARUs), such a system provides a possibility to collect bird observations on a scale that no human observer could ever match. During the last decades progress has been made in the field of automatic bird sound recognition, but recognizing bird species from untargeted soundscape recordings remains a challenge. In this article we demonstrate the workflow for building a global identification model and adjusting it to perform well on the data of autonomous recorders from a specific region. We show how data augmentation and a combination of global and local data can be used to train a convolutional neural network to classify vocalizations of 101 bird species. We construct a model and train it with a global data set to obtain a base model. The base model is then fine-tuned with local data from Southern Finland in order to adapt it to the sound environment of a specific location and tested with two data sets: one originating from the same Southern Finnish region and another originating from a different region in German Alps. Our results suggest that fine-tuning with local data significantly improves the network performance. Classification accuracy was improved for test recordings from the same area as the local training data (Southern Finland) but not for recordings from a different region (German Alps). Data augmentation enables training with a limited number of training data and even with few local data samples significant improvement over the base model can be achieved. Our model outperforms the current state-of-the-art tool for automatic bird sound classification. Using local data to adjust the recognition model for the target domain leads to improvement over general non-tailored solutions. The process introduced in this article can be applied to build a fine-tuned bird sound classification model for a specific environment. Methods This repository contains data and recognition models described in paper Domain-specific neural networks improve automated bird sound recognition already with small amount of local data. (Lauha et al., 2022).
D
CNVVE Dataset clean audio samples
darus.uni-stuttgart.de
Updated Feb 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ramin Hedeshy; Raphael Menges; Steffen Staab (2024). CNVVE Dataset clean audio samples [Dataset]. http://doi.org/10.18419/DARUS-3898
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-3898
Dataset updated
Feb 13, 2024
Dataset provided by
DaRUS
Authors
Ramin Hedeshy; Raphael Menges; Steffen Staab
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
BMBF
BMWK/ESF
Description
This CNVVE Dataset contains clean audio samples encompassing six distinct classes of voice expressions, namely “Uh-huh” or “mm-hmm”, “Uh-uh” or “mm-mm”, “Hush” or “Shh”, “Psst”, “Ahem”, and Continuous humming, e.g., “hmmm.” Audio samples of each class are found in the respective folders. These audio samples have undergone a thorough cleaning process. The raw samples are published in https://doi.org/10.18419/darus-3897. Initially, we applied the Google WebRTC voice activity detection (VAD) algorithm on the given audio files to remove noise or silence from the collected voice signals. The intensity was set to "2", which could be a value between "1" and "3". However, because of variations in the data, some files required additional manual cleaning. These outliers, characterized by sharp click sounds (such as those occurring at the end of recordings), were addressed. The samples are recorded through a dedicated website for data collection that defines the purpose and type of voice data by providing example recordings to participants as well as the expressions’ written equivalent, e.g., “Uh-huh”. Audio recordings were automatically saved in the .wav format and kept anonymous, with a sampling rate of 48 kHz and a bit depth of 32 bits. For more info, please check the paper or feel free to contact the authors for any inquiries.
S
Speech and Audio Data Report
marketresearchforecast.com
doc, pdf, ppt
Updated Mar 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Speech and Audio Data Report [Dataset]. https://www.marketresearchforecast.com/reports/speech-and-audio-data-28840
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Mar 7, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global speech and audio data market is experiencing robust growth, driven by the increasing adoption of voice assistants, the proliferation of smart devices, and the expanding use of speech analytics in various sectors. The market, estimated at $15 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 18% from 2025 to 2033, reaching an estimated $50 billion by 2033. Key drivers include advancements in artificial intelligence (AI), particularly in natural language processing (NLP) and machine learning (ML), which are enhancing the accuracy and efficiency of speech recognition and analysis. Furthermore, the growing demand for personalized user experiences, coupled with the rise of multilingual applications, is fueling market expansion. The market is segmented by language (Chinese Mandarin, English, Spanish, French, and Others) and application (Commercial Use and Academic Use). Commercial applications, including customer service, market research, and healthcare, currently dominate, but the academic sector is showing significant growth potential as research into speech technology advances. Geographic distribution shows North America and Europe currently holding the largest market shares, but the Asia-Pacific region is expected to experience the fastest growth in the coming years, fueled by increasing smartphone penetration and digitalization in emerging economies like India and China. Restraints include data privacy concerns, the need for high-quality data collection, and the challenges associated with handling diverse accents and dialects. The competitive landscape is characterized by a mix of large technology companies like Google, Amazon, and Microsoft, and specialized speech technology providers such as Nuance and VoiceBase. These companies are engaged in intense R&D to improve the accuracy and performance of speech recognition and synthesis technologies. Strategic partnerships and acquisitions are expected to shape the market further, as companies seek to expand their product portfolios and geographic reach. The ongoing innovation in speech-to-text and text-to-speech technologies, alongside the integration of speech data with other data types (like text and image data), will unlock new applications and further accelerate market growth. The demand for real-time transcription and translation services is also contributing to this upward trend, driving investment in innovative solutions and pushing the boundaries of what’s possible with speech and audio data.
g
English Deep South Media Audio Dataset
gts.ai
json
Updated Nov 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2022). English Deep South Media Audio Dataset [Dataset]. https://gts.ai/case-study/english-deep-south-media-audio-dataset-for-data-annotation/
Explore at:
jsonAvailable download formats
Dataset updated
Nov 19, 2022
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The English Deep South Media Audio Dataset project is designed to develop a comprehensive audio dataset focusing on the unique accents and dialects of the English Deep South.
Z
Axiom voice recognition dataset
data.niaid.nih.gov
Updated Aug 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sara Ermini (2024). Axiom voice recognition dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1218978
Explore at:
Dataset updated
Aug 2, 2024
Dataset provided by
Sara Ermini
Antonio Rizzo
Nicola Bettin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The AXIOM Voice Dataset has the main purpose of gathering audio recordings from Italian natural language speakers. This voice data collection intended to obtain audio reconding sample for the training and testing of VIMAR algorithm implemented for the Smart Home scenario for the Axiom board. The final goal was to developing an efficient voice recognition system using machine learning algorithms. A team of UX researchers of the University of Siena collected data for five months and tested the voice recognition system on the AXIOM board [1]. The data acquisition process involved natural Italian speakers who provided their written consent to participate in the research project. The participants were selected in order to maintain a cluster with different characteristics in gender, age, region of origin and background.
F
Polish (Poland) General Conversation Speech Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Polish (Poland) General Conversation Speech Dataset [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-polish-poland
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
Area covered
Poland
Dataset funded by
FutureBeeAI
Description
What’s Included
Welcome to the Polish Language General Conversation Speech Dataset, a comprehensive and diverse collection of voice data specifically curated to advance the development of Polish language speech recognition models, with a particular focus on Poland accents and dialects.
With high-quality audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and Generative Voice AI algorithms. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the Polish language spoken in Poland.
Speech Data:
This training dataset comprises 50 hours of audio recordings covering a wide range of topics and scenarios, ensuring robustness and accuracy in speech technology applications. To achieve this, we collaborated with a diverse network of 70 native Polish speakers from different states/provinces of Poland. This collaborative effort guarantees a balanced representation of Poland accents, dialects, and demographics, reducing biases and promoting inclusivity.
Each audio recording captures the essence of spontaneous, unscripted conversations between two individuals, with an average duration ranging from 15 to 60 minutes. The speech data is available in WAV format, with stereo channel files having a bit depth of 16 bits and a sample rate of 8 kHz. The recording environment is generally quiet, without background noise and echo.
Metadata:
In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This metadata includes the participant's age, gender, country, state, and dialect. Furthermore, additional metadata such as recording device detail, topic of recording, bit depth, and sample rate will be provided.
The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Polish language speech recognition models.
Transcription:
This dataset provides a manual verbatim transcription of each audio file to enhance your workflow efficiency. The transcriptions are available in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags.
Our goal is to expedite the deployment of Polish language conversational AI and NLP models by offering ready-to-use transcriptions, ultimately saving valuable time and resources in the development process.
Updates and Customization:
We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our voice dataset is regularly updated with new audio data captured in diverse real-world conditions.
If you require a custom training dataset with specific environmental conditions such as in-car, busy street, restaurant, or any other scenario, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.
License:
This audio dataset, created by FutureBeeAI, is now available for commercial use.
Conclusion:
Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, exploring generative voice AI, or building cutting-edge voice assistants and bots, our dataset serves as a reliable and valuable resource.
FSD50K
zenodo.org
opendatalab.com
+2more
bin, zip
Updated Apr 24, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eduardo Fonseca; Eduardo Fonseca; Xavier Favory; Jordi Pons; Frederic Font; Frederic Font; Xavier Serra; Xavier Serra; Xavier Favory; Jordi Pons (2022). FSD50K [Dataset]. http://doi.org/10.5281/zenodo.4060432
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4060432
Dataset updated
Apr 24, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Eduardo Fonseca; Eduardo Fonseca; Xavier Favory; Jordi Pons; Frederic Font; Frederic Font; Xavier Serra; Xavier Serra; Xavier Favory; Jordi Pons
Description
FSD50K is an open dataset of human-labeled sound events containing 51,197 Freesound clips unequally distributed in 200 classes drawn from the AudioSet Ontology. FSD50K has been created at the Music Technology Group of Universitat Pompeu Fabra.

Citation

If you use the FSD50K dataset, or part of it, please cite our TASLP paper (available from [arXiv] [TASLP]):

@article{fonseca2022FSD50K, title={{FSD50K}: an open dataset of human-labeled sound events}, author={Fonseca, Eduardo and Favory, Xavier and Pons, Jordi and Font, Frederic and Serra, Xavier}, journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing}, volume={30}, pages={829--852}, year={2022}, publisher={IEEE} }

Paper update: This paper has been published in TASLP at the beginning of 2022. The accepted camera-ready version includes a number of improvements with respect to the initial submission. The main updates include: estimation of the amount of label noise in FSD50K, SNR comparison between FSD50K and AudioSet, improved description of evaluation metrics including equations, clarification of experimental methodology and some results, some content moved to Appendix for readability. The TASLP-accepted camera-ready version is available from arXiv (in particular, it is v2 in arXiv, displayed by default).

Data curators

Eduardo Fonseca, Xavier Favory, Jordi Pons, Mercedes Collado, Ceren Can, Rachit Gupta, Javier Arredondo, Gary Avendano and Sara Fernandez

Contact

You are welcome to contact Eduardo Fonseca should you have any questions, at efonseca@google.com.

ABOUT FSD50K

Freesound Dataset 50k (or FSD50K for short) is an open dataset of human-labeled sound events containing 51,197 Freesound clips unequally distributed in 200 classes drawn from the AudioSet Ontology [1]. FSD50K has been created at the Music Technology Group of Universitat Pompeu Fabra.

What follows is a brief summary of FSD50K's most important characteristics. Please have a look at our paper (especially Section 4) to extend the basic information provided here with relevant details for its usage, as well as discussion, limitations, applications and more.

Basic characteristics:

FSD50K contains 51,197 audio clips from Freesound, totalling 108.3 hours of multi-labeled audio

The dataset encompasses 200 sound classes (144 leaf nodes and 56 intermediate nodes) hierarchically organized with a subset of the AudioSet Ontology.

The audio content is composed mainly of sound events produced by physical sound sources and production mechanisms, including human sounds, sounds of things, animals, natural sounds, musical instruments and more. The vocabulary can be inspected in vocabulary.csv (see Files section below).

The acoustic material has been manually labeled by humans following a data labeling process using the Freesound Annotator platform [2].

Clips are of variable length from 0.3 to 30s, due to the diversity of the sound classes and the preferences of Freesound users when recording sounds.

All clips are provided as uncompressed PCM 16 bit 44.1 kHz mono audio files.

Ground truth labels are provided at the clip-level (i.e., weak labels).

The dataset poses mainly a large-vocabulary multi-label sound event classification problem, but also allows development and evaluation of a variety of machine listening approaches (see Sec. 4D in our paper).

In addition to audio clips and ground truth, additional metadata is made available (including raw annotations, sound predominance ratings, Freesound metadata, and more), allowing a variety of analyses and sound event research tasks (see Files section below).

The audio clips are grouped into a development (dev) set and an evaluation (eval) set such that they do not have clips from the same Freesound uploader.

Dev set:

40,966 audio clips totalling 80.4 hours of audio

Avg duration/clip: 7.1s

114,271 smeared labels (i.e., labels propagated in the upwards direction to the root of the ontology)

Labels are correct but could be occasionally incomplete

A train/validation split is provided (Sec. 3H). If a different split is used, it should be specified for reproducibility and fair comparability of results (see Sec. 5C of our paper)

Eval set:

10,231 audio clips totalling 27.9 hours of audio

Avg duration/clip: 9.8s

38,596 smeared labels

Eval set is labeled exhaustively (labels are correct and complete for the considered vocabulary)

Note: All classes in FSD50K are represented in AudioSet, except Crash cymbal, Human group actions, Human voice, Respiratory sounds, and Domestic sounds, home sounds.

LICENSE

All audio clips in FSD50K are released under Creative Commons (CC) licenses. Each clip has its own license as defined by the clip uploader in Freesound, some of them requiring attribution to their original authors and some forbidding further commercial reuse. Specifically:

The development set consists of 40,966 clips with the following licenses:

CC0: 14,959

CC-BY: 20,017

CC-BY-NC: 4616

CC Sampling+: 1374

The evaluation set consists of 10,231 clips with the following licenses:

CC0: 4914

CC-BY: 3489

CC-BY-NC: 1425

CC Sampling+: 403

For attribution purposes and to facilitate attribution of these files to third parties, we include a mapping from the audio clips to their corresponding licenses. The licenses are specified in the files dev_clips_info_FSD50K.json and eval_clips_info_FSD50K.json.

In addition, FSD50K as a whole is the result of a curation process and it has an additional license: FSD50K is released under CC-BY. This license is specified in the LICENSE-DATASET file downloaded with the FSD50K.doc zip file. We note that the choice of one license for the dataset as a whole is not straightforward as it comprises items with different licenses (such as audio clips, annotations, or data split). The choice of a global license in these cases may warrant further investigation (e.g., by someone with a background in copyright law).

Usage of FSD50K for commercial purposes:

If you'd like to use FSD50K for commercial purposes, please contact Eduardo Fonseca and Frederic Font at efonseca@google.com and frederic.font@upf.edu.

Also, if you are interested in using FSD50K for machine learning competitions, please contact Eduardo Fonseca and Frederic Font at efonseca@google.com and frederic.font@upf.edu.

FILES

FSD50K can be downloaded as a series of zip files with the following directory structure:

root │ └───FSD50K.dev_audio/ Audio clips in the dev set │ └───FSD50K.eval_audio/ Audio clips in the eval set │ └───FSD50K.ground_truth/ Files for FSD50K's ground truth │ │ │ └─── dev.csv Ground truth for the dev set │ │ │ └─── eval.csv Ground truth for the eval set │ │ │ └─── vocabulary.csv List of 200 sound classes in FSD50K │ └───FSD50K.metadata/ Files for additional metadata │ │ │ └─── class_info_FSD50K.json Metadata about the sound classes │ │ │ └─── dev_clips_info_FSD50K.json Metadata about the dev clips │ │ │ └─── eval_clips_info_FSD50K.json Metadata about the eval clips │ │ │ └─── pp_pnp_ratings_FSD50K.json PP/PNP ratings │ │ │ └─── collection/ Files for the *sound collection* format │ └───FSD50K.doc/ │ └───README.md The dataset description file that you are reading │ └───LICENSE-DATASET License of the FSD50K dataset as an entity

Each row (i.e. audio clip) of dev.csv contains the following information:

fname: the file name without the .wav extension, e.g., the fname 64760 corresponds to the file 64760.wav in disk. This number is the Freesound id. We always use Freesound ids as filenames.

labels: the class labels (i.e., the ground truth). Note these
Sound and Audio Data in Uganda
kaggle.com
Updated Apr 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Techsalerator (2025). Sound and Audio Data in Uganda [Dataset]. https://www.kaggle.com/datasets/techsalerator/sound-and-audio-data-in-uganda/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 3, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Techsalerator
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Area covered
Uganda
Description
Techsalerator’s Location Sentiment Data for Uganda

Techsalerator’s Location Sentiment Data for Uganda offers an extensive collection of data that is crucial for businesses, researchers, and technology developers. This dataset provides deep insights into public sentiment across various locations in Uganda, enabling data-driven decision-making for development, marketing, and social research.

For access to the full dataset, contact us at info@techsalerator.com or visit Techsalerator Contact Us.

Techsalerator’s Location Sentiment Data for Uganda

Techsalerator’s Location Sentiment Data for Uganda delivers a comprehensive analysis of public sentiment across urban, rural, and industrial locations. This dataset is essential for businesses, government agencies, and researchers looking to understand the sentiment trends in different regions of Uganda.

Top 5 Key Data Fields

Location of Data Capture – Identifies the geographic location where sentiment data was collected, enabling location-specific analysis of public perception.

Sentiment Score – Provides a numerical representation of sentiment, with positive, negative, and neutral classifications, supporting sentiment analysis for public opinion research.

Demographic Segmentation – Breaks down sentiment by key demographic factors such as age, gender, and occupation to uncover sentiment trends within specific groups.

Time of Data Capture – Records the exact time and date of sentiment data collection, helping analyze variations in sentiment over different times of day or during specific events.

Sentiment Source – Categorizes data sources such as social media posts, surveys, and customer feedback, to offer insights into the platform-specific sentiment.

Top 5 Sentiment Trends in Uganda

Urban vs. Rural Sentiment – Variations in sentiment between urban centers like Kampala and rural areas, often revealing different priorities and perceptions on topics like infrastructure, education, and healthcare.

Political Sentiment – Public sentiment around political events and figures, with insights into political stability, government policies, and public opinion on elections.

Economic Sentiment – How Ugandans feel about economic conditions, employment opportunities, inflation, and business growth across different regions.

Social Issues Sentiment – Public opinion on social issues such as gender equality, healthcare access, education, and human rights.

Technology Adoption Sentiment – Increasing interest in digital technologies, mobile platforms, and internet access, reflecting sentiment on technological advancements and connectivity.

Top 5 Applications of Location Sentiment Data in Uganda

Urban Development and Planning – Helps city planners and government bodies design better urban environments based on public sentiment toward infrastructure, traffic, and public services.

Marketing and Consumer Insights – Brands use sentiment data to tailor marketing campaigns and improve customer engagement by understanding regional preferences and concerns.

Policy and Governance – Governments and NGOs utilize sentiment data to shape policies that address public concerns and improve governance effectiveness.

Social Research – Social researchers can analyze regional disparities in public opinion on issues like education, healthcare, and social justice.

Crisis Management and Response – Sentiment data aids in understanding public reaction to crises like health emergencies or natural disasters, helping improve response strategies.

Accessing Techsalerator’s Location Sentiment Data

To obtain Techsalerator’s Location Sentiment Data for Uganda, contact info@techsalerator.com with your specific requirements. Techsalerator offers customized datasets based on requested fields, with delivery available within 24 hours. Ongoing access options can also be discussed.

Included Data Fields

Location of Data Capture

Sentiment Score

Demographic Segmentation

Time of Data Capture

Sentiment Source

Topic Categories

Public Opinion on Government Policies

Sentiment on Social Issues

Regional Sentiment Trends

Contact Information

For deep insights into public sentiment across Uganda, Techsalerator’s dataset is an invaluable resource for businesses, policymakers, and researchers.
Data Collection And Labelling Market Report | Global Forecast From 2025 To...
dataintelo.com
csv, pdf, pptx
Updated Sep 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Data Collection And Labelling Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-collection-and-labelling-market
Explore at:
csv, pdf, pptxAvailable download formats
Dataset updated
Sep 22, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Data Collection and Labelling Market Outlook

The global market size for data collection and labelling was estimated at USD 1.3 billion in 2023, with forecasts predicting it will reach approximately USD 7.8 billion by 2032, showcasing a robust CAGR of 20.8% during the forecast period. Several factors are driving this significant growth, including the rising adoption of artificial intelligence (AI) and machine learning (ML) across various industries, the increasing demand for high-quality annotated data, and the proliferation of data-driven decision-making processes.

One of the primary growth factors in the data collection and labelling market is the rapid advancement and integration of AI and ML technologies across various industry verticals. These technologies require vast amounts of accurately annotated data to train algorithms and improve their accuracy and efficiency. As AI and ML applications become more prevalent in sectors such as healthcare, automotive, and retail, the demand for high-quality labelled data is expected to grow exponentially. Furthermore, the increasing need for automation and the ability to extract valuable insights from large datasets are driving the adoption of data labelling services.

Another significant factor contributing to the market's growth is the rising focus on enhancing customer experiences and personalisation. Companies are leveraging data collection and labelling to gain deeper insights into customer behaviour, preferences, and trends. This enables them to develop more targeted marketing strategies, improve product recommendations, and deliver personalised services. As businesses strive to stay competitive in a rapidly evolving digital landscape, the demand for accurate and comprehensive data labelling solutions is expected to rise.

The growing importance of data privacy and security is also playing a crucial role in driving the data collection and labelling market. With the implementation of stringent data protection regulations, such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), organisations are increasingly focusing on ensuring the accuracy and integrity of their data. This has led to a greater emphasis on data labelling processes, as they help maintain data quality and compliance with regulatory requirements. Additionally, the rising awareness of the potential risks associated with biased or inaccurate data is further propelling the demand for reliable data labelling services.

Regionally, North America is expected to dominate the data collection and labelling market during the forecast period. The region's strong technological infrastructure, high adoption rate of AI and ML technologies, and the presence of major market players contribute to its leading position. Additionally, the Asia Pacific region is anticipated to witness significant growth, driven by the increasing investments in AI and ML technologies, the expanding IT and telecommunications sector, and the growing focus on digital transformation in countries such as China, India, and Japan. Europe is also expected to experience steady growth, supported by the rising adoption of AI-driven applications across various industries and the implementation of data protection regulations.

Data Type Analysis

The data collection and labelling market can be segmented by data type into text, image/video, and audio. Each type has its unique applications and demands, creating diverse opportunities and challenges within the market. Text data labelling is particularly crucial for natural language processing (NLP) applications, such as chatbots, sentiment analysis, and language translation. The growing adoption of NLP technologies across various industries, including healthcare, finance, and customer service, is driving the demand for high-quality text data labelling services.

Image and video data labelling is essential for computer vision applications, such as facial recognition, object detection, and autonomous vehicles. The increasing deployment of these technologies in industries such as automotive, retail, and surveillance is fuelling the demand for accurate image and video annotation. Additionally, the growing popularity of augmented reality (AR) and virtual reality (VR) applications is further contributing to the demand for labelled image and video data. The rising need for real-time video analytics and the development of advanced visual search engines are also driving the growth of this segment.

Audio data labelling is critical for speech recognition and audio analysis appli
Z
2024-04-08 Total Solar Eclipse ESID#504
data.niaid.nih.gov
zenodo.org
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ARISA Lab, L.L.C. (2025). 2024-04-08 Total Solar Eclipse ESID#504 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14889434
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
ARISA Lab, L.L.C.
Volunteer Scientist
Severino, MaryKay
Winter, Henry
Description
These are audio recordings taken by an Eclipse Soundscapes (ES) Data Collector during the week of the April 08, 2024 Total Solar Eclipse.

Data Site location information:

Latitude: 36.089254

Longitude: -92.54893

Type of Eclipse: Total Solar Eclipse

Eclipse %: 100

WAV files Time & Date Settings: Set with Automated AudioMoth Time chime

Included Data:

Audio files in WAV format with the date and time in UTC within the file name: YYYYMMDD_HHMMSS meaning YearMonthDay_HourMinuteSecondFor example, 20240411_141600.WAV means that this audio file starts on April 11, 2024 at 14:16:00 Coordinated Universal Time (UTC)

CONFIG Text file: Includes AudioMoth device setting information, such as sample rate in Hertz (Hz), gain, firmware, etc.

Eclipse Information for this location:

Eclipse Date: 04/08/2024

Eclipse Start Time (UTC): 17:35:18

Totality Start Time (UTC): [N/A if partial eclipse] 18:52:19

Eclipse Maximum when the most possible amount of the Sun in blocked: 18:54:05

Totality End Time (UTC): [N/A if partial eclipse] 18:55:51

Eclipse End Time (UTC): [N/A if partial eclipse] 20:12:14

Audio Data Collection During Eclipse Week

ES Data Collectors used AudioMoth devices to record audio data, known as soundscapes, over a 5-day period during the eclipse week: 2 days before the eclipse, the day of the eclipse, and 2 days after. The complete raw audio data collected by the Data Collector at the location mentioned above is provided here. This data may or may not cover the entire requested timeframe due to factors such as availability, technical issues, or other unforeseen circumstances.

ES ID# Information:

Each AudioMoth recording device was assigned a unique Eclipse Soundscapes Identification Number (ES ID#). This identifier connects the audio data, submitted via a MicroSD card, with the latitude and longitude information provided by the data collector through an online form. The ES team used the ES ID# to link the audio data with its corresponding location information and then uploaded this raw audio data and location details to Zenodo. This process ensures the anonymity of the ES Data Collectors while allowing them to easily search for and access their audio data on Zenodo.

TimeStamp Information:

The ES team and the Data Collectors took care to set the date and time on the AudioMoth recording devices using an AudioMoth time chime before deployment, ensuring that the recordings would have an automatic timestamp. However, participants also manually noted the date and start time as a backup in case the time chime setup failed. The notes above indicate whether the WAV audio files for this site were timestamped manually or with the automated AudioMoth time chime.

Common Timestamp Error:

Some AudioMoth devices experienced a malfunction where the timestamp on audio files reverted to a date in 1970 or before, even after initially recording correctly. Despite this issue, the affected data was still included in this ES site’s collected raw audio dataset.

Latitude & Longitude Information:

The latitude and longitude for each site was taken manually by data collectors and submitted to the ES team, either via a web form or on paper. It is shared in Decimal Degrees format.

General Project Information:

The Eclipse Soundscapes Project is a NASA Volunteer Science project funded by NASA Science Activation that is studying how eclipses affect life on Earth during the October 14, 2023 annular solar eclipse and the April 8, 2024 total solar eclipse. Eclipse Soundscapes revisits an eclipse study from almost 100 years ago that showed that animals and insects are affected by solar eclipses! Like this study from 100 years ago, ES asked for the public's help. ES uses modern technology to continue to study how solar eclipses affect life on Earth! You can learn more at www.EclipseSoundscapes.org.

Eclipse Soundscapes is an enterprise of ARISA Lab, LLC and is supported by NASA award No. 80NSSC21M0008.

Eclipse Data Version Definitions

{1st digit = year, 2nd digit = Eclipse type (1=Total Solar Eclipse, 9=Annular Solar Eclipse, 0=Partial Solar Eclipse), 3rd digit is unused and in place for future use}

2023.9.0 = Week of October 14, 2023 Annular Eclipse Audio Data, Path of Annularity (Annular Eclipse)

2023.0.0 = Week of October 14, 2023 Annular Eclipse Audio Data, OFF the Path of Annularity (Partial Eclipse)

2024.1.0 = Week of April 8, 2024 Total Solar Eclipse Audio Data, Path of Totality (Total Solar Eclipse)

2024.0.0 = Week of April 8, 2024 Total Solar Eclipse Audio Data , OFF the Path of Totality (Partial Solar Eclipse)

*Please note that this dataset's version number is listed below.

Individual Site Citation: APA Citation (7th edition)

ARISA Lab, L.L.C., Winter, H., Severino, M., & Volunteer Scientist. (2025). 2024 solar eclipse soundscapes audio data [Audio dataset, ES ID# 504]. Zenodo.{Insert DOI}Collected by volunteer scientists as part of the Eclipse Soundscapes Project.This project is supported by NASA award No. 80NSSC21M0008.

Eclipse Community Citation

ARISA Lab, L.L.C., Winter, H., Severino, M., & Volunteer Scientists. 2023 and 2024 solar eclipse soundscapes audio data [Collection of audio datasets]. Eclipse Soundscapes Community, Zenodo. https://zenodo.org/communities/eclipsesoundscapes/Collected by volunteer scientists as part of the Eclipse Soundscapes Project.This project is supported by NASA award No. 80NSSC21M0008.
Z
2024-04-08 Total Solar Eclipse ESID#001
data.niaid.nih.gov
zenodo.org
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ARISA Lab, L.L.C. (2025). 2024-04-08 Total Solar Eclipse ESID#001 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14888070
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
ARISA Lab, L.L.C.
Volunteer Scientist
Severino, MaryKay
Winter, Henry
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These are audio recordings taken by an Eclipse Soundscapes (ES) Data Collector during the week of the April 08, 2024 Total Solar Eclipse.

Data Site location information:

Latitude: 34.53804

Longitude: -93.03621

Type of Eclipse: Total Solar Eclipse

Eclipse %: 100

WAV files Time & Date Settings: Set with Automated AudioMoth Time chime

Included Data:

Audio files in WAV format with the date and time in UTC within the file name: YYYYMMDD_HHMMSS meaning YearMonthDay_HourMinuteSecondFor example, 20240411_141600.WAV means that this audio file starts on April 11, 2024 at 14:16:00 Coordinated Universal Time (UTC)

CONFIG Text file: Includes AudioMoth device setting information, such as sample rate in Hertz (Hz), gain, firmware, etc.

Eclipse Information for this location:

Eclipse Date: 04/08/2024

Eclipse Start Time (UTC): 17:31:56

Totality Start Time (UTC): [N/A if partial eclipse] 18:49:24

Eclipse Maximum when the most possible amount of the Sun in blocked: 18:51:15

Totality End Time (UTC): [N/A if partial eclipse] 18:53:05

Eclipse End Time (UTC): [N/A if partial eclipse] 20:10:10

Audio Data Collection During Eclipse Week

ES Data Collectors used AudioMoth devices to record audio data, known as soundscapes, over a 5-day period during the eclipse week: 2 days before the eclipse, the day of the eclipse, and 2 days after. The complete raw audio data collected by the Data Collector at the location mentioned above is provided here. This data may or may not cover the entire requested timeframe due to factors such as availability, technical issues, or other unforeseen circumstances.

ES ID# Information:

Each AudioMoth recording device was assigned a unique Eclipse Soundscapes Identification Number (ES ID#). This identifier connects the audio data, submitted via a MicroSD card, with the latitude and longitude information provided by the data collector through an online form. The ES team used the ES ID# to link the audio data with its corresponding location information and then uploaded this raw audio data and location details to Zenodo. This process ensures the anonymity of the ES Data Collectors while allowing them to easily search for and access their audio data on Zenodo.

TimeStamp Information:

The ES team and the Data Collectors took care to set the date and time on the AudioMoth recording devices using an AudioMoth time chime before deployment, ensuring that the recordings would have an automatic timestamp. However, participants also manually noted the date and start time as a backup in case the time chime setup failed. The notes above indicate whether the WAV audio files for this site were timestamped manually or with the automated AudioMoth time chime.

Common Timestamp Error:

Some AudioMoth devices experienced a malfunction where the timestamp on audio files reverted to a date in 1970 or before, even after initially recording correctly. Despite this issue, the affected data was still included in this ES site’s collected raw audio dataset.

Latitude & Longitude Information:

The latitude and longitude for each site was taken manually by data collectors and submitted to the ES team, either via a web form or on paper. It is shared in Decimal Degrees format.

General Project Information:

The Eclipse Soundscapes Project is a NASA Volunteer Science project funded by NASA Science Activation that is studying how eclipses affect life on Earth during the October 14, 2023 annular solar eclipse and the April 8, 2024 total solar eclipse. Eclipse Soundscapes revisits an eclipse study from almost 100 years ago that showed that animals and insects are affected by solar eclipses! Like this study from 100 years ago, ES asked for the public's help. ES uses modern technology to continue to study how solar eclipses affect life on Earth! You can learn more at www.EclipseSoundscapes.org.

Eclipse Soundscapes is an enterprise of ARISA Lab, LLC and is supported by NASA award No. 80NSSC21M0008.

Eclipse Data Version Definitions

{1st digit = year, 2nd digit = Eclipse type (1=Total Solar Eclipse, 9=Annular Solar Eclipse, 0=Partial Solar Eclipse), 3rd digit is unused and in place for future use}

2023.9.0 = Week of October 14, 2023 Annular Eclipse Audio Data, Path of Annularity (Annular Eclipse)

2023.0.0 = Week of October 14, 2023 Annular Eclipse Audio Data, OFF the Path of Annularity (Partial Eclipse)

2024.1.0 = Week of April 8, 2024 Total Solar Eclipse Audio Data, Path of Totality (Total Solar Eclipse)

2024.0.0 = Week of April 8, 2024 Total Solar Eclipse Audio Data , OFF the Path of Totality (Partial Solar Eclipse)

*Please note that this dataset's version number is listed below.

Individual Site Citation: APA Citation (7th edition)

ARISA Lab, L.L.C., Winter, H., Severino, M., & Volunteer Scientist. (2025). 2024 solar eclipse soundscapes audio data [Audio dataset, ES ID# 001]. Zenodo.{Insert DOI}. Collected by volunteer scientists as part of the Eclipse Soundscapes Project. This project is supported by NASA award No. 80NSSC21M0008.

Eclipse Community Citation

ARISA Lab, L.L.C., Winter, H., Severino, M., & Volunteer Scientists. 2023 and 2024 solar eclipse soundscapes audio data [Collection of audio datasets]. Eclipse Soundscapes Community, Zenodo. https://zenodo.org/communities/eclipsesoundscapes/. Collected by volunteer scientists as part of the Eclipse Soundscapes Project. This project is supported by NASA award No. 80NSSC21M0008.
Sound and Audio Data in Palestine State
kaggle.com
Updated Mar 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Techsalerator (2025). Sound and Audio Data in Palestine State [Dataset]. https://www.kaggle.com/datasets/techsalerator/sound-and-audio-data-in-palestine-state/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 31, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Techsalerator
Description
Techsalerator’s Location Sentiment Data for Palestine State

Techsalerator’s Location Sentiment Data for Palestine State offers a detailed collection of insights vital for businesses, researchers, and technology developers. This dataset provides in-depth information about the emotional sentiment across different regions, capturing the mood and opinions of people in various environments within Palestine State.

For access to the full dataset, contact us at info@techsalerator.com or visit Techsalerator Contact Us.

Techsalerator’s Location Sentiment Data for Palestine State

Techsalerator’s Location Sentiment Data for Palestine State delivers a thorough analysis of sentiment across urban, rural, and industrial locations. This dataset is crucial for AI development, social studies, marketing strategies, and telecommunications.

Top 5 Key Data Fields

Geographic Location – Identifies the exact area where sentiment was captured, helping analyze regional sentiment variations across Palestine State.

Sentiment Score – Provides a quantitative measure of positive, neutral, or negative sentiment, allowing businesses to understand public opinion in real-time.

Demographic Insights – Breaks down sentiment data by age, gender, and social status, offering a more nuanced understanding of public sentiment.

Time of Sentiment Capture – Records the exact date and time of sentiment data collection, allowing for the analysis of fluctuations in sentiment across different times of day.

Social Media Mentions – Tracks the frequency of positive, neutral, or negative mentions on social media platforms, contributing to a real-time sentiment analysis.

Top 5 Sentiment Trends in Palestine State

Urban Sentiment Shifts – Increased urbanization in cities like Ramallah and Gaza leads to fluctuating sentiment based on political, economic, and social factors.

Political Influence on Sentiment – Events such as elections, protests, and policy changes significantly impact sentiment, particularly in high-stakes periods.

Cultural and Social Movements – Social and cultural changes, such as youth activism, impact sentiment data, highlighting generational shifts in Palestine.

Economic Sentiment – Economic conditions and development projects drive positive or negative sentiment, influencing both consumer behavior and public opinion.

Community Well-being – Sentiment data is being used to track public opinion on health and safety issues, particularly in response to local crises or humanitarian efforts.

Top 5 Applications of Location Sentiment Data in Palestine State

Marketing and Branding – Businesses use sentiment data to fine-tune their advertising strategies, ensuring messaging resonates positively with local audiences.

AI and Machine Learning – Enhancing sentiment analysis systems with localized data for better accuracy in understanding emotions, opinions, and social behavior.

Political Campaigning – Political candidates and parties use sentiment data to track public opinion and adjust campaign strategies accordingly.

Social Impact Studies – Researchers analyze sentiment trends to understand how societal issues, such as unemployment or education, affect the overall mood of the population.

Urban Planning and Development – Sentiment data helps urban planners create environments that align with the emotional needs of the population.

Accessing Techsalerator’s Location Sentiment Data

To obtain Techsalerator’s Location Sentiment Data for Palestine State, contact info@techsalerator.com with your specific requirements. Techsalerator provides customized datasets based on requested fields, with delivery available within 24 hours. Ongoing access options can also be discussed.

Included Data Fields

Geographic Location

Sentiment Score

Demographic Insights

Time of Sentiment Capture

Social Media Mentions

Sentiment Analysis by Region

Political Sentiment Trends

Economic Impact on Sentiment

Cultural Sentiment Insights

Contact Information

For in-depth insights into sentiment trends and regional opinions in Palestine State, Techsalerator’s dataset is an invaluable resource for researchers, policymakers, marketers, and urban developers.
Sound and Audio Data in Vanuatu
kaggle.com
Updated Apr 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Techsalerator (2025). Sound and Audio Data in Vanuatu [Dataset]. https://www.kaggle.com/datasets/techsalerator/sound-and-audio-data-in-vanuatu/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 3, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Techsalerator
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Area covered
Vanuatu
Description
Techsalerator’s Location Sentiment Data for Vanuatu

Techsalerator’s Location Sentiment Data for Vanuatu provides a detailed collection of data, offering crucial insights for businesses, researchers, and technology developers. This dataset delivers a comprehensive analysis of public sentiment and environmental conditions across different regions of Vanuatu, helping to understand local opinions, behaviors, and perceptions.

For access to the full dataset, contact us at info@techsalerator.com or visit Techsalerator Contact Us.

Techsalerator’s Location Sentiment Data for Vanuatu

Techsalerator’s Location Sentiment Data for Vanuatu offers an in-depth analysis of public sentiment across urban, rural, and remote locations. This data is essential for market research, tourism development, social studies, and governmental decision-making.

Top 5 Key Data Fields

Location of Sentiment Capture – Identifies the geographic area where sentiment data is collected, helping analyze regional variations in opinions and attitudes.

Sentiment Analysis – Provides an analysis of positive, negative, and neutral sentiments expressed by individuals in various locations, offering insights into local moods and concerns.

Sentiment by Demographics – Breaks down sentiment data by key demographics such as age, gender, and socio-economic status, helping tailor messages and services.

Time of Sentiment Capture – Records the time and date of sentiment data collection, allowing analysis of shifts in sentiment over time, such as during holidays or national events.

Event Impact Analysis – Identifies the effect of local events (e.g., natural disasters, festivals) on sentiment, offering a view into public reactions to significant occurrences.

Top 5 Sentiment Trends in Vanuatu

Tourism Sentiment – A rising interest in sustainable tourism has led to positive sentiment toward eco-tourism initiatives, which influences local businesses and government policies.

Disaster Preparedness – The aftermath of cyclones and natural disasters has prompted concerns about preparedness and recovery, with sentiment focusing on improving resilience.

Cultural Pride and Heritage – There is a strong sentiment of pride in Vanuatu’s unique cultural heritage, especially among younger generations.

Economic Growth and Development – Public sentiment regarding economic opportunities is positive, especially in urban areas, though rural areas show concerns about access to resources.

Environmental Concerns – Climate change and environmental degradation are emerging as significant concerns, with strong calls for sustainable development and conservation efforts.

Top 5 Applications of Location Sentiment Data in Vanuatu

Market Research and Consumer Behavior – Businesses can use sentiment data to understand local preferences and adjust products or services accordingly.

Tourism Industry Development – Sentiment data helps shape tourism strategies by identifying the experiences and services visitors value most.

Government Policy and Social Programs – Policymakers can use sentiment trends to design programs that address public concerns, such as environmental protection and economic growth.

Event Planning and Management – Event organizers can analyze sentiment before, during, and after events to gauge public interest and improve future planning.

Social Media and Public Relations – Sentiment data helps brands and public relations professionals track public opinion and manage their reputation across different regions.

Accessing Techsalerator’s Location Sentiment Data

To obtain Techsalerator’s Location Sentiment Data for Vanuatu, contact info@techsalerator.com with your specific requirements. Techsalerator provides customized datasets based on requested fields, with delivery available within 24 hours. Ongoing access options can also be discussed.

Included Data Fields

Location of Sentiment Capture

Sentiment Analysis (Positive, Negative, Neutral)

Sentiment by Demographics (Age, Gender, Socio-Economic Status)

Time of Sentiment Capture

Event Impact Analysis

Public Sentiment toward Key Industries (Tourism, Agriculture, Technology, etc.)

Regional Sentiment Trends

Sentiment toward Government Policies

Sentiment by Language and Cultural Groups

Impact of Natural Disasters on Sentiment

Techsalerator’s dataset is an invaluable resource for businesses, governments, and researchers seeking to understand public sentiment in Vanuatu. It provides actionable insights for decision-making, policy development, and market strategies.

BAF: an audio fingerprinting dataset for broadcast monitoring

zenodo.org
data.niaid.nih.gov

Updated Jul 16, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Guillem Cortès; Guillem Cortès; Alex Ciurana; Alex Ciurana; Emilio Molina; Emilio Molina; Marius Miron; Marius Miron; Owen Meyers; Owen Meyers; Joren Six; Joren Six; Xavier Serra; Xavier Serra (2024). BAF: an audio fingerprinting dataset for broadcast monitoring [Dataset]. http://doi.org/10.5281/zenodo.6868083

Explore at:

Unique identifier

https://doi.org/10.5281/zenodo.6868083

Dataset updated

Jul 16, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Guillem Cortès; Guillem Cortès; Alex Ciurana; Alex Ciurana; Emilio Molina; Emilio Molina; Marius Miron; Marius Miron; Owen Meyers; Owen Meyers; Joren Six; Joren Six; Xavier Serra; Xavier Serra

Description

Overview

Broadcast Audio Fingerprinting dataset is an open, available upon request, annotated dataset for the task of music monitoring in broadcast. It contains 2,000 tracks from Epidemic Sound's private catalogue as reference tracks that represent 74 hours. As queries, it contains over 57 hours of TV broadcast audio from 23 countries and 203 channels distributed with 3,425 one-min audio excerpts.

It has been annotated by six annotators in total and each query has been cross-annotated by three of them obtaining high inter-annotator agreement percentages, which validates the annotation methodology and ensures the reliability of the annotations.

Purpose of the dataset

This dataset aims to become the standard dataset to evaluate Audio Fingerprinting algorithms since it’s built on real data, without the use of any data-augmentation techniques. It is also the first dataset to address background music fingerprinting, which is a real problem in royalties distribution.

Dataset use

This dataset is available for conducting non-commercial research related to audio analysis. It shall not be used for music generation or music synthesis.

About the data

All audio files are monophonic, 8kHz, 128kb/s, pcm_s16le encoded in .wav. Annotations mark which tracks sound (either in foreground or background) in each query (if any) and also the specific times where it starts and ends sound in the query.

Note that there are 88 queries that do not have any matches.

For more information check the dedicated Github repository: https://github.com/guillemcortes/baf-dataset and the dataset datasheet included in the files.

Dataset contents

The dataset is structured following this schema

baf-dataset/
├── baf_datasheet.pdf
├── annotations.csv
├── changelog.md
├── cross_annotations.csv
├── queries_info.csv
├── queries
│  ├── query_0001.wav
│  ├── query_0002.wav
│  ├── …
│  └── query_3425.wav
├── queries_info.csv
└── references
  ├── ref_0001.wav
  ├── ref_0002.wav
  ├── …
  └── ref_2000.wav

There are two folders named queries and references containing the wav files of TV broadcast recordings and the reference tracks, respectively.

annotations.csv file contains the annotations made by the 6 annotators, giving the following information:

annotations.csv content summary
query	reference	query_start	query_end	annotator
query_0692.wav	ref_1235.wav	0.0	59.904	annotator_6

cross_annotations.csv contains the resulting annotations after merging the overlapping annotations in annotations.csv file. x_tag has three different values:

single: the segment has only been annotated by one annotator.
majority: the segment has been annotated by two annotators.
unanimity: the segment has been annotated by the three annotators.

cross_annotations.csv content summary
query	reference	query_Start	query_end	annotators	x_tag
query_0693.wav	ref_1834.wav	37.53	38.07	['annotator_3']	single
query_0693.wav	ref_1834.wav	18.18	37.48	['annotator_3', 'annotator_5', 'annotator_3']	unanimity
query_0693.wav	ref_1834.wav	37.48	37.53	['annotator_5', 'annotator_3']	majority

queries_info.csv contains information about the queries as a citation reference. It contains the country, the channel and the date where the broadcast happened.

queries_info.csv content summary
filename	country	channel	datetime
query_0001.wav	Norway	Discovery Channel	2021-02-26 14:45:26

changelog.md contains a curated, chronologically ordered list of notable changes for each version of the dataset.

baf_datasheet.pdf contains standardized documentation for datasets

Ownership of the data

Next, we specify the ownership of all the data included in BAF: Broadcast Audio Fingerprinting dataset. For licensing information, please refer to the “License” section.

Reference tracks

The reference tracks are owned by Epidemic Sound AB, which has given a worldwide, revocable, non-exclusive, royalty-free licence to use and reproduce this data collection consisting of 2,000 low-quality monophonic 8kHz downsampled audio recordings.

Query tracks

The query tracks come from publicly available TV broadcast emissions so the ownership of each recording belongs to the channel that emitted the content. We publish them under the right of quotation provided by the Berne Convention.

Annotations

Guillem Cortès together with Alex Ciurana and Emilio Molina from BMAT Music Licensing S.L. have managed the annotation therefore the annotations belong to BMAT.

Accessing the dataset

The dataset is available upon request. Please include, in the justification field, your academic affiliation (if you have one) and a brief description of your research topics and why you would like to use this dataset. Bear in mind that this information is important for the evaluation of every access request.

License

This dataset is available for conducting non-commercial research related to audio analysis. It shall not be used for music generation or music synthesis. Given the different ownership of the elements of the dataset, the dataset is licensed under the following conditions:

User’s access request
Research only, non-commercial purposes
No adaptations nor derivative works
Attribution to Epidemic Sound and the authors as it is indicated in the ”citation” section.

Please include, in the justification field, your academic affiliation (if you have one) and a brief description of your research topics and why you would like to use this dataset.

Acknowledgments

With the support of Ministerio de Ciencia Innovación y universidades through Retos-Colaboración call, reference: RTC2019-007248-7, and also with the support of the Industrial Doctorates Plan of the Secretariat of Universities and Research of the Department of Business and Knowledge of the Generalitat de Catalunya. Reference: DI46-2020.

Facebook

Twitter

Click to copy link

Link copied

Cite

Nexdata (2023). Speech Recognition Data Collection Services | 100+ Languages Resources |Audio Data | Speech Recognition Data | Machine Learning (ML) Data [Dataset]. https://datarade.ai/data-products/nexdata-speech-recognition-data-collection-services-100-nexdata

Speech Recognition Data Collection Services | 100+ Languages Resources |Audio Data | Speech Recognition Data | Machine Learning (ML) Data

Explore at:

.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats

Dataset updated

Dec 28, 2023

Dataset authored and provided by

Nexdata

Area covered

Estonia, Haiti, Cambodia, Brazil, Malaysia, Sri Lanka, United Kingdom, Lithuania, Austria, El Salvador

Description

Overview With extensive experience in speech recognition, Nexdata has resource pool covering more than 50 countries and regions. Our linguist team works closely with clients to assist them with dictionary and text corpus construction, speech quality inspection, linguistics consulting and etc.
Our Capacity -Global Resources: Global resources covering hundreds of languages worldwide

-Compliance: All the Machine Learning (ML) Data are collected with proper authorization -Quality: Multiple rounds of quality inspections ensures high quality data output

-Secure Implementation: NDA is signed to gurantee secure implementation and Machine Learning (ML) Data is destroyed upon delivery.

About Nexdata Nexdata is equipped with professional Machine Learning (ML) Data collection devices, tools and environments, as well as experienced project managers in data collection and quality control, so that we can meet the data collection requirements in various scenarios and types. Please visit us at https://www.nexdata.ai/service/speech-recognition?source=Datarade

Clear search

Close search

Google apps

Main menu

Speech Recognition Data Collection Services | 100+ Languages Resources...

Speech Recognition Data Collection Services | 100+ Languages Resources...

Speech Synthesis Data Collection Service | 50+ Languages Resources |...

English (Canada) General Conversation Speech Dataset

What’s Included

wolof-audio-data

Japanese (Japan) General Conversation Speech Dataset

What’s Included

Data from: Domain-specific neural networks improve automated bird sound...

CNVVE Dataset clean audio samples

Speech and Audio Data Report

English Deep South Media Audio Dataset

Axiom voice recognition dataset

Polish (Poland) General Conversation Speech Dataset

What’s Included

FSD50K

Sound and Audio Data in Uganda

Techsalerator’s Location Sentiment Data for Uganda

Top 5 Key Data Fields

Top 5 Sentiment Trends in Uganda

Top 5 Applications of Location Sentiment Data in Uganda

Accessing Techsalerator’s Location Sentiment Data

Included Data Fields

Data Collection And Labelling Market Report | Global Forecast From 2025 To...

Data Collection and Labelling Market Outlook

Data Type Analysis

2024-04-08 Total Solar Eclipse ESID#504

2024-04-08 Total Solar Eclipse ESID#001

Sound and Audio Data in Palestine State

Techsalerator’s Location Sentiment Data for Palestine State

Top 5 Key Data Fields

Top 5 Sentiment Trends in Palestine State

Top 5 Applications of Location Sentiment Data in Palestine State

Accessing Techsalerator’s Location Sentiment Data

Included Data Fields

Sound and Audio Data in Vanuatu

Techsalerator’s Location Sentiment Data for Vanuatu

Top 5 Key Data Fields

Top 5 Sentiment Trends in Vanuatu

Top 5 Applications of Location Sentiment Data in Vanuatu

Accessing Techsalerator’s Location Sentiment Data

Included Data Fields

BAF: an audio fingerprinting dataset for broadcast monitoring

Speech Recognition Data Collection Services | 100+ Languages Resources |Audio Data | Speech Recognition Data | Machine Learning (ML) Data