15 datasets found
  1. h

    big_bench_audio

    • huggingface.co
    Updated Dec 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2024). big_bench_audio [Dataset]. https://huggingface.co/datasets/ArtificialAnalysis/big_bench_audio
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 20, 2024
    Dataset authored and provided by
    Artificial Analysis
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Artificial Analysis Big Bench Audio

      Dataset Summary
    

    Big Bench Audio is an audio version of a subset of Big Bench Hard questions. The dataset can be used for evaluating the reasoning capabilities of models that support audio input. The dataset includes 1000 audio recordings for all questions from the following Big Bench Hard categories. Descriptions are taken from Suzgun et al. (2022):

    Formal Fallacies Syllogisms Negation (Formal Fallacies) - 250 questions Given a context… See the full description on the dataset page: https://huggingface.co/datasets/ArtificialAnalysis/big_bench_audio.

  2. h

    CMI-bench

    • huggingface.co
    Updated Jun 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yinghao Ma (2025). CMI-bench [Dataset]. https://huggingface.co/datasets/nicolaus625/CMI-bench
    Explore at:
    Dataset updated
    Jun 18, 2025
    Authors
    Yinghao Ma
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for Test-Audio of CMI-Bench

    CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following 🔗 Paper (arXiv) 🧪 Evaluation Toolkit 📊 License: CC BY-NC 4.0

      Dataset Summary
    

    The CMI-Bench/test-audio dataset provides the complete test split audio files used in the CMI-Bench benchmark. CMI-Bench evaluates the instruction-following capabilities of audio-text large language models (LLMs) on a wide range of Music Information Retrieval (MIR) tasks.… See the full description on the dataset page: https://huggingface.co/datasets/nicolaus625/CMI-bench.

  3. h

    adu-bench

    • huggingface.co
    Updated Jun 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous09 (2024). adu-bench [Dataset]. https://huggingface.co/datasets/adu-bench/adu-bench
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 26, 2024
    Authors
    Anonymous09
    Description

    ADU-Bench: Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models

    If you use ADU-Bench in your project, please kindly cite: @articles{adubench2025, title={Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models}, author={Anonymous ACL submission}, journal={Under Review}, year={2025} }

  4. h

    blab_long_audio

    • huggingface.co
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Orevaoghene Ahia (2025). blab_long_audio [Dataset]. https://huggingface.co/datasets/oreva/blab_long_audio
    Explore at:
    Dataset updated
    Jun 12, 2025
    Authors
    Orevaoghene Ahia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BLAB: Brutally Long Audio Bench

      Dataset Summary
    

    Brutally Long Audio Bench (BLAB) is a challenging long-form audio benchmark that evaluates audio LMs on localization, duration estimation, emotion, and counting tasks using audio segments averaging 51 minutes in length. BLAB consists of 833+ hours of diverse, full-length Youtube audio clips, each paired with human-annotated, text-based natural language questions and answers. Our audio data were collected from permissively… See the full description on the dataset page: https://huggingface.co/datasets/oreva/blab_long_audio.

  5. WildSpeech-Bench

    • huggingface.co
    Updated Jul 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tencent (2025). WildSpeech-Bench [Dataset]. https://huggingface.co/datasets/tencent/WildSpeech-Bench
    Explore at:
    Dataset updated
    Jul 2, 2025
    Dataset authored and provided by
    Tencenthttps://tencent.com/
    Description

    WildSpeech-Bench: Benchmarking Audio LLMs in Natural Speech Conversation

    🤗 Dataset | 🐙 GitHub 📖 Arxiv

    This repository contains the evaluation code for the paper "WildSpeech-Bench: Benchmarking Audio LLMs in Natural Speech Conversation".

      🔔 Introduction
    

    WildSpeech-Bench is the first end-to-end, systematic benchmark for evaluating the capabilities of audio-to-audio speech dialogue models. The dataset is designed with three key features:

    Realistic and Diverse… See the full description on the dataset page: https://huggingface.co/datasets/tencent/WildSpeech-Bench.

  6. B

    Bench-top Psophometer Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Jun 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Bench-top Psophometer Report [Dataset]. https://www.archivemarketresearch.com/reports/bench-top-psophometer-210668
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Jun 28, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The bench-top psophometer market is experiencing robust growth, driven by increasing demand across various sectors. While precise figures for market size and CAGR were not provided, a reasonable estimation, considering the involvement of established players like Siemens and Keysight Technologies and the consistent need for precision noise measurement in industries like telecommunications and audio engineering, suggests a market size of approximately $250 million in 2025. Considering typical growth trends in specialized testing equipment markets, a conservative Compound Annual Growth Rate (CAGR) of 7% is estimated for the forecast period (2025-2033). This growth is fueled by several key drivers including the rising adoption of 5G networks necessitating stringent noise level testing, advancements in audio technology demanding high-fidelity measurements, and a growing focus on regulatory compliance for electromagnetic interference (EMI) and noise emissions. The market is segmented by application (telecommunications, audio testing, industrial quality control, etc.) and geography, with North America and Europe currently holding significant market share. However, emerging economies in Asia-Pacific are expected to witness rapid growth owing to increased infrastructure development and industrialization. The competitive landscape is characterized by the presence of both established industry giants and specialized manufacturers. Key players are focusing on product innovation, strategic partnerships, and expanding their global reach to maintain their market position. Future growth will depend on continuous technological advancements such as improved accuracy, enhanced functionality, and the integration of smart features. Factors like the high initial investment cost of these instruments and the potential for substitute technologies could pose challenges to market expansion. However, the long-term outlook for the bench-top psophometer market remains positive, reflecting the increasing importance of precise noise level measurements in various applications.

  7. D

    Data from: Creating a multi-track classical music performance dataset for...

    • datasetcatalog.nlm.nih.gov
    • search.dataone.org
    • +2more
    Updated Mar 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dinesh, Karthik; Liu, Xinzhao; Sharma, Gaurav; Li, Bochen; Duan, Zhiyao (2019). Creating a multi-track classical music performance dataset for multi-modal music analysis: challenges, insights, and applications [Dataset]. http://doi.org/10.5061/dryad.ng3r749
    Explore at:
    Dataset updated
    Mar 20, 2019
    Authors
    Dinesh, Karthik; Liu, Xinzhao; Sharma, Gaurav; Li, Bochen; Duan, Zhiyao
    Description

    We introduce a dataset for facilitating audio-visual analysis of musical performances. The dataset comprises 44 simple multi-instrument classical music pieces assembled from coordinated but separately recorded performances of individual tracks. For each piece, we provide the musical score in MIDI format, the audio recordings of the individual tracks, the audio and video recording of the assembled mixture, and ground- truth annotation files including frame-level and note-level tran- scriptions. We describe our methodology for the creation of the dataset, particularly highlighting our approaches for addressing the challenges involved in maintaining synchronization and ex- pressiveness. We demonstrate the high quality of synchronization achieved with our proposed approach by comparing the dataset against existing widely-used music audio datasets. We anticipate that the dataset will be useful for the devel- opment and evaluation of existing music information retrieval (MIR) tasks, as well as for novel multi-modal tasks. We bench- mark two existing MIR tasks (multi-pitch analysis and score- informed source separation) on the dataset and compare against other existing music audio datasets. Additionally, we consider two novel multi-modal MIR tasks (visually informed multi-pitch analysis and polyphonic vibrato analysis) enabled by the dataset and provide evaluation measures and baseline systems for future comparisons (from our recent work). Finally, we propose several emerging research directions that the dataset enables.

  8. h

    MusicBench

    • huggingface.co
    Updated Nov 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AMAAI Lab (2023). MusicBench [Dataset]. https://huggingface.co/datasets/amaai-lab/MusicBench
    Explore at:
    Dataset updated
    Nov 16, 2023
    Dataset authored and provided by
    AMAAI Lab
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    MusicBench Dataset

    The MusicBench dataset is a music audio-text pair dataset that was designed for text-to-music generation purpose and released along with Mustango text-to-music model. MusicBench is based on the MusicCaps dataset, which it expands from 5,521 samples to 52,768 training and 400 test samples!

      Dataset Details
    

    MusicBench expands MusicCaps by:

    Including music features of chords, beats, tempo, and key that are extracted from the audio. Describing these music… See the full description on the dataset page: https://huggingface.co/datasets/amaai-lab/MusicBench.

  9. f

    SynSpeech Dataset (Small Version)

    • figshare.com
    csv
    Updated Nov 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yusuf Brima (2024). SynSpeech Dataset (Small Version) [Dataset]. http://doi.org/10.6084/m9.figshare.27627840.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 7, 2024
    Dataset provided by
    figshare
    Authors
    Yusuf Brima
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The SynSpeech Dataset (Small Version) is an English-language synthetic speech dataset created using OpenVoice and LibriSpeech-100 for bench-marking disentangled speech representation learning methods. It includes 50 unique speakers, each with 500 distinct sentences spoken in a “default” style at a 16kHz sampling rate. Data is organized by speaker ID, with a synspeech_Small_Metadata.csv file detailing speaker information, gender, speaking style, text, and file paths. This dataset is ideal for tasks in representation learning, speaker and content factorization, and TTS synthesis.

  10. h

    SAVVY-Bench

    • huggingface.co
    Updated Jul 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zijun Cui (2025). SAVVY-Bench [Dataset]. https://huggingface.co/datasets/ZijunCui/SAVVY-Bench
    Explore at:
    Dataset updated
    Jul 16, 2025
    Authors
    Zijun Cui
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    SAVVY-Bench

    This repository contains SAVVY-Bench, the first benchmark for dynamic 3D spatial reasoning in audio-visual environments, introduced in SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing.

      SAVVY-Bench Dataset
    

    The benchmark dataset is also available on Hugging Face: from datasets import load_dataset dataset = load_dataset("ZijunCui/SAVVY-Bench")

    This repository provides both the benchmark data and tools to… See the full description on the dataset page: https://huggingface.co/datasets/ZijunCui/SAVVY-Bench.

  11. h

    TTA-Bench

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hui wang, TTA-Bench [Dataset]. https://huggingface.co/datasets/Hui519/TTA-Bench
    Explore at:
    Authors
    hui wang
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    TTA-Bench Dataset

      🎯 Overview
    

    Welcome to TTA-Bench! This repository contains our comprehensive evaluation framework for text-to-audio (TTA) systems. We've carefully curated 2,999 prompts across six different evaluation dimensions, creating a standardized benchmark for assessing text-to-audio generation capabilities.

      📚 Dataset Structure
    

    Each prompt in our dataset contains these essential fields:

    id: Unique identifier for each prompt (format: prompt_XXXX)… See the full description on the dataset page: https://huggingface.co/datasets/Hui519/TTA-Bench.

  12. h

    AIR-Bench-Dataset

    • huggingface.co
    Updated May 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    qianyang (2024). AIR-Bench-Dataset [Dataset]. https://huggingface.co/datasets/qyang1021/AIR-Bench-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2024
    Authors
    qianyang
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    AIR-Bench

    Arxiv: https://arxiv.org/html/2402.07729v1This is the AIR-Bench dataset download page.AIR-Bench encompasses two dimensions: foundation and chat benchmarks. The former consists of 19 tasks with approximately 19k single-choice questions. The latter one contains 2k instances of open-ended question-and-answer data.For how to run AIR-Bench, Please refer to AIR-Bench github page(https://github.com/OFA-Sys/AIR-Bench)(will be public soon).

      Data Sources(All come from… See the full description on the dataset page: https://huggingface.co/datasets/qyang1021/AIR-Bench-Dataset.
    
  13. h

    CSEU-Bench

    • huggingface.co
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qiuchi Li (2025). CSEU-Bench [Dataset]. https://huggingface.co/datasets/smart9/CSEU-Bench
    Explore at:
    Dataset updated
    May 23, 2025
    Authors
    Qiuchi Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Chinese Speech Emotional Understanding Benchmark (CSEU-Bench)

    The benchmark aims to evaluate the ability of understanding psycho-linguistic emotion labels in Chinese speech. It contains Chinese speech audios with diverse syntactic structures, and 83 psycho-linguistic emotion entities as classification labels.

    Github: https://github.com/qiuchili/CSEU-Bench

      CSEU-Bench Components:
    

    CSEU-Bench.csv: all speech samples CSEU-monosyllabic.csv: speech samples with… See the full description on the dataset page: https://huggingface.co/datasets/smart9/CSEU-Bench.

  14. h

    XMAD-Bench

    • huggingface.co
    Updated Jun 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Computer Science, University of Bucharest (2025). XMAD-Bench [Dataset]. https://huggingface.co/datasets/unibuc-cs/XMAD-Bench
    Explore at:
    Dataset updated
    Jun 21, 2025
    Dataset authored and provided by
    Department of Computer Science, University of Bucharest
    Description

    XMAD-Bench: Cross-Domain Multilingual Audio Deepfake Benchmark

      by Ioan-Paul Ciobanu, Andrei-Iulian Hiji, Nicolae-Catalin Ristea, Paul Irofti, Cristian Rusu, Radu Tudor Ionescu
    
    
    
    
    
    
      License
    

    The source code and models are released under the Creative Common Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

      Reference
    

    If you use this dataset or code in your research, please cite the corresponding paper:

    Ioan-Paul Ciobanu… See the full description on the dataset page: https://huggingface.co/datasets/unibuc-cs/XMAD-Bench.

  15. h

    AV_Odyssey_Bench

    • huggingface.co
    Updated Nov 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AV-Odyssey Bench (2024). AV_Odyssey_Bench [Dataset]. https://huggingface.co/datasets/AV-Odyssey/AV_Odyssey_Bench
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 24, 2024
    Dataset authored and provided by
    AV-Odyssey Bench
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Official dataset for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?". 🌟 For more details, please refer to the project page with data examples: https://av-odyssey.github.io/. [🌐 Webpage] [📖 Paper] [🤗 Huggingface AV-Odyssey Dataset] [🤗 Huggingface Deaftest Dataset] [🏆 Leaderboard]

      🔥 News
    

    2024.11.24 🌟 We release AV-Odyssey, the first-ever comprehensive evaluation benchmark to explore whether MLLMs really understand audio-visual… See the full description on the dataset page: https://huggingface.co/datasets/AV-Odyssey/AV_Odyssey_Bench.

  16. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Artificial Analysis (2024). big_bench_audio [Dataset]. https://huggingface.co/datasets/ArtificialAnalysis/big_bench_audio

big_bench_audio

Artifical Analysis Big Bench Audio

ArtificialAnalysis/big_bench_audio

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 20, 2024
Dataset authored and provided by
Artificial Analysis
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Artificial Analysis Big Bench Audio

  Dataset Summary

Big Bench Audio is an audio version of a subset of Big Bench Hard questions. The dataset can be used for evaluating the reasoning capabilities of models that support audio input. The dataset includes 1000 audio recordings for all questions from the following Big Bench Hard categories. Descriptions are taken from Suzgun et al. (2022):

Formal Fallacies Syllogisms Negation (Formal Fallacies) - 250 questions Given a context… See the full description on the dataset page: https://huggingface.co/datasets/ArtificialAnalysis/big_bench_audio.

Search
Clear search
Close search
Google apps
Main menu