100+ datasets found
  1. Speaker Recognition - CMU ARCTIC

    • kaggle.com
    Updated Nov 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Lins (2022). Speaker Recognition - CMU ARCTIC [Dataset]. https://www.kaggle.com/datasets/mrgabrielblins/speaker-recognition-cmu-arctic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 21, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gabriel Lins
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description
    • Can you predict which speaker is talking?
    • Can you predict what they are saying? This dataset makes all of these possible. Perfect for a school project, research project, or resume builder.

    File information

    • train.csv - file containing all the data you need for training, with 4 columns, id (file id), file_path(path to .wav files), speech(transcription of audio file), and speaker (target column)
    • test.csv - file containing all the data you need to test your model (20% of total audio files), it has the same columns as train.csv
    • train/ - Folder with training data, subdivided with Speaker's folders
      • aew/ - Folder containing audio files in .wav format for speaker aew
      • ...
    • test/ - Folder containing audio files for test data.

    Column description

    ColumnDescription
    idfile id (string)
    file_pathfile path to .wav file (string)
    speechtranscription of the audio file (string)
    speakerspeaker name, use this as the target variable if you are doing audio classification (string)

    More Details

    The CMU_ARCTIC databases were constructed at the Language Technologies Institute at Carnegie Mellon University as phonetically balanced, US-English single-speaker databases designed for unit selection speech synthesis research. A detailed report on the structure and content of the database and the recording environment etc is available as a Carnegie Mellon University, Language Technologies Institute Tech Report CMU-LTI-03-177 and is also available here.

    The databases consist of around 1150 utterances carefully selected from out-of-copyright texts from Project Gutenberg. The databases include US English male (bdl) and female (slt) speakers (both experienced voice talent) as well as other accented speakers.

    The 1132 sentence prompt list is available from cmuarctic.data

    The distributions include 16KHz waveform and simultaneous EGG signals. Full phonetically labeling was performed by the CMU Sphinx using the FestVox based labeling scripts. Complete runnable Festival Voices are included with the database distributions, as examples though better voices can be made by improving labeling, etc.

    Acknowledgements

    This work was partially supported by the U.S. National Science Foundation under Grant No. 0219687, "ITR/CIS Evaluation and Personalization of Synthetic Voices". Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

  2. E

    Data from: ASR database ARTUR 1.0 (audio)

    • live.european-language-grid.eu
    binary format
    Updated Feb 26, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). ASR database ARTUR 1.0 (audio) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/21520
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    Feb 26, 2023
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Artur 1.0 is a speech database designed for the needs of automatic speech recognition for the Slovenian language. The database includes 1,067 hours of speech. 884 hours are transcribed, while the remaining 183 hours are recordings only. This repository entry includes audio files only, the transcriptions are available on http://hdl.handle.net/11356/1772.

    The data are structured as follows: (1) Artur-B, read speech, 573 hours in total. It includes: (1a) Artur-B-Brani, 485 hours: Readings of sentences which were pre-selected from a 10% increment in the Gigafida 2.0 corpus. The sentences were chosen in such a way that they reflect the natural or the actual distribution of triphones in the words. They were distributed between 1,000 speakers, so that we recorded approx. 30 min in read form from each speaker. The speakers were balanced according to gender, age, region, and a small proportion of speakers were non-native speakers of Slovene. Each sentence is its own audio file and has a corresponding transcription file. (1b) Artur-B-Crkovani, 10 hours: Spellings. Speakers were asked to spell abbreviations and personal names and surnames, all chosen so that all Slovene letters were covered, plus the most common foreign letters. (1c) Artur-B-Studio, 51 hours: Designed for the development of speech synthesis. The sentences were read in a studio by a single speaker. Each sentence is its own audio file and has a corresponding transcription file. (1d) Artur-B-Izloceno, 27 hours: The recordings include different types of errors, typically, incorrect reading of sentences or a noisy environment.

    (2) Artur-J, public speech, 62 hours in total. It includes: (2a) Artur-J-Splosni, 62 hours: media recordings, online recordings of conferences, workshops, education videos, etc.

    (3) Artur-N, private speech, 74 hours in total. It includes: (3a) Artur-N-Obrazi, 6 hours: Speakers were asked to describe faces on pictures. Designed for a face-description domain-specific speech recognition. (3b) Artur-N-PDom, 7 hours: Speakers were asked to read pre-written sentences, as well as to express instructions for a potential smart-home system freely. Designed for a smart-home domain-specific speech recognition. (3c) Artur-N-Prosti, 61 hours: Monologues and dialogues between two persons, recorded for the purposes of the Artur database creation. Speakers were asked to conversate or explain freely on casual topics.

    (4) Artur-P, parliamentary speech, 201 hours in total. It includes: (4a) Artur-P-SejeDZ, 201 hours: Speech from the Slovene National Assembly.

    Further information on the database are available in the Artur-DOC file, which is part of this repository entry.

  3. Podcast Database - Complete Podcast Metadata, All Countries & Languages

    • datarade.ai
    Updated Mar 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Listen Notes (2020). Podcast Database - Complete Podcast Metadata, All Countries & Languages [Dataset]. https://datarade.ai/data-categories/podcast-data/datasets
    Explore at:
    Dataset updated
    Mar 20, 2020
    Dataset authored and provided by
    Listen Notes
    Area covered
    Panama, Costa Rica, Solomon Islands, Turks and Caicos Islands, Bhutan, Switzerland, United States of America, Fiji
    Description

    == Quick facts ==

    The most up-to-date and comprehensive podcast database available All languages & All countries Includes over 3,500,000 podcasts Features 35+ data fields , such as basic metadata, global rank, RSS feed (with audio URLs), Spotify links, and more Delivered in SQLite format Learn how we build a high quality podcast database: https://www.listennotes.help/article/105-high-quality-podcast-database-from-listen-notes

    == Use Cases ==

    AI training, including speech recognition, generative AI, voice cloning / synthesis, and news analysis Alternative data for investment research, such as sentiment analysis of executive interviews, market research and tracking investment themes PR and marketing, including social monitoring, content research, outreach, and guest booking ...

    == Data Attributes ==

    See the full list of data attributes on this page: https://www.listennotes.com/podcast-datasets/fields/?filter=podcast_only

    How to access podcast audio files: Our dataset includes RSS feed URLs for all podcasts. You can retrieve audio for over 170 million episodes directly from these feeds. With access to the raw audio, you’ll have high-quality podcast speech data ideal for AI training and related applications.

    == Custom Offers ==

    We can provide custom datasets based on your needs, such as language-specific data, daily/weekly/monthly update frequency, or one-time purchases.

    We also provide a RESTful API at PodcastAPI.com

    Contact us: hello@listennotes.com

    == Need Help? ==

    If you have any questions about our products, feel free to reach out hello@listennotes.com

    == About Listen Notes, Inc. ==

    Since 2017, Listen Notes, Inc. has provided the leading podcast search engine and podcast database.

  4. E

    Data from: The HIWIRE database, a noisy and non-native English speech corpus...

    • catalogue.elra.info
    • live.european-language-grid.eu
    Updated Nov 25, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ELRA (European Language Resources Association) (2008). The HIWIRE database, a noisy and non-native English speech corpus for cockpit communication [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-S0293/
    Explore at:
    Dataset updated
    Nov 25, 2008
    Dataset provided by
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
    ELRA (European Language Resources Association)
    License

    https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf

    https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf

    Description

    This database has been collected and packaged under the auspices of the IST-EU STREP project HIWIRE (Human Input that Works In Real Environments). The database was designed to be used as a tool for development and test of speech processing and recognition techniques dealing with robust non-native speech recognition.The database contains 8,099 English utterances pronounced by non-native speakers (31 French, 20 Greek, 20 Italian, and 10 Spanish speakers). The collected utterances correspond to human input in a command and control aeronautics application. The data was recorded in studio with a close-talking microphone and real noise recorded in an airplane cockpit was artificially added to the data. The signals are provided in clean (studio recordings with close talking microphone), low, mid and high noise conditions. The three noise levels correspond approximately to signal-to-noise ratios of 10dB, 5dB and -5 dB respectively.Clean audio data has been recorded in different office rooms using a close-talking microphone for lowest ambient acoustic effects (Plantronics USB-45). The used sampling frequency is 16 kHz and data is stored in Windows PCM WAV 16 bits mono format.Recordings correspond to prompts extracted from an aeronautic command and control application. A total of 8,099 utterances have been recorded corresponding to 81 speakers pronouncing 100 utterances each. The speaker distribution is as follows:

    Country# Speakers# Utterances
    France31 (38.3%)3100
    Greece20 (24.7%)2000
    Italy20 (24.7%)2000
    Spain10 (12.3%)999
    Total818099
    To generate the noisy data utterances, the speech level is maintained and only the noise amplitude is modified to obtain the desired SNR. The noise amplitude is adjusted to obtain three different averaged SNR values of 10dB, 5dB and -5dB which are referenced as low noise (LN), mid noise (MN) and high noise (HN) conditions. For each given condition the noise level remains constant.The speech data are pcm-wav files (16kHz / 16 bits / mono) stored on one DVD. The total size is 3.03 Gbytes for 33.053 files.

  5. EmoDB Dataset

    • kaggle.com
    Updated Sep 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Piyush Agnihotri (2020). EmoDB Dataset [Dataset]. https://www.kaggle.com/piyushagni5/berlin-database-of-emotional-speech-emodb/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 24, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Piyush Agnihotri
    Description

    Emo-DB Database

    The EMODB database is the freely available German emotional database. The database is created by the Institute of Communication Science, Technical University, Berlin, Germany. Ten professional speakers (five males and five females) participated in data recording. The database contains a total of 535 utterances. The EMODB database comprises of seven emotions: 1) anger; 2) boredom; 3) anxiety; 4) happiness; 5) sadness; 6) disgust; and 7) neutral. The data was recorded at a 48-kHz sampling rate and then down-sampled to 16-kHz.

    Additional Information

    Every utterance is named according to the same scheme:

    • Positions 1-2: number of speaker
    • Positions 3-5: code for text
    • Position 6: emotion (sorry, letter stands for german emotion word)
    • Position 7: if there are more than two versions these are numbered a, b, c ....

    Example: 03a01Fa.wav is the audio file from Speaker 03 speaking text a01 with the emotion "Freude" (Happiness).

    Information about the speakers

    • 03 - male, 31 years old
    • 08 - female, 34 years
    • 09 - female, 21 years
    • 10 - male, 32 years
    • 11 - male, 26 years
    • 12 - male, 30 years
    • 13 - female, 32 years
    • 14 - female, 35 years
    • 15 - male, 25 years
    • 16 - female, 31 years


    Code of emotions:

    letteremotion (english)letteremotion (german)
    AangerWÄrger (Wut)
    BboredomLLangeweile
    DdisgustEEkel
    Fanxiety/fearAAngst
    HhappinessFFreude
    SsadnessTTrauer
    N = neutral version

    Inspiration

    EMOTION classification from speech has an increasing interest in the field of the speech processing area. The objective of the emotion classification is to classify different emotions from the speech signal. A person’s emotional state affects the production mechanism of speech, and due to this, breathing rate and muscle tension change from the neutral condition. Therefore, the resulting speech signal may have different characteristics from that of neutral speech.

    The performance of speech recognition or speaker recognition decreases significantly if the model is trained with neutral speech and it is tested with an emotional speech. So we as a Machine Learning Enthusiast can start working on speaker emotion recognition problems and can come up with some good robust models.

  6. TORGO Dataset for Dysarthric Speech - Audio Files

    • kaggle.com
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pranay Koppula (2023). TORGO Dataset for Dysarthric Speech - Audio Files [Dataset]. https://www.kaggle.com/datasets/pranaykoppula/torgo-audio
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Pranay Koppula
    Description

    Citation: DOI 10.1007/s10579-011-9145-0

    Collection of Audio Recordings by the Department of Computer Science at the University of Toronto from speakers with and without Dysarthtria. Useful for tasks like Audio Classification, Disease Detection, Speech Processing, etc.

    Directory Structure:

    F_Con : Audio Samples of female speakers from the control group, i.e., female speakers without dysarthria. 'FC01' in the folder names and the filenames refers to the first speaker, 'FC02' refers to the second speaker and so on. 'S01' refers to the first recording session with that speaker, 'S02' refers to the second session and so on. 'arrayMic' suggests that the audio was recorded with an array microphone, whereas 'headMic' suggests that the audio was recorded by a headpiece microphone.

    F_Dys : Audio Samples of female speakers with dysarthria. 'F01' in the folder names and the filenames refers to the first speaker, 'F03' refers to the second speaker and so on. 'S01' refers to the first recording session with that speaker, 'S02' refers to the second session and so on. 'arrayMic' suggests that the audio was recorded with an array microphone, whereas 'headMic' suggests that the audio was recorded by a headpiece microphone.

    M_Con : Audio Samples of male speakers from the control group, i.e., male speakers without dysarthria. 'MC01' in the folder names and the filenames refers to the first speaker, 'MC02' refers to the second speaker and so on. 'S01' refers to the first recording session with that speaker, 'S02' refers to the second session and so on. 'arrayMic' suggests that the audio was recorded with an array microphone, whereas 'headMic' suggests that the audio was recorded by a headpiece microphone.

    M_Dys : Audio Samples of male speakers with dysarthria. 'M01' in the folder names and the filenames refers to the first speaker, 'M03' refers to the second speaker and so on. 'S01' refers to the first recording session with that speaker, 'S02' refers to the second session and so on. 'arrayMic' suggests that the audio was recorded with an array microphone, whereas 'headMic' suggests that the audio was recorded by a headpiece microphone.

  7. E

    AURORA-5

    • catalogue.elra.info
    • live.european-language-grid.eu
    Updated Aug 16, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2017). AURORA-5 [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-AURORA-CD0005/
    Explore at:
    Dataset updated
    Aug 16, 2017
    Dataset provided by
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
    ELRA (European Language Resources Association)
    License

    https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf

    Description

    The Aurora project was originally set up to establish a worldwide standard for the feature extraction software which forms the core of the front-end of a DSR (Distributed Speech Recognition) system. The AURORA-5 database has been mainly developed to investigate the influence on the performance of automatic speech recognition for a hands-free speech input in noisy room environments. Furthermore two test conditions are included to study the influence of transmitting the speech in a mobile communication system.The earlier three Aurora experiments had a focus on additive noise and the influence of some telephone frequency characteristics. Aurora-5 tries to cover all effects as they occur in realistic application scenarios. The focus was put on two scenarios. The first one is the hands-free speech input in the noisy car environment with the intention of controlling either devices in the car itself or retrieving information from a remote speech server over the telephone. The second one covers the hands-free speech input in a type of office or in a type of living room to control e.g. a telephone device or some audio/video equipment.The AURORA-5 database contains the following data:•Artificially distorted versions of the recordings from adult speakers in the TI-Digits speech database downsampled at a sampling frequency of 8000 Hz. The distortions consist of: - additive background noise, - the simulation of a hands-free speech input in rooms, - the simulation of transmitting speech over cellular telephone networks.•A subset of recordings from the meeting recorder project at the International Computer Science Institute. The recordings contain sequences of digits uttered by different speakers in hands-free mode in a meeting room.•A set of scripts for running recognition experiments on the above mentioned speech data. The experiments are based on the usage of the freely available software package HTK where HTK is not part of this resource.Further information is also available at the following address: http://aurora.hsnr.de

  8. e

    NST Norwegian ASR Database (16 kHz)

    • data.europa.eu
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NST Norwegian ASR Database (16 kHz) [Dataset]. https://data.europa.eu/data/datasets/sbr-13?locale=en
    Explore at:
    application/x-gtar , application/pdfAvailable download formats
    License

    https://hdl.handle.net/21.11146/13/.well-known/skolem/bead126e-5168-3e2b-8959-07eaf5d458d5https://hdl.handle.net/21.11146/13/.well-known/skolem/bead126e-5168-3e2b-8959-07eaf5d458d5

    Description

    This database was originally developed by Nordic Language Technology in the 1990ies in order to facilitate automatic speech recognition (ASR) in Norwegian . A reorganized and more user friendly version of this database is also available from The Language Bank. Type "sbr-54" in the search bar to find the updated version.

  9. E

    Data from: ASR database ARTUR 1.0 (transcriptions)

    • live.european-language-grid.eu
    binary format
    Updated Feb 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). ASR database ARTUR 1.0 (transcriptions) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/21519
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    Feb 21, 2023
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Artur 1.0 is a speech database designed for the needs of developing automatic speech recognition for the Slovenian language. The complete database includes 1,067 hours of speech, of which 884 hours are transcribed, while the remaining 183 hours are recordings only. This repository entry includes transcriptions only, while the audio files are available on http://hdl.handle.net/11356/1776.

    Transcriptions are available in the original TRS format of the Transcriber 1.5.1 tool which was used for making the transcriptions. All transcriptions were made manually or manually corrected.

    The data are structured as follows: (1) Artur-B, read speech, 573 hours in total. It includes: (1a) Artur-B-Brani, 485 hours: Readings of sentences which were pre-selected from a 10% increment in the Gigafida 2.0 corpus. The sentences were chosen in such a way that they reflect the natural or the actual distribution of triphones in the words. They were distributed between 1,000 speakers, so that we recorded approx. 30 min in read form from each speaker. The speakers were balanced according to gender, age, region, and a small proportion of speakers were non-native speakers of Slovene. Each sentence is its own transcription file and has a corresponding audio file. (1b) Artur-B-Crkovani, 10 hours: Spellings. Speakers were asked to spell abbreviations and personal names and surnames, all chosen so that all Slovene letters were covered, plus the most common foreign letters. The transcriptions were corrected manually. (1c) Artur-B-Studio, 51 hours: Designed for the development of speech synthesis. The sentences were read in a studio by a single speaker. Each sentence is its own transcription file and has a corresponding recording. (1d) Artur-B-Izloceno, 27 hours: in trs format only. The recordings that correspond to these transcriptions include different types of errors, typically, incorrect reading of sentences or a noisy environment.

    (2) Artur-J, public speech, 62 hours in total. It includes: (2a) Artur-J-Splosni, 62 hours: manual transcriptions of media recordings, online recordings of conferences, workshops, education videos, etc. Transcriptions were made in two modes: - 'pog' files include the pronunciation-based or citation-phonemic transcriptions (containing the output phoneme string derived from the orthographic form by letter-to-sound rules) - 'std' files include standardised or expanded orthographic transcriptions (the standard Slovenian spelling is used to indicate the spoken words, but there are additional rules and word-lists for non-standard lexis)

    (3) Artur-N, private speech, 74 hours in total. It includes: (3a) Artur-N-Obrazi, 6 hours: Speakers were asked to describe faces on pictures. Designed for a face-description domain-specific speech recognition. (3b) Artur-N-PDom, 7 hours: Speakers were asked to read pre-written sentences, as well as to express instructions for a potential smart-home system freely. Designed for a smart-home domain-specific speech recognition. (3c) Artur-N-Prosti, 61 hours: Monologues and dialogues between two persons, recorded for the purposes of the Artur database creation. Speakers were asked to conversate or explain freely on casual topics. The manual transcriptions were done in two modes, the same as for Artur-J.

    (4) Artur-P, parliamentary speech, 201 hours in total. It includes: (4a) Artur-P-SejeDZ, 201 hours: Transcriptions of speech from the Slovene National Assembly. Manual transcriptions were done in two modes, the same as for Artur-J.

    Further information on the database, including various statistics, are available in the Artur-DOC directory, which is part of Artur_1.0_TRS.

  10. Speech and Noise Corpora for Pitch Estimation of Human Speech

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jun 30, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bastian Bechtold; Bastian Bechtold (2020). Speech and Noise Corpora for Pitch Estimation of Human Speech [Dataset]. http://doi.org/10.5281/zenodo.3920591
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 30, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Bastian Bechtold; Bastian Bechtold
    Description

    This dataset contains common speech and noise corpora for evaluating fundamental frequency estimation algorithms as convenient JBOF dataframes. Each corpus is available freely on its own, and allows redistribution:

    These files are published as part of my dissertation, "Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods", and in support of the Replication Dataset for Fundamental Frequency Estimation.

    References:

    1. John Kominek and Alan W Black. CMU ARCTIC database for speech synthesis, 2003.
    2. Paul C Bagshaw, Steven Hiller, and Mervyn A Jack. Enhanced Pitch Tracking and the Processing of F0 Contours for Computer Aided Intonation Teaching. In EUROSPEECH, 1993.
    3. F Plante, Georg F Meyer, and William A Ainsworth. A Pitch Extraction Reference Database. In Fourth European Conference on Speech Communication and Technology, pages 837–840, Madrid, Spain, 1995.
    4. Alan Wrench. MOCHA MultiCHannel Articulatory database: English, November 1999.
    5. Gregor Pirker, Michael Wohlmayr, Stefan Petrik, and Franz Pernkopf. A Pitch Tracking Corpus with Evaluation on Multipitch Tracking Scenario. page 4, 2011.
    6. John S. Garofolo, Lori F. Lamel, William M. Fisher, Jonathan G. Fiscus, David S. Pallett, Nancy L. Dahlgren, and Victor Zue. TIMIT Acoustic-Phonetic Continuous Speech Corpus, 1993.
    7. Andrew Varga and Herman J.M. Steeneken. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recog- nition systems. Speech Communication, 12(3):247–251, July 1993.
    8. David B. Dean, Sridha Sridharan, Robert J. Vogt, and Michael W. Mason. The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms. Proceedings of Interspeech 2010, 2010.
  11. m

    Conversational Skills in Language Learning Games: A Speech Recognition...

    • data.mendeley.com
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Murat Kuvvetli (2023). Conversational Skills in Language Learning Games: A Speech Recognition Technology Dataset [Dataset]. http://doi.org/10.17632/bhvd9z5jjr.1
    Explore at:
    Dataset updated
    Dec 8, 2023
    Authors
    Murat Kuvvetli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The "SpeechRec_LanguageLearning_ConversationalSkills" dataset is a collection of data generated in a game-based language learning environment, aiming to explore the impact of Speech Recognition Technology (SRT) on the development of conversational skills. The dataset encompasses speaking test results conducted within the context of language learning games utilizing SRT.

  12. A Replication Dataset for Fundamental Frequency Estimation

    • zenodo.org
    • live.european-language-grid.eu
    • +1more
    bin
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bastian Bechtold; Bastian Bechtold (2025). A Replication Dataset for Fundamental Frequency Estimation [Dataset]. http://doi.org/10.5281/zenodo.3904389
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Bastian Bechtold; Bastian Bechtold
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Part of the dissertation Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods.
    © 2020, Bastian Bechtold. All rights reserved.

    Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech recognition, speaker identification, and speech compression. A vast number of algorithms for estimatimating this quantity have been proposed over the years, and a number of speech and noise corpora have been developed for evaluating their performance. The present dataset contains estimated fundamental frequency tracks of 25 algorithms, six speech corpora, two noise corpora, at nine signal-to-noise ratios between -20 and 20 dB SNR, as well as an additional evaluation of synthetic harmonic tone complexes in white noise.

    The dataset also contains pre-calculated performance measures both novel and traditional, in reference to each speech corpus’ ground truth, the algorithms’ own clean-speech estimate, and our own consensus truth. It can thus serve as the basis for a comparison study, or to replicate existing studies from a larger dataset, or as a reference for developing new fundamental frequency estimation algorithms. All source code and data is available to download, and entirely reproducible, albeit requiring about one year of processor-time.

    Included Code and Data

    • ground truth data.zip is a JBOF dataset of fundamental frequency estimates and ground truths of all speech files in the following corpora:
      • CMU-ARCTIC (consensus truth) [1]
      • FDA (corpus truth and consensus truth) [2]
      • KEELE (corpus truth and consensus truth) [3]
      • MOCHA-TIMIT (consensus truth) [4]
      • PTDB-TUG (corpus truth and consensus truth) [5]
      • TIMIT (consensus truth) [6]
    • noisy speech data.zip is a JBOF datasets of fundamental frequency estimates of speech files mixed with noise from the following corpora:
    • synthetic speech data.zip is a JBOF dataset of fundamental frequency estimates of synthetic harmonic tone complexes in white noise.
    • noisy_speech.pkl and synthetic_speech.pkl are pickled Pandas dataframes of performance metrics derived from the above data for the following list of fundamental frequency estimation algorithms:
    • noisy speech evaluation.py and synthetic speech evaluation.py are Python programs to calculate the above Pandas dataframes from the above JBOF datasets. They calculate the following performance measures:
      • Gross Pitch Error (GPE), the percentage of pitches where the estimated pitch deviates from the true pitch by more than 20%.
      • Fine Pitch Error (FPE), the mean error of grossly correct estimates.
      • High/Low Octave Pitch Error (OPE), the percentage pitches that are GPEs and happens to be at an integer multiple of the true pitch.
      • Gross Remaining Error (GRE), the percentage of pitches that are GPEs but not OPEs.
      • Fine Remaining Bias (FRB), the median error of GREs.
      • True Positive Rate (TPR), the percentage of true positive voicing estimates.
      • False Positive Rate (FPR), the percentage of false positive voicing estimates.
      • False Negative Rate (FNR), the percentage of false negative voicing estimates.
      • F₁, the harmonic mean of precision and recall of the voicing decision.
    • Pipfile is a pipenv-compatible pipfile for installing all prerequisites necessary for running the above Python programs.

    The Python programs take about an hour to compute on a fast 2019 computer, and require at least 32 Gb of memory.

    References:

    1. John Kominek and Alan W Black. CMU ARCTIC database for speech synthesis, 2003.
    2. Paul C Bagshaw, Steven Hiller, and Mervyn A Jack. Enhanced Pitch Tracking and the Processing of F0 Contours for Computer Aided Intonation Teaching. In EUROSPEECH, 1993.
    3. F Plante, Georg F Meyer, and William A Ainsworth. A Pitch Extraction Reference Database. In Fourth European Conference on Speech Communication and Technology, pages 837–840, Madrid, Spain, 1995.
    4. Alan Wrench. MOCHA MultiCHannel Articulatory database: English, November 1999.
    5. Gregor Pirker, Michael Wohlmayr, Stefan Petrik, and Franz Pernkopf. A Pitch Tracking Corpus with Evaluation on Multipitch Tracking Scenario. page 4, 2011.
    6. John S. Garofolo, Lori F. Lamel, William M. Fisher, Jonathan G. Fiscus, David S. Pallett, Nancy L. Dahlgren, and Victor Zue. TIMIT Acoustic-Phonetic Continuous Speech Corpus, 1993.
    7. Andrew Varga and Herman J.M. Steeneken. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recog- nition systems. Speech Communication, 12(3):247–251, July 1993.
    8. David B. Dean, Sridha Sridharan, Robert J. Vogt, and Michael W. Mason. The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms. Proceedings of Interspeech 2010, 2010.
    9. Man Mohan Sondhi. New methods of pitch extraction. Audio and Electroacoustics, IEEE Transactions on, 16(2):262—266, 1968.
    10. Myron J. Ross, Harry L. Shaffer, Asaf Cohen, Richard Freudberg, and Harold J. Manley. Average magnitude difference function pitch extractor. Acoustics, Speech and Signal Processing, IEEE Transactions on, 22(5):353—362, 1974.
    11. Na Yang, He Ba, Weiyang Cai, Ilker Demirkol, and Wendi Heinzelman. BaNa: A Noise Resilient Fundamental Frequency Detection Algorithm for Speech and Music. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12):1833–1848, December 2014.
    12. Michael Noll. Cepstrum Pitch Determination. The Journal of the Acoustical Society of America, 41(2):293–309, 1967.
    13. Jong Wook Kim, Justin Salamon, Peter Li, and Juan Pablo Bello. CREPE: A Convolutional Representation for Pitch Estimation. arXiv:1802.06182 [cs, eess, stat], February 2018. arXiv: 1802.06182.
    14. Masanori Morise, Fumiya Yokomori, and Kenji Ozawa. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications. IEICE Transactions on Information and Systems, E99.D(7):1877–1884, 2016.
    15. Kun Han and DeLiang Wang. Neural Network Based Pitch Tracking in Very Noisy Speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12):2158–2168, Decem- ber 2014.
    16. Pegah Ghahremani, Bagher BabaAli, Daniel Povey, Korbinian Riedhammer, Jan Trmal, and Sanjeev Khudanpur. A pitch extraction algorithm tuned for automatic speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pages 2494–2498. IEEE, 2014.
    17. Lee Ngee Tan and Abeer Alwan. Multi-band summary correlogram-based pitch detection for noisy speech. Speech Communication, 55(7-8):841–856, September 2013.
    18. Jesper Kjær Nielsen, Tobias Lindstrøm Jensen, Jesper Rindom Jensen, Mads Græsbøll Christensen, and Søren Holdt Jensen. Fast fundamental frequency estimation: Making a statistically

  13. Data from: DiPCo -- Dinner Party Corpus

    • zenodo.org
    application/gzip, pdf
    Updated Jul 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maarten Van Segbroeck; Ahmed Zaid; Ksenia Kutsenko; Cirenia Huerta; Tinh Nguyen; Xuewen Luo; Björn Hoffmeister; Jan Trmal; Maurizio Omologo; Roland Maas; Maarten Van Segbroeck; Ahmed Zaid; Ksenia Kutsenko; Cirenia Huerta; Tinh Nguyen; Xuewen Luo; Björn Hoffmeister; Jan Trmal; Maurizio Omologo; Roland Maas (2024). DiPCo -- Dinner Party Corpus [Dataset]. http://doi.org/10.21437/interspeech.2020-2800
    Explore at:
    application/gzip, pdfAvailable download formats
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Maarten Van Segbroeck; Ahmed Zaid; Ksenia Kutsenko; Cirenia Huerta; Tinh Nguyen; Xuewen Luo; Björn Hoffmeister; Jan Trmal; Maurizio Omologo; Roland Maas; Maarten Van Segbroeck; Ahmed Zaid; Ksenia Kutsenko; Cirenia Huerta; Tinh Nguyen; Xuewen Luo; Björn Hoffmeister; Jan Trmal; Maurizio Omologo; Roland Maas
    License

    https://cdla.io/permissive-1-0https://cdla.io/permissive-1-0

    Description

    We present a speech data corpus that simulates a "dinner party" scenario taking place in an everyday home environment. The corpus was created by recording multiple groups of four Amazon employee volunteers having a natural conversation in English around a dining table. The participants were recorded by a single-channel close-talk microphone and by five far-field 7-microphone array devices positioned at different locations in the recording room. The dataset contains the audio recordings and human labeled transcripts of a total of 10 sessions with a duration between 15 and 45 minutes. The corpus was created to advance in the field of noise robust and distant speech processing and is intended to serve as a public research and benchmarking data set.

  14. Data from: 🎤 Gender Recognition by Voice

    • kaggle.com
    Updated Oct 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mexwell (2023). 🎤 Gender Recognition by Voice [Dataset]. https://www.kaggle.com/datasets/mexwell/gender-recognition-by-voice
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 2, 2023
    Dataset provided by
    Kaggle
    Authors
    mexwell
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    In order to analyze gender by voice and speech, a training database was required. A database was built using thousands of samples of male and female voices, each labeled by their gender of male or female. Voice samples were collected from the following resources:

    The output from the pre-processed WAV files were saved into a CSV file, containing 3168 rows and 21 columns (20 columns for each feature and one label column for the classification of male or female). You can download the pre-processed dataset in CSV format, using the link above

    Acoustic Properties Measured

    The following acoustic properties of each voice are measured:

    duration: length of signal meanfreq: mean frequency (in kHz) sd: standard deviation of frequency median: median frequency (in kHz) Q25: first quantile (in kHz) Q75: third quantile (in kHz) IQR: interquantile range (in kHz) skew: skewness (see note in specprop description) kurt: kurtosis (see note in specprop description) sp.ent: spectral entropy sfm: spectral flatness mode: mode frequency centroid: frequency centroid (see specprop) peakf: peak frequency (frequency with highest energy) meanfun: average of fundamental frequency measured across acoustic signal minfun: minimum fundamental frequency measured across acoustic signal maxfun: maximum fundamental frequency measured across acoustic signal meandom: average of dominant frequency measured across acoustic signal mindom: minimum of dominant frequency measured across acoustic signal maxdom: maximum of dominant frequency measured across acoustic signal dfrange: range of dominant frequency measured across acoustic signal modindx: modulation index. Calculated as the accumulated absolute difference between adjacent measurements of fundamental frequencies divided by the frequency range Note, the features for duration and peak frequency (peakf) were removed from training. Duration refers to the length of the recording, which for training, is cut off at 20 seconds. Peakf was omitted from calculation due to time and CPU constraints in calculating the value. In this case, all records will have the same value for duration (20) and peak frequency (0).

    Original Data

    Acknowlegement

    Foto von Jason Rosewell auf Unsplash

  15. Z

    Dvoice : An open source dataset for Automatic Speech Recognition on Moroccan...

    • data.niaid.nih.gov
    • explore.openaire.eu
    Updated Jan 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imade Benelallam (2022). Dvoice : An open source dataset for Automatic Speech Recognition on Moroccan dialectal Arabic [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5482550
    Explore at:
    Dataset updated
    Jan 13, 2022
    Dataset provided by
    Abdou Mohamed Naira
    Imade Benelallam
    Anass Allak
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Morocco
    Description

    Dialectal Voice is a community project initiated by AIOX Labs to facilitate voice recognition by Intelligent Systems. Today, the need for AI systems capable of recognizing the human voice is increasingly expressed within communities. However, we note that for some languages such as Darija, there are not enough voice technology solutions. To meet this need, we then proposed to establish this program of iterative and interactive construction of a dialectal database open to all in order to help improve models of voice recognition and generation.

  16. P

    TIMIT Dataset

    • paperswithcode.com
    Updated Jul 5, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2012). TIMIT Dataset [Dataset]. https://paperswithcode.com/dataset/timit
    Explore at:
    Dataset updated
    Jul 5, 2012
    Description

    The TIMIT Acoustic-Phonetic Continuous Speech Corpus is a standard dataset used for evaluation of automatic speech recognition systems. It consists of recordings of 630 speakers of 8 dialects of American English each reading 10 phonetically-rich sentences. It also comes with the word and phone-level transcriptions of the speech.

  17. E

    SNABI database for continuous speech recognition 1.2

    • live.european-language-grid.eu
    binary format
    Updated Mar 1, 2002
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2002). SNABI database for continuous speech recognition 1.2 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/20237
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    Mar 1, 2002
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The SNABI speech database can be used to train continuous speech recognition for Slovene language. The database comprises 1530 sentences, 150 words and the alphabet. 132 speakers were recorded, each reading 200 sentences or more. This resulted in more than 15,000 recordings of speech signal contained in the database. The recordings were done in studio (SNABI SI_SSQ) and through a telephone line (SNABI SI_SFN).

  18. P

    THCHS-30 Dataset

    • paperswithcode.com
    Updated Feb 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dong Wang; Xuewei Zhang (2025). THCHS-30 Dataset [Dataset]. https://paperswithcode.com/dataset/thchs-30
    Explore at:
    Dataset updated
    Feb 26, 2025
    Authors
    Dong Wang; Xuewei Zhang
    Description

    THCHS-30 is a free Chinese speech database THCHS-30 that can be used to build a full-fledged Chinese speech recognition system.

  19. g

    Indonesian Media Audio Database

    • gts.ai
    json
    Updated Jan 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2024). Indonesian Media Audio Database [Dataset]. https://gts.ai/case-study/indonesian-media-audio-database-custom-ai-data-collection/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Our project, “Indonesian Media Audio Database,” is designed to establish a rich and diverse dataset tailored for training advanced machine learning models in language processing, speech recognition, and cultural analysis.

  20. Z

    Data from: Written and spoken digits database for multimodal learning

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Jan 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khacef, Lyes (2021). Written and spoken digits database for multimodal learning [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3515934
    Explore at:
    Dataset updated
    Jan 21, 2021
    Dataset provided by
    Rodriguez, Laurent
    Miramond, Benoit
    Khacef, Lyes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Database description:

    The written and spoken digits database is not a new database but a constructed database from existing ones, in order to provide a ready-to-use database for multimodal fusion [1].

    The written digits database is the original MNIST handwritten digits database [2] with no additional processing. It consists of 70000 images (60000 for training and 10000 for test) of 28 x 28 = 784 dimensions.

    The spoken digits database was extracted from Google Speech Commands [3], an audio dataset of spoken words that was proposed to train and evaluate keyword spotting systems. It consists of 105829 utterances of 35 words, amongst which 38908 utterances of the ten digits (34801 for training and 4107 for test). A pre-processing was done via the extraction of the Mel Frequency Cepstral Coefficients (MFCC) with a framing window size of 50 ms and frame shift size of 25 ms. Since the speech samples are approximately 1 s long, we end up with 39 time slots. For each one, we extract 12 MFCC coefficients with an additional energy coefficient. Thus, we have a final vector of 39 x 13 = 507 dimensions. Standardization and normalization were applied on the MFCC features.

    To construct the multimodal digits dataset, we associated written and spoken digits of the same class respecting the initial partitioning in [2] and [3] for the training and test subsets. Since we have less samples for the spoken digits, we duplicated some random samples to match the number of written digits and have a multimodal digits database of 70000 samples (60000 for training and 10000 for test).

    The dataset is provided in six files as described below. Therefore, if a shuffle is performed on the training or test subsets, it must be performed in unison with the same order for the written digits, spoken digits and labels.

    Files:

    data_wr_train.npy: 60000 samples of 784-dimentional written digits for training;

    data_sp_train.npy: 60000 samples of 507-dimentional spoken digits for training;

    labels_train.npy: 60000 labels for the training subset;

    data_wr_test.npy: 10000 samples of 784-dimentional written digits for test;

    data_sp_test.npy: 10000 samples of 507-dimentional spoken digits for test;

    labels_test.npy: 10000 labels for the test subset.

    References:

    Khacef, L. et al. (2020), "Brain-Inspired Self-Organization with Cellular Neuromorphic Computing for Multimodal Unsupervised Learning".

    LeCun, Y. & Cortes, C. (1998), “MNIST handwritten digit database”.

    Warden, P. (2018), “Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition”.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gabriel Lins (2022). Speaker Recognition - CMU ARCTIC [Dataset]. https://www.kaggle.com/datasets/mrgabrielblins/speaker-recognition-cmu-arctic
Organization logo

Speaker Recognition - CMU ARCTIC

try to classify correctly the speaker given an audio file!

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 21, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Gabriel Lins
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description
  • Can you predict which speaker is talking?
  • Can you predict what they are saying? This dataset makes all of these possible. Perfect for a school project, research project, or resume builder.

File information

  • train.csv - file containing all the data you need for training, with 4 columns, id (file id), file_path(path to .wav files), speech(transcription of audio file), and speaker (target column)
  • test.csv - file containing all the data you need to test your model (20% of total audio files), it has the same columns as train.csv
  • train/ - Folder with training data, subdivided with Speaker's folders
    • aew/ - Folder containing audio files in .wav format for speaker aew
    • ...
  • test/ - Folder containing audio files for test data.

Column description

ColumnDescription
idfile id (string)
file_pathfile path to .wav file (string)
speechtranscription of the audio file (string)
speakerspeaker name, use this as the target variable if you are doing audio classification (string)

More Details

The CMU_ARCTIC databases were constructed at the Language Technologies Institute at Carnegie Mellon University as phonetically balanced, US-English single-speaker databases designed for unit selection speech synthesis research. A detailed report on the structure and content of the database and the recording environment etc is available as a Carnegie Mellon University, Language Technologies Institute Tech Report CMU-LTI-03-177 and is also available here.

The databases consist of around 1150 utterances carefully selected from out-of-copyright texts from Project Gutenberg. The databases include US English male (bdl) and female (slt) speakers (both experienced voice talent) as well as other accented speakers.

The 1132 sentence prompt list is available from cmuarctic.data

The distributions include 16KHz waveform and simultaneous EGG signals. Full phonetically labeling was performed by the CMU Sphinx using the FestVox based labeling scripts. Complete runnable Festival Voices are included with the database distributions, as examples though better voices can be made by improving labeling, etc.

Acknowledgements

This work was partially supported by the U.S. National Science Foundation under Grant No. 0219687, "ITR/CIS Evaluation and Personalization of Synthetic Voices". Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Search
Clear search
Close search
Google apps
Main menu