5 datasets found
  1. F

    In-Car Speech Dataset: English (US)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). In-Car Speech Dataset: English (US) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-english-us
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the US English Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

    Speech Data

    This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

    Participant Diversity:
    Speakers: 50+ native English speakers from the FutureBeeAI Community.
    Regions: Ensures a balanced representation of United States of America1 accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Nature: Scripted wake word and command type of audio recordings.
    Duration: Average duration of 5 to 20 seconds per audio recording.
    Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

    Dataset Diversity

    Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

    Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
    Different Cars: Data collection was carried out in different types and models of cars.
    Different Types of Voice Commands:
    Navigational Voice Commands
    Mobile Control Voice Commands
    Car Control Voice Commands
    Multimedia & Entertainment Commands
    General, Question Answer, Search Commands
    Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
    Morning
    Afternoon
    Evening
    Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
    Noise Level: Silent, Low Noise, Moderate Noise, High Noise
    Parking Location: Indoor, Outdoor
    Car Windows: Open, Closed
    Car AC: On, Off
    Car Engine: On, Off
    Car Movement: Stationary, Moving

    Metadata

    The dataset provides comprehensive metadata for each audio recording and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.

  2. F

    In-Car Speech Dataset: English (UK)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). In-Car Speech Dataset: English (UK) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-english-british
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United Kingdom
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the UK English Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

    Speech Data

    This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

    Participant Diversity:
    Speakers: 50+ native English speakers from the FutureBeeAI Community.
    Regions: Ensures a balanced representation of United Kingdom1 accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Nature: Scripted wake word and command type of audio recordings.
    Duration: Average duration of 5 to 20 seconds per audio recording.
    Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

    Dataset Diversity

    Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

    Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
    Different Cars: Data collection was carried out in different types and models of cars.
    Different Types of Voice Commands:
    Navigational Voice Commands
    Mobile Control Voice Commands
    Car Control Voice Commands
    Multimedia & Entertainment Commands
    General, Question Answer, Search Commands
    Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
    Morning
    Afternoon
    Evening
    Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
    Noise Level: Silent, Low Noise, Moderate Noise, High Noise
    Parking Location: Indoor, Outdoor
    Car Windows: Open, Closed
    Car AC: On, Off
    Car Engine: On, Off
    Car Movement: Stationary, Moving

    Metadata

    The dataset provides comprehensive metadata for each audio recording and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.

  3. F

    In-Car Speech Dataset: English (New Zealand)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). In-Car Speech Dataset: English (New Zealand) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-english-new
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    New Zealand
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the New Zealand English Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

    Speech Data

    This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

    Participant Diversity:
    Speakers: 50+ native English speakers from the FutureBeeAI Community.
    Regions: Ensures a balanced representation of New Zealand1 accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Nature: Scripted wake word and command type of audio recordings.
    Duration: Average duration of 5 to 20 seconds per audio recording.
    Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

    Dataset Diversity

    Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

    Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
    Different Cars: Data collection was carried out in different types and models of cars.
    Different Types of Voice Commands:
    Navigational Voice Commands
    Mobile Control Voice Commands
    Car Control Voice Commands
    Multimedia & Entertainment Commands
    General, Question Answer, Search Commands
    Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
    Morning
    Afternoon
    Evening
    Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
    Noise Level: Silent, Low Noise, Moderate Noise, High Noise
    Parking Location: Indoor, Outdoor
    Car Windows: Open, Closed
    Car AC: On, Off
    Car Engine: On, Off
    Car Movement: Stationary, Moving

    Metadata

    The dataset provides comprehensive metadata for each audio recording and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.

  4. F

    In-Car Speech Dataset: English (Canada)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). In-Car Speech Dataset: English (Canada) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-english-canada
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    Canada
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Canadian English Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

    Speech Data

    This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

    Participant Diversity:
    Speakers: 50+ native English speakers from the FutureBeeAI Community.
    Regions: Ensures a balanced representation of Canada1 accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Nature: Scripted wake word and command type of audio recordings.
    Duration: Average duration of 5 to 20 seconds per audio recording.
    Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

    Dataset Diversity

    Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

    Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
    Different Cars: Data collection was carried out in different types and models of cars.
    Different Types of Voice Commands:
    Navigational Voice Commands
    Mobile Control Voice Commands
    Car Control Voice Commands
    Multimedia & Entertainment Commands
    General, Question Answer, Search Commands
    Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
    Morning
    Afternoon
    Evening
    Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
    Noise Level: Silent, Low Noise, Moderate Noise, High Noise
    Parking Location: Indoor, Outdoor
    Car Windows: Open, Closed
    Car AC: On, Off
    Car Engine: On, Off
    Car Movement: Stationary, Moving

    Metadata

    The dataset provides comprehensive metadata for each audio recording and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.

  5. F

    In-Car Speech Dataset: English (Australia)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). In-Car Speech Dataset: English (Australia) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-english-australia
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    Australia
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Australian English Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

    Speech Data

    This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

    Participant Diversity:
    Speakers: 50+ native English speakers from the FutureBeeAI Community.
    Regions: Ensures a balanced representation of Australia1 accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Nature: Scripted wake word and command type of audio recordings.
    Duration: Average duration of 5 to 20 seconds per audio recording.
    Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

    Dataset Diversity

    Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

    Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
    Different Cars: Data collection was carried out in different types and models of cars.
    Different Types of Voice Commands:
    Navigational Voice Commands
    Mobile Control Voice Commands
    Car Control Voice Commands
    Multimedia & Entertainment Commands
    General, Question Answer, Search Commands
    Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
    Morning
    Afternoon
    Evening
    Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
    Noise Level: Silent, Low Noise, Moderate Noise, High Noise
    Parking Location: Indoor, Outdoor
    Car Windows: Open, Closed
    Car AC: On, Off
    Car Engine: On, Off
    Car Movement: Stationary, Moving

    Metadata

    The dataset provides comprehensive metadata for each audio recording and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
FutureBee AI (2022). In-Car Speech Dataset: English (US) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-english-us

In-Car Speech Dataset: English (US)

American English In-car Audio corpus

Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License

https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

Area covered
United States
Dataset funded by
FutureBeeAI
Description

Introduction

Welcome to the US English Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

Speech Data

This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

Participant Diversity:
Speakers: 50+ native English speakers from the FutureBeeAI Community.
Regions: Ensures a balanced representation of United States of America1 accents, dialects, and demographics.
Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
Recording Nature: Scripted wake word and command type of audio recordings.
Duration: Average duration of 5 to 20 seconds per audio recording.
Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

Dataset Diversity

Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
Different Cars: Data collection was carried out in different types and models of cars.
Different Types of Voice Commands:
Navigational Voice Commands
Mobile Control Voice Commands
Car Control Voice Commands
Multimedia & Entertainment Commands
General, Question Answer, Search Commands
Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
Morning
Afternoon
Evening
Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
Noise Level: Silent, Low Noise, Moderate Noise, High Noise
Parking Location: Indoor, Outdoor
Car Windows: Open, Closed
Car AC: On, Off
Car Engine: On, Off
Car Movement: Stationary, Moving

Metadata

The dataset provides comprehensive metadata for each audio recording and participant:

Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.

Search
Clear search
Close search
Google apps
Main menu