10 datasets found
  1. F

    In-Car Speech Dataset: English (US)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). In-Car Speech Dataset: English (US) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-english-us
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the US English Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

    Speech Data

    This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

    Participant Diversity:
    Speakers: 50+ native English speakers from the FutureBeeAI Community.
    Regions: Ensures a balanced representation of United States of America1 accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Nature: Scripted wake word and command type of audio recordings.
    Duration: Average duration of 5 to 20 seconds per audio recording.
    Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

    Dataset Diversity

    Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

    Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
    Different Cars: Data collection was carried out in different types and models of cars.
    Different Types of Voice Commands:
    Navigational Voice Commands
    Mobile Control Voice Commands
    Car Control Voice Commands
    Multimedia & Entertainment Commands
    General, Question Answer, Search Commands
    Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
    Morning
    Afternoon
    Evening
    Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
    Noise Level: Silent, Low Noise, Moderate Noise, High Noise
    Parking Location: Indoor, Outdoor
    Car Windows: Open, Closed
    Car AC: On, Off
    Car Engine: On, Off
    Car Movement: Stationary, Moving

    Metadata

    The dataset provides comprehensive metadata for each audio recording and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.

  2. F

    In-Car Speech Dataset: Bulgarian (Bulgaria)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). In-Car Speech Dataset: Bulgarian (Bulgaria) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-bulgarian-bulgaria
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    Bulgaria
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the US Spanish Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

    Speech Data

    This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

    Participant Diversity:
    Speakers: 50+ native Spanish speakers from the FutureBeeAI Community.
    Regions: Ensures a balanced representation of USA1 accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Nature: Scripted wake word and command type of audio recordings.
    Duration: Average duration of 5 to 20 seconds per audio recording.
    Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

    Dataset Diversity

    Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

    Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
    Different Cars: Data collection was carried out in different types and models of cars.
    Different Types of Voice Commands:
    Navigational Voice Commands
    Mobile Control Voice Commands
    Car Control Voice Commands
    Multimedia & Entertainment Commands
    General, Question Answer, Search Commands
    Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
    Morning
    Afternoon
    Evening
    Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
    Noise Level: Silent, Low Noise, Moderate Noise, High Noise
    Parking Location: Indoor, Outdoor
    Car Windows: Open, Closed
    Car AC: On, Off
    Car Engine: On, Off
    Car Movement: Stationary, Moving

    Metadata

    The dataset provides comprehensive metadata for each audio recording and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.
    <b

  3. F

    In-Car Speech Dataset: Spanish (USA)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). In-Car Speech Dataset: Spanish (USA) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-spanish-usa
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Canadian French Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

    Speech Data

    This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

    Participant Diversity:
    Speakers: 50+ native French speakers from the FutureBeeAI Community.
    Regions: Ensures a balanced representation of Canada1 accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Nature: Scripted wake word and command type of audio recordings.
    Duration: Average duration of 5 to 20 seconds per audio recording.
    Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

    Dataset Diversity

    Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

    Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
    Different Cars: Data collection was carried out in different types and models of cars.
    Different Types of Voice Commands:
    Navigational Voice Commands
    Mobile Control Voice Commands
    Car Control Voice Commands
    Multimedia & Entertainment Commands
    General, Question Answer, Search Commands
    Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
    Morning
    Afternoon
    Evening
    Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
    Noise Level: Silent, Low Noise, Moderate Noise, High Noise
    Parking Location: Indoor, Outdoor
    Car Windows: Open, Closed
    Car AC: On, Off
    Car Engine: On, Off
    Car Movement: Stationary, Moving

    Metadata

    The dataset provides comprehensive metadata for each audio recording and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.

  4. N

    Derelict Vehicle Dispositions - Vehicles

    • data.cityofnewyork.us
    • data.amerigeoss.org
    • +1more
    application/rdfxml +5
    Updated Sep 9, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Sanitation (DSNY) (2019). Derelict Vehicle Dispositions - Vehicles [Dataset]. https://data.cityofnewyork.us/City-Government/Derelict-Vehicle-Dispositions-Vehicles/bjuu-44hx
    Explore at:
    csv, application/rssxml, json, application/rdfxml, xml, tsvAvailable download formats
    Dataset updated
    Sep 9, 2019
    Dataset authored and provided by
    Department of Sanitation (DSNY)
    Description

    "Data for removing derelict vehicle operations from city streets. Gives disposition (complaints) of derelict vehicles reported to DSNY from 311. For information on how to report an abandoned vehicle, go to: http://www1.nyc.gov/nyc-resources/service/989/abandoned-vehicle.

    Related datasets: - https://data.cityofnewyork.us/City-Government/Derelict-Vehicle-Dispositions-Tow/vr8p-8shw - https://data.cityofnewyork.us/City-Government/Derelict-Vehicles-Dispositions-Complaints/pq5i-thsu

  5. d

    Derelict Vehicle Dispositions - Tow

    • catalog.data.gov
    • datadiscoverystudio.org
    • +1more
    Updated Oct 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofnewyork.us (2021). Derelict Vehicle Dispositions - Tow [Dataset]. https://catalog.data.gov/uk_UA/dataset/derelict-vehicle-dispositions-tow
    Explore at:
    Dataset updated
    Oct 8, 2021
    Dataset provided by
    data.cityofnewyork.us
    Description

    "Provides data on abandoned vehicles on city streets that did not meet DSNY guidelines to be classified as derelict and that were tagged and reported to the NYPD Rotation Tow (ro-tow) Program for towing. For information on how to report an abandoned vehicle, go to: http://www1.nyc.gov/nyc-resources/service/989/abandoned-vehicle Related datasets: - https://data.cityofnewyork.us/City-Government/Derelict-Vehicles-Dispositions-Complaints/pq5i-thsu - https://data.cityofnewyork.us/City-Government/Derelict-Vehicle-Dispositions-Vehicles/bjuu-44hx

  6. DCASE 2024 Challenge Task 10 Development Dataset: Acoustic-based Traffic...

    • zenodo.org
    • explore.openaire.eu
    bin, zip
    Updated Sep 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luca Bondi; Luca Bondi; Shabnam Ghaffarzadegan; Shabnam Ghaffarzadegan; Stefano Damiano; Abinaya Kumar; Ho-Hsiang Wu; Wei-Cheng Lin; Wei-Cheng Lin; Samarjit Das; Hans-Georg Horst; Toon van Waterschoot; Stefano Damiano; Abinaya Kumar; Ho-Hsiang Wu; Samarjit Das; Hans-Georg Horst; Toon van Waterschoot (2024). DCASE 2024 Challenge Task 10 Development Dataset: Acoustic-based Traffic Monitoring [Dataset]. http://doi.org/10.5281/zenodo.10700792
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Sep 3, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Luca Bondi; Luca Bondi; Shabnam Ghaffarzadegan; Shabnam Ghaffarzadegan; Stefano Damiano; Abinaya Kumar; Ho-Hsiang Wu; Wei-Cheng Lin; Wei-Cheng Lin; Samarjit Das; Hans-Georg Horst; Toon van Waterschoot; Stefano Damiano; Abinaya Kumar; Ho-Hsiang Wu; Samarjit Das; Hans-Georg Horst; Toon van Waterschoot
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Directory structure:

    engine-sounds.zip
    |----- car [car engine sounds]
    |----- cv [commercial vehicles engine sounds]

    locX.zip
    |----- meta.json [contains meta information of traffic condition and sensor setup corresponding to the location]
    |----- train [train flac files inside]
    |----- train.csv
    |----- val [val flac files inside]
    |----- val.csv

    simulation.zip
    |----- locX
    |----- car
    | |_ left [flac files and label csv inside]
    | |_ right [flac files and label csv inside]
    |----- cv
    |_ left [flac files and label csv inside]
    |_ right [flac files and label csv inside]

    NOTE: We split zip files for large size folders (i.e., loc1, loc3, loc6). Please make sure to download all splits before unzipping the data! For example, run

    zip -s 0 loc1.zip --out unsplit-loc1.zip

    unzip unsplit-loc1.zip

    in your terminal once you have downloaded loc1.z01, loc1.z02, loc1.z03 and loc1.zip.

    Contact:

    If there is any question, please contact:

    Acknowledgement:

    This work has partially received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 956962 and from the European Research Council under the European Union's Horizon 2020 research and innovation program / ERC Consolidator Grant: SONORA (no. 773268). This work reflects only the authors' views and the Union is not liable for any use that may be made of the contained information.

  7. PhysicalAI-Autonomous-Vehicles-NuRec

    • huggingface.co
    Updated Aug 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NVIDIA (2025). PhysicalAI-Autonomous-Vehicles-NuRec [Dataset]. https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicles-NuRec
    Explore at:
    Dataset updated
    Aug 30, 2025
    Dataset provided by
    Nvidiahttp://nvidia.com/
    Authors
    NVIDIA
    Description

    🚀 Update (August 8, 2025 – NuRec 25.07 Release)!!

    We've added new neural reconstructions reconstructions with improved quality and a wider range of scenarios, including:

    All-way stop and 2 way stop intersections Vulnerable road users (VRUs) at crosswalks Protected turns, curved lanes, and more.

    The reconstructions were generated using 6 camera views (front-wide 120 deg, front-tele 30 deg, cross right/left 120 deg and rear right/left 70 deg). Find the new samples in the… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicles-NuRec.

  8. T

    Tesla Fire

    • tesla-fire.com
    • search.dataone.org
    • +3more
    csv
    Updated Feb 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    I Capulet (2024). Tesla Fire [Dataset]. http://doi.org/10.5281/zenodo.5520568
    Explore at:
    csvAvailable download formats
    Dataset updated
    Feb 19, 2024
    Dataset provided by
    TSLAQ
    Authors
    I Capulet
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Time period covered
    Apr 2, 2013 - Present
    Variables measured
    fires
    Description

    A digital record of all Tesla fires - including cars and other products, e.g. Tesla MegaPacks - that are corroborated by news articles or confirmed primary sources. Latest version hosted at https://www.tesla-fire.com.

  9. Data from: National Highway System

    • data.ca.gov
    • gis.data.ca.gov
    • +1more
    Updated Jul 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caltrans (2023). National Highway System [Dataset]. https://data.ca.gov/dataset/national-highway-system
    Explore at:
    zip, kml, arcgis geoservices rest api, geojson, csv, htmlAvailable download formats
    Dataset updated
    Jul 18, 2023
    Dataset authored and provided by
    Caltranshttp://dot.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The National Highway System consists of a network of roads important to the economy, defense and mobility. On October 1, 2012 the existing National Highway System (NHS) was expanded to include all existing Principal Arterials (i.e. Functional Classifications 1, 2 and 3) to the new Enhanced NHS.

    Under MAP-21, the Enhanced NHS is composed of rural and urban roads nationwide serving major population centers, international border crossings, intermodal transportation facilities, and major travel destinations.The NHS includes:

    The Interstate System.

    • Other Principal arterials and border crossings on those routes (including other urban and rural principal arterial routes, and border crossings on those routes, that were not included on the NHS before the date of enactment of the MAP-21).
    • Intermodal connectors -- highways that provide motor vehicle access between the NHS and major intermodal transportation facilities.
    • STRAHNET -- the network of highways important to U.S. strategic defense.
    • STRAHNET connectors to major military installations.

  10. T

    WATV Roadways

    • internal.open.piercecountywa.gov
    • open.piercecountywa.gov
    Updated Jul 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). WATV Roadways [Dataset]. https://internal.open.piercecountywa.gov/dataset/WATV-Roadways/xk6b-6m26
    Explore at:
    kml, application/geo+json, xml, kmz, csv, xlsxAvailable download formats
    Dataset updated
    Jul 5, 2024
    Description

    https://online.co.pierce.wa.us:443/cfapps/council/model/otDocDownload.cfm?id=3611940&fileName=2018-59%20Signed%20Final%20Ordinance%20with%20Exhibits.pdf" STYLE="text-decoration:underline;">Pierce County Ord. 2018-59 authorizes operation of wheeled all-terrain vehicles (WATV) on approved Pierce County roadways, effective Jan. 1, 2019.

    https://online.co.pierce.wa.us:443/cfapps/council/model/otDocDownload.cfm?id=6706190&fileName=2019-29%20Signed%20Final%20Ordinance.pdf" STYLE="text-decoration:underline;">Pierce County Ord. 2019-29 amends operation of wheeled all-terrain vehicles (WATV) on approved Pierce County roadways, effective Aug. 1, 2019.

    https://online.co.pierce.wa.us/cfapps/council/model/otDocDownload.cfm?id=12523609&fileName=2020-90s%20Signed%20Final%20Ordinance%20with%20Exhibit.pdf" STYLE="text-decoration:underline;">Pierce County Ord. 2020-90s amends operation of wheeled all-terrain vehicles (WATV) on approved Pierce County roadways, effective Jan. 1, 2021.

    https://online.co.pierce.wa.us:443/cfapps/council/model/otDocDownload.cfm?id=24225109&fileName=2022-64%20Signed%20Final%20Ord%20with%20Exhibit.pdf" STYLE="text-decoration:underline;">Pierce County Ord. 2022-64 amends operation of wheeled all-terrain vehicles (WATV) on approved Pierce County roadways, effective Jan. 1, 2023.

  11. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
FutureBee AI (2022). In-Car Speech Dataset: English (US) [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/in-car-speech-dataset-english-us

In-Car Speech Dataset: English (US)

American English In-car Audio corpus

Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License

https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

Area covered
United States
Dataset funded by
FutureBeeAI
Description

Introduction

Welcome to the US English Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.

Speech Data

This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.

Participant Diversity:
Speakers: 50+ native English speakers from the FutureBeeAI Community.
Regions: Ensures a balanced representation of United States of America1 accents, dialects, and demographics.
Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
Recording Nature: Scripted wake word and command type of audio recordings.
Duration: Average duration of 5 to 20 seconds per audio recording.
Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.

Dataset Diversity

Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.

Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
Different Cars: Data collection was carried out in different types and models of cars.
Different Types of Voice Commands:
Navigational Voice Commands
Mobile Control Voice Commands
Car Control Voice Commands
Multimedia & Entertainment Commands
General, Question Answer, Search Commands
Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
Morning
Afternoon
Evening
Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
Noise Level: Silent, Low Noise, Moderate Noise, High Noise
Parking Location: Indoor, Outdoor
Car Windows: Open, Closed
Car AC: On, Off
Car Engine: On, Off
Car Movement: Stationary, Moving

Metadata

The dataset provides comprehensive metadata for each audio recording and participant:

Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.

Search
Clear search
Close search
Google apps
Main menu