Annual State-reported licensed driver data from Highway Statistics for the 50 States and DC from Highway Statistics table DL-22.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The graph illustrates the number of truck drivers in the United States from 1997 to 2024. The x-axis represents the years, ranging from 1997 to 2024, while the y-axis denotes the number of truck drivers, spanning from 2,247,000 in 2010 to 3,064,890 in 2023. Throughout this period, the number of truck drivers generally increased, starting at 264,258 in 1997 and reaching its highest point in 2024. Notable fluctuations include significant decreases in 1998 and 2002, followed by steady growth in subsequent years. Overall, the data exhibits an upward trend in the number of truck drivers over the 27-year span. This information is presented in a line graph format, effectively highlighting the annual changes and long-term growth in truck driver numbers in the United States.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Average Vehicles per Household: 4 or More Licensed Drivers data was reported at 4.100 Unit in 2017. This records an increase from the previous number of 3.900 Unit for 2009. United States Average Vehicles per Household: 4 or More Licensed Drivers data is updated yearly, averaging 3.850 Unit from Dec 1991 (Median) to 2017, with 4 observations. The data reached an all-time high of 4.100 Unit in 2017 and a record low of 3.800 Unit in 2001. United States Average Vehicles per Household: 4 or More Licensed Drivers data remains active status in CEIC and is reported by Center for Transportation Analysis. The data is categorized under Global Database’s United States – Table US.TA003: Number of Vehicles per Household.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the US English Language In-car Speech Dataset, a comprehensive collection of audio recordings designed to facilitate the development of speech recognition models specifically tailored for in-car environments. This dataset aims to support research and innovation in automotive speech technology, enabling seamless and robust voice interactions within vehicles for drivers and co-passengers.
This dataset comprises over 5,000 high-quality audio recordings collected from various in-car environments. These recordings include scripted wake words and command-type prompts.
Participant Diversity:
- Speakers: 50+ native English speakers from the FutureBeeAI Community.
- Regions: Ensures a balanced representation of United States of America1 accents, dialects, and demographics.
- Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
Recording Nature: Scripted wake word and command type of audio recordings.
- Duration: Average duration of 5 to 20 seconds per audio recording.
- Formats: WAV format with mono channels, a bit depth of 16 bits. The dataset contains different data at 16kHz and 48kHz.
Apart from participant diversity, the dataset is diverse in terms of different wake words, voice commands, and recording environments.
Different Automobile Related Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Hey Mini, Hey Toyota, Ok Ford, Hey Hyundai, Ok Honda, Hello Kia, Hey Dodge.
Different Cars: Data collection was carried out in different types and models of cars.
Different Types of Voice Commands:
- Navigational Voice Commands
- Mobile Control Voice Commands
- Car Control Voice Commands
- Multimedia & Entertainment Commands
- General, Question Answer, Search Commands
Recording Time: Participants recorded the given prompts at various times to make the dataset more diverse.
- Morning
- Afternoon
- Evening
Recording Environment: Various recording environments were captured to acquire more realistic data and to make the dataset inclusive of various types of noises. Some of the environment variables are as follows:
- Noise Level: Silent, Low Noise, Moderate Noise, High Noise
- Parking Location: Indoor, Outdoor
- Car Windows: Open, Closed
- Car AC: On, Off
- Car Engine: On, Off
- Car Movement: Stationary, Moving
The dataset provides comprehensive metadata for each audio recording and participant:
Participant Metadata: Unique identifier, age, gender, country, state, district, accent, and dialect.
Other Metadata: Recording transcript, recording environment, device details, sample rate, bit depth, file format, recording time.
This metadata is a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of English voice assistant speech recognition models.
This US English In-car audio dataset is created by FutureBeeAI and is available for commercial use.
This study focuses on the drinking and driving habits of Americans. The questionnaire contained 51 questions. Respondents were interviewed over the telephone and asked about their frequency of consumption of alcoholic beverages, where they most often drank, their mode of transportation to and from this location, their driving and drinking experiences, and their age, sex, educational attainment, and socioeconomic status.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Key information about US Number of Registered Vehicles
The ORNL Driver Identification Dataset was created to collect and analyze driving behavior data from 50 different drivers. Each driver operated a 2014 Kenworth T270 Class 6 truck around Fort Collins, Colorado while various data sources recorded their driving behavior and vehicle performance. The dataset includes CANbus (Controller Area Network) data, GPS data, inertial measurement data, and biometric data from a heart rate monitor. A cyberattack was executed during each drive, which caused multiple dashboard warning lights to illuminate and set the tachometer and speedometer to zero, regardless of actual speed. The attack was stopped either after one minute or if the driver pulled over.
Sample Data: https://cloud.drivertechnologies.com/shared?s=146&t=4:03&token=0f469c88-d578-4b4f-80b2-f53f195683b2
At Driver Technologies, we are dedicated to harnessing advanced technology to gather anonymized critical driving data through our innovative dash cam app, which operates seamlessly on end users' smartphones. Our Speed Over Limit Driver Behavior Data offering is a key resource for understanding driver behavior and improving safety on the roads, making it an essential tool for various industries.
What Makes Our Data Unique? Our Speed Over Limit Driver Behavior Data is distinguished by its real-time collection capabilities, utilizing our built-in computer vision technology to identify and capture instances where a driver nearly gets into an accident. This data reflects critical safety events that are indicative of potential risks and non-compliance with traffic regulations. By providing data on these significant events, our dataset empowers clients to perform in-depth analysis.
How Is the Data Generally Sourced? Our data is sourced directly from users who utilize our dash cam app, which harnesses the smartphone’s camera and sensors to record during a trip. This direct sourcing method ensures that our data is unbiased and represents a wide variety of conditions and environments. The data is not only authentic and reflective of current road conditions but is also abundant in volume, offering millions of miles of recorded trips that cover diverse scenarios. For our Speed Over Limit Driver Behavior Data, we leverage computer vision models to read speed limit signs as the driver drives past them, then compare that to speed data captured using the phone's sensor.
Primary Use-Cases and Verticals Driver Behavior Analysis: Organizations can leverage our dataset to analyze driving habits and identify trends in driver behavior. This analysis can help in understanding patterns related to rule compliance and potential risk factors.
Training Computer Vision Models: Clients can utilize our annotated data to develop and refine their own computer vision models for applications in autonomous vehicles, ensuring better decision-making capabilities in complex driving environments.
Improving Risk Assessment: Insurers can utilize our dataset to refine their risk assessment models. By understanding the frequency and context of significant events, they can better evaluate driver risk profiles, leading to more accurate premium pricing and improved underwriting processes.
Integration with Our Broader Data Offering The Speed Over Limit Driver Behavior Data is a crucial component of our broader data offerings at Driver Technologies. It complements our extensive library of driving data collected from various vehicles and road users, creating a comprehensive data ecosystem that supports multiple verticals, including insurance, automotive technology, and smart city planning.
In summary, Driver Technologies' Speed Over Limit Driver Behavior Data provides a unique opportunity for data buyers to access high-quality, actionable insights that drive innovation across mobility. By integrating our Speed Over Limit Driver Behavior Data with other datasets, clients can gain a holistic view of transportation dynamics, enhancing their analytical capabilities and decision-making processes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This repository hosts the Testing Roads for Autonomous VEhicLes (TRAVEL) dataset. TRAVEL is an extensive collection of virtual roads that have been used for testing lane assist/keeping systems (i.e., driving agents) and data from their execution in state of the art, physically accurate driving simulator, called BeamNG.tech. Virtual roads consist of sequences of road points interpolated using Cubic splines.
Along with the data, this repository contains instructions on how to install the tooling necessary to generate new data (i.e., test cases) and analyze them in the context of test regression. We focus on test selection and test prioritization, given their importance for developing high-quality software following the DevOps paradigms.
This dataset builds on top of our previous work in this area, including work on
Dataset Overview
The TRAVEL dataset is available under the data
folder and is organized as a set of experiments folders. Each of these folders is generated by running the test-generator
(see below) and contains the configuration used for generating the data (experiment_description.csv
), various statistics on generated tests (generation_stats.csv
) and found faults (oob_stats.csv
). Additionally, the folders contain the raw test cases generated and executed during each experiment (test.
).
The following sections describe what each of those files contains.
Experiment Description
The experiment_description.csv
contains the settings used to generate the data, including:
Experiment Statistics
The generation_stats.csv
contains statistics about the test generation, including:
The TRAVEL dataset also contains statistics about the failed tests, including the overall number of failed tests (total oob) and its breakdown into OOB that happened while driving left or right. Further statistics about the diversity (i.e., sparseness) of the failures are also reported.
Test Cases and Executions
Each test.
contains information about a test case and, if the test case is valid, the data observed during its execution as driving simulation.
The data about the test case definition include:
the road contains sharp turns
or the road self intersects
)The test data are organized according to the following JSON Schema and can be interpreted as RoadTest
objects provided by the tests_generation.py module.
{
"type": "object",
"properties": {
"id": { "type": "integer" },
"is_valid": { "type": "boolean" },
"validation_message": { "type": "string" },
"road_points": { §\label{line:road-points}§
"type": "array",
"items": { "$ref": "schemas/pair" },
},
"interpolated_points": { §\label{line:interpolated-points}§
"type": "array",
"items": { "$ref": "schemas/pair" },
},
"test_outcome": { "type": "string" }, §\label{line:test-outcome}§
"description": { "type": "string" },
"execution_data": {
"type": "array",
"items": { "$ref" : "schemas/simulationdata" }
}
},
"required": [
"id", "is_valid", "validation_message",
"road_points", "interpolated_points"
]
}
Finally, the execution data contain a list of timestamped state information recorded by the driving simulation. State information is collected at constant frequency and includes absolute position, rotation, and velocity of the ego-car, its speed in Km/h, and control inputs from the driving agent (steering, throttle, and braking). Additionally, execution data contain OOB-related data, such as the lateral distance between the car and the lane center and the OOB percentage (i.e., how much the car is outside the lane).
The simulation data adhere to the following (simplified) JSON Schema and can be interpreted as Python objects using the simulation_data.py module.
{
"$id": "schemas/simulationdata",
"type": "object",
"properties": {
"timer" : { "type": "number" },
"pos" : {
"type": "array",
"items":{ "$ref" : "schemas/triple" }
}
"vel" : {
"type": "array",
"items":{ "$ref" : "schemas/triple" }
}
"vel_kmh" : { "type": "number" },
"steering" : { "type": "number" },
"brake" : { "type": "number" },
"throttle" : { "type": "number" },
"is_oob" : { "type": "number" },
"oob_percentage" : { "type": "number" } §\label{line:oob-percentage}§
},
"required": [
"timer", "pos", "vel", "vel_kmh",
"steering", "brake", "throttle",
"is_oob", "oob_percentage"
]
}
Dataset Content
The TRAVEL dataset is a lively initiative so the content of the dataset is subject to change. Currently, the dataset contains the data collected during the SBST CPS tool competition, and data collected in the context of our recent work on test selection (SDC-Scissor work and tool) and test prioritization (automated test cases prioritization work for SDCs).
SBST CPS Tool Competition Data
The data collected during the SBST CPS tool competition are stored inside data/competition.tar.gz
. The file contains the test cases generated by Deeper, Frenetic, AdaFrenetic, and Swat, the open-source test generators submitted to the competition and executed against BeamNG.AI with an aggression factor of 0.7 (i.e., conservative driver).
Name | Map Size (m x m) | Max Speed (Km/h) | Budget (h) | OOB Tolerance |
---|
This dataset provides information on motor vehicle operators (drivers) involved in traffic collisions occurring on county and local roadways. The dataset reports details of all traffic collisions occurring on county and local roadways within Montgomery County, as collected via the Automated Crash Reporting System (ACRS) of the Maryland State Police, and reported by the Montgomery County Police, Gaithersburg Police, Rockville Police, or the Maryland-National Capital Park Police. This dataset shows each collision data recorded and the drivers involved. Please note that these collision reports are based on preliminary information supplied to the Police Department by the reporting parties. Therefore, the collision data available on this web page may reflect: -Information not yet verified by further investigation -Information that may include verified and unverified collision data -Preliminary collision classifications may be changed at a later date based upon further investigation -Information may include mechanical or human error This dataset can be joined with the other 2 Crash Reporting datasets (see URLs below) by the State Report Number. * Crash Reporting - Incidents Data at https://data.montgomerycountymd.gov/Public-Safety/Crash-Reporting-Incidents-Data/bhju-22kf * Crash Reporting - Non-Motorists Data at https://data.montgomerycountymd.gov/Public-Safety/Crash-Reporting-Non-Motorists-Data/n7fk-dce5 Update Frequency : Weekly
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
A comprehensive, real-world–anchored synthetic dataset capturing 2,133 luxury beauty pop-up events across global retail hotspots. It focuses on limited-edition product drops, experiential formats, and performance KPIs—especially footfall and sell‑through. The data is designed for analytics use cases such as demand forecasting, footfall modeling, merchandising optimization, pricing analysis, and market expansion studies across regions and venue types.
Column | Type | Example | Description |
---|---|---|---|
event_id | string | POP100282 | Unique identifier for each pop‑up event. |
brand | string | Charlotte Tilbury | Luxury/premium cosmetics brand running the pop‑up. |
region | string | North America | Macro market region (North America, Europe, Middle East, Asia‑Pacific, Latin America). |
city | string | Miami | City of the event; occasionally null to simulate real‑world data gaps. |
location_type | string | Art/Design District | Venue archetype: High‑Street, Luxury Mall, Dept Store Atrium, Airport Duty‑Free, Art/Design District. |
event_type | string | Flash Event | Pop‑up format: Standalone, Shop‑in‑Shop, Mobile Truck, Flash Event, Mall Kiosk. |
start_date | date | 2024-02-25 | Event start date. |
end_date | date | 2024-03-02 | Event end date; can be null (e.g., ongoing/TBC) to reflect operational uncertainty. |
lease_length_days | integer | 6 | Duration of the activation (days), aligned with short‑term pop‑up leases. |
sku | string | LE-UQYNQA1A | Limited‑release product code tied to the event/dataset scope. |
product_name | string | Charlotte Tilbury Glow Mascara | Branded product listing (luxury‑oriented descriptors + category). |
price_usd | float | 62.21 | Ticket price (USD) aligned with luxury cosmetics price bands by category. |
avg_daily_footfall | integer | 1107 | Estimated average daily visitors based on venue, format, and activation intensity. |
units_sold | integer | 3056 | Total units sold during the event window; capped by allocation dynamics. |
sell_through_pct | float | 98.9 | Share of allocated inventory sold (%), proxy for demand strength and launch success. |
Registration information on interstate, intrastate non-hazmat, and intrastate truck and bus companies that operate in the United States and have registered with FMCSA. Contains contact information and demographic information (number of drivers, vehicles, commodities carried, etc).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Car Registrations in the United States increased to 240.90 Thousand in August from 221.50 Thousand in July of 2025. This dataset provides - United States Car Registrations - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Over the course of the 20th century, the number of operational motor vehicles in the United States grew significantly, from just 8,000 automobiles in the year 1900 to more than 183 million private and commercial vehicles in the late 1980s. Generally, the number of vehicles increased in each year, with the most notable exceptions during the Great Depression and Second World War.
This is a list of authorized providers who offer the TLC Driver License 24 hour TLC Driver Education Course and exam. All TLC Driver License applicants must complete the course and pass an 80-question multiple choice exam on a computer with a grade of 70% or higher (you must answer 56 out of 80 questions correctly in order to pass). The course covers the following topics: TLC rules and regulations; geography; safe driving skills; traffic rules; and customer service.
This dataset features over 1,000,000 high-quality images of cars, sourced globally from photographers, enthusiasts, and automotive content creators. Optimized for AI and machine learning applications, it provides richly annotated and visually diverse automotive imagery suitable for a wide array of use cases in mobility, computer vision, and retail.
Key Features: 1. Comprehensive Metadata: each image includes full EXIF data and detailed annotations such as car make, model, year, body type, view angle (front, rear, side, interior), and condition (e.g., showroom, on-road, vintage, damaged). Ideal for training in classification, detection, OCR for license plates, and damage assessment.
Unique Sourcing Capabilities: the dataset is built from images submitted through a proprietary gamified photography platform with auto-themed competitions. Custom datasets can be delivered within 72 hours targeting specific brands, regions, lighting conditions, or functional contexts (e.g., race cars, commercial vehicles, taxis).
Global Diversity: contributors from over 100 countries ensure broad coverage of car types, manufacturing regions, driving orientations, and environmental settings—from luxury sedans in urban Europe to pickups in rural America and tuk-tuks in Southeast Asia.
High-Quality Imagery: images range from standard to ultra-HD and include professional-grade automotive photography, dealership shots, roadside captures, and street-level scenes. A mix of static and dynamic compositions supports diverse model training.
Popularity Scores: each image includes a popularity score derived from GuruShots competition performance, offering valuable signals for consumer appeal, aesthetic evaluation, and trend modeling.
AI-Ready Design: this dataset is structured for use in applications like vehicle detection, make/model recognition, automated insurance assessment, smart parking systems, and visual search. It’s compatible with all major ML frameworks and edge-device deployments.
Licensing & Compliance: fully compliant with privacy and automotive content use standards, offering transparent and flexible licensing for commercial and academic use.
Use Cases: 1. Training AI for vehicle recognition in smart city, surveillance, and autonomous driving systems. 2. Powering car search engines, automotive e-commerce platforms, and dealership inventory tools. 3. Supporting damage detection, condition grading, and automated insurance workflows. 4. Enhancing mobility research, traffic analytics, and vision-based safety systems.
This dataset delivers a large-scale, high-fidelity foundation for AI innovation in transportation, automotive tech, and intelligent infrastructure. Custom dataset curation and region-specific filters are available. Contact us to learn more!
https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required
Graph and download economic data for Automobile Registrations, Passenger Cars, Total for United States from 1895 to 1944 about car registrations, vehicles, and USA.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
USA ID CARD FRONT is a dataset for object detection tasks - it contains TEXT annotations for 29 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
When data and analytics leaders throughout Europe and the United States were asked what their key business drivers were for their company's data and analytics priorities, over half cite generating revenue as their number one reason as of 2021. Other popular business drivers include digital transformation, customer intimacy, plus regulatory and compliance to name a few.
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Moving 12-Month Total Vehicle Miles Traveled (M12MTVUSM227NFWA) from Dec 1970 to Jul 2025 about miles, travel, vehicles, and USA.
Annual State-reported licensed driver data from Highway Statistics for the 50 States and DC from Highway Statistics table DL-22.