Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SDCC Traffic Congestion Saturation Flow Data for January to June 2023. Traffic volumes, traffic saturation, and congestion data for sites across South Dublin County. Used by traffic management to control stage timings on junctions. It is recommended that this dataset is read in conjunction with the ‘Traffic Data Site Names SDCC’ dataset.A detailed description of each column heading can be referenced below;scn: Site Serial numberregion: A group of Nodes that are operated under SCOOT control at the same common cycle time. Normally these will be nodes between which co-ordination is desirable. Some of the nodes may be double cycling at half of the region cycle time.system: SCOOT STC UTC (UTC-MX)locn: Locationssite: Site numbersday: Days of the week Monday to Sunday. Abbreviations; MO,TU,WE,TH,FR,SA,SU.date: Reflects correct actual Date of when data was collected.start_time: NOTE - Please ignore the date displayed in this column. The actual data collection date is correctly displayed in the 'date' column. The date displayed here is the date of when report was run and extracted from the system, but correctly reflects start time of 15 minute intervals. end_time: End time of 15 minute intervals.flow: A representation of demand (flow) for each link built up over several minutes by the SCOOT model. SCOOT has two profiles:(1) Short – Raw data representing the actual values over the previous few minutes(2) Long – A smoothed average of values over a longer periodSCOOT will choose to use the appropriate profile depending on a number of factors.flow_pc: Same as above ref PC SCOOTcong: Congestion is directly measured from the detector. If the detector is placed beyond the normal end of queue in the street it is rarely covered by stationary traffic, except of course when congestion occurs. If any detector shows standing traffic for the whole of an interval this is recorded. The number of intervals of congestion in any cycle is also recorded.The percentage congestion is calculated from:No of congested intervals x 4 x 100 cycle time in seconds.This percentage of congestion is available to view and more importantly for the optimisers to take into account.cong_pc: Same as above ref PC SCOOTdsat: The ratio of the demand flow to the maximum possible discharge flow, i.e. it is the ratio of the demand to the discharge rate (Saturation Occupancy) multiplied by the duration of the effective green time. The Split optimiser will try to minimise the maximum degree of saturation on links approaching the node.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Traffic Flow Analysis: The dataset can be used in machine learning models to analyze traffic flow in cities. It can identify the type of vehicles on the city roads at different times of the day, helping in planning and traffic management.
Vehicle Class Based Toll Collection: Toll booths can use this model to automatically classify and charge vehicles based on their type, enabling a more efficient and automated system.
Parking Management System: Parking lot owners can use this model to easily classify vehicles as they enter for better space management. Knowing the vehicle type can help assign it to the most suitable parking spot.
Traffic Rule Enforcement: The dataset can be used to create a computer vision model to automatically detect any traffic violations like wrong lane driving by different vehicle types, and notify law enforcement agencies.
Smart Ambulance Tracking: The system can help in identifying and tracking ambulances and other emergency vehicles, enabling traffic management systems to provide priority routing during emergencies.
You can also access an API version of this dataset.
TMS
(traffic monitoring system) daily-updated traffic counts API
Important note: due to the size of this dataset, you won't be able to open it fully in Excel. Use notepad / R / any software package which can open more than a million rows.
Data reuse caveats: as per license.
Data quality
statement: please read the accompanying user manual, explaining:
how
this data is collected identification
of count stations traffic
monitoring technology monitoring
hierarchy and conventions typical
survey specification data
calculation TMS
operation.
Traffic
monitoring for state highways: user manual
[PDF 465 KB]
The data is at daily granularity. However, the actual update
frequency of the data depends on the contract the site falls within. For telemetry
sites it's once a week on a Wednesday. Some regional sites are fortnightly, and
some monthly or quarterly. Some are only 4 weeks a year, with timing depending
on contractors’ programme of work.
Data quality caveats: you must use this data in
conjunction with the user manual and the following caveats.
The
road sensors used in data collection are subject to both technical errors and
environmental interference.Data
is compiled from a variety of sources. Accuracy may vary and the data
should only be used as a guide.As
not all road sections are monitored, a direct calculation of Vehicle
Kilometres Travelled (VKT) for a region is not possible.Data
is sourced from Waka Kotahi New Zealand Transport Agency TMS data.For
sites that use dual loops classification is by length. Vehicles with a length of less than 5.5m are
classed as light vehicles. Vehicles over 11m long are classed as heavy
vehicles. Vehicles between 5.5 and 11m are split 50:50 into light and
heavy.In September 2022, the National Telemetry contract was handed to a new contractor. During the handover process, due to some missing documents and aged technology, 40 of the 96 national telemetry traffic count sites went offline. Current contractor has continued to upload data from all active sites and have gradually worked to bring most offline sites back online. Please note and account for possible gaps in data from National Telemetry Sites.
The NZTA Vehicle
Classification Relationships diagram below shows the length classification (typically dual loops) and axle classification (typically pneumatic tube counts),
and how these map to the Monetised benefits and costs manual, table A37,
page 254.
Monetised benefits and costs manual [PDF 9 MB]
For the full TMS
classification schema see Appendix A of the traffic counting manual vehicle
classification scheme (NZTA 2011), below.
Traffic monitoring for state highways: user manual [PDF 465 KB]
State highway traffic monitoring (map)
State highway traffic monitoring sites
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Update NotesMar 16 2024, remove spaces in the file and folder names.Mar 31 2024, delete the underscore in the city names with a space (such as San Francisco) in the '02_TransCAD_results' folder to ensure correct data loading by TransCAD (software version: 9.0).Aug 31 2024, add the 'cityname_link_LinkFlows.csv' file in the '02_TransCAD_results' folder to match the link from input data and the link from TransCAD results (LinkFlows) with the same Link_ID.IntroductionThis is a unified and validated traffic dataset for 20 US cities. There are 3 folders for each city.01 Input datathe initial network data obtained from OpenStreetMap (OSM)the visualization of the OSM dataprocessed node / link / od data02 TransCAD results (software version: 9.0)cityname.dbd : geographical network database of the city supported by TransCAD (version 9.0)cityname_link.shp / cityname_node.shp : network data supported by GIS software, which can be imported into TransCAD manually. Then the corresponding '.dbd' file can be generated for TransCAD with a version lower than 9.0od.mtx : OD matrix supported by TransCADLinkFlows.bin / LinkFlows.csv : traffic assignment results by TransCADcityname_link_LinkFlows.csv: the input link attributes with the traffic assignment results by TransCADShortestPath.mtx / ue_travel_time.csv : the traval time (min) between OD pairs by TransCAD03 AequilibraE results (software version: 0.9.3)cityname.shp : shapefile network data of the city support by QGIS or other GIS softwareod_demand.aem : OD matrix supported by AequilibraEnetwork.csv : the network file used for traffic assignment in AequilibraEassignment_result.csv : traffic assignment results by AequilibraEPublicationXu, X., Zheng, Z., Hu, Z. et al. (2024). A unified dataset for the city-scale traffic assignment model in 20 U.S. cities. Sci Data 11, 325. https://doi.org/10.1038/s41597-024-03149-8Usage NotesIf you use this dataset in your research or any other work, please cite both the dataset and paper above.A brief introduction about how to use this dataset can be found in GitHub. More detailed illustration for compiling the traffic dataset on AequilibraE can be referred to GitHub code or Colab code.ContactIf you have any inquiries, please contact Xiaotong Xu (email: kid-a.xu@connect.polyu.hk).
This traffic-count data is provided by the City of Pittsburgh's Department of Mobility & Infrastructure (DOMI). Counters were deployed as part of traffic studies, including intersection studies, and studies covering where or whether to install speed humps. In some cases, data may have been collected by the Southwestern Pennsylvania Commission (SPC) or BikePGH.
Data is currently available for only the most-recent count at each location.
Traffic count data is important to the process for deciding where to install speed humps. According to DOMI, they may only be legally installed on streets where traffic counts fall below a minimum threshhold. Residents can request an evaluation of their street as part of DOMI's Neighborhood Traffic Calming Program. The City has also shared data on the impact of the Neighborhood Traffic Calming Program in reducing speeds.
Different studies may collect different data. Speed hump studies capture counts and speeds. SPC and BikePGH conduct counts of cyclists. Intersection studies included in this dataset may not include traffic counts, but reports of individual studies may be requested from the City. Despite the lack of count data, intersection studies are included to facilitate data requests.
Data captured by different types of counting devices are included in this data. StatTrak counters are in use by the City, and capture data on counts and speeds. More information about these devices may be found on the company's website. Data includes traffic counts and average speeds, and may also include separate counts of bicycles.
Tubes are deployed by both SPC and BikePGH and used to count cyclists. SPC may also deploy video counters to collect data.
NOTE: The data in this dataset has not updated since 2021 because of a broken data feed. We're working to fix it.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SDCC Traffic Congestion Saturation Flow Data for January to June 2022. Traffic volumes, traffic saturation, and congestion data for sites across South Dublin County. Used by traffic management to control stage timings on junctions. It is recommended that this dataset is read in conjunction with the ‘Traffic Data Site Names SDCC’ dataset.A detailed description of each column heading can be referenced below;scn: Site Serial numberregion: A group of Nodes that are operated under SCOOT control at the same common cycle time. Normally these will be nodes between which co-ordination is desirable. Some of the nodes may be double cycling at half of the region cycle time.system: SCOOT STC UTC (UTC-MX)locn: Locationssite: Site numbersday: Days of the week Monday to Sunday. Abbreviations; MO,TU,WE,TH,FR,SA,SU.date: Reflects correct actual Date of when data was collected.start_time: NOTE - Please ignore the date displayed in this column. The actual data collection date is correctly displayed in the 'date' column. The date displayed here is the date of when report was run and extracted from the system, but correctly reflects start time of 15 minute intervals. end_time: End time of 15 minute intervals.flow: A representation of demand (flow) for each link built up over several minutes by the SCOOT model. SCOOT has two profiles:(1) Short – Raw data representing the actual values over the previous few minutes(2) Long – A smoothed average of values over a longer periodSCOOT will choose to use the appropriate profile depending on a number of factors.flow_pc: Same as above ref PC SCOOTcong: Congestion is directly measured from the detector. If the detector is placed beyond the normal end of queue in the street it is rarely covered by stationary traffic, except of course when congestion occurs. If any detector shows standing traffic for the whole of an interval this is recorded. The number of intervals of congestion in any cycle is also recorded.The percentage congestion is calculated from:No of congested intervals x 4 x 100 cycle time in seconds.This percentage of congestion is available to view and more importantly for the optimisers to take into account.cong_pc: Same as above ref PC SCOOTdsat: The ratio of the demand flow to the maximum possible discharge flow, i.e. it is the ratio of the demand to the discharge rate (Saturation Occupancy) multiplied by the duration of the effective green time. The Split optimiser will try to minimise the maximum degree of saturation on links approaching the node.
Be ready for a cookieless internet while capturing anonymous website traffic data!
By installing the resolve pixel onto your website, business owners can start to put a name to the activity seen in analytics sources (i.e. GA4). With capture/resolve, you can identify up to 40% or more of your website traffic. Reach customers BEFORE they are ready to reveal themselves to you and customize messaging toward the right product or service.
This product will include Anonymous IP Data and Web Traffic Data for B2B2C.
Get a 360 view of the web traffic consumer with their business data such as business email, title, company, revenue, and location.
Super easy to implement and extraordinarily fast at processing, business owners are thrilled with the enhanced identity resolution capabilities powered by VisitIQ's First Party Opt-In Identity Platform. Capture/resolve and identify your Ideal Customer Profiles to customize marketing. Identify WHO is looking, WHAT they are looking at, WHERE they are located and HOW the web traffic came to your site.
Create segments based on specific demographic or behavioral attributes and export the data as a .csv or through S3 integration.
Check our product that has the most accurate Web Traffic Data for the B2B2C market.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global real-time traffic data market size is anticipated to reach USD 15.3 billion by 2032 from an estimated USD 6.5 billion in 2023, exhibiting a robust CAGR of 10.1% over the forecast period. This substantial growth is driven by the increasing need for efficient traffic management systems and the rising adoption of smart city initiatives worldwide. Governments and commercial entities are investing heavily in advanced technologies to optimize traffic flow and enhance urban mobility, thus fostering market expansion.
The surge in urbanization and the consequent rise in vehicle ownership have led to severe traffic congestion issues in many metropolitan areas. This has necessitated the implementation of real-time traffic data systems that can provide accurate and timely information to manage traffic effectively. With the integration of sophisticated technologies such as IoT, AI, and big data analytics, these systems are becoming more efficient, thereby driving market growth. Furthermore, the growing emphasis on reducing carbon emissions and enhancing road safety is also propelling the adoption of real-time traffic data solutions.
Technological advancements are playing a pivotal role in shaping the real-time traffic data market. Innovations in sensor technology, the proliferation of GPS devices, and the widespread use of mobile data are providing rich sources of real-time traffic information. The ability to integrate data from multiple sources and deliver actionable insights is significantly enhancing traffic management capabilities. Additionally, the development of cloud-based solutions is enabling scalable and cost-effective deployment of traffic data systems, further contributing to market growth.
Another critical growth factor is the increasing investment in smart city projects. Governments across the globe are prioritizing the development of smart transportation infrastructure to improve urban mobility and reduce traffic-related issues. Real-time traffic data systems are integral to these initiatives, providing essential data for optimizing traffic flow, enabling route optimization, and enhancing public transport efficiency. The involvement of private sector players in these projects is also fueling market growth by introducing innovative solutions and fostering public-private partnerships.
The exponential rise in Mobile Data Traffic is another significant factor influencing the real-time traffic data market. As more people rely on smartphones and mobile applications for navigation and traffic updates, the demand for real-time data has surged. Mobile data provides a wealth of information about traffic patterns and congestion levels, enabling more accurate and timely traffic management. The integration of mobile data with other data sources, such as GPS and sensor data, enhances the overall effectiveness of traffic data systems. This trend is particularly evident in urban areas where mobile devices are ubiquitous, and the need for efficient traffic management is critical. The ability to harness mobile data for traffic insights is driving innovation and growth in the market, as companies develop new solutions to leverage this valuable resource.
Regionally, North America and Europe are leading the market due to their early adoption of advanced traffic management technologies and significant investments in smart city projects. However, the Asia Pacific region is expected to witness the highest growth rate over the forecast period, driven by rapid urbanization, increasing vehicle ownership, and growing government initiatives to develop smart transportation infrastructure. Emerging economies in Latin America and the Middle East & Africa are also showing promising growth potential, fueled by ongoing infrastructure development and increasing awareness of the benefits of real-time traffic data solutions.
The real-time traffic data market by component is segmented into software, hardware, and services. Each component plays a crucial role in the overall functionality and effectiveness of traffic data systems. The software segment includes traffic management software, route optimization software, and other analytical tools that help process and analyze traffic data. The hardware segment comprises sensors, GPS devices, and other data collection tools. The services segment includes installation, maintenance, and consulting services that support the deployment and operation of traffic data systems
Unlock the Potential of Your Web Traffic with Advanced Data Resolution
In the digital age, understanding and leveraging web traffic data is crucial for businesses aiming to thrive online. Our pioneering solution transforms anonymous website visits into valuable B2B and B2C contact data, offering unprecedented insights into your digital audience. By integrating our unique tag into your website, you unlock the capability to convert 25-50% of your anonymous traffic into actionable contact rows, directly deposited into an S3 bucket for your convenience. This process, known as "Web Traffic Data Resolution," is at the forefront of digital marketing and sales strategies, providing a competitive edge in understanding and engaging with your online visitors.
Comprehensive Web Traffic Data Resolution Our product stands out by offering a robust solution for "Web Traffic Data Resolution," a process that demystifies the identities behind your website traffic. By deploying a simple tag on your site, our technology goes to work, analyzing visitor behavior and leveraging proprietary data matching techniques to reveal the individuals and businesses behind the clicks. This innovative approach not only enhances your data collection but does so with respect for privacy and compliance standards, ensuring that your business gains insights ethically and responsibly.
Deep Dive into Web Traffic Data At the core of our solution is the sophisticated analysis of "Web Traffic Data." Our system meticulously collects and processes every interaction on your site, from page views to time spent on each section. This data, once anonymous and perhaps seen as abstract numbers, is transformed into a detailed ledger of potential leads and customer insights. By understanding who visits your site, their interests, and their contact information, your business is equipped to tailor marketing efforts, personalize customer experiences, and streamline sales processes like never before.
Benefits of Our Web Traffic Data Resolution Service Enhanced Lead Generation: By converting anonymous visitors into identifiable contact data, our service significantly expands your pool of potential leads. This direct enhancement of your lead generation efforts can dramatically increase conversion rates and ROI on marketing campaigns.
Targeted Marketing Campaigns: Armed with detailed B2B and B2C contact data, your marketing team can create highly targeted and personalized campaigns. This precision in marketing not only improves engagement rates but also ensures that your messaging resonates with the intended audience.
Improved Customer Insights: Gaining a deeper understanding of your web traffic enables your business to refine customer personas and tailor offerings to meet market demands. These insights are invaluable for product development, customer service improvement, and strategic planning.
Competitive Advantage: In a digital landscape where understanding your audience can make or break your business, our Web Traffic Data Resolution service provides a significant competitive edge. By accessing detailed contact data that others in your industry may overlook, you position your business as a leader in customer engagement and data-driven strategies.
Seamless Integration and Accessibility: Our solution is designed for ease of use, requiring only the placement of a tag on your website to start gathering data. The contact rows generated are easily accessible in an S3 bucket, ensuring that you can integrate this data with your existing CRM systems and marketing tools without hassle.
How It Works: A Closer Look at the Process Our Web Traffic Data Resolution process is streamlined and user-friendly, designed to integrate seamlessly with your existing website infrastructure:
Tag Deployment: Implement our unique tag on your website with simple instructions. This tag is lightweight and does not impact your site's loading speed or user experience.
Data Collection and Analysis: As visitors navigate your site, our system collects web traffic data in real-time, analyzing behavior patterns, engagement metrics, and more.
Resolution and Transformation: Using advanced data matching algorithms, we resolve the collected web traffic data into identifiable B2B and B2C contact information.
Data Delivery: The resolved contact data is then securely transferred to an S3 bucket, where it is organized and ready for your access. This process occurs daily, ensuring you have the most up-to-date information at your fingertips.
Integration and Action: With the resolved data now in your possession, your business can take immediate action. From refining marketing strategies to enhancing customer experiences, the possibilities are endless.
Security and Privacy: Our Commitment Understanding the sensitivity of web traffic data and contact information, our solution is built with security and privacy at its core. We adhere to strict data protection regulat...
This dataset contains the current estimated speed for about 1250 segments covering 300 miles of arterial roads. For a more detailed description, please go to https://tas.chicago.gov, click the About button at the bottom of the page, and then the MAP LAYERS tab.
The Chicago Traffic Tracker estimates traffic congestion on Chicago’s arterial streets (nonfreeway streets) in real-time by continuously monitoring and analyzing GPS traces received from Chicago Transit Authority (CTA) buses. Two types of congestion estimates are produced every ten minutes: 1) by Traffic Segments and 2) by Traffic Regions or Zones. Congestion estimate by traffic segments gives the observed speed typically for one-half mile of a street in one direction of traffic.
Traffic Segment level congestion is available for about 300 miles of principal arterials. Congestion by Traffic Region gives the average traffic condition for all arterial street segments within a region. A traffic region is comprised of two or three community areas with comparable traffic patterns. 29 regions are created to cover the entire city (except O’Hare airport area). This dataset contains the current estimated speed for about 1250 segments covering 300 miles of arterial roads. There is much volatility in traffic segment speed. However, the congestion estimates for the traffic regions remain consistent for relatively longer period. Most volatility in arterial speed comes from the very nature of the arterials themselves. Due to a myriad of factors, including but not limited to frequent intersections, traffic signals, transit movements, availability of alternative routes, crashes, short length of the segments, etc. speed on individual arterial segments can fluctuate from heavily congested to no congestion and back in a few minutes. The segment speed and traffic region congestion estimates together may give a better understanding of the actual traffic conditions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Popular Website Traffic Over Time ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/popular-website-traffice on 13 February 2022.
--- Dataset description provided by original source is as follows ---
Background
Have you every been in a conversation and the question comes up, who uses Bing? This question comes up occasionally because people wonder if these sites have any views. For this research study, we are going to be exploring popular website traffic for many popular websites.
Methodology
The data collected originates from SimilarWeb.com.
Source
For the analysis and study, go to The Concept Center
This dataset was created by Chase Willden and contains around 0 samples along with 1/1/2017, Social Media, technical information and other features such as: - 12/1/2016 - 3/1/2017 - and more.
- Analyze 11/1/2016 in relation to 2/1/2017
- Study the influence of 4/1/2017 on 1/1/2017
- More datasets
If you use this dataset in your research, please credit Chase Willden
--- Original source retains full ownership of the source dataset ---
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Linear network representing the estimated traffic flows for roads and highways managed by the Ministry of Transport and Sustainable Mobility (MTMD). These flows are obtained using a statistical estimation method applied to data from more than 4,500 collection sites spread over the main roads of Quebec. It includes DJMA (annual average daily flow), DJME (summer average daily flow), DJME (summer average daily flow (June, July, August, September) and DJMH (average daily winter flow (December, January, February, March) as well as other traffic data. It is important to note that these values are calculated for total traffic directions. Interactive map: Some files are accessible by querying a section of traffic à la carte with a click (the file links are displayed in the descriptive table that is displayed when clicking): • Historical aggregated data (PDF) • Annual reports for permanent sites (PDF and Excel) • Hourly data (hourly average per weekday per month) (Excel) • Annual reports for permanent sites (PDF and Excel) • Hourly data (hourly average per weekday per month) (Excel)**This third party metadata element was translated using an automated translation tool (Amazon Translate).**
Abstract: The task for this dataset is to forecast the spatio-temporal traffic volume based on the historical traffic volume and other features in neighboring locations.
Data Set Characteristics | Number of Instances | Area | Attribute Characteristics | Number of Attributes | Date Donated | Associated Tasks | Missing Values |
---|---|---|---|---|---|---|---|
Multivariate | 2101 | Computer | Real | 47 | 2020-11-17 | Regression | N/A |
Source: Liang Zhao, liang.zhao '@' emory.edu, Emory University.
Data Set Information: The task for this dataset is to forecast the spatio-temporal traffic volume based on the historical traffic volume and other features in neighboring locations. Specifically, the traffic volume is measured every 15 minutes at 36 sensor locations along two major highways in Northern Virginia/Washington D.C. capital region. The 47 features include: 1) the historical sequence of traffic volume sensed during the 10 most recent sample points (10 features), 2) week day (7 features), 3) hour of day (24 features), 4) road direction (4 features), 5) number of lanes (1 feature), and 6) name of the road (1 feature). The goal is to predict the traffic volume 15 minutes into the future for all sensor locations. With a given road network, we know the spatial connectivity between sensor locations. For the detailed data information, please refer to the file README.docx.
Attribute Information: The 47 features include: (1) the historical sequence of traffic volume sensed during the 10 most recent sample points (10 features), (2) week day (7 features), (3) hour of day (24 features), (4) road direction (4 features), (5) number of lanes (1 feature), and (6) name of the road (1 feature).
Relevant Papers: Liang Zhao, Olga Gkountouna, and Dieter Pfoser. 2019. Spatial Auto-regressive Dependency Interpretable Learning Based on Spatial Topological Constraints. ACM Trans. Spatial Algorithms Syst. 5, 3, Article 19 (August 2019), 28 pages. DOI:[Web Link]
Citation Request: To use these datasets, please cite the papers:
Liang Zhao, Olga Gkountouna, and Dieter Pfoser. 2019. Spatial Auto-regressive Dependency Interpretable Learning Based on Spatial Topological Constraints. ACM Trans. Spatial Algorithms Syst. 5, 3, Article 19 (August 2019), 28 pages. DOI:[Web Link]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Traffic information is crucial for managing transportation and city planning, but obtaining national-scale data is difficult due to privacy concerns. Consequently, most current traffic datasets have limitations in terms of time and location coverage, leading to a lack of comprehensive public access to national traffic data. To address this issue, a multi-source highway traffic dataset has been created, featuring 2042 sensors in New Zealand over a 9-year period with 15-minute intervals and accompanying metadata. The dataset includes data of both light-duty and heavy-duty vehicles, as well as weather information like temperature and precipitation. This dataset has diverse potential research applications such as traffic flow prediction and congestion management.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DESCRIPTION OF THE RESEARCH AND DATA: This work presents the Madrid Traffic Dataset (MTD), a comprehensive resource for the analysis and modeling of traffic patterns in Madrid. The dataset integrates data from traffic sensors, weather observations, calendar information, road infrastructure, and geolocation data to support advanced studies of urban mobility and predictive modeling.
In addition to the core data sources, the dataset includes temporal sequences and a traffic adjacency matrix, enabling the application of time-series analysis and graph-based modeling approaches.
-COMPLETE DATASET: The complete version of the MTD includes data from 554 traffic sensors distributed across the Madrid region, covering a total of 30 months (from June 2022 to November 2024).
-SUBSET DATASET: A more compact version derived from the complete dataset, focused on a subset of 300 traffic sensors with 17 months of data (from June 2022 to October 2023). This subset is designed for researchers requiring a lighter dataset.
DATA ORGANIZATION: The dataset is organized in a main directory containing a subfolder identified by the configuration data hash. This subfolder includes all key components: datasets, temporal sequences, adjacency matrices, and configuration files. The structure ensures that all resources are clearly arranged to facilitate easy access and reproducibility for researchers.
For more details, see [Submitted to IEEE Internet of the Things Journal].
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please refer to the original data article for further data description: Jan Luxemburk et al. CESNET-QUIC22: A large one-month QUIC network traffic dataset from backbone lines, Data in Brief, 2023, 108888, ISSN 2352-3409, https://doi.org/10.1016/j.dib.2023.108888. We recommend using the CESNET DataZoo python library, which facilitates the work with large network traffic datasets. More information about the DataZoo project can be found in the GitHub repository https://github.com/CESNET/cesnet-datazoo. The QUIC (Quick UDP Internet Connection) protocol has the potential to replace TLS over TCP, which is the standard choice for reliable and secure Internet communication. Due to its design that makes the inspection of QUIC handshakes challenging and its usage in HTTP/3, there is an increasing demand for research in QUIC traffic analysis. This dataset contains one month of QUIC traffic collected in an ISP backbone network, which connects 500 large institutions and serves around half a million people. The data are delivered as enriched flows that can be useful for various network monitoring tasks. The provided server names and packet-level information allow research in the encrypted traffic classification area. Moreover, included QUIC versions and user agents (smartphone, web browser, and operating system identifiers) provide information for large-scale QUIC deployment studies. Data capture The data was captured in the flow monitoring infrastructure of the CESNET2 network. The capturing was done for four weeks between 31.10.2022 and 27.11.2022. The following list provides per-week flow count, capture period, and uncompressed size:
W-2022-44
Uncompressed Size: 19 GB Capture Period: 31.10.2022 - 6.11.2022 Number of flows: 32.6M W-2022-45
Uncompressed Size: 25 GB Capture Period: 7.11.2022 - 13.11.2022 Number of flows: 42.6M W-2022-46
Uncompressed Size: 20 GB Capture Period: 14.11.2022 - 20.11.2022 Number of flows: 33.7M W-2022-47
Uncompressed Size: 25 GB Capture Period: 21.11.2022 - 27.11.2022 Number of flows: 44.1M CESNET-QUIC22
Uncompressed Size: 89 GB Capture Period: 31.10.2022 - 27.11.2022 Number of flows: 153M
Data description The dataset consists of network flows describing encrypted QUIC communications. Flows were created using ipfixprobe flow exporter and are extended with packet metadata sequences, packet histograms, and with fields extracted from the QUIC Initial Packet, which is the first packet of the QUIC connection handshake. The extracted handshake fields are the Server Name Indication (SNI) domain, the used version of the QUIC protocol, and the user agent string that is available in a subset of QUIC communications. Packet Sequences Flows in the dataset are extended with sequences of packet sizes, directions, and inter-packet times. For the packet sizes, we consider payload size after transport headers (UDP headers for the QUIC case). Packet directions are encoded as ±1, +1 meaning a packet sent from client to server, and -1 a packet from server to client. Inter-packet times depend on the location of communicating hosts, their distance, and on the network conditions on the path. However, it is still possible to extract relevant information that correlates with user interactions and, for example, with the time required for an API/server/database to process the received data and generate the response to be sent in the next packet. Packet metadata sequences have a length of 30, which is the default setting of the used flow exporter. We also derive three fields from each packet sequence: its length, time duration, and the number of roundtrips. The roundtrips are counted as the number of changes in the communication direction (from packet directions data); in other words, each client request and server response pair counts as one roundtrip. Flow statistics Flows also include standard flow statistics, which represent aggregated information about the entire bidirectional flow. The fields are: the number of transmitted bytes and packets in both directions, the duration of flow, and packet histograms. Packet histograms include binned counts of packet sizes and inter-packet times of the entire flow in both directions (more information in the PHISTS plugin documentation There are eight bins with a logarithmic scale; the intervals are 0-15, 16-31, 32-63, 64-127, 128-255, 256-511, 512-1024, >1024 [ms or B]. The units are milliseconds for inter-packet times and bytes for packet sizes. Moreover, each flow has its end reason - either it was idle, reached the active timeout, or ended due to other reasons. This corresponds with the official IANA IPFIX-specified values. The FLOW_ENDREASON_OTHER field represents the forced end and lack of resources reasons. The end of flow detected reason is not considered because it is not relevant for UDP connections. Dataset structure The dataset flows are delivered in compressed CSV files. CSV files contain one flow per row; data columns are summarized in the provided list below. For each flow data file, there is a JSON file with the number of saved and seen (before sampling) flows per service and total counts of all received (observed on the CESNET2 network), service (belonging to one of the dataset's services), and saved (provided in the dataset) flows. There is also the stats-week.json file aggregating flow counts of a whole week and the stats-dataset.json file aggregating flow counts for the entire dataset. Flow counts before sampling can be used to compute sampling ratios of individual services and to resample the dataset back to the original service distribution. Moreover, various dataset statistics, such as feature distributions and value counts of QUIC versions and user agents, are provided in the dataset-statistics folder. The mapping between services and service providers is provided in the servicemap.csv file, which also includes SNI domains used for ground truth labeling. The following list describes flow data fields in CSV files:
ID: Unique identifier SRC_IP: Source IP address DST_IP: Destination IP address DST_ASN: Destination Autonomous System number SRC_PORT: Source port DST_PORT: Destination port PROTOCOL: Transport protocol QUIC_VERSION QUIC: protocol version QUIC_SNI: Server Name Indication domain QUIC_USER_AGENT: User agent string, if available in the QUIC Initial Packet TIME_FIRST: Timestamp of the first packet in format YYYY-MM-DDTHH-MM-SS.ffffff TIME_LAST: Timestamp of the last packet in format YYYY-MM-DDTHH-MM-SS.ffffff DURATION: Duration of the flow in seconds BYTES: Number of transmitted bytes from client to server BYTES_REV: Number of transmitted bytes from server to client PACKETS: Number of packets transmitted from client to server PACKETS_REV: Number of packets transmitted from server to client PPI: Packet metadata sequence in the format: [[inter-packet times], [packet directions], [packet sizes]] PPI_LEN: Number of packets in the PPI sequence PPI_DURATION: Duration of the PPI sequence in seconds PPI_ROUNDTRIPS: Number of roundtrips in the PPI sequence PHIST_SRC_SIZES: Histogram of packet sizes from client to server PHIST_DST_SIZES: Histogram of packet sizes from server to client PHIST_SRC_IPT: Histogram of inter-packet times from client to server PHIST_DST_IPT: Histogram of inter-packet times from server to client APP: Web service label CATEGORY: Service category FLOW_ENDREASON_IDLE: Flow was terminated because it was idle FLOW_ENDREASON_ACTIVE: Flow was terminated because it reached the active timeout FLOW_ENDREASON_OTHER: Flow was terminated for other reasons
Link to other CESNET datasets
https://www.liberouter.org/technology-v2/tools-services-datasets/datasets/ https://github.com/CESNET/cesnet-datazoo Please cite the original data article:
@article{CESNETQUIC22, author = {Jan Luxemburk and Karel Hynek and Tomáš Čejka and Andrej Lukačovič and Pavel Šiška}, title = {CESNET-QUIC22: a large one-month QUIC network traffic dataset from backbone lines}, journal = {Data in Brief}, pages = {108888}, year = {2023}, issn = {2352-3409}, doi = {https://doi.org/10.1016/j.dib.2023.108888}, url = {https://www.sciencedirect.com/science/article/pii/S2352340923000069} }
In July 2019, the Metropolitan Police Department (MPD) implemented new data collection methods that enabled officers to collect more comprehensive information about each police stop in an aggregated manner. More specifically, these changes have allowed for more detailed data collection on stops, protective pat down (PPDs), searches, and arrests. (For a complete list of terms, see the glossary on page 2.) These changes support data collection requirements in the Neighborhood Engagement Achieves Results Amendment Act of 2016 (NEAR Act).The accompanying data cover all MPD stops including vehicle, pedestrian, bicycle, and harbor stops for the period from July 22, 2019 to December 31, 2022. A stop may involve a ticket (actual or warning), investigatory stop, protective pat down, search, or arrest.If the final outcome of a stop results in an actual or warning ticket, the ticket serves as the official documentation for the stop. The information provided in the ticket include the subject’s name, race, gender, reason for the stop, and duration. All stops resulting in additional law enforcement actions (e.g., pat down, search, or arrest) are documented in MPD’s Record Management System (RMS). This dataset includes records pulled from both the ticket (District of Columbia Department of Motor Vehicles [DMV]) and RMS sources. Data variables not applicable to a particular stop are indicated as “NULL.” For example, if the stop type (“stop_type” field) is a “ticket stop,” then the fields: “stop_reason_nonticket” and “stop_reason_harbor” will be “NULL.” Each row in the data represents an individual stop of a single person, and that row reveals any and all recorded outcomes of that stop (including information about any actual or warning tickets issued, searches conducted, arrests made, etc.). A single traffic stop may generate multiple tickets, including actual, warning, and/or voided tickets. Additionally, an individual who is stopped and receives a traffic ticket may also be stopped for investigatory purposes, patted down, searched, and/or arrested. If any of these situations occur, the “stop_type” field would be labeled “Ticket and Non-Ticket Stop.” If an individual is searched, MPD differentiates between person and property searches. The “stop_location_block” field represents the block-level location of the stop and/or a street name. The age of the person being stopped is calculated based on the time between the person’s date ofbirth and the date of the stop.There are certain locations that have a high prevalence of non-ticket stops. These can be attributed to some centralized processing locations. Additionally, there is a time lag for data on some ticket stops as roughly 20 percent of tickets are handwritten. In these instances, the handwritten traffic tickets are delivered by MPD to the DMV, and then entered into data systems by DMV contractors. On August 1, 2021, MPD transitioned to a new version of its current records management system, Mark43 RMS.Due to this transition, the data collection and structures for the period between August 1, 2021 – December 31, 2021 were changed. The list below provides explanatory notes to consider when using this dataset.New fields for data collection resulted in an increase of outliers in stop duration (affecting 0.98% of stops). In order to mitigate the disruption of outliers on any analysis, these values have been set to null as consistent with past practices.Due to changes to the data structure that occurred after August 1, 2021, six attributes pertaining to reasons for searches of property and person are only available for the first seven months of 2021. These attributes are: Individual’s Actions, Information Obtained from Law Enforcement Sources, Information Obtained from Witnesses or Informants, Characteristics of an Armed Individual, Nature of the Alleged Crime, Prior Knowledge. These data structure changes have been updated to include these attributes going forward (as of April 23, 2022).Out of the four attributes for types of property search, warrant property search is only available for the first seven months of 2021. Data structure changes were made to include this type of property search in future datasets.The following chart shows how certain property search fields were aligned prior to and after August 1, 2021. A glossary is also provided following the chart. As of August 2, 2022, these fields have reverted to the original alignment.https://mpdc.dc.gov/sites/default/files/dc/sites/mpdc/publication/attachments/Explanatory%20Notes%202021%20Data.pdfIn October 2022 several fields were added to the dataset to provide additional clarity differentiating NOIs issued to bicycles (including Personal Mobility Devices, aka stand-on scooters), pedestrians, and vehicles as well as stops related specifically to MPD’s Harbor Patrol Unit and stops of an investigative nature where a police report was written. Please refer to the Data Dictionary for field definitions.In March 2023 an indicator was added to the data which reflects stops related to traffic enforcement and/or traffic violations. This indicator will be 1 if a stop originated as a traffic stop (including both stops where only a ticket was issued as well as stops that ultimately resulted in police action such as a search or arrest), involved an arrest for a traffic violation, and/or if the reason for the stop was Response to Crash, Observed Moving Violation, Observed Equipment Violation, or Traffic Violation.Between November 2021 and February 2022 several fields pertaining to items seized during searches of a person were not available for officers to use, leading to the data showing that no objects were seized pursuant to person searches during this time period. Finally, MPD is conducting on-going data audits on all data for thorough and complete information. For more information regarding police stops, please see: https://mpdc.dc.gov/stopdataFigures are subject to change due to delayed reporting, on-going data quality audits, and data improvement processes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset is a set of network traffic traces in pcap/csv format captured from a single user. The traffic is classified in 5 different activities (Video, Bulk, Idle, Web, and Interactive) and the label is shown in the filename. There is also a file (mapping.csv) with the mapping of the host's IP address, the csv/pcap filename and the activity label.
Activities:
Interactive: applications that perform real-time interactions in order to provide a suitable user experience, such as editing a file in google docs and remote CLI's sessions by SSH. Bulk data transfer: applications that perform a transfer of large data volume files over the network. Some examples are SCP/FTP applications and direct downloads of large files from web servers like Mediafire, Dropbox or the university repository among others. Web browsing: contains all the generated traffic while searching and consuming different web pages. Examples of those pages are several blogs and new sites and the moodle of the university. Vídeo playback: contains traffic from applications that consume video in streaming or pseudo-streaming. The most known server used are Twitch and Youtube but the university online classroom has also been used. Idle behaviour: is composed by the background traffic generated by the user computer when the user is idle. This traffic has been captured with every application closed and with some opened pages like google docs, YouTube and several web pages, but always without user interaction.
The capture is performed in a network probe, attached to the router that forwards the user network traffic, using a SPAN port. The traffic is stored in pcap format with all the packet payload. In the csv file, every non TCP/UDP packet is filtered out, as well as every packet with no payload. The fields in the csv files are the following (one line per packet): Timestamp, protocol, payload size, IP address source and destination, UDP/TCP port source and destination. The fields are also included as a header in every csv file.
The amount of data is stated as follows:
Bulk : 19 traces, 3599 s of total duration, 8704 MBytes of pcap files Video : 23 traces, 4496 s, 1405 MBytes Web : 23 traces, 4203 s, 148 MBytes Interactive : 42 traces, 8934 s, 30.5 MBytes Idle : 52 traces, 6341 s, 0.69 MBytes
The code of our machine learning approach is also included. There is a README.txt file with the documentation of how to use the code.
http://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence
15 smart sensors were installed on Mill Road and surrounding streets to record numbers of pedestrians, bicycles, cars and other vehicles. The data being collated and analysed by the Smart Cambridge programme will help the Greater Cambridge Partnership understand how people use the road network.
Data will be released monthly for these locations until the end of 2020. Please note that due to the level of insight that can be gained from these sensors, additional sensors in more locations have been installed in Cambridge since the summer of 2019. Some sensors will remain beyond 2020 in strategic locations and the network is expected to grow. Data for those more permanent sites, outside of the Mill Road project will be published here: https://data.cambridgeshireinsight.org.uk/dataset/cambridge-city-smart-s...
Mill Road Bridge was closed for eight weeks from 1 July 2019 for crucial work being carried out to improve rail services. Pedestrians and cyclists will still be able to cross the railway for most of the working time.
A high concentration of sensors were installed for approximately 18 months to gather data before the closure, during the time when there is no vehicle traffic coming over Mill Road Bridge and then after the bridge is re-opened. This has allowed engineers to see the impact of the closure on surrounding roads, including on air quality. Keeping the sensors in place for this long has also allowed teams to make greater comparisons, by taking in to account daily, weekly, monthly and annual variations in traffic levels.
The below data release offers counts for each sensor over 1 hour periods. The curent data covers the period 03/06/2019 to 13/12/2020.
Hourly counts are broken down by inbound and outbound journeys. .
Counts are also broken down by vehicle type. This includes:
Pedestrians Cyclists Buses LGV OGV 1 OGV 2 The release also includes a full list of sensor sites with geographic point location data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NOTE: The Historic Traffic Data Dashboard & Feature Hosted Service have been retired.Network operations traffic data from Main Roads Western Australia for 2015 to 2019. The data provided includes data collected on the Perth Metropolitan State Road Network (PMSRN) at 15 minute intervals. The Historic Traffic Data is provided in CSV format per year. Each table has over 34 million rows and can be linked to the M-Links Road Network using the M-Links ID. A data dictionary for M-Links Road Network and the Historic Traffic Data is at the following link:https://bit.ly/2S86uSnNetwork Operations traffic data can also be accessed via the Daily Traffic Data API at the following link: https://bit.ly/34ZsyAK The network operations traffic data provided here is of variable quality and has not been checked, quality assured or manually corrected. An automated process is used to patch over missing or suspect data with the most representative data available within the database. Patches may be reapplied as new data becomes available and patched data may change over time. Note that you are accessing this data pursuant to a Creative Commons (Attribution) Licence which has a disclaimer of warranties and limitation of liability. You accept that the data provided pursuant to the Licence is subject to changes. Pursuant to section 3 of the Licence you are provided with the following notice to be included when you Share the Licenced Material:- “The Commissioner of Main Roads is the creator and owner of the data and Licenced Material, which is accessed pursuant to a Creative Commons (Attribution) Licence, which has a disclaimer of warranties and limitation of liability.”
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SDCC Traffic Congestion Saturation Flow Data for January to June 2023. Traffic volumes, traffic saturation, and congestion data for sites across South Dublin County. Used by traffic management to control stage timings on junctions. It is recommended that this dataset is read in conjunction with the ‘Traffic Data Site Names SDCC’ dataset.A detailed description of each column heading can be referenced below;scn: Site Serial numberregion: A group of Nodes that are operated under SCOOT control at the same common cycle time. Normally these will be nodes between which co-ordination is desirable. Some of the nodes may be double cycling at half of the region cycle time.system: SCOOT STC UTC (UTC-MX)locn: Locationssite: Site numbersday: Days of the week Monday to Sunday. Abbreviations; MO,TU,WE,TH,FR,SA,SU.date: Reflects correct actual Date of when data was collected.start_time: NOTE - Please ignore the date displayed in this column. The actual data collection date is correctly displayed in the 'date' column. The date displayed here is the date of when report was run and extracted from the system, but correctly reflects start time of 15 minute intervals. end_time: End time of 15 minute intervals.flow: A representation of demand (flow) for each link built up over several minutes by the SCOOT model. SCOOT has two profiles:(1) Short – Raw data representing the actual values over the previous few minutes(2) Long – A smoothed average of values over a longer periodSCOOT will choose to use the appropriate profile depending on a number of factors.flow_pc: Same as above ref PC SCOOTcong: Congestion is directly measured from the detector. If the detector is placed beyond the normal end of queue in the street it is rarely covered by stationary traffic, except of course when congestion occurs. If any detector shows standing traffic for the whole of an interval this is recorded. The number of intervals of congestion in any cycle is also recorded.The percentage congestion is calculated from:No of congested intervals x 4 x 100 cycle time in seconds.This percentage of congestion is available to view and more importantly for the optimisers to take into account.cong_pc: Same as above ref PC SCOOTdsat: The ratio of the demand flow to the maximum possible discharge flow, i.e. it is the ratio of the demand to the discharge rate (Saturation Occupancy) multiplied by the duration of the effective green time. The Split optimiser will try to minimise the maximum degree of saturation on links approaching the node.