37 datasets found

Network traffic datasets created by Single Flow Time Series Analysis

zenodo.org
explore.openaire.eu
+1more

csv, pdf

Updated Jul 11, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka (2024). Network traffic datasets created by Single Flow Time Series Analysis [Dataset]. http://doi.org/10.5281/zenodo.8035724

Explore at:

csv, pdfAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.8035724

Dataset updated

Jul 11, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Network traffic datasets created by Single Flow Time Series Analysis

Datasets were created for the paper: Network Traffic Classification based on Single Flow Time Series Analysis -- Josef Koumar, Karel Hynek, Tomáš Čejka -- which was published at The 19th International Conference on Network and Service Management (CNSM) 2023. Please cite usage of our datasets as:

J. Koumar, K. Hynek and T. Čejka, "Network Traffic Classification Based on Single Flow Time Series Analysis," 2023 19th International Conference on Network and Service Management (CNSM), Niagara Falls, ON, Canada, 2023, pp. 1-7, doi: 10.23919/CNSM59352.2023.10327876.

This Zenodo repository contains 23 datasets created from 15 well-known published datasets which are cited in the table below. Each dataset contains 69 features created by Time Series Analysis of Single Flow Time Series. The detailed description of features from datasets is in the file: feature_description.pdf

In the following table is a description of each dataset file:

File name	Detection problem	Citation of original raw dataset
botnet_binary.csv	Binary detection of botnet	S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
botnet_multiclass.csv	Multi-class classification of botnet	S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
cryptomining_design.csv	Binary detection of cryptomining; the design part	Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
cryptomining_evaluation.csv	Binary detection of cryptomining; the evaluation part	Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
dns_malware.csv	Binary detection of malware DNS	Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.
doh_cic.csv	Binary detection of DoH	Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020
doh_real_world.csv	Binary detection of DoH	Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022
dos.csv	Binary detection of DoS	Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.
edge_iiot_binary.csv	Binary detection of IoT malware	Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
edge_iiot_multiclass.csv	Multi-class classification of IoT malware	Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
https_brute_force.csv	Binary detection of HTTPS Brute Force	Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020
ids_cic_binary.csv	Binary detection of intrusion in IDS	Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_cic_multiclass.csv	Multi-class classification of intrusion in IDS	Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_unsw_nb_15_binary.csv	Binary detection of intrusion in IDS	Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
ids_unsw_nb_15_multiclass.csv	Multi-class classification of intrusion in IDS	Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
iot_23.csv	Binary detection of IoT malware	Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23
ton_iot_binary.csv	Binary detection of IoT malware	Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
ton_iot_multiclass.csv	Multi-class classification of IoT malware	Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
tor_binary.csv	Binary detection of TOR	Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
tor_multiclass.csv	Multi-class classification of TOR	Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
vpn_iscx_binary.csv	Binary detection of VPN	Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
vpn_iscx_multiclass.csv	Multi-class classification of VPN	Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
vpn_vnat_binary.csv	Binary detection of VPN	Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022
vpn_vnat_multiclass.csv	Multi-class classification of VPN	Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022

Z
LoRaWAN Traffic Analysis Dataset
data.niaid.nih.gov
Updated Aug 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kral, Jan (2023). LoRaWAN Traffic Analysis Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7919212
Explore at:
Dataset updated
Aug 28, 2023
Dataset provided by
Kral, Jan
Povalac, Ales
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was created by a LoRaWAN sniffer and contains packets, which are thoroughly analyzed in the paper Exploring LoRaWAN Traffic: In-Depth Analysis of IoT Network Communications. Data from the LoRaWAN sniffer was collected in four cities: Liege (Belgium), Graz (Austria), Vienna (Austria), and Brno (Czechia).

Gateway ID: b827ebafac000001

Uplink reception (end-device => gateway)

Only packets containing CRC, inverted IQ

RX0: 867.1 MHz, 867.3 MHz, 867.5 MHz, 867.7 MHz, 867.9 MHz - BW 125 kHz and all SF

RX1: 868.1 MHz, 868.3 MHz, 868.5 MHz - BW 125 kHz and all SF

Gateway ID: b827ebafac000002

Downlink reception (gateway => end-device)

Includes packets without CRC, non-inverted IQ

RX0: 867.1 MHz, 867.3 MHz, 867.5 MHz, 867.7 MHz, 867.9 MHz - BW 125 kHz and all SF

RX1: 868.1 MHz, 868.3 MHz, 868.5 MHz - BW 125 kHz and all SF

Gateway ID: b827ebafac000003

Downlink reception (gateway => end-device) and Class-B beacon on 869.525 MHz

Includes packets without CRC, non-inverted IQ

RX0: 869.525 MHz - BW 125 kHz and all SF, BW 125 kHz and SF9 with implicit header, CR 4/5 and length 17 B

To open the pcap files, you need Wireshark with current support for LoRaTap and LoRaWAN protocols. This support will be available in the official 4.1.0 release. A working version for Windows is accessible in the automated build system.

The source data is available in the log.zip file, which contains the complete dataset obtained by the sniffer. A set of conversion tools for log processing is available on Github. The converted logs, available in Wireshark format, are stored in pcap.zip. For the LoRaWAN decoder, you can use the attached root and session keys. The processed outputs are stored in csv.zip, and graphical statistics are available in png.zip.

This data represents a unique, geographically identifiable selection from the full log, cleaned of any errors. The records from Brno include communication between the gateway and a node with known keys.

Test file :: 00_Test

short test file for parser verification

comparison of LoRaTap version 0 and version 1 formats

Brno, Czech Republic :: 01_Brno

49.22685N, 16.57536E, ASL 306m

lines 150873 to 529796

time 1.8.2022 15:04:28 to 17.8.2022 13:05:32

preliminary experiment

experimental device

Device EUI: 70b3d5cee0000042

Application key: d494d49a7b4053302bdcf96f1defa65a

Device address: 00d85395

Network session key: c417540b8b2afad8930c82fcf7ea54bb

Application session key: 421fea9bedd2cc497f63303edf5adf8e

Liege, Belgium :: 02_Liege :: evaluated in the paper

50.66445N, 5.59276E, ASL 151m

lines 636205 to 886868

time 25.8.2022 10:12:24 to 12.9.2022 06:20:48

Brno, Czech Republic :: 03_Brno_join

49.22685N, 16.57536E, ASL 306m

lines 947787 to 979382

time 30.9.2022 15:21:27 to 4.10.2022 10:46:31

record contains OTAA activation (Join Request / Join Accept)

experimental device:

Device EUI: 70b3d5cee0000042

Application key: d494d49a7b4053302bdcf96f1defa65a

Device address: 01e65ddc

Network session key: e2898779a03de59e2317b149abf00238

Application session key: 59ca1ac91922887093bc7b236bd1b07f

Graz, Austria :: 04_Graz :: evaluated in the paper

47.07049N, 15.44506E, ASL 364m

lines 1015139 to 1178855

time 26.10.2022 06:21:07 to 29.11.2022 10:03:00

Vienna, Austria :: 05_Wien :: evaluated in the paper

48.19666N, 16.37101E, ASL 204m

lines 1179308 to 3657105

time 1.12.2022 10:42:19 to 4.1.2023 14:00:05

contains a total of 14 short restarts (under 90 seconds)

Brno, Czech Republic :: 07_Brno :: evaluated in the paper

49.22685N, 16.57536E, ASL 306m

lines 4969648 to 6919392

time 16.2.2023 8:53:43 to 30.3.2023 9:00:11
VLC Data: A Multi-Class Network Traffic Dataset Covering Diverse...
data.europa.eu
unknown
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
unknown(1205388)Available download formats
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
VLC Data: A Multi-Class Network Traffic Dataset Covering Diverse Applications and Platforms Valencia Data (VLC Data) is a network traffic dataset collected from various applications and platforms. It includes both encrypted and, when applicable, unencrypted protocols, capturing realistic usage scenarios and application-specific behavior. The dataset covers 18.5 hours, 58 pcapng files, and 24.26 GB, with traffic from: Video streaming: Netflix and Prime Video (10–50 min) via Firefox. Gaming: Roblox sessions on Windows (20–35 min), recorded outside of virtual machines, despite VM support. Video conferencing: Microsoft Teams (20 min) via Firefox. Web browsing: Wikipedia, BBC, Google, LinkedIn, Amazon, and OWIN6G (2–5 min) via Firefox or Chrome. Audio streaming: Spotify (30–33 min) on multiple OS. Web streaming: YouTube in 4K and Full HD (20–30 min). This dataset is publicly available for traffic analysis across different apps, protocols, and systems. Table Description: Type Applications Platform Time [min] Comments Filename Size (MB) Video Streaming Netflix Linux 10 Running Netflix on Firefox Browser netflix_linux_10m_01 95.1 Video Streaming Netflix Linux 20 Running Netflix on Firefox Browser netflix_linux_20m_01 167.7 Video Streaming Netflix Linux 20 Running Netflix on Firefox Browser netflix_linux_20m_02 237.9 Video Streaming Netflix Linux 20 Running Netflix on Firefox Browser netflix_linux_20m_03 212.6 Video Streaming Netflix Linux 25 Running Netflix on Firefox, but 2 min in Menu netflix_linux_25m_01 610.7 Video Streaming Netflix Linux 35 Running Netflix on Firefox, but 1 min in Menu netflix_linux_35m_01 534.8 Video Streaming Netflix Linux 50 Running Netflix on Firefox Browser netflix_linux_50m_01 660.9 Video Streaming Netflix Windows 10 Running Netflix on Firefox Browser netflix_windows_10m_01 132.1 Video Streaming Netflix Windows 20 Running Netflix on Firefox Browser netflix_windows_20m_01 506.4 Video Streaming Prime Video Linux 20 Running Prime Video on Firefox Browser prime_linux_20m_01 767.3 Video Streaming Prime Video Linux 20 Running Prime Video on Firefox Browser prime_linux_20m_02 569.3 Video Streaming Prime Video Windows 20 Running Prime Video on Firefox Browser prime_windows_20m_01 512.3 Video Streaming Prime Video Windows 20 Running Prime Video on Firefox Browser prime_windows_20m_02 364.2 Gaming Roblox Windows 20 Doesn't run in VM roblox_windows_20m_01 127.5 Gaming Roblox Windows 20 Doesn't run in VM roblox_windows_20m_02 378.5 Gaming Roblox Windows 20 Doesn't run in VM roblox_windows_20m_03 458.9 Gaming Roblox Windows 30 Doesn't run in VM roblox_windows_30m_01 519.8 Gaming Roblox Windows 30 Doesn't run in VM roblox_windows_30m_02 357.3 Gaming Roblox Windows 35 Doesn't run in VM roblox_windows_35m_01 880.4 Audio Streaming Spotify Linux 30 Running Spotify app on Ubuntu-Linux spotify_linux_30m_01 98.2 Audio Streaming Spotify Linux 30 Running Spotify app on Ubuntu-Linux spotify_linux_30m_02 112.2 Audio Streaming Spotify Linux 30 Running Spotify app on Ubuntu-Linux spotify_linux_30m_03 175.5 Audio Streaming Spotify Windows 30 Running Spotify app on Windows spotify_windows_30m_01 50.7 Audio Streaming Spotify Windows 30 Doesn't run in VM spotify_windows_30m_02 63.2 Audio Streaming Spotify Windows 33 Running Spotify app on Windows spotify_windows_33m_01 70.9 Video Conferencing Teams Linux 20 Running Teams on Firefox Browser teams_linux_20m_01 134.6 Video Conferencing Teams Linux 20 Running Teams on Firefox Browser teams_linux_20m_02 343.3 Video Conferencing Teams Linux 20 Running Teams on Firefox Browser teams_linux_20m_03 376.6 Video Conferencing Teams Windows 20 Running Teams on Firefox Browser teams_windows_20m_01 634.1 Video Conferencing Teams Windows 20 Running Teams on Firefox Browser teams_windows_20m_02 517.8 Video Conferencing Teams Windows 20 Running Teams on Firefox Browser teams_windows_20m_03 629.9 Web Browsing Web Linux 2 OWIN6G website on Firefox Browser web_linux_2m_owin6g 1.2 Web Browsing Web Linux 2 Wikipedia website on Firefox Browser web_linux_2m_wikipedia 19.7 Web Browsing Web Linux 3 OWIN6G website on Firefox Browser web_linux_3m_owin6g 4.5 Web Browsing Web Linux 3 Wikipedia website on Firefox Browser web_linux_3m_wikipedia 23.5 Web Browsing Web Linux 5 Amazon website on Chrome Browser web_linux_5m_amazon 262.9 Web Browsing Web Linux 5 BBC website on Firefox Browser web_linux_5m_bbc 55.7 Web Browsing Web Linux 5 Google website on Firefox Browser web_linux_5m_google 22.6 Web Browsing Web Linux 5 Linkedin website on Firefox Browser web_linux_5m_linkedin 39.8 Web Browsing Web Windows 3 OWIN6G website on Firefox Browser web_windows_3m_owin6g 32.6 Web Browsing Web
m
Enriched Traffic Datasets for Madrid
data.mendeley.com
Updated Jan 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Iván Gómez (2025). Enriched Traffic Datasets for Madrid [Dataset]. http://doi.org/10.17632/697ht4f65b.2
Explore at:
Unique identifier
https://doi.org/10.17632/697ht4f65b.2
Dataset updated
Jan 27, 2025
Authors
Iván Gómez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Madrid
Description
DESCRIPTION OF THE RESEARCH AND DATA: This work presents the Madrid Traffic Dataset (MTD), a comprehensive resource for the analysis and modeling of traffic patterns in Madrid. The dataset integrates data from traffic sensors, weather observations, calendar information, road infrastructure, and geolocation data to support advanced studies of urban mobility and predictive modeling.

In addition to the core data sources, the dataset includes temporal sequences and a traffic adjacency matrix, enabling the application of time-series analysis and graph-based modeling approaches.

-COMPLETE DATASET: The complete version of the MTD includes data from 554 traffic sensors distributed across the Madrid region, covering a total of 30 months (from June 2022 to November 2024).

-SUBSET DATASET: A more compact version derived from the complete dataset, focused on a subset of 300 traffic sensors with 17 months of data (from June 2022 to October 2023). This subset is designed for researchers requiring a lighter dataset.

DATA ORGANIZATION: The dataset is organized in a main directory containing a subfolder identified by the configuration data hash. This subfolder includes all key components: datasets, temporal sequences, adjacency matrices, and configuration files. The structure ensures that all resources are clearly arranged to facilitate easy access and reproducibility for researchers.

For more details, see [Submitted to IEEE Internet of the Things Journal].
h
isom5240-td-application-traffic-analysis
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gordon, isom5240-td-application-traffic-analysis [Dataset]. https://huggingface.co/datasets/slliac/isom5240-td-application-traffic-analysis
Explore at:
Authors
Gordon
Description
Split:

application: 38 samples

Class Distribution:

car (ID: 3): 67 (46.2%) motorcycle (ID: 4): 14 (9.7%) airplane (ID: 5): 62 (42.8%) truck (ID: 8): 2 (1.4%)

Annotation Files:

Latest: application/application_labels.json Timestamped: application/application_labels_20250309_212205.json

Split:

application: 49 samples

Class Distribution:

car (ID: 3): 120 (60.0%) motorcycle (ID: 4): 14 (7.0%) airplane (ID: 5): 62 (31.0%) truck (ID: 8):… See the full description on the dataset page: https://huggingface.co/datasets/slliac/isom5240-td-application-traffic-analysis.
d
Current Traffic Analysis Zones for Sandoval County, New Mexico, 2006se TIGER...
catalog.data.gov
gstore.unm.edu
+1more
Updated Dec 2, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Earth Data Analysis Center (Point of Contact) (2020). Current Traffic Analysis Zones for Sandoval County, New Mexico, 2006se TIGER [Dataset]. https://catalog.data.gov/dataset/current-traffic-analysis-zones-for-sandoval-county-new-mexico-2006se-tiger
Explore at:
Dataset updated
Dec 2, 2020
Dataset provided by
Earth Data Analysis Center (Point of Contact)
Area covered
Sandoval County, New Mexico
Description
The 2006 Second Edition TIGER/Line files are an extract of selected geographic and cartographic information from the Census TIGER database. The geographic coverage for a single TIGER/Line file is a county or statistical equivalent entity, with the coverage area based on the latest available governmental unit boundaries. The Census TIGER database represents a seamless national file with no overlaps or gaps between parts. However, each county-based TIGER/Line file is designed to stand alone as an independent data set or the files can be combined to cover the whole Nation. The 2006 Second Edition TIGER/Line files consist of line segments representing physical features and governmental and statistical boundaries. This shapefile represents the current Traffic Analysis Zones for Sandoval County stored in the 2006 TIGER Second Edition dataset.
IP Network Traffic Flows Labeled with 75 Apps
kaggle.com
Updated Sep 15, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juan Sebastián Rojas (2018). IP Network Traffic Flows Labeled with 75 Apps [Dataset]. https://www.kaggle.com/datasets/jsrojas/ip-network-traffic-flows-labeled-with-87-apps/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 15, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Juan Sebastián Rojas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context

The data presented here was collected in a network section from Universidad Del Cauca, Popayán, Colombia by performing packet captures at different hours, during morning and afternoon, over six days (April 26, 27, 28 and May 9, 11 and 15) of 2017. A total of 3.577.296 instances were collected and are currently stored in a CSV (Comma Separated Values) file.

Content

This dataset contains 87 features. Each instance holds the information of an IP flow generated by a network device i.e., source and destination IP addresses, ports, interarrival times, layer 7 protocol (application) used on that flow as the class, among others. Most of the attributes are numeric type but there are also nominal types and a date type due to the Timestamp.

The flow statistics (IP addresses, ports, inter-arrival times, etc) were obtained using CICFlowmeter (http://www.unb.ca/cic/research/applications.html - https://github.com/ISCX/CICFlowMeter). The application layer protocol was obtained by performing a DPI (Deep Packet Inspection) processing on the flows with ntopng (https://www.ntop.org/products/traffic-analysis/ntop/ - https://github.com/ntop/ntopng).

For further information and if you find this dataset useful, please read and cite the following papers:

Research Gate: https://www.researchgate.net/publication/326150046_Personalized_Service_Degradation_Policies_on_OTT_Applications_Based_on_the_Consumption_Behavior_of_Users

Research Gate: https://www.researchgate.net/publication/335954240_Consumption_Behavior_Analysis_of_Over_The_Top_Services_Incremental_Learning_or_Traditional_Methods

Springer: https://link.springer.com/chapter/10.1007/978-3-319-95168-3_37

IEEExplore https://ieeexplore.ieee.org/document/8845576

Research Gate: https://www.researchgate.net/publication/345990587_Smart_User_Consumption_Profiling_Incremental_Learning-based_OTT_Service_Degradation

IEEExpore https://ieeexplore.ieee.org/document/9258898

Acknowledgements

I would like to thank Universidad Del Cauca for supporting the research that generated this dataset and Colciencias for my PhD scholarship.

Inspiration

Considering that most of the network traffic classification datasets are aimed only at identifying the type of application an IP flow holds (WWW, DNS, FTP, P2P, Telnet,etc), this dataset goes a step further by generating machine learning models capable of detecting specific applications such as Facebook, YouTube, Instagram, etc, from IP flow statistics (currently 75 applications).
R
Traffic_train Dataset
universe.roboflow.com
zip
Updated Apr 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ViWhiVN (2021). Traffic_train Dataset [Dataset]. https://universe.roboflow.com/viwhivn/traffic_train
Explore at:
zipAvailable download formats
Dataset updated
Apr 12, 2021
Dataset authored and provided by
ViWhiVN
Variables measured
TrafficTrain Bounding Boxes
Description
Here are a few use cases for this project:

Traffic Monitoring Systems: The model could be leveraged by city planning departments or traffic control centers to automatically identify and monitor different types of traffic on roads in real-time. This could assist in efficient traffic management, congestion detection, and traffic light timing adjustment.

Autonomous Vehicles: Companies developing self-driving cars or drones could utilize the model to improve their vehicle's ability to recognize different types of vehicles on the road, ensuring safer navigation.

Security and Surveillance: The model could be used in CCTV camera systems to detect, classify, and track vehicles around sensitive areas like government buildings, airports, or high-security areas for security enhancement and crime prevention.

Traffic Analysis for Urban Planning: Urban planners and researchers can use the model to study traffic patterns based on vehicle type over time, informing future infrastructure and transportation planning.

Enhanced Vehicle-based Augmented Reality (AR): Game developers or AR app creators who focus on city or traffic scenarios can use the model to enhance their system's ability to accurately detect and interact with real-world vehicles, promoting a more immersive experience for users.
d
Datasets for Computational Methods and GIS Applications in Social Science
search.dataone.org
Updated Sep 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fahui Wang; Lingbo Liu (2024). Datasets for Computational Methods and GIS Applications in Social Science [Dataset]. http://doi.org/10.7910/DVN/4CM7V4
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/4CM7V4
Dataset updated
Sep 25, 2024
Dataset provided by
Harvard Dataverse
Authors
Fahui Wang; Lingbo Liu
Description
Dataset for the textbook Computational Methods and GIS Applications in Social Science (3rd Edition), 2023 Fahui Wang, Lingbo Liu Main Book Citation: Wang, F., & Liu, L. (2023). Computational Methods and GIS Applications in Social Science (3rd ed.). CRC Press. https://doi.org/10.1201/9781003292302 KNIME Lab Manual Citation: Liu, L., & Wang, F. (2023). Computational Methods and GIS Applications in Social Science - Lab Manual. CRC Press. https://doi.org/10.1201/9781003304357 KNIME Hub Dataset and Workflow for Computational Methods and GIS Applications in Social Science-Lab Manual Update Log If Python package not found in Package Management, use ArcGIS Pro's Python Command Prompt to install them, e.g., conda install -c conda-forge python-igraph leidenalg NetworkCommDetPro in CMGIS-V3-Tools was updated on July 10,2024 Add spatial adjacency table into Florida on June 29,2024 The dataset and tool for ABM Crime Simulation were updated on August 3, 2023, The toolkits in CMGIS-V3-Tools was updated on August 3rd,2023. Report Issues on GitHub https://github.com/UrbanGISer/Computational-Methods-and-GIS-Applications-in-Social-Science Following the website of Fahui Wang : http://faculty.lsu.edu/fahui Contents Chapter 1. Getting Started with ArcGIS: Data Management and Basic Spatial Analysis Tools Case Study 1: Mapping and Analyzing Population Density Pattern in Baton Rouge, Louisiana Chapter 2. Measuring Distance and Travel Time and Analyzing Distance Decay Behavior Case Study 2A: Estimating Drive Time and Transit Time in Baton Rouge, Louisiana Case Study 2B: Analyzing Distance Decay Behavior for Hospitalization in Florida Chapter 3. Spatial Smoothing and Spatial Interpolation Case Study 3A: Mapping Place Names in Guangxi, China Case Study 3B: Area-Based Interpolations of Population in Baton Rouge, Louisiana Case Study 3C: Detecting Spatiotemporal Crime Hotspots in Baton Rouge, Louisiana Chapter 4. Delineating Functional Regions and Applications in Health Geography Case Study 4A: Defining Service Areas of Acute Hospitals in Baton Rouge, Louisiana Case Study 4B: Automated Delineation of Hospital Service Areas in Florida Chapter 5. GIS-Based Measures of Spatial Accessibility and Application in Examining Healthcare Disparity Case Study 5: Measuring Accessibility of Primary Care Physicians in Baton Rouge Chapter 6. Function Fittings by Regressions and Application in Analyzing Urban Density Patterns Case Study 6: Analyzing Population Density Patterns in Chicago Urban Area >Chapter 7. Principal Components, Factor and Cluster Analyses and Application in Social Area Analysis Case Study 7: Social Area Analysis in Beijing Chapter 8. Spatial Statistics and Applications in Cultural and Crime Geography Case Study 8A: Spatial Distribution and Clusters of Place Names in Yunnan, China Case Study 8B: Detecting Colocation Between Crime Incidents and Facilities Case Study 8C: Spatial Cluster and Regression Analyses of Homicide Patterns in Chicago Chapter 9. Regionalization Methods and Application in Analysis of Cancer Data Case Study 9: Constructing Geographical Areas for Mapping Cancer Rates in Louisiana Chapter 10. System of Linear Equations and Application of Garin-Lowry in Simulating Urban Population and Employment Patterns Case Study 10: Simulating Population and Service Employment Distributions in a Hypothetical City Chapter 11. Linear and Quadratic Programming and Applications in Examining Wasteful Commuting and Allocating Healthcare Providers Case Study 11A: Measuring Wasteful Commuting in Columbus, Ohio Case Study 11B: Location-Allocation Analysis of Hospitals in Rural China Chapter 12. Monte Carlo Method and Applications in Urban Population and Traffic Simulations Case Study 12A. Examining Zonal Effect on Urban Population Density Functions in Chicago by Monte Carlo Simulation Case Study 12B: Monte Carlo-Based Traffic Simulation in Baton Rouge, Louisiana Chapter 13. Agent-Based Model and Application in Crime Simulation Case Study 13: Agent-Based Crime Simulation in Baton Rouge, Louisiana Chapter 14. Spatiotemporal Big Data Analytics and Application in Urban Studies Case Study 14A: Exploring Taxi Trajectory in ArcGIS Case Study 14B: Identifying High Traffic Corridors and Destinations in Shanghai Dataset File Structure 1 BatonRouge Census.gdb BR.gdb 2A BatonRouge BR_Road.gdb Hosp_Address.csv TransitNetworkTemplate.xml BR_GTFS Google API Pro.tbx 2B Florida FL_HSA.gdb R_ArcGIS_Tools.tbx (RegressionR) 3A China_GX GX.gdb 3B BatonRouge BR.gdb 3C BatonRouge BRcrime R_ArcGIS_Tools.tbx (STKDE) 4A BatonRouge BRRoad.gdb 4B Florida FL_HSA.gdb HSA Delineation Pro.tbx Huff Model Pro.tbx FLplgnAdjAppend.csv 5 BRMSA BRMSA.gdb Accessibility Pro.tbx 6 Chicago ChiUrArea.gdb R_ArcGIS_Tools.tbx (RegressionR) 7 Beijing BJSA.gdb bjattr.csv R_ArcGIS_Tools.tbx (PCAandFA, BasicClustering) 8A Yunnan YN.gdb R_ArcGIS_Tools.tbx (SaTScanR) 8B Jiangsu JS.gdb 8C Chicago ChiCity.gdb cityattr.csv ...
d
Current Traffic Analysis Zones for Bernalillo County, New Mexico, 2006se...
catalog.data.gov
gstore.unm.edu
+2more
Updated Dec 2, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Earth Data Analysis Center (Point of Contact) (2020). Current Traffic Analysis Zones for Bernalillo County, New Mexico, 2006se TIGER [Dataset]. https://catalog.data.gov/dataset/current-traffic-analysis-zones-for-bernalillo-county-new-mexico-2006se-tiger
Explore at:
Dataset updated
Dec 2, 2020
Dataset provided by
Earth Data Analysis Center (Point of Contact)
Area covered
Bernalillo County, New Mexico
Description
The 2006 Second Edition TIGER/Line files are an extract of selected geographic and cartographic information from the Census TIGER database. The geographic coverage for a single TIGER/Line file is a county or statistical equivalent entity, with the coverage area based on the latest available governmental unit boundaries. The Census TIGER database represents a seamless national file with no overlaps or gaps between parts. However, each county-based TIGER/Line file is designed to stand alone as an independent data set or the files can be combined to cover the whole Nation. The 2006 Second Edition TIGER/Line files consist of line segments representing physical features and governmental and statistical boundaries. This shapefile represents the current Traffic Analysis Zones for Bernalillo County stored in the 2006 TIGER Second Edition dataset.
m
USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven
app.mobito.io
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven [Dataset]. https://app.mobito.io/data-product/usa-enriched-geospatial-framework-dataset
Explore at:
Area covered
United States
Description
Our dataset provides detailed and precise insights into the business, commercial, and industrial aspects of any given area in the USA (Including Point of Interest (POI) Data and Foot Traffic. The dataset is divided into 150x150 sqm areas (geohash 7) and has over 50 variables. - Use it for different applications: Our combined dataset, which includes POI and foot traffic data, can be employed for various purposes. Different data teams use it to guide retailers and FMCG brands in site selection, fuel marketing intelligence, analyze trade areas, and assess company risk. Our dataset has also proven to be useful for real estate investment.- Get reliable data: Our datasets have been processed, enriched, and tested so your data team can use them more quickly and accurately.- Ideal for trainning ML models. The high quality of our geographic information layers results from more than seven years of work dedicated to the deep understanding and modeling of geospatial Big Data. Among the features that distinguished this dataset is the use of anonymized and user-compliant mobile device GPS location, enriched with other alternative and public data.- Easy to use: Our dataset is user-friendly and can be easily integrated to your current models. Also, we can deliver your data in different formats, like .csv, according to your analysis requirements. - Get personalized guidance: In addition to providing reliable datasets, we advise your analysts on their correct implementation.Our data scientists can guide your internal team on the optimal algorithms and models to get the most out of the information we provide (without compromising the security of your internal data).Answer questions like: - What places does my target user visit in a particular area? Which are the best areas to place a new POS?- What is the average yearly income of users in a particular area?- What is the influx of visits that my competition receives?- What is the volume of traffic surrounding my current POS?This dataset is useful for getting insights from industries like:- Retail & FMCG- Banking, Finance, and Investment- Car Dealerships- Real Estate- Convenience Stores- Pharma and medical laboratories- Restaurant chains and franchises- Clothing chains and franchisesOur dataset includes more than 50 variables, such as:- Number of pedestrians seen in the area.- Number of vehicles seen in the area.- Average speed of movement of the vehicles seen in the area.- Point of Interest (POIs) (in number and type) seen in the area (supermarkets, pharmacies, recreational locations, restaurants, offices, hotels, parking lots, wholesalers, financial services, pet services, shopping malls, among others). - Average yearly income range (anonymized and aggregated) of the devices seen in the area.Notes to better understand this dataset:- POI confidence means the average confidence of POIs in the area. In this case, POIs are any kind of location, such as a restaurant, a hotel, or a library. - Category confidences, for example"food_drinks_tobacco_retail_confidence" indicates how confident we are in the existence of food/drink/tobacco retail locations in the area. - We added predictions for The Home Depot and Lowe's Home Improvement stores in the dataset sample. These predictions were the result of a machine-learning model that was trained with the data. Knowing where the current stores are, we can find the most similar areas for new stores to open.How efficient is a Geohash?Geohash is a faster, cost-effective geofencing option that reduces input data load and provides actionable information. Its benefits include faster querying, reduced cost, minimal configuration, and ease of use.Geohash ranges from 1 to 12 characters. The dataset can be split into variable-size geohashes, with the default being geohash7 (150m x 150m).
c
Current Traffic Analysis Zones for Torrance County, New Mexico, 2006se TIGER...
s.cnmilf.com
cloud.csiss.gmu.edu
+2more
Updated Dec 2, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Earth Data Analysis Center (Point of Contact) (2020). Current Traffic Analysis Zones for Torrance County, New Mexico, 2006se TIGER [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/current-traffic-analysis-zones-for-torrance-county-new-mexico-2006se-tiger
Explore at:
Dataset updated
Dec 2, 2020
Dataset provided by
Earth Data Analysis Center (Point of Contact)
Area covered
Torrance County, New Mexico
Description
The 2006 Second Edition TIGER/Line files are an extract of selected geographic and cartographic information from the Census TIGER database. The geographic coverage for a single TIGER/Line file is a county or statistical equivalent entity, with the coverage area based on the latest available governmental unit boundaries. The Census TIGER database represents a seamless national file with no overlaps or gaps between parts. However, each county-based TIGER/Line file is designed to stand alone as an independent data set or the files can be combined to cover the whole Nation. The 2006 Second Edition TIGER/Line files consist of line segments representing physical features and governmental and statistical boundaries. This shapefile represents the current Traffic Analysis Zones for Torrance County stored in the 2006 TIGER Second Edition dataset.
d
Current Traffic Analysis Zones for Otero County, New Mexico, 2006se TIGER
catalog.data.gov
datasets.ai
+2more
Updated Dec 2, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Earth Data Analysis Center (Point of Contact) (2020). Current Traffic Analysis Zones for Otero County, New Mexico, 2006se TIGER [Dataset]. https://catalog.data.gov/dataset/current-traffic-analysis-zones-for-otero-county-new-mexico-2006se-tiger
Explore at:
Dataset updated
Dec 2, 2020
Dataset provided by
Earth Data Analysis Center (Point of Contact)
Area covered
Otero County, New Mexico
Description
The 2006 Second Edition TIGER/Line files are an extract of selected geographic and cartographic information from the Census TIGER database. The geographic coverage for a single TIGER/Line file is a county or statistical equivalent entity, with the coverage area based on the latest available governmental unit boundaries. The Census TIGER database represents a seamless national file with no overlaps or gaps between parts. However, each county-based TIGER/Line file is designed to stand alone as an independent data set or the files can be combined to cover the whole Nation. The 2006 Second Edition TIGER/Line files consist of line segments representing physical features and governmental and statistical boundaries. This shapefile represents the current Traffic Analysis Zones for Otero County stored in the 2006 TIGER Second Edition dataset.
R
Traffic Dataset
universe.roboflow.com
zip
Updated Sep 19, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
joseva (2022). Traffic Dataset [Dataset]. https://universe.roboflow.com/joseva/traffic-kyalq/dataset/2
Explore at:
zipAvailable download formats
Dataset updated
Sep 19, 2022
Dataset authored and provided by
joseva
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Traffic Signals Bounding Boxes
Description
Here are a few use cases for this project:

Self-Driving Vehicles System: This model can be implemented in autonomous vehicles technology to identify traffic signs and signals, thus enabling the vehicle to make intelligent and safety-compliant decisions as per road conditions.

Smart Traffic Management: The model can be used in urban planning and traffic management systems to analyze, comprehend, and report traffic indications in real-time, aiding in better road traffic control and congestion avoidance.

Driving Assistance Applications: There is potential to integrate this model into GPS navigation systems or dedicated driving assistance applications. These apps could provide real-time traffic rule alerts to drivers, enhancing safety and rule adherence.

Road Condition Analysis: Use the model to collect road condition data based on signs for construction, slippery road, uneven road, etc. This critical information could support road maintenance planning by relevant authorities.

Traffic Rule Training Software: This model can be used in developing training software for beginner drivers or trucking companies. The software could explain and demonstrate various traffic rules, greatly improving the quality of road safety education.
m
Ransomware and user samples for training and validating ML models
data.mendeley.com
Updated Sep 17, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eduardo Berrueta (2021). Ransomware and user samples for training and validating ML models [Dataset]. http://doi.org/10.17632/yhg5wk39kf.2
Explore at:
Unique identifier
https://doi.org/10.17632/yhg5wk39kf.2
Dataset updated
Sep 17, 2021
Authors
Eduardo Berrueta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Ransomware is considered as a significant threat for most enterprises since past few years. In scenarios wherein users can access all files on a shared server, one infected host is capable of locking the access to all shared files. In the article related to this repository, we detect ransomware infection based on file-sharing traffic analysis, even in the case of encrypted traffic. We compare three machine learning models and choose the best for validation. We train and test the detection model using more than 70 ransomware binaries from 26 different families and more than 2500 h of ‘not infected’ traffic from real users. The results reveal that the proposed tool can detect all ransomware binaries, including those not used in the training phase (zero-days). This paper provides a validation of the algorithm by studying the false positive rate and the amount of information from user files that the ransomware could encrypt before being detected.

This dataset directory contains the 'infected' and 'not infected' samples and the models used for each T configuration, each one in a separated folder.

The folders are named NxSy where x is the number of 1-second interval per sample and y the sliding step in seconds.

Each folder (for example N10S10/) contains: - tree.py -> Python script with the Tree model. - ensemble.json -> JSON file with the information about the Ensemble model. - NN_XhiddenLayer.json -> JSON file with the information about the NN model with X hidden layers (1, 2 or 3). - N10S10.csv -> All samples used for training each model in this folder. It is in csv format for using in bigML application. - zeroDays.csv -> All zero-day samples used for testing each model in this folder. It is in csv format for using in bigML application. - userSamples_test -> All samples used for validating each model in this folder. It is in csv format for using in bigML application. - userSamples_train -> User samples used for training the models. - ransomware_train -> Ransomware samples used for training the models - scaler.scaler -> Standard Scaler from python library used for scale the samples. - zeroDays_notFiltered -> Folder with the zeroDay samples.

In the case of N30S30 folder, there is an additional folder (SMBv2SMBv3NFS) with the samples extracted from the SMBv2, SMBv3 and NFS traffic traces. There are more binaries than the ones presented in the article, but it is because some of them are not "unseen" binaries (the families are present in the training set).

The files containing samples (NxSy.csv, zeroDays.csv and userSamples_test.csv) are structured as follows: - Each line is one sample. - Each sample has 3*T features and the label (1 if it is 'infected' sample and 0 if it is not). - The features are separated by ',' because it is a csv file. - The last column is the label of the sample.

Additionally we have placed two pcap files in root directory. There are the traces used for compare both versions of SMB.
f
Applications for traffic analysis study.
plos.figshare.com
xls
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mun-Suk Kim; Yena Kim; SeungSeob Lee; SuKyoung Lee; Nada Golmie (2023). Applications for traffic analysis study. [Dataset]. http://doi.org/10.1371/journal.pone.0210738.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0210738.t002
Dataset updated
Jun 5, 2023
Dataset provided by
PLOS ONE
Authors
Mun-Suk Kim; Yena Kim; SeungSeob Lee; SuKyoung Lee; Nada Golmie
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Applications for traffic analysis study.
R
Outdoor Finetune Dataset
universe.roboflow.com
zip
Updated Jun 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Usama Amir (2022). Outdoor Finetune Dataset [Dataset]. https://universe.roboflow.com/usama-amir/outdoor-finetune
Explore at:
zipAvailable download formats
Dataset updated
Jun 24, 2022
Dataset authored and provided by
Usama Amir
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Objects Bounding Boxes
Description
Here are a few use cases for this project:

Traffic Surveillance: The computer vision model can be applied to monitor real-time traffic situations. It can identify different vehicle types such as bikes, cars, trucks (ltvs), and any other unusual items on the road, which can help in traffic analysis and management.

Autonomous Vehicles: The model can be integrated into the AI systems of self-driving cars to help them recognize and respond appropriately to the various entities in the outdoor environment, like different types of vehicles, bicycles, and people.

Outdoor Security Systems: This model can enhance the capabilities of security cameras installed outdoors. With its ability to identify various outdoor objects and people, it can improve the effectiveness and responsiveness of such systems.

Pedestrian Safety Application: The model can be integrated into apps designed for enhancing pedestrian safety. These apps can alert users when a vehicle or a bike is approaching.

Smart City Planning: City planners can use the data generated by this model to understand traffic flow, pedestrian activities, and vehicle type distributions, supporting more informed infrastructure planning and development.
Network Traffic Android Malware
kaggle.com
zip
Updated Sep 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Urcuqui (2019). Network Traffic Android Malware [Dataset]. https://www.kaggle.com/xwolf12/network-traffic-android-malware
Explore at:
zip(116603 bytes)Available download formats
Dataset updated
Sep 12, 2019
Authors
Christian Urcuqui
Description
Introduction

Android is one of the most used mobile operating systems worldwide. Due to its technological impact, its open-source code and the possibility of installing applications from third parties without any central control, Android has recently become a malware target. Even if it includes security mechanisms, the last news about malicious activities and Android´s vulnerabilities point to the importance of continuing the development of methods and frameworks to improve its security.

To prevent malware attacks, researches and developers have proposed different security solutions, applying static analysis, dynamic analysis, and artificial intelligence. Indeed, data science has become a promising area in cybersecurity, since analytical models based on data allow for the discovery of insights that can help to predict malicious activities.

In this work, we propose to consider some network layer features as the basis for machine learning models that can successfully detect malware applications, using open datasets from the research community.

Content

This dataset is based on another dataset (DroidCollector) where you can get all the network traffic in pcap files, in our research we preprocessed the files in order to get network features that are illustrated in the next article:

López, C. C. U., Villarreal, J. S. D., Belalcazar, A. F. P., Cadavid, A. N., & Cely, J. G. D. (2018, May). Features to Detect Android Malware. In 2018 IEEE Colombian Conference on Communications and Computing (COLCOM) (pp. 1-6). IEEE.

Acknowledgements

Cao, D., Wang, S., Li, Q., Cheny, Z., Yan, Q., Peng, L., & Yang, B. (2016, August). DroidCollector: A High Performance Framework for High Quality Android Traffic Collection. In Trustcom/BigDataSE/I SPA, 2016 IEEE (pp. 1753-1758). IEEE
w
Current Traffic Analysis Zones for Valencia County, New Mexico, 2006se TIGER...
data.wu.ac.at
datadiscoverystudio.org
+2more
csv, excel, geojson +9
Updated Jun 25, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Earth Data Analysis Center, University of New Mexico (2014). Current Traffic Analysis Zones for Valencia County, New Mexico, 2006se TIGER [Dataset]. https://data.wu.ac.at/schema/data_gov/NmIzMmMzYTQtZjU1ZS00NzFhLWFhNTctMzc1MTIwZmYyMTkz
Explore at:
xml, html, shp, csv, gml, geojson, json, excel, wfs, wms, kml, zipAvailable download formats
Dataset updated
Jun 25, 2014
Dataset provided by
Earth Data Analysis Center, University of New Mexico
Area covered
Valencia County, cfe766fd5ec8b865bb9610b7c101127e8df5d17a
Description
The 2006 Second Edition TIGER/Line files are an extract of selected geographic and cartographic information from the Census TIGER database. The geographic coverage for a single TIGER/Line file is a county or statistical equivalent entity, with the coverage area based on the latest available governmental unit boundaries. The Census TIGER database represents a seamless national file with no overlaps or gaps between parts. However, each county-based TIGER/Line file is designed to stand alone as an independent data set or the files can be combined to cover the whole Nation. The 2006 Second Edition TIGER/Line files consist of line segments representing physical features and governmental and statistical boundaries.

This shapefile represents the current Traffic Analysis Zones for Valencia County stored in the 2006 TIGER Second Edition dataset.
R
Analysis of the route safety of abnormal vehicle from the perspective of...
repod.icm.edu.pl
json, tsv, txt
Updated Feb 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Betkier, Igor (2023). Analysis of the route safety of abnormal vehicle from the perspective of traffic parameters and infrastructure characteristics with the use of web technologies and machine learning [Dataset]. http://doi.org/10.18150/U9NPVL
Explore at:
txt(1061), txt(135312), txt(36279), txt(1237), tsv(49700), txt(4657), txt(1274), txt(474), json(223876718), json(142231883), txt(42976), txt(364), json(16510649), json(176705), txt(1316), txt(4420), txt(8577220), json(220646926), json(259936249)Available download formats
Unique identifier
https://doi.org/10.18150/U9NPVL
Dataset updated
Feb 14, 2023
Dataset provided by
RepOD
Authors
Betkier, Igor
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset funded by
Narodowe Centrum Nauki
Description
Dear Scientist!This database contains data collected due to conducting study: "Analysis of the route safety of abnormal vehicle from the perspective of traffic parameters and infrastructure characteristics with the use of web technologies and machine learning" funded by National Science Centre Poland (Grant reference 2021/05/X/ST8/01669). The structure of files is arising from the aims of the study and numerous of sources needed to tailor suitable data possible to use as an input layer for neural network. You can find a following folders and files:1. Road_Parameters_Data (.csv) - which is data colleced by author before the study (2021). Here you can find information about technical quality and types of main roads located in Mazovia province (Poland). The source of data was Polish General Directorate for National Roads and Motorways. 2. Google_Maps_Data (.json) - here you can find the data, which was collected using the authors’ webservice created using the Python language, which downloaded the said data in the Distance Matrix API service on Google Maps at two-hour intervals from 25 May 2022 to 22 June 2022. The application retrieved the TRAFFIC FACTOR parameter, which was a ratio of actual time of travel divided by historical time of travel for particular roads.3. Geocoding_Roads_Data (.json) - in this folder you can find data gained from reverse geocoding approach based on geographical coordinates and the request parameter latlng were employed. As a result, Google Maps returned a response containing the postal code for the field types defined as postal_code and the name of the lowest possible level of the territorial unit for the field administrative_area_level. 4. Population_Density_Data (.csv) - here you can find date for territorial units, which were assigned to individual records were used to search the database of the Polish Postal Service using the authors' original web service written in the Python programming language. The records which contained a postal code were assigned the name of the municipality which corresponded to it. Finally, postal codes and names of territorial units were compared with the database of the Statistics Poland (GUS) containing information on population density for individual municipalities and assigned to existing records from the database.5. Roads_Incidents_Data (.json) - in this folder you can find a data collected by a webservice, which was programmed in the Python language and used for analysing the reported obstructions available on the website of the General Directorate for National Roads and Motorways. In the event of traffic obstruction emergence in the Mazovia Province, the application, on the basis of the number and kilometre of the road on which it occurred, could associate it later with appropriate records based on the links parameters. The data was colleced from 26 May to 22 June 2022.6. Weather_For_Roads_Data (.json) - here you can find the data concerning the weather conditions on the roads occurring at days of the study. To make this feasible, a webservice was programmed in the Python language, by means of which the selected items from the response returned by the www.timeanddate.com server for the corresponding input parameters were retrieved – geographical coordinates of the midpoint between the nodes of the particular roads. The data was colleced for day between 27 May and 22 June 2022.7. data_v_1 (.csv) - collected only data for road parameters8. data_v_2 (.csv) - collected data for road parameters + population density9. data_v_3 (.json) - collected data for road parameters + population density + traffic10. data_v_4 (.json) - collected data for road parameters + population density + traffic + weather + road incidents11. data_v_5 (.csv) - collected VALIDATED and cleaned data for road parameters + population density + traffic + weather + road incidents. At this stage, the road sections for which the parameter traffic factor was assessed to have been estimated incorrectly were eliminated. These were combinations for which the value of the traffic factor remained the same regardless the time of day or which took several of the same values during the course of the whole study. Moreover, it was also assumed that the final database should consist of road sections for traffic factor less than 1.2 constitute at least 10% of all results. Thus, the sections with no tendency to become congested and characterized by a small number of road traffic users were eliminated.Good luck with your research!Igor Betkier, PhD

Facebook

Twitter

Click to copy link

Link copied

Cite

Network traffic datasets created by Single Flow Time Series Analysis

Explore at:

csv, pdfAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.8035724

Dataset updated

Jul 11, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Network traffic datasets created by Single Flow Time Series Analysis

J. Koumar, K. Hynek and T. Čejka, "Network Traffic Classification Based on Single Flow Time Series Analysis," 2023 19th International Conference on Network and Service Management (CNSM), Niagara Falls, ON, Canada, 2023, pp. 1-7, doi: 10.23919/CNSM59352.2023.10327876.

In the following table is a description of each dataset file:

File name	Detection problem	Citation of original raw dataset
botnet_binary.csv	Binary detection of botnet	S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
botnet_multiclass.csv	Multi-class classification of botnet	S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
cryptomining_design.csv	Binary detection of cryptomining; the design part	Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
cryptomining_evaluation.csv	Binary detection of cryptomining; the evaluation part	Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
dns_malware.csv	Binary detection of malware DNS	Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.
doh_cic.csv	Binary detection of DoH	Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020
doh_real_world.csv	Binary detection of DoH	Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022
dos.csv	Binary detection of DoS	Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.
edge_iiot_binary.csv	Binary detection of IoT malware	Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
edge_iiot_multiclass.csv	Multi-class classification of IoT malware	Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
https_brute_force.csv	Binary detection of HTTPS Brute Force	Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020
ids_cic_binary.csv	Binary detection of intrusion in IDS	Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_cic_multiclass.csv	Multi-class classification of intrusion in IDS	Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_unsw_nb_15_binary.csv	Binary detection of intrusion in IDS	Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
ids_unsw_nb_15_multiclass.csv	Multi-class classification of intrusion in IDS	Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
iot_23.csv	Binary detection of IoT malware	Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23
ton_iot_binary.csv	Binary detection of IoT malware	Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
ton_iot_multiclass.csv	Multi-class classification of IoT malware	Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
tor_binary.csv	Binary detection of TOR	Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
tor_multiclass.csv	Multi-class classification of TOR	Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
vpn_iscx_binary.csv	Binary detection of VPN	Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
vpn_iscx_multiclass.csv	Multi-class classification of VPN	Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
vpn_vnat_binary.csv	Binary detection of VPN	Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022
vpn_vnat_multiclass.csv	Multi-class classification of VPN	Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022

Clear search

Close search

Google apps

Main menu

Network traffic datasets created by Single Flow Time Series Analysis

LoRaWAN Traffic Analysis Dataset

VLC Data: A Multi-Class Network Traffic Dataset Covering Diverse...

Enriched Traffic Datasets for Madrid

isom5240-td-application-traffic-analysis

Current Traffic Analysis Zones for Sandoval County, New Mexico, 2006se TIGER...

IP Network Traffic Flows Labeled with 75 Apps

Context

Content

Acknowledgements

Inspiration

Traffic_train Dataset

Datasets for Computational Methods and GIS Applications in Social Science

Current Traffic Analysis Zones for Bernalillo County, New Mexico, 2006se...

USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven

Current Traffic Analysis Zones for Torrance County, New Mexico, 2006se TIGER...

Current Traffic Analysis Zones for Otero County, New Mexico, 2006se TIGER

Traffic Dataset

Ransomware and user samples for training and validating ML models

Applications for traffic analysis study.

Outdoor Finetune Dataset

Network Traffic Android Malware

Introduction

Content

Acknowledgements

Current Traffic Analysis Zones for Valencia County, New Mexico, 2006se TIGER...

Analysis of the route safety of abnormal vehicle from the perspective of...

Network traffic datasets created by Single Flow Time Series Analysis