19 datasets found

c
Anonymized Internet Traces 2019
catalog.caida.org
Updated Jan 15, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CAIDA (2019). Anonymized Internet Traces 2019 [Dataset]. https://catalog.caida.org/dataset/passive_2019_pcap
Explore at:
Dataset updated
Jan 15, 2019
Dataset authored and provided by
CAIDA
License
https://www.caida.org/about/legal/aua/https://www.caida.org/about/legal/aua/
Time period covered
Jan 2019
Description
Packet headers (upto transport layer, inclusive) for Anonymized Internet Traces 2019 Dataset. Derived from 10G traces on Equinix NYC monitor.
m
Network traffic and code for machine learning classification
data.mendeley.com
Updated Feb 20, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Víctor Labayen (2020). Network traffic and code for machine learning classification [Dataset]. http://doi.org/10.17632/5pmnkshffm.2
Explore at:
Unique identifier
https://doi.org/10.17632/5pmnkshffm.2
Dataset updated
Feb 20, 2020
Authors
Víctor Labayen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset is a set of network traffic traces in pcap/csv format captured from a single user. The traffic is classified in 5 different activities (Video, Bulk, Idle, Web, and Interactive) and the label is shown in the filename. There is also a file (mapping.csv) with the mapping of the host's IP address, the csv/pcap filename and the activity label.

Activities:

Interactive: applications that perform real-time interactions in order to provide a suitable user experience, such as editing a file in google docs and remote CLI's sessions by SSH. Bulk data transfer: applications that perform a transfer of large data volume files over the network. Some examples are SCP/FTP applications and direct downloads of large files from web servers like Mediafire, Dropbox or the university repository among others. Web browsing: contains all the generated traffic while searching and consuming different web pages. Examples of those pages are several blogs and new sites and the moodle of the university. Vídeo playback: contains traffic from applications that consume video in streaming or pseudo-streaming. The most known server used are Twitch and Youtube but the university online classroom has also been used. Idle behaviour: is composed by the background traffic generated by the user computer when the user is idle. This traffic has been captured with every application closed and with some opened pages like google docs, YouTube and several web pages, but always without user interaction.

The capture is performed in a network probe, attached to the router that forwards the user network traffic, using a SPAN port. The traffic is stored in pcap format with all the packet payload. In the csv file, every non TCP/UDP packet is filtered out, as well as every packet with no payload. The fields in the csv files are the following (one line per packet): Timestamp, protocol, payload size, IP address source and destination, UDP/TCP port source and destination. The fields are also included as a header in every csv file.

The amount of data is stated as follows:

Bulk : 19 traces, 3599 s of total duration, 8704 MBytes of pcap files Video : 23 traces, 4496 s, 1405 MBytes Web : 23 traces, 4203 s, 148 MBytes Interactive : 42 traces, 8934 s, 30.5 MBytes Idle : 52 traces, 6341 s, 0.69 MBytes

The code of our machine learning approach is also included. There is a README.txt file with the documentation of how to use the code.
c
Passive Metadata
catalog.caida.org
Updated Feb 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CAIDA (2021). Passive Metadata [Dataset]. https://catalog.caida.org/dataset/passive_metadata
Explore at:
Dataset updated
Feb 25, 2021
Dataset authored and provided by
CAIDA
License
https://www.caida.org/about/legal/aua/public_aua/https://www.caida.org/about/legal/aua/public_aua/
Time period covered
Mar 2008 - Jan 2019
Description
Meta data for all passive monthly traces, incl. chicago and sanjose monitors. This includes the files used to generate the public trace stats.
f
YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network...
figshare.com
txt
Updated Apr 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld (2022). YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network Management, and Streaming Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.19096823.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19096823.v2
Dataset updated
Apr 14, 2022
Dataset provided by
figshare
Authors
Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
Streaming is by far the predominant type of traffic in communication networks. With thispublic dataset, we provide 1,081 hours of time-synchronous video measurements at network, transport, and application layer with the native YouTube streaming client on mobile devices. The dataset includes 80 network scenarios with 171 different individual bandwidth settings measured in 5,181 runs with limited bandwidth, 1,939 runs with emulated 3G/4G traces, and 4,022 runs with pre-defined bandwidth changes. This corresponds to 332GB video payload. We present the most relevant quality indicators for scientific use, i.e., initial playback delay, streaming video quality, adaptive video quality changes, video rebuffering events, and streaming phases.
m
Data from: Packet-level and IEEE 802.11 MAC frame-level Network Traffic...
data.mendeley.com
narcis.nl
Updated Jan 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajarshi Roy Chowdhury (2021). Packet-level and IEEE 802.11 MAC frame-level Network Traffic Traces Data of the D-Link IoT devices [Dataset]. http://doi.org/10.17632/84cc8grtkt.1
Explore at:
Unique identifier
https://doi.org/10.17632/84cc8grtkt.1
Dataset updated
Jan 14, 2021
Authors
Rajarshi Roy Chowdhury
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset presents network traffic traces data of the 14 D-Link IoT devices from different types including camera, network camera, smart-plug, door-window sensor, and home-hub. It consists of:

• Network packet traces (inbound and outbound traffic) and • IEEE 802.11 MAC frame traces.

The experimental testbed was set-up in the Network Systems and Signal Processing (NSSP) laboratory at Universiti Brunei Darussalam (UBD) to collect all the network traffic traces from 9th September 2020 to 10th January 2021 including an access point on a laptop. The network traffic traces were captured passively observing the Ethernet interface and the WiFi interface at the access point.

In packet traces, typical communication protocols, such as TCP, UDP, IP, ICMP, ARP, DNS, SSDP, TLS/SSL etc, data are captured which IoT devices use for communication on the Internet. In the probe request frame (a subtype of management frames) traces, data are recorded which IoT devices use to connect access point on the local area network.

The authors would like to thank the Faculty of Integrated Technologies, Universiti Brunei Darussalam, for the support to conduct this research experiment in the Network Systems and Signal Processing laboratory.
i
Data from: In-browser and network traffic based web response time...
ieee-dataport.org
Updated May 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carlos Lopez (2022). In-browser and network traffic based web response time measurements [Dataset]. https://ieee-dataport.org/open-access/browser-and-network-traffic-based-web-response-time-measurements
Explore at:
Dataset updated
May 18, 2022
Authors
Carlos Lopez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
out of which 20 used plaintext HTTP browsing
c
Anonymized Two-Way Traffic Packet Header Traces 100G (5 sec) sampler
catalog.caida.org
Updated Feb 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CAIDA (2025). Anonymized Two-Way Traffic Packet Header Traces 100G (5 sec) sampler [Dataset]. https://catalog.caida.org/dataset/passive_100g_sampler/cite
Explore at:
Dataset updated
Feb 16, 2025
Dataset authored and provided by
CAIDA
License
https://www.caida.org/about/legal/aua/https://www.caida.org/about/legal/aua/
Time period covered
Nov 2024
Description
This dataset contains anonymized layer 1-4 packet headers of two-way passive traces captured on a 100 GB link between Los Angeles and San Jose. These data are useful for research on the characteristics of Internet traffic, including application breakdown, security events, geographic and topological distribution, flow volume and duration.

Passive 100G sampler is offered to researchers at commercial organizations when they request Anonymized Internet Traces. These data are part of the 2024 Anonymized Traces 100G dataset. The files consist of 5 second snapshots of a bidirectional capture taken in November 2024.
t
CAIDA Internet Traces 2016 Chicago - Dataset - LDM
service.tib.eu
Updated Nov 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). CAIDA Internet Traces 2016 Chicago - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/caida-internet-traces-2016-chicago
Explore at:
Dataset updated
Nov 25, 2024
Area covered
Chicago
Description
The traffic data is collected at a backbone link of a Tier1 ISP, aiming to estimate the number of packets for each network flow identified by IP addresses and application ports.
GTT23: A 2023 Dataset of Genuine Tor Traces
zenodo.org
Updated Jul 31, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rob Jansen; Rob Jansen; Ryan Wails; Ryan Wails; Aaron Johnson; Aaron Johnson (2024). GTT23: A 2023 Dataset of Genuine Tor Traces [Dataset]. http://doi.org/10.5281/zenodo.10869889
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10869889
Dataset updated
Jul 31, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rob Jansen; Rob Jansen; Ryan Wails; Ryan Wails; Aaron Johnson; Aaron Johnson
Time period covered
2023
Description
The GTT23 dataset contains network metadata of encrypted traffic measured from exit relays in the Tor network over a 13-week measurement period in 2023. The metadata is suitable for analyzing and evaluating website fingerprinting attacks and defenses.

Our dataset measurement process was designed to prioritize safety and privacy and was developed through consultation with the Tor Research Safety Board (TRSB, submission #37). Our TRSB interaction resulted in a “No Objections” score.

The measurement process, additional safety and ethical considerations, and a statistical analysis of the dataset is presented in further detail in the article "A Measurement of Genuine Tor Traces for Realistic Website Fingerprinting", arXiv:2404.07892 [cs.CR], https://doi.org/10.48550/arXiv.2404.07892.
i
Netflix
ieee-dataport.org
Updated Oct 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Danil Shamsimukhametov (2021). Netflix [Dataset]. https://ieee-dataport.org/documents/youtube-netflix-web-dataset-encrypted-traffic-classification
Explore at:
Dataset updated
Oct 1, 2021
Authors
Danil Shamsimukhametov
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
YouTube flows
Z
Trace-Share Dataset for Evaluation of Statistical Characteristics...
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Madeja, Tomas (2020). Trace-Share Dataset for Evaluation of Statistical Characteristics Preservation [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3553062
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Cermak, Milan
Madeja, Tomas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains all data used during the evaluation of statistical characteristics preservation. Archives are protected by password "trace-share" to avoid false detection by antivirus software.

For more information, see the project repository at https://github.com/Trace-Share.

Selected Attack Traces

We selected 72 different traces of network attacks obtained from various internet databases. File names refer to common names of contained vulnerabilities, malware, or attack tools.

Background Traffic Data

Publicly available dataset CSE-CIC-IDS-2018 was used as a background traffic data. The evaluation uses data from the day Thursday-01-03-2018 containing a sufficient proportion of regular traffic without any statistically significant attacks. Only traffic aimed at victim machines (range 172.31.69.0/24) is used to reduce less significant traffic.

Evaluation Results and Dataset Structure

Traces variants (traces-normalized.zip, traces-adjusted.zip)

./traces-normalized/ — normalized PCAP files and details in YAML format;

./traces-adjusted/ — configuration files for traces combination in YAML format.

Computed statistics (statistics.zip)

./statistics-background/ — background traffic statistics computed by ID2T;

./statistics-combination/ — combined traces statistics computed by ID2T for all adjust options (selected only combinations where ID2T provided all statistics files);

./statistics-difference/ — computed mean and median differences of background and combined traffic traces.

Evaluation results

statistics-difference.ipynb — file containing visualization of statistics differences.
c
UCSD Real-time Network Telescope
catalog.caida.org
Updated May 17, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CAIDA (2018). UCSD Real-time Network Telescope [Dataset]. https://catalog.caida.org/dataset/telescope_live
Explore at:
Dataset updated
May 17, 2018
Dataset authored and provided by
CAIDA
License
https://www.caida.org/about/legal/aua/https://www.caida.org/about/legal/aua/
Description
The UCSD Network Telescope consists of a globally routed, but lightly utilized /9 and /10 network prefix, that is, 1/256th of the whole IPv4 address space. It contains few legitimate hosts; inbound traffic to non-existent machines - so called Internet Background Radiation (IBR) - is unsolicited and results from a wide range of events, including misconfiguration (e.g. mistyping an IP address), scanning of address space by attackers or malware looking for vulnerable targets, backscatter from randomly spoofed denial-of-service attacks, and the automated spread of malware. CAIDA continously captures this anomalous traffic discarding the legitimate traffic packets destined to the few reachable IP addresses in this prefix. We archive and aggregate these data, and provide this valuable resource to network security researchers. This dataset represents raw traffic traces captured by the Telescope instrumentation and made available in near-real time as one-hour long compressed pcap files. We collect more than 3 TB of uncompressed IBR traffic traces data per day. The most recent 14 days of data are stored locally at CAIDA. Once data slides out of this near-real-time window, the pcap files are off-loaded to a tape storage. This historical Telescope data starting from 2008 are available by additional request.
m
Amazon S3 cloud storage service data set
data.mendeley.com
narcis.nl
Updated Jan 21, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonio Pescape' (2017). Amazon S3 cloud storage service data set [Dataset]. http://doi.org/10.17632/99kv5x8xhr.1
Explore at:
Unique identifier
https://doi.org/10.17632/99kv5x8xhr.1
Dataset updated
Jan 21, 2017
Authors
Antonio Pescape'
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains cloud network performance data related to the Amazon S3 storage service. The dataset refers to experimental campaigns conducted in May 2016. The dataset was collected leveraging 77 Bismark VPs, instructed as detailed in the following. Each VP performed repeated download cycles over 7 days. Each cycle is composed of 40 sequential download requests spaced out by 10 seconds and uniquely identified by a combination of factors, i.e. cloud region, file size, and storage class. Downloads within cycles are randomly scheduled and repeated from each VP every 2 hours. After every download, VPs run TCP-traceroute towards the IP address that served the request in order to trace the information related to the path and estimate the RTT to the S3 cloud datacenter (note that this information is not always available, due to the version of the firmware of the Bismark nodes and to the measurement tools available on them).

When refering to our Traffic Traces, please cite the following reference: Valerio Persico, Antonio Montieri, Antonio Pescapè: On the Network Performance of Amazon S3 Cloud-Storage Service. CloudNet 2016: 113-118
m
Anomaly Detection
data.mendeley.com
narcis.nl
Updated Jan 19, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antonio Pescape' (2017). Anomaly Detection [Dataset]. http://doi.org/10.17632/dkg3b6vz65.1
Explore at:
Unique identifier
https://doi.org/10.17632/dkg3b6vz65.1
Dataset updated
Jan 19, 2017
Authors
Antonio Pescape'
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Time Series for Anomaly Detection

The file is a Matlab data file. It contains 3 time series, representing the packet rate of 3 different traffic traces, related to inbound traffic of the UNINA Network. The traces were collected in year 2004. The packet rate was sampled with a period of 2 seconds and each trace lasts 2 hours. These data have been used for studies on volume-based anomaly detection and are related to time intervals during which no anomalies were observed on the UNINA network by the NOC operators. In other words, they can be considered anomaly-free.

When refering to our Anomaly Detection Dataset, please cite the following reference:

A. Dainotti, A. Pescapè, G. Ventre, "A cascade architecture for DoS attacks detection based on the wavelet transform", Journal of Computer Security, Volume 17, Number 6/2009, Pages 945-968.
Traces captured by visiting the top 1500 website
kaggle.com
Updated Aug 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DNS_dataset (2021). Traces captured by visiting the top 1500 website [Dataset]. https://www.kaggle.com/datasets/jacksontang16/traces-captured-by-visiting-the-top-1500-website
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 25, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
DNS_dataset
Description
Dataset

This dataset was created by DNS_dataset

Contents
UC Berkeley Home IP Web Traces
zenodo.org
application/gzip, bin
Updated Sep 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steven D.Gribble; Steven D.Gribble (2020). UC Berkeley Home IP Web Traces [Dataset]. http://doi.org/10.5281/zenodo.4020425
Explore at:
application/gzip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4020425
Dataset updated
Sep 9, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Steven D.Gribble; Steven D.Gribble
Area covered
Berkeley
Description
Description

This dataset consists of 18 days' worth of HTTP traces gathered from the Home IP service offered by UC Berkeley to its students, faculty, and staff Home IP provides dial-up PPP/SLIP IP connectivity using 2.4 kb/s, 9.6 kb/s, 14.4 kb/s, or 28.8 kb/s wireline modems, or Metricom Ricochet (approximately 20-30 kb/s) wireless modems. These client traces were unobtrusively gathered through the use of a packet sniffing machine placed at the head-end of the Home IP modem bank; the tracing program used was a custom module written on top of the Internet Protocol Scanning Engine (IPSE) created by Ian Goldberg. Only traffic destined for port 80 was traced; all non-HTTP protocols and HTTP connections for other ports were excluded from these traces.

The traces contain the following information:

a total of 9,244,728 references spanning the period from Friday, November 1st, 1996 at 15:18:59 PST through Tuesday, November 19th, 1996 at 05:52:03 PST. 8,377 unique clients were seen in the traces.

the time at which the client made the request

the time at which the first byte of the server response was seen

the time at which the last byte of the server response was seen

the client IP address (suitably anonymized)

the client port

the server IP address (suitably anonymized)

the server port (always 80 for these traces)

the presence of the no-cache, keep-alive, cache-control, if-modified-since, and unless client headers.

the presence of the no-cache, cache-control, expires, and last-modified server headers.

the values of the client if-modified-since, the server expires, and the server last-modified headers, if present.

the length of the response HTTP header

the length of the response data

the request URL (suitably anonymized)

Format

For the sake of storage efficiency, the (gzipped) traces are stored in a binary representation. This archive of tools includes the following code to parse and manipulate the archives:

showtrace: this program will print out a human readable ASCII representation of what is in the traces. To use, type:
gzcat
Take a look at the source file showtrace.c to see how you can use logparse.[ch] to write code that parses and manipulates the traces. All times displayed are as reported by the gettimeofday() system call.

anon_clients: this is the program that we used to anonymize the traces. I include this program under the principle that the anonymization used is strong enough that distributing the anonymization code cannot help anybody break the anonymization.

timeconvert: a program that accepts a calendar time (i.e. time in seconds since the Epoch, as reported by showtrace and as used in the trace filenames) and outputs the local time corresponding to that calendar time.

The showtrace tool will display lines in the following format:

848278028:829593 848278028:893670 848278028:895350 23.240.8.98:1462 207.36.205.194:80 2 8 4294967295 4294967295 835418853 170 844 37 GET 9168504434183313441..gif HTTP/1.0

848278028:829593 is the time at which the client made the request

848278028:893670 is the time at which the first byte of the server response was seen

848278028:895350 is the time at which the last byte of the server response was seen

23.240.8.98:1462 is the anonymized client IP address and the client port number

207.36.205.194:80 is the anonymized server IP address and the server port number

2 is the decimal representation of the client headers bitfield

8 is the decimal representation of the server headers bitfield

the first 4294967295 is the if-modified-since client header value (note that 4294967295 is 0xFFFFFFFF, which means this header value was not present for this entry)

the second 4294967295 is the expires server header value (again not present)

835418853 is the last-modified server header value

170 is the length of the HTTP response header

844 is the length of the response data

37 is the length of the anonymized request URL

"GET 9168504434183313441..gif HTTP/1.0" is the anonymized request URL.

The interpretation of the client and server header bitfields are as defined in the logparse.h header in the tools code.

The tools code has been tested on both Linux and Solaris. The provided Makefile assumes Solaris - you may have to play with the LIBS definition for other platforms. HPUX is a mess; I didn't even try, but it should be possible to get these tools to work with little effort. If you do, please let me know what you did so that I can make your changes available to the world.

Measurement

The Home IP population gains IP connectivity using PPP or SLIP across their 2.4 kb/s, 9.6 kb/s, 14.4kb/s or 28.8kb/s wireline modem, or their (approximately) 20-30kb/s wireless Metricom Ricochet modem. There are a total of roughly 600 modems available via the Home IP bank. All traffic from these modems ends up feeding over a single 10Mb/s shared Ethernet segment, on which we placed a network monitoring computer (a Pentium Pro 200Mhz running Linux 2.0.27). The monitor was running the IPSE user-level packet scanning engine and a custom-written HTTP module that reconstructed HTTP connections from the gathered IP packets on-the-fly and emitted an unanonymized trace file. Each trace file was then anonymized and transmitted to our research workstations for further postprocessing and analysis.

The trace gathering engine was brought down and restarted approximately every 4 hours (for administrative and address-space-growth reasons). This implies that there are two weaknesses in these traces that you should be aware of:

any connection active when the engine was brought down will have a possibly incorrect timestamp for the last byte seen from the server, and a possibly incorrect reported size. We estimate that no more than 150 such entries (out of roughly 90000-100000) are misreported for each 4 hour period.

any connection that was forged in the very small time window (about 300 milliseconds) between when the engine was shut down and restarted will not appear in the logs. We estimate that no more than 30 such drops occur for each 4 hour period.

The packet capture tool reported no packet drops. Considering that a Pentium Pro 200MHz was used to capture the traces on a 10 Mb/s Ethernet segment, it is virtually certain that no trace drops besides those mentioned above occurred. There may be periods of uncharacteristically low activity in the traces - these correspond to network outages from Berkeley's ISP, rather than trace failures.

The traces do contain entries for requests issued by the client but that weren't completed (because, for instance, the user pressed the STOP button and the TCP connection was shut down before the request completed). Unknown timestamps in the traces contain the value 0xFFFFFFFF (reported by showtrace as 4294967295), and incomplete requests contain header and data length values that report as much header/data was seen.

The trace data is sorted by completion time (i.e. the time at which the last bye of the server response was seen, or the time at which the connection was dropped). However, because of inaccuracies and apparent time travel in the Linux system clock, some trace entries appear slightly out of order.

All timestamps within the traces are as reported by the gettimeofday() system call, so these timestamps ostensibly have microsecond resolution.

Privacy

To maintain the privacy of each individual Home IP user, we have stripped identity information out of the traces through a post-processing phase. Because it is very trivial to identify a user based solely on the pages that the user has visited, we were forced to anonymize the URL and destination IP address of each web request as well as the source IP address. All anonymization was done using a keyed MD5 hash of the data (32 bits for client and server IP addresses, 64 bits for URLs). We ourselves do not know the key used to salt the MD5 hash, so don't bother asking us for it. Similarly, don't bother asking us for unanonymized traces.

In order to preserve some information about the URLs, the post-processed URLs have the following format:

COMMAND URLHASH.[flags][.suffix] [HTTPVERS]

where:

COMMAND is one of GET, HEAD, POST, or PUT,

 </li> <li><code>URLHASH</code> is the string representation of the 64-bit MD5 hash of the URL, </li> <li><code>flags</code> contains the character q to indicate that a question mark was seen in the URL, and the character c to indicate that the string CGI or cgi was seen in the URL, </li> <li><code>suffix</code> is the filename suffix, if present, and </li> <li><code>HTTPVERS</code> is the HTTP version field of the HTTP command issued by the client,
f
All algorithm throughput values (in Gbps) using Defcon and web traffic...
plos.figshare.com
xls
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chun-Liang Lee; Yi-Shan Lin; Yaw-Chung Chen (2023). All algorithm throughput values (in Gbps) using Defcon and web traffic traces. [Dataset]. http://doi.org/10.1371/journal.pone.0139301.t009
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0139301.t009
Dataset updated
Jun 5, 2023
Dataset provided by
PLOS ONE
Authors
Chun-Liang Lee; Yi-Shan Lin; Yaw-Chung Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All algorithm throughput values (in Gbps) using Defcon and web traffic traces.
i
Backscatter-2004-2005
impactcybertrust.org
Updated May 26, 2004
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD - Center for Applied Internet Data Analysis (2004). Backscatter-2004-2005 [Dataset]. http://doi.org/10.23721/107/1353898
Explore at:
Unique identifier
https://doi.org/10.23721/107/1353898
Dataset updated
May 26, 2004
Authors
UCSD - Center for Applied Internet Data Analysis
Time period covered
May 26, 2004 - Dec 1, 2005
Description
This backscatter from victims was collected by the UCSD Network Telescope.
Quarterly data collection took place for one week in May, August and
November in 2004, and February, May, August and November in 2005. Possible
uses of this data include modeling DoS attacks, understanding victim
populations, and using real packet traces to validate algorithms for
detecting or classifying malicious traffic. This last use is particularly
valuable because it is extremely challenging to artificially generate the
kind of real-world noise present on the Internet.
i
Two-Days-in-2008 Telescope Dataset
impactcybertrust.org
Updated Nov 12, 2008
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD - Center for Applied Internet Data Analysis (2008). Two-Days-in-2008 Telescope Dataset [Dataset]. http://doi.org/10.23721/107/1353895
Explore at:
Unique identifier
https://doi.org/10.23721/107/1353895
Dataset updated
Nov 12, 2008
Authors
UCSD - Center for Applied Internet Data Analysis
Time period covered
Nov 12, 2008 - Nov 19, 2008
Description
This dataset contains two full days of trace data from the UCSD Network Telescope:
2008-11-12 and 2008-11-19. These dates precede our detection of the Conficker A Worm
on 2008-11-21. The dataset consists of 48 compressed pcap files each containing one
hour of traffic observed by the Network Telescope.
Not seeing a result you expected?
Learn how you can add new datasets to our index.