100+ datasets found
  1. Leading social networks used for news in the U.S. 2019-2025

    • statista.com
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Leading social networks used for news in the U.S. 2019-2025 [Dataset]. https://www.statista.com/statistics/444708/social-networks-used-for-news-usa/
    Explore at:
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    In 2025, Facebook remained the most-used social platform for news in the United States, with ** percent of respondents reporting they accessed news on it. YouTube followed closely at ** percent, recording a slight increase from the previous year. X (formerly Twitter) saw the most notable growth, rising by ***** percent to ** percent.

  2. Network Traffic Dataset

    • kaggle.com
    zip
    Updated Oct 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ravikumar Gattu (2023). Network Traffic Dataset [Dataset]. https://www.kaggle.com/datasets/ravikumargattu/network-traffic-dataset/code
    Explore at:
    zip(6783827 bytes)Available download formats
    Dataset updated
    Oct 31, 2023
    Authors
    Ravikumar Gattu
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The data presented here was obtained in a Kali Machine from University of Cincinnati,Cincinnati,OHIO by carrying out packet captures for 1 hour during the evening on Oct 9th,2023 using Wireshark.This dataset consists of 394137 instances were obtained and stored in a CSV (Comma Separated Values) file.This large dataset could be used utilised for different machine learning applications for instance classification of Network traffic,Network performance monitoring,Network Security Management , Network Traffic Management ,network intrusion detection and anomaly detection.

    The dataset can be used for a variety of machine learning tasks, such as network intrusion detection, traffic classification, and anomaly detection.

    Content :

    This network traffic dataset consists of 7 features.Each instance contains the information of source and destination IP addresses, The majority of the properties are numeric in nature, however there are also nominal and date kinds due to the Timestamp.

    The network traffic flow statistics (No. Time Source Destination Protocol Length Info) were obtained using Wireshark (https://www.wireshark.org/).

    Dataset Columns:

    No : Number of Instance. Timestamp : Timestamp of instance of network traffic Source IP: IP address of Source Destination IP: IP address of Destination Portocol: Protocol used by the instance Length: Length of Instance Info: Information of Traffic Instance

    Acknowledgements :

    I would like thank University of Cincinnati for giving the infrastructure for generation of network traffic data set.

    Ravikumar Gattu , Susmitha Choppadandi

    Inspiration : This dataset goes beyond the majority of network traffic classification datasets, which only identify the type of application (WWW, DNS, ICMP,ARP,RARP) that an IP flow contains. Instead, it generates machine learning models that can identify specific applications (like Tiktok,Wikipedia,Instagram,Youtube,Websites,Blogs etc.) from IP flow statistics (there are currently 25 applications in total).

    **Dataset License: ** CC0: Public Domain

    Dataset Usages : This dataset can be used for different machine learning applications in the field of cybersecurity such as classification of Network traffic,Network performance monitoring,Network Security Management , Network Traffic Management ,network intrusion detection and anomaly detection.

    ML techniques benefits from this Dataset :

    This dataset is highly useful because it consists of 394137 instances of network traffic data obtained by using the 25 applications on a public,private and Enterprise networks.Also,the dataset consists of very important features that can be used for most of the applications of Machine learning in cybersecurity.Here are few of the potential machine learning applications that could be benefited from this dataset are :

    1. Network Performance Monitoring : This large network traffic data set can be utilised for analysing the network traffic to identifying the network patterns in the network .This help in designing the network security algorithms for minimise the network probelms.

    2. Anamoly Detection : Large network traffic dataset can be utilised training the machine learning models for finding the irregularitues in the traffic which could help identify the cyber attacks.

    3.Network Intrusion Detection : This large dataset could be utilised for machine algorithms training and designing the models for detection of the traffic issues,Malicious traffic network attacks and DOS attacks as well.

  3. Number of global social network users 2017-2028

    • statista.com
    • de.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How many people use social media?

                  Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
    
                  Who uses social media?
                  Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
                  when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
    
                  How much time do people spend on social media?
                  Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
    
                  What are the most popular social media platforms?
                  Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
    
  4. Network Digital Twin-Generated Dataset for Machine Learning-Based Detection...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Nov 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amit Karamchandani Batra; Amit Karamchandani Batra; Javier Nuñez Fuente; Luis de la Cal García; Luis de la Cal García; Yenny Moreno Meneses; Alberto Mozo Velasco; Alberto Mozo Velasco; Antonio Pastor Perales; Antonio Pastor Perales; Diego R. López; Diego R. López; Javier Nuñez Fuente; Yenny Moreno Meneses (2024). Network Digital Twin-Generated Dataset for Machine Learning-Based Detection of Benign and Malicious Heavy Hitter Flows [Dataset]. http://doi.org/10.5281/zenodo.14134646
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 13, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Amit Karamchandani Batra; Amit Karamchandani Batra; Javier Nuñez Fuente; Luis de la Cal García; Luis de la Cal García; Yenny Moreno Meneses; Alberto Mozo Velasco; Alberto Mozo Velasco; Antonio Pastor Perales; Antonio Pastor Perales; Diego R. López; Diego R. López; Javier Nuñez Fuente; Yenny Moreno Meneses
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jul 11, 2024
    Description

    The dataset used in this study is publicly available for research purposes. If you are using this dataset, please cite the following paper, which outlines the complete details of the dataset and the methodology used for its generation:

    Amit Karamchandani, Javier Núñez, Luis de-la-Cal, Yenny Moreno, Alberto Mozo, Antonio Pastor, "On the Applicability of Network Digital Twins in Generating Synthetic Data for Heavy Hitter Discrimination," under submission.

    This dataset contains a synthetic dataset generated to differentiate between benign and malicious heavy hitter flows within complex network environments. Heavy Hitter flows, which include high-volume data transfers, can significantly impact network performance, leading to congestion and degraded quality of service. Distinguishing legitimate heavy hitter activity from malicious Distributed Denial-of-Service traffic is critical for network management and security, yet existing datasets lack the granularity needed for training machine learning models to effectively make this distinction.

    To address this, a Network Digital Twin (NDT) approach was utilized to emulate realistic network conditions and traffic patterns, enabling automated generation of labeled data for both benign and malicious HH flows alongside regular traffic.

    The feature set includes flow statistics commonly used in network analysis, such as:

    • Traffic protocol type,
    • Flow duration (the time between the initial and final packet in both directions),
    • Total count of payload packets transmitted in both directions,
    • Cumulative bytes transmitted in both directions,
    • Time discrepancy between the first packet observations at the source and destination,
    • Packet and byte transmission rates per second within each interval, and
    • Total packet and byte counts within each interval in both directions.
  5. Data from: Network Cards: concise, readable summaries of network data

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    James Bagrow; Yong-Yeol ahn (2023). Network Cards: concise, readable summaries of network data [Dataset]. http://doi.org/10.6084/m9.figshare.20286648.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    James Bagrow; Yong-Yeol ahn
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Network datasets used as examples for network cards.

  6. Data from: Estimation of Global Network Statistics from Incomplete Data

    • figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Catherine Bliss (2023). Estimation of Global Network Statistics from Incomplete Data [Dataset]. http://doi.org/10.6084/m9.figshare.1152811.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Catherine Bliss
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Twitter Reply Networks We apply our techniques to Twitter reply networks. These networks are constructed from tweets we collected via Twitter gardenhose API service between September 9, 1998 and November 17, 1998. Each network is weighted and directed, whereby entries in the (i,j) cell of the adjacency matrix represent the number of replies directed from node i to node j. Note that there is no correlation between node labels from week to week. For example, the individual represented by node 1 in Week 1 is not the same individual represented by node 1 in Weeks 2, 3 and so forth. Each network is presented as a Matlab (.mat) file. Please cite as: Bliss, C. A., Danforth, C. M. & P. S. Dodds. (2014). Estimation of Global Network Statistics from Incomplete Data. PLOSONE (Accepted). For additional data, see: http://www.uvm.edu/~storylab/share/papers/bliss2014a/data.html

  7. Most used social networks 2025, by number of users

    • statista.com
    • abripper.com
    • +2more
    Updated Oct 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Most used social networks 2025, by number of users [Dataset]. https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/
    Explore at:
    Dataset updated
    Oct 16, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    Market leader Facebook was the first social network to surpass one billion registered accounts and currently sits at more than three billion monthly active users. Meta Platforms owns four of the biggest social media platforms, all with more than one billion monthly active users each: Facebook (core platform), WhatsApp, Messenger, and Instagram. In the third quarter of 2023, Facebook reported around four billion monthly core Family product users. The United States and China account for the most high-profile social platforms Most top-ranked social networks with more than 100 million users originated in the United States, but services like Chinese social networks WeChat, QQ, or video-sharing app Douyin have also garnered mainstream appeal in their respective regions due to local context and content. Douyin’s popularity has led to the platform releasing an international version of its network, TikTok. How many people use social media? The leading social networks are usually available in multiple languages and enable users to connect with friends or people across geographical, political, or economic borders. In 2025, social networking sites are estimated to reach 5.44 billion users, and these figures are still expected to grow as mobile device usage and mobile social networks increasingly gain traction in previously underserved markets.

  8. d

    Street Network Database SND

    • catalog.data.gov
    • data.seattle.gov
    • +2more
    Updated Oct 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Seattle ArcGIS Online (2025). Street Network Database SND [Dataset]. https://catalog.data.gov/dataset/street-network-database-snd-1712b
    Explore at:
    Dataset updated
    Oct 4, 2025
    Dataset provided by
    City of Seattle ArcGIS Online
    Description

    The pathway representation consists of segments and intersection elements. A segment is a linear graphic element that represents a continuous physical travel path terminated by path end (dead end) or physical intersection with other travel paths. Segments have one street name, one address range and one set of segment characteristics. A segment may have none or multiple alias street names. Segment types included are Freeways, Highways, Streets, Alleys (named only), Railroads, Walkways, and Bike lanes. SNDSEG_PV is a linear feature class representing the SND Segment Feature, with attributes for Street name, Address Range, Alias Street name and segment Characteristics objects. Part of the Address Range and all of Street name objects are logically shared with the Discrete Address Point-Master Address File layer. Appropriate uses include: Cartography - Used to depict the City's transportation network location and connections, typically on smaller scaled maps or images where a single line representation is appropriate. Used to depict specific classifications of roadway use, also typically at smaller scales. Used to label transportation network feature names typically on larger scaled maps. Used to label address ranges with associated transportation network features typically on larger scaled maps. Geocode reference - Used as a source for derived reference data for address validation and theoretical address location Address Range data repository - This data store is the City's address range repository defining address ranges in association with transportation network features. Polygon boundary reference - Used to define various area boundaries is other feature classes where coincident with the transportation network. Does not contain polygon features. Address based extracts - Used to create flat-file extracts typically indexed by address with reference to business data typically associated with transportation network features. Thematic linear location reference - By providing unique, stable identifiers for each linear feature, thematic data is associated to specific transportation network features via these identifiers. Thematic intersection location reference - By providing unique, stable identifiers for each intersection feature, thematic data is associated to specific transportation network features via these identifiers. Network route tracing - Used as source for derived reference data used to determine point to point travel paths or determine optimal stop allocation along a travel path. Topological connections with segments - Used to provide a specific definition of location for each transportation network feature. Also provides a specific definition of connection between each transportation network feature. (defines where the streets are and the relationship between them ie. 4th Ave is west of 5th Ave and 4th Ave does intersect with Cherry St) Event location reference - Used as source for derived reference data used to locate event and linear referencing.Data source is TRANSPO.SNDSEG_PV. Updated weekly.

  9. Z

    Training dataset used in the magazine paper entitled "A Flexible Machine...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francisco Wilhelmi (2020). Training dataset used in the magazine paper entitled "A Flexible Machine Learning-Aware Architecture for Future WLANs" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3626690
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Universitat Pompeu Fabra
    Authors
    Francisco Wilhelmi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A Flexible Machine Learning-Aware Architecture for Future WLANs

    Authors: Francesc Wilhelmi, Sergio Barrachina-Muñoz, Boris Bellalta, Cristina Cano, Anders Jonsson & Vishnu Ram.

    Abstract: Lots of hopes have been placed in Machine Learning (ML) as a key enabler of future wireless networks. By taking advantage of the large volumes of data generated by networks, ML is expected to deal with the ever-increasing complexity of networking problems. Unfortunately, current networking systems are not yet prepared for supporting the ensuing requirements of ML-based applications, especially for enabling procedures related to data collection, processing, and output distribution. This article points out the architectural requirements that are needed to pervasively include ML as part of future wireless networks operation. To this aim, we propose to adopt the International Telecommunications Union (ITU) unified architecture for 5G and beyond. Specifically, we look into Wireless Local Area Networks (WLANs), which, due to their nature, can be found in multiple forms, ranging from cloud-based to edge-computing-like deployments. Based on ITU's architecture, we provide insights on the main requirements and the major challenges of introducing ML to the multiple modalities of WLANs.

    Dataset description: This is the dataset generated for training a Neural Network (NN) in the Access Point (AP) (re)association problem in IEEE 802.11 Wireless Local Area Networks (WLANs).

    In particular, the NN is meant to output a prediction function of the throughput that a given station (STA) can obtain from a given Access Point (AP) after association. The features included in the dataset are:

    Identifier of the AP to which the STA has been associated.

    RSSI obtained from the AP to which the STA has been associated.

    Data rate in bits per second (bps) that the STA is allowed to use for the selected AP.

    Load in packets per second (pkt/s) that the STA generates.

    Percentage of data that the AP is able to serve before the user association is done.

    Amount of traffic load in pkt/s handled by the AP before the user association is done.

    Airtime in % that the AP enjoys before the user association is done.

    Throughput in pkt/s that the STA receives after the user association is done.

    The dataset has been generated through random simulations, based on the model provided in https://github.com/toniadame/WiFi_AP_Selection_Framework. More details regarding the dataset generation have been provided in https://github.com/fwilhelmi/machine_learning_aware_architecture_wlans.

  10. NDN Attack Traffic: Tree & DFN

    • kaggle.com
    zip
    Updated Oct 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Raga Titipan (2025). NDN Attack Traffic: Tree & DFN [Dataset]. https://www.kaggle.com/datasets/ragatitipan/ndn-attack-traffic-tree-and-dfn
    Explore at:
    zip(32733333 bytes)Available download formats
    Dataset updated
    Oct 16, 2025
    Authors
    Muhammad Raga Titipan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description This dataset contains comprehensive network traffic data captured during simulated attacks on Named Data Networking (NDN) environments across two distinct network topologies: Tree and DFN (Deutsches ForschungsNetz). All data was generated through controlled experiments using miniNDN simulation on Ubuntu.

    Dataset Overview Named Data Networking (NDN) represents a future internet architecture that focuses on content retrieval rather than host-to-host communication. As this architecture gains traction, understanding its security vulnerabilities becomes increasingly important. This dataset provides researchers with real traffic patterns observed during various attack scenarios on NDN networks.

    The dataset captures traffic parameters across:

    1. Tree Topology: A hierarchical network structure commonly used in organizational networks
    2. DFN Topology: Based on the German Research Network topology, representing a more complex, real-world network configuration

    Data Collection Methodology All data was systematically collected through:

    1. Setting up miniNDN environments on Ubuntu
    2. Configuring both Tree and DFN network topologies
    3. Executing controlled attack scenarios
    4. Capturing comprehensive network traffic parameters
    5. Labeling data with attack types and relevant metadata

    Features https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F17020645%2F9e3da0ea20cf30dd62d34a2ab7a1c58b%2Ftree.png?generation=1747460661383218&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F17020645%2Ff1459323e4195379bd2a8e8ea186eef1%2Fdfn.png?generation=1747460683137915&alt=media" alt=""> The dataset includes essential NDN traffic parameters:

    1. Packet type (Interest, Data, Nack)
    2. Node and interface identifiers
    3. Packet size and hop count metrics
    4. Interest lifetime values
    5. Content Store, PIT, and FIB entries
    6. Attack classification labels
    7. Topology identifiers

    Applications This dataset is valuable for:

    • Developing NDN-specific intrusion detection systems
    • Comparing attack propagation across different network architectures
    • Training machine learning models for attack detection
    • Benchmarking security solutions for content-centric networks
    • Understanding how topology affects security vulnerability
  11. m

    Network traffic for machine learning classification

    • data.mendeley.com
    Updated Feb 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Víctor Labayen Guembe (2020). Network traffic for machine learning classification [Dataset]. http://doi.org/10.17632/5pmnkshffm.1
    Explore at:
    Dataset updated
    Feb 12, 2020
    Authors
    Víctor Labayen Guembe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset is a set of network traffic traces in pcap/csv format captured from a single user. The traffic is classified in 5 different activities (Video, Bulk, Idle, Web, and Interactive) and the label is shown in the filename. There is also a file (mapping.csv) with the mapping of the host's IP address, the csv/pcap filename and the activity label.

    Activities:

    Interactive: applications that perform real-time interactions in order to provide a suitable user experience, such as editing a file in google docs and remote CLI's sessions by SSH. Bulk data transfer: applications that perform a transfer of large data volume files over the network. Some examples are SCP/FTP applications and direct downloads of large files from web servers like Mediafire, Dropbox or the university repository among others. Web browsing: contains all the generated traffic while searching and consuming different web pages. Examples of those pages are several blogs and new sites and the moodle of the university. Vídeo playback: contains traffic from applications that consume video in streaming or pseudo-streaming. The most known server used are Twitch and Youtube but the university online classroom has also been used. Idle behaviour: is composed by the background traffic generated by the user computer when the user is idle. This traffic has been captured with every application closed and with some opened pages like google docs, YouTube and several web pages, but always without user interaction.

    The capture is performed in a network probe, attached to the router that forwards the user network traffic, using a SPAN port. The traffic is stored in pcap format with all the packet payload. In the csv file, every non TCP/UDP packet is filtered out, as well as every packet with no payload. The fields in the csv files are the following (one line per packet): Timestamp, protocol, payload size, IP address source and destination, UDP/TCP port source and destination. The fields are also included as a header in every csv file.

    The amount of data is stated as follows:

    Bulk : 19 traces, 3599 s of total duration, 8704 MBytes of pcap files Video : 23 traces, 4496 s, 1405 MBytes Web : 23 traces, 4203 s, 148 MBytes Interactive : 42 traces, 8934 s, 30.5 MBytes Idle : 52 traces, 6341 s, 0.69 MBytes

  12. U.S. adults who use selected social networks 2021

    • statista.com
    Updated Aug 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). U.S. adults who use selected social networks 2021 [Dataset]. https://www.statista.com/statistics/246230/share-of-us-internet-users-who-use-selected-social-networks/
    Explore at:
    Dataset updated
    Aug 29, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 25, 2021 - Feb 8, 2021
    Area covered
    United States
    Description

    A telephone survey conducted in the United States in 2021 found that 81 percent of internet users used YouTube and 69 percent said that they used Facebook, now rebranded as Meta, followed by 40 percent stating that they used Instagram. Additionally, 21 percent of respondents reported to use TikTok whilst 18 percent used Reddit.

  13. r

    Statistical learning, anomaly detection, and optimization in self-organizing...

    • resodate.org
    Updated Dec 28, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qi Liao (2016). Statistical learning, anomaly detection, and optimization in self-organizing networks [Dataset]. http://doi.org/10.14279/depositonce-5654
    Explore at:
    Dataset updated
    Dec 28, 2016
    Dataset provided by
    Technische Universität Berlin
    DepositOnce
    Authors
    Qi Liao
    Description

    Self-organizing network, considered as a starting point toward self-aware cognitive network, is an automation technology designed for automated configuring, monitoring, troubleshooting and optimizing for the next generation mobile networks. Its main functionalities include: self-configuration, self-optimization and self-healing. With the emergence of new wireless devices and applications, the increasing demand for mixed types of services motivates extremely dense and heterogeneous deployments. As a result it is expected that a large amount of measurements and signaling overhead will be generated in future networks. Partial and inaccurate network knowledge, together with the increasing complexity of envisioned wireless networks, pose one of the biggest challenges for self-organizing network (SON) -- maintaining perfect global network information at the level of autonomous network elements is simply illusive in large-scale, highly dynamic wireless networks. Another big challenge is the network-wide optimization of interacting or conflicting SON functionalities, with the goal of improving the efficiency of total algorithmic machinery on the network level. This thesis studies SON in the context of erroneous and incomplete local information on network state, as well as possibly conflicting and abstractly defined objectives of different SON functions. We design novel mathematical models and statistical methods for enhancing network awareness at the locality of network elements through statistical learning, intelligent monitoring, and dynamic network feedback collection amidst network uncertainties. The extracted knowledge is used to optimize the network performance by adjusting to internal and exogenous network variations, critical network conditions, and different network anomalies. Context-aware frameworks are proposed for automatic configuration and tuning of network elements with minimal operator intervention to achieve timely detection of network abnormal states such as coverage holes, and to carry out a network-wide optimization of different SON functions. The results prove the benefits of the developed self-healing and self-optimization functions, including cell outage detection, network state classification and anomaly detection, random access channel optimization, mobility robustness optimization, mobility load balancing, interference reduction, and coverage and capacity optimization. We achieve timely detection and identification of network abnormal states based on the analysis of data extracted from the network. The anomaly detection algorithm automatically activates the corresponding self-healing and self-optimization algorithms for single or multiple SON use cases, which frees up operational resource and improves user-centric quality of service

  14. Corporate network dataset

    • kaggle.com
    zip
    Updated Apr 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luis Fhelipe Ribeiro (2025). Corporate network dataset [Dataset]. https://www.kaggle.com/datasets/luisfheliperibeiro/corporate-network-dataset
    Explore at:
    zip(116752218 bytes)Available download formats
    Dataset updated
    Apr 25, 2025
    Authors
    Luis Fhelipe Ribeiro
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    General Description

    This dataset was developed from real data on the usage of the corporate data network at the Universidade Federal do Rio Grande do Norte (UFRN). The main objective is to enable detailed observation of the university's network infrastructure and make this data available to the academic community. Data collection started on August 30, 2023, with the last query conducted on February 7, 2025, covering a total of approximately 19 months of continuous observations. During this period, about 1.5 months of data were lost due to failures in the data collection process or maintenance of the system responsible for capturing the data.

    The data collections cover administrative, academic, and classroom sectors, spanning a total of 13 buildings within the university, providing a broad view of the network across different environments.

    The dataset contains a total of 1,675,843 entries, each with 49 attributes.

    Dataset Attributes, by Category

    1. Connected Machines and ARP (8 attributes)

    • Number of Access, Wi-Fi, Security, and VoIP Machines: Indicates the number of machines connected to each type of network, providing insight into the network size and device load.
    • ARP Value for Access, Wi-Fi, Security, and VoIP: Refers to the number of entries in the Address Resolution Protocol (ARP) table associated with each type of network. ARP is used to map IP addresses to MAC addresses and can indicate potential connectivity issues.

    2. Traffic Metrics (18 attributes)

    • Packet and Byte: Indicates whether the information queried is accounted in packets or transmitted bytes, with positive (1) or negative (-1) values.
    • Downlink and Uplink Bandwidth by Packets (Access, Wi-Fi, Security, VoIP): Refers to the number of packets received or sent by devices connected to each network type.
    • Downlink and Uplink Bandwidth by Bytes (Access, Wi-Fi, Security, VoIP): Refers to the number of bytes received or sent by devices connected to each network type.

    3. Collection Context (5 attributes)

    • Sector: The sector from which the data was collected (academic, administrative, or classroom).
    • Date: The date of the data collection.
    • Time of Day: The time period of the collection (morning, afternoon, or evening).
    • Day of the Week: The day of the week when the collection occurred.
    • Hour: The hour of the collection.

    4. Asset Identification (4 attributes)

    • Asset IP: The IP address of the monitored device.
    • Asset Model: The model of the network device.
    • Asset Part Number: The part number of the device.
    • Asset Firmware: The firmware version in use on the device.

    5. Asset Performance (6 attributes)

    • CPU Usage (% - 1 min and 5 min): The percentage of CPU usage on the device in the last minute and the last five minutes.
    • Memory Used (%): The percentage of memory used by the device.
    • Total and Used Memory (Kb): The total amount and the used amount of memory on the device, measured in Kb.
    • Temperature (°C): The temperature of the device in degrees Celsius.

    6. Port Packet Metrics (8 attributes)

    • Packet In and Out Counter: The number of packets of data that have entered and exited all the device's ports.
    • Broadcast Packet In and Out Counter: The number of broadcast packets that have entered and exited all the device's ports.
    • Multicast Packet In and Out Counter: The number of multicast packets that have entered and exited all the device's ports.
    • Packet Error In and Out Counter: The number of error packets that have entered and exited all the device's ports.

    Size and Format

    The dataset contains approximately 1,675,843 entries, with 49 attributes per entry. It is available in CSV format.

  15. Z

    Data Center Networking Market: By Type (Application Delivery Controller...

    • zionmarketresearch.com
    pdf
    Updated Nov 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Data Center Networking Market: By Type (Application Delivery Controller (ADC), By Storage Area Network (SAN), Ethernet Switches, Wan Optimization Appliances, Routers, And System Security Equipment), By Enterprise Type (Cloud Service Providers, and Telecommunication Service Providers), By End-Use Industry (IT And Telecom, Banking Financial Services And Insurance, Healthcare, Government And Defence, Media And Entertainment, Retail, Education, And High Technology Industries), And by Region: Global Industry Perspective, Comprehensive Analysis, and Forecast, 2024 - 2032 [Dataset]. https://www.zionmarketresearch.com/report/data-center-networking-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 23, 2025
    Dataset authored and provided by
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    Global Data Center Networking Market size valued at US$ 23.03 Billion in 2023, set to reach US$ 44.09 Billion by 2032 at a CAGR of about 7.48% from 2024 to 2032

  16. u

    Trained network files

    • zivahub.uct.ac.za
    • datasetcatalog.nlm.nih.gov
    application/gzip
    Updated Feb 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Duncan Saffy; Tim Gebbie (2021). Trained network files [Dataset]. http://doi.org/10.25375/uct.13679485.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Feb 6, 2021
    Dataset provided by
    University of Cape Town
    Authors
    Duncan Saffy; Tim Gebbie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    .R files providing weights for networks trained on the simulated BSM, Black and Heston model data as well as the market data provided in this collection. There are four networks for each dataset relating to the standard network (MLP), soft-contrained network (SC) and the two hard constained networks HC1 and HC2.

  17. Social networks used for news in Brazil 2024

    • statista.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Social networks used for news in Brazil 2024 [Dataset]. https://www.statista.com/statistics/981938/social-media-platforms-used-weekly-news-brazil/
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2024 - Feb 2024
    Area covered
    Brazil
    Description

    As of February 2024, ** percent of internet users surveyed in Brazil said they had accessed news on WhatsApp. The same share of respondents used YouTube for that purpose.

  18. Spatial Networks

    • figshare.com
    bin
    Updated Jan 18, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nick Larusso (2016). Spatial Networks [Dataset]. http://doi.org/10.6084/m9.figshare.153828.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 18, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Nick Larusso
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Network datasets and meta data using in work which develops a new model of spatial network structure: http://arxiv.org/abs/1210.4246

  19. Companies' data security deployment status worldwide 2024, by use type

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Companies' data security deployment status worldwide 2024, by use type [Dataset]. https://www.statista.com/statistics/1319767/app-and-data-security-deployment-status-worldwide-by-use-type/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Nov 2024
    Area covered
    Worldwide
    Description

    As of November 2024, the application and data-centric security technology most used by companies worldwide was database firewall. At the same time, over ** percent of respondents stated that their company already used web application firewall (WAF). Moreover, the data security technology that most companies planned to acquire in the next 12 months was bot management.

  20. S

    The data and code of the article ''SNSAlib: a python library for analyzing...

    • scidb.cn
    Updated Jan 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    aiwenli; Jun-Lin Lu; Ying Fan; Xiao-Ke Xu (2025). The data and code of the article ''SNSAlib: a python library for analyzing signed network'' [Dataset]. http://doi.org/10.57760/sciencedb.j00113.00178
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 24, 2025
    Dataset provided by
    Science Data Bank
    Authors
    aiwenli; Jun-Lin Lu; Ying Fan; Xiao-Ke Xu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The data and code related to the article ''SNSAlib: a python library for analyzing signed network'' was published in the journal of Chinese Physics B. This project contains null model construction of signed networks and its statistic features. The whole project is divided into three parts, as follows: Part1: signed networks datasetsThis part involves ten empirical signed network datasets: SPP, GGS, Wiring, Sampson, Teams, Alpha, OTC, Wiki, Slashdot, and Epinions. The first five datasets are sourced from offline real-world social networks, and the latter five are obtained from online internet platforms. The processed data is stored as a triplet in a text file (.txt). Part2: null model construction of signed networksThis part is null model construction of undirected signed networks. It have seven different methods of null model construction of undirected signed networks: positive-edge randomized null model, negative-edge randomized null model, the positive-edge and negative-edge randomized null model, full-edge randomized null model, signed randomized null model, diminish community structure null model, and enhance community structure null model. Part3: statistic features of signed networksThis part is statistic features of signed model, which can describe the difference between the null model and the real networks, and discover the extraordinary characteristics of real networks. These statistic features are common neighbors, matching coefficient, excess average degree, clustering coefficient, embeddedness, FMF, FECS and DECDS.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Leading social networks used for news in the U.S. 2019-2025 [Dataset]. https://www.statista.com/statistics/444708/social-networks-used-for-news-usa/
Organization logo

Leading social networks used for news in the U.S. 2019-2025

Explore at:
Dataset updated
Jul 4, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description

In 2025, Facebook remained the most-used social platform for news in the United States, with ** percent of respondents reporting they accessed news on it. YouTube followed closely at ** percent, recording a slight increase from the previous year. X (formerly Twitter) saw the most notable growth, rising by ***** percent to ** percent.

Search
Clear search
Close search
Google apps
Main menu