100+ datasets found
  1. Urban Road Network Data

    • figshare.com
    • resodate.org
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Urban Road Networks (2023). Urban Road Network Data [Dataset]. http://doi.org/10.6084/m9.figshare.2061897.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Urban Road Networks
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Tool and data set of road networks for 80 of the most populated urban areas in the world. The data consist of a graph edge list for each city and two corresponding GIS shapefiles (i.e., links and nodes).Make your own data with our ArcGIS, QGIS, and python tools available at: http://csun.uic.edu/codes/GISF2E.htmlPlease cite: Karduni,A., Kermanshah, A., and Derrible, S., 2016, "A protocol to convert spatial polyline data to network formats and applications to world urban road networks", Scientific Data, 3:160046, Available at http://www.nature.com/articles/sdata201646

  2. Albero study: a longitudinal database of the social network and personal...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, csv
    Updated Mar 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Isidro Maya Jariego; Isidro Maya Jariego; Daniel Holgado Ramos; Daniel Holgado Ramos; Deniza Alieva; Deniza Alieva (2021). Albero study: a longitudinal database of the social network and personal networks of a cohort of students at the end of high school [Dataset]. http://doi.org/10.5281/zenodo.3532048
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Mar 26, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Isidro Maya Jariego; Isidro Maya Jariego; Daniel Holgado Ramos; Daniel Holgado Ramos; Deniza Alieva; Deniza Alieva
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT

    The Albero study analyzes the personal transitions of a cohort of high school students at the end of their studies. The data consist of (a) the longitudinal social network of the students, before (n = 69) and after (n = 57) finishing their studies; and (b) the longitudinal study of the personal networks of each of the participants in the research. The two observations of the complete social network are presented in two matrices in Excel format. For each respondent, two square matrices of 45 alters of their personal networks are provided, also in Excel format. For each respondent, both psychological sense of community and frequency of commuting is provided in a SAV file (SPSS). The database allows the combined analysis of social networks and personal networks of the same set of individuals.

    INTRODUCTION

    Ecological transitions are key moments in the life of an individual that occur as a result of a change of role or context. This is the case, for example, of the completion of high school studies, when young people start their university studies or try to enter the labor market. These transitions are turning points that carry a risk or an opportunity (Seidman & French, 2004). That is why they have received special attention in research and psychological practice, both from a developmental point of view and in the situational analysis of stress or in the implementation of preventive strategies.

    The data we present in this article describe the ecological transition of a group of young people from Alcala de Guadaira, a town located about 16 kilometers from Seville. Specifically, in the “Albero” study we monitored the transition of a cohort of secondary school students at the end of the last pre-university academic year. It is a turning point in which most of them began a metropolitan lifestyle, with more displacements to the capital and a slight decrease in identification with the place of residence (Maya-Jariego, Holgado & Lubbers, 2018).

    Normative transitions, such as the completion of studies, affect a group of individuals simultaneously, so they can be analyzed both individually and collectively. From an individual point of view, each student stops attending the institute, which is replaced by new interaction contexts. Consequently, the structure and composition of their personal networks are transformed. From a collective point of view, the network of friendships of the cohort of high school students enters into a gradual process of disintegration and fragmentation into subgroups (Maya-Jariego, Lubbers & Molina, 2019).

    These two levels, individual and collective, were evaluated in the “Albero” study. One of the peculiarities of this database is that we combine the analysis of a complete social network with a survey of personal networks in the same set of individuals, with a longitudinal design before and after finishing high school. This allows combining the study of the multiple contexts in which each individual participates, assessed through the analysis of a sample of personal networks (Maya-Jariego, 2018), with the in-depth analysis of a specific context (the relationships between a promotion of students in the institute), through the analysis of the complete network of interactions. This potentially allows us to examine the covariation of the social network with the individual differences in the structure of personal networks.

    PARTICIPANTS

    The social network and personal networks of the students of the last two years of high school of an institute of Alcala de Guadaira (Seville) were analyzed. The longitudinal follow-up covered approximately a year and a half. The first wave was composed of 31 men (44.9%) and 38 women (55.1%) who live in Alcala de Guadaira, and who mostly expect to live in Alcala (36.2%) or in Seville (37.7%) in the future. In the second wave, information was obtained from 27 men (47.4%) and 30 women (52.6%).

    DATE STRUCTURE AND ARCHIVES FORMAT

    The data is organized in two longitudinal observations, with information on the complete social network of the cohort of students of the last year, the personal networks of each individual and complementary information on the sense of community and frequency of metropolitan movements, among other variables.

    Social network

    The file “Red_Social_t1.xlsx” is a valued matrix of 69 actors that gathers the relations of knowledge and friendship between the cohort of students of the last year of high school in the first observation. The file “Red_Social_t2.xlsx” is a valued matrix of 57 actors obtained 17 months after the first observation.

    The data is organized in two longitudinal observations, with information on the complete social network of the cohort of students of the last year, the personal networks of each individual and complementary information on the sense of community and frequency of metropolitan movements, among other variables.

    In order to generate each complete social network, the list of 77 students enrolled in the last year of high school was passed to the respondents, asking that in each case they indicate the type of relationship, according to the following values: 1, “his/her name sounds familiar"; 2, "I know him/her"; 3, "we talk from time to time"; 4, "we have good relationship"; and 5, "we are friends." The two resulting complete networks are represented in Figure 2. In the second observation, it is a comparatively less dense network, reflecting the gradual disintegration process that the student group has initiated.

    Personal networks

    Also in this case the information is organized in two observations. The compressed file “Redes_Personales_t1.csv” includes 69 folders, corresponding to personal networks. Each folder includes a valued matrix of 45 alters in CSV format. Likewise, in each case a graphic representation of the network obtained with Visone (Brandes and Wagner, 2004) is included. Relationship values range from 0 (do not know each other) to 2 (know each other very well).

    Second, the compressed file “Redes_Personales_t2.csv” includes 57 folders, with the information equivalent to each respondent referred to the second observation, that is, 17 months after the first interview. The structure of the data is the same as in the first observation.

    Sense of community and metropolitan displacements

    The SPSS file “Albero.sav” collects the survey data, together with some information-summary of the network data related to each respondent. The 69 rows correspond to the 69 individuals interviewed, and the 118 columns to the variables related to each of them in T1 and T2, according to the following list:

    • Socio-economic data.

    • Data on habitual residence.

    • Information on intercity journeys.

    • Identity and sense of community.

    • Personal network indicators.

    • Social network indicators.

    DATA ACCESS

    Social networks and personal networks are available in CSV format. This allows its use directly with UCINET, Visone, Pajek or Gephi, among others, and they can be exported as Excel or text format files, to be used with other programs.

    The visual representation of the personal networks of the respondents in both waves is available in the following album of the Graphic Gallery of Personal Networks on Flickr: <https://www.flickr.com/photos/25906481@N07/albums/72157667029974755>.

    In previous work we analyzed the effects of personal networks on the longitudinal evolution of the socio-centric network. It also includes additional details about the instruments applied. In case of using the data, please quote the following reference:

    • Maya-Jariego, I., Holgado, D. & Lubbers, M. J. (2018). Efectos de la estructura de las redes personales en la red sociocéntrica de una cohorte de estudiantes en transición de la enseñanza secundaria a la universidad. Universitas Psychologica, 17(1), 86-98. https://doi.org/10.11144/Javeriana.upsy17-1.eerp

    The English version of this article can be downloaded from: https://tinyurl.com/yy9s2byl

    CONCLUSION

    The database of the “Albero” study allows us to explore the co-evolution of social networks and personal networks. In this way, we can examine the mutual dependence of individual trajectories and the structure of the relationships of the cohort of students as a whole. The complete social network corresponds to the same context of interaction: the secondary school. However, personal networks collect information from the different contexts in which the individual participates. The structural properties of personal networks may partly explain individual differences in the position of each student in the entire social network. In turn, the properties of the entire social network partly determine the structure of opportunities in which individual trajectories are displayed.

    The longitudinal character and the combination of the personal networks of individuals with a common complete social network, make this database have unique characteristics. It may be of interest both for multi-level analysis and for the study of individual differences.

    ACKNOWLEDGEMENTS

    The fieldwork for this study was supported by the Complementary Actions of the Ministry of Education and Science (SEJ2005-25683), and was part of the project “Dynamics of actors and networks across levels: individuals,

  3. m

    Network traffic for machine learning classification

    • data.mendeley.com
    Updated Feb 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Víctor Labayen Guembe (2020). Network traffic for machine learning classification [Dataset]. http://doi.org/10.17632/5pmnkshffm.1
    Explore at:
    Dataset updated
    Feb 12, 2020
    Authors
    Víctor Labayen Guembe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset is a set of network traffic traces in pcap/csv format captured from a single user. The traffic is classified in 5 different activities (Video, Bulk, Idle, Web, and Interactive) and the label is shown in the filename. There is also a file (mapping.csv) with the mapping of the host's IP address, the csv/pcap filename and the activity label.

    Activities:

    Interactive: applications that perform real-time interactions in order to provide a suitable user experience, such as editing a file in google docs and remote CLI's sessions by SSH. Bulk data transfer: applications that perform a transfer of large data volume files over the network. Some examples are SCP/FTP applications and direct downloads of large files from web servers like Mediafire, Dropbox or the university repository among others. Web browsing: contains all the generated traffic while searching and consuming different web pages. Examples of those pages are several blogs and new sites and the moodle of the university. Vídeo playback: contains traffic from applications that consume video in streaming or pseudo-streaming. The most known server used are Twitch and Youtube but the university online classroom has also been used. Idle behaviour: is composed by the background traffic generated by the user computer when the user is idle. This traffic has been captured with every application closed and with some opened pages like google docs, YouTube and several web pages, but always without user interaction.

    The capture is performed in a network probe, attached to the router that forwards the user network traffic, using a SPAN port. The traffic is stored in pcap format with all the packet payload. In the csv file, every non TCP/UDP packet is filtered out, as well as every packet with no payload. The fields in the csv files are the following (one line per packet): Timestamp, protocol, payload size, IP address source and destination, UDP/TCP port source and destination. The fields are also included as a header in every csv file.

    The amount of data is stated as follows:

    Bulk : 19 traces, 3599 s of total duration, 8704 MBytes of pcap files Video : 23 traces, 4496 s, 1405 MBytes Web : 23 traces, 4203 s, 148 MBytes Interactive : 42 traces, 8934 s, 30.5 MBytes Idle : 52 traces, 6341 s, 0.69 MBytes

  4. Data from: Board of Directors’ Interlocks: A Social Network Analysis...

    • scielo.figshare.com
    tiff
    Updated Jun 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Claudine Pereira Salgado; Vivian Sebben Adami; Jorge R. de Souza Verschoore Filho; Cristiano Machado Costa (2023). Board of Directors’ Interlocks: A Social Network Analysis Tutorial [Dataset]. http://doi.org/10.6084/m9.figshare.21556978.v1
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 17, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Claudine Pereira Salgado; Vivian Sebben Adami; Jorge R. de Souza Verschoore Filho; Cristiano Machado Costa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT Objective: the literature on board interlocks has increased in recent years, focusing on understanding board composition and its relationships with other companies’ boards. Such studies usually require multiple procedures of data extraction, handling, and analysis to create and analyze social networks. However, these procedures are not standardized, and there is a lack of methodological instructions available to make this process easier for researchers. This tutorial intends to describe the logical steps taken to collect data, treat them, and map and measure the network properties to provide researchers with the sources to replicate it in their own research. We contribute to the literature in the management field by proposing an empirical methodological approach to conduct board interlocks’ research. Proposal: our tutorial describes and provides examples of data collection, directors’ data treatment, and the use of these data to map and measure network structural properties using an open-source tool - R statistical software. Conclusions: our main contribution is a tutorial detailing the steps required to map and analyze board interlocks, making this process easier, standardized, and more accessible for all researchers who wish to develop social network analysis studies.

  5. Data from: Network Cards: concise, readable summaries of network data

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    James Bagrow; Yong-Yeol ahn (2023). Network Cards: concise, readable summaries of network data [Dataset]. http://doi.org/10.6084/m9.figshare.20286648.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    James Bagrow; Yong-Yeol ahn
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Network datasets used as examples for network cards.

  6. Network Vulnerability(Sample)

    • kaggle.com
    zip
    Updated Oct 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SkywardAI Labs (2024). Network Vulnerability(Sample) [Dataset]. https://www.kaggle.com/datasets/skywardai/network-vulnerability
    Explore at:
    zip(90893 bytes)Available download formats
    Dataset updated
    Oct 30, 2024
    Dataset authored and provided by
    SkywardAI Labs
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Bowen

    Released under MIT

    Contents

  7. s

    Orphan Drugs - Dataset 1: Twitter issue-networks as excluded publics

    • orda.shef.ac.uk
    txt
    Updated Oct 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Hanchard (2021). Orphan Drugs - Dataset 1: Twitter issue-networks as excluded publics [Dataset]. http://doi.org/10.15131/shef.data.16447326.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 22, 2021
    Dataset provided by
    The University of Sheffield
    Authors
    Matthew Hanchard
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises of two .csv format files used within workstream 2 of the Wellcome Trust funded ‘Orphan drugs: High prices, access to medicines and the transformation of biopharmaceutical innovation’ project (219875/Z/19/Z). They appear in various outputs, e.g. publications and presentations.

    The deposited data were gathered using the University of Amsterdam Digital Methods Institute’s ‘Twitter Capture and Analysis Toolset’ (DMI-TCAT) before being processed and extracted from Gephi. DMI-TCAT queries Twitter’s STREAM Application Programming Interface (API) using SQL and retrieves data on a pre-set text query. It then sends the returned data for storage on a MySQL database. The tool allows for output of that data in various formats. This process aligns fully with Twitter’s service user terms and conditions. The query for the deposited dataset gathered a 1% random sample of all public tweets posted between 10-Feb-2021 and 10-Mar-2021 containing the text ‘Rare Diseases’ and/or ‘Rare Disease Day’, storing it on a local MySQL database managed by the University of Sheffield School of Sociological Studies (http://dmi-tcat.shef.ac.uk/analysis/index.php), accessible only via a valid VPN such as FortiClient and through a permitted active directory user profile. The dataset was output from the MySQL database raw as a .gexf format file, suitable for social network analysis (SNA). It was then opened using Gephi (0.9.2) data visualisation software and anonymised/pseudonymised in Gephi as per the ethical approval granted by the University of Sheffield School of Sociological Studies Research Ethics Committee on 02-Jun-201 (reference: 039187). The deposited dataset comprises of two anonymised/pseudonymised social network analysis .csv files extracted from Gephi, one containing node data (Issue-networks as excluded publics – Nodes.csv) and another containing edge data (Issue-networks as excluded publics – Edges.csv). Where participants explicitly provided consent, their original username has been provided. Where they have provided consent on the basis that they not be identifiable, their username has been replaced with an appropriate pseudonym. All other usernames have been anonymised with a randomly generated 16-digit key. The level of anonymity for each Twitter user is provided in column C of deposited file ‘Issue-networks as excluded publics – Nodes.csv’.

    This dataset was created and deposited onto the University of Sheffield Online Research Data repository (ORDA) on 26-Aug-2021 by Dr. Matthew S. Hanchard, Research Associate at the University of Sheffield iHuman institute/School of Sociological Studies. ORDA has full permission to store this dataset and to make it open access for public re-use without restriction under a CC BY license, in line with the Wellcome Trust commitment to making all research data Open Access.

    The University of Sheffield are the designated data controller for this dataset.

  8. Malware Detection in Network Traffic Data

    • kaggle.com
    zip
    Updated Dec 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agung Pambudi (2023). Malware Detection in Network Traffic Data [Dataset]. https://www.kaggle.com/datasets/agungpambudi/network-malware-detection-connection-analysis
    Explore at:
    zip(755409206 bytes)Available download formats
    Dataset updated
    Dec 26, 2023
    Authors
    Agung Pambudi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To cite the dataset please reference it as “Stratosphere Laboratory. A labeled dataset with malicious and benign IoT network traffic. January 22th. Agustin Parmisano, Sebastian Garcia, Maria Jose Erquiaga. https://www.stratosphereips.org/datasets-iot23

    This dataset includes labels that explain the linkages between flows connected with harmful or possibly malicious activity to provide network malware researchers and analysts with more thorough information. These labels were painstakingly created at the Stratosphere labs using malware capture analysis.

    We present a concise explanation of the labels used for the identification of malicious flows, based on manual network analysis, below:

    Attack: This label signifies the occurrence of an attack originating from an infected device directed towards another host. Any flow that endeavors to exploit a vulnerable service, discerned through payload and behavioral analysis, falls under this classification. Examples include brute force attempts on telnet logins or header-based command injections in GET requests.

    Benign: The "Benign" label denotes connections where no suspicious or malicious activities have been detected.

    C&C (Command and Control): This label indicates that the infected device has established a connection with a Command and Control server. This observation is rooted in the periodic nature of connections or activities such as binary downloads or the exchange of IRC-like or decoded commands.

    DDoS (Distributed Denial of Service): "DDoS" is assigned when the infected device is actively involved in a Distributed Denial of Service attack, identifiable by the volume of flows directed towards a single IP address.

    FileDownload: This label signifies that a file is being downloaded to the infected device. It is determined by examining connections with response bytes exceeding a specified threshold (typically 3KB or 5KB), often in conjunction with known suspicious destination ports or IPs associated with Command and Control servers.

    HeartBeat: "HeartBeat" designates connections where packets serve the purpose of tracking the infected host by the Command and Control server. Such connections are identified through response bytes below a certain threshold (typically 1B) and exhibit periodic similarities. This is often associated with known suspicious destination ports or IPs linked to Command and Control servers.

    Mirai: This label is applied when connections exhibit characteristics resembling those of the Mirai botnet, based on patterns consistent with common Mirai attack profiles.

    Okiru: Similar to "Mirai," the "Okiru" label is assigned to connections displaying characteristics of the Okiru botnet. The parameters for this label are the same as for Mirai, but Okiru is a less prevalent botnet family.

    PartOfAHorizontalPortScan: This label is employed when connections are involved in a horizontal port scan aimed at gathering information for potential subsequent attacks. The labeling decision hinges on patterns such as shared ports, similar transmitted byte counts, and multiple distinct destination IPs among the connections.

    Torii: The "Torii" label is used when connections exhibit traits indicative of the Torii botnet, with labeling criteria similar to those used for Mirai, albeit in the context of a less common botnet family.

    Field NameDescriptionType
    tsThe timestamp of the connection event.time
    uidA unique identifier for the connection.string
    id.orig_hThe source IP address.addr
    id.orig_pThe source port.port
    id.resp_hThe destination IP address.addr
    id.resp_pThe destination port.port
    protoThe network protocol used (e.g., 'tcp').enum
    serviceThe service associated with the connection.string
    durationThe duration of the connection.interval
    orig_bytesThe number of bytes sent from the source to the destination.count
    resp_bytesThe number of bytes sent from the destination to the source.count
    conn_stateThe state of the connection.string
    local_origIndicates whether the connection is considered local or not.bool
    local_respIndicates whether the connection is considered...
  9. H

    Replication Data for: Navigating the Range of Statistical Tools for...

    • dataverse.harvard.edu
    Updated Apr 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Skyler Cranmer; Philip Leifeld; Scott McClurg; Meredith Rolfe (2017). Replication Data for: Navigating the Range of Statistical Tools for Inferential Network Analysis [Dataset]. http://doi.org/10.7910/DVN/2XP8YF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 23, 2017
    Dataset provided by
    Harvard Dataverse
    Authors
    Skyler Cranmer; Philip Leifeld; Scott McClurg; Meredith Rolfe
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/2XP8YFhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/2XP8YF

    Description

    The last decade has seen substantial advances in statistical techniques for the analysis of network data, and a major increase in the frequency with which these tools are used. These techniques are designed to accomplish the same broad goal, statistically valid inference in the presence of highly interdependent relationships, but important differences remain between them. We review three approaches commonly used for inferential network analysis---the Quadratic Assignment Procedure, Exponential Random Graph Model, and Latent Space Network Model---highlighting the strengths and weaknesses of the techniques relative to one another. An illustrative example using climate change policy network data shows that all three network models outperform standard logit estimates on multiple criteria. This paper introduces political scientists to a class of network techniques beyond simple descriptive measures of network structure, and helps researchers choose which model to use in their own research.

  10. Network traffic datasets created by Single Flow Time Series Analysis

    • zenodo.org
    • data.niaid.nih.gov
    csv, pdf
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka (2024). Network traffic datasets created by Single Flow Time Series Analysis [Dataset]. http://doi.org/10.5281/zenodo.8035724
    Explore at:
    csv, pdfAvailable download formats
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Network traffic datasets created by Single Flow Time Series Analysis

    Datasets were created for the paper: Network Traffic Classification based on Single Flow Time Series Analysis -- Josef Koumar, Karel Hynek, Tomáš Čejka -- which was published at The 19th International Conference on Network and Service Management (CNSM) 2023. Please cite usage of our datasets as:

    J. Koumar, K. Hynek and T. Čejka, "Network Traffic Classification Based on Single Flow Time Series Analysis," 2023 19th International Conference on Network and Service Management (CNSM), Niagara Falls, ON, Canada, 2023, pp. 1-7, doi: 10.23919/CNSM59352.2023.10327876.

    This Zenodo repository contains 23 datasets created from 15 well-known published datasets which are cited in the table below. Each dataset contains 69 features created by Time Series Analysis of Single Flow Time Series. The detailed description of features from datasets is in the file: feature_description.pdf

    In the following table is a description of each dataset file:

    File nameDetection problemCitation of original raw dataset
    botnet_binary.csv Binary detection of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
    botnet_multiclass.csv Multi-class classification of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
    cryptomining_design.csvBinary detection of cryptomining; the design part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
    cryptomining_evaluation.csv Binary detection of cryptomining; the evaluation part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
    dns_malware.csv Binary detection of malware DNS Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.
    doh_cic.csv Binary detection of DoH

    Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020

    doh_real_world.csv Binary detection of DoH Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022
    dos.csv Binary detection of DoS Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.
    edge_iiot_binary.csv Binary detection of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
    edge_iiot_multiclass.csvMulti-class classification of IoT malwareMohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
    https_brute_force.csvBinary detection of HTTPS Brute ForceJan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020
    ids_cic_binary.csvBinary detection of intrusion in IDSIman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
    ids_cic_multiclass.csv Multi-class classification of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
    ids_unsw_nb_15_binary.csv Binary detection of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
    ids_unsw_nb_15_multiclass.csv Multi-class classification of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
    iot_23.csv Binary detection of IoT malware Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23
    ton_iot_binary.csv Binary detection of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
    ton_iot_multiclass.csv Multi-class classification of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
    tor_binary.csv Binary detection of TOR Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
    tor_multiclass.csv Multi-class classification of TOR Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
    vpn_iscx_binary.csv Binary detection of VPN Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
    vpn_iscx_multiclass.csv Multi-class classification of VPN Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
    vpn_vnat_binary.csv Binary detection of VPN Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022
    vpn_vnat_multiclass.csvMulti-class classification of VPN Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022

  11. Dataset of directed signed networks from social domain

    • figshare.com
    zip
    Updated Sep 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samin Aref; Ly Dinh; Rezvaneh Rezapour (2020). Dataset of directed signed networks from social domain [Dataset]. http://doi.org/10.6084/m9.figshare.12152628.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 4, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Samin Aref; Ly Dinh; Rezvaneh Rezapour
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains a range of directed signed networks (signed digraphs) from social domain. The data come from 9 different sources and in total there are 29 network files. There are two temporal networks and one multilayer network in this dataset. Each network is provided in two formats: edgelist (.csv) and .gml format.This dataset is provided under a CC BY-NC-SA Creative Commons v 4.0 license (Attribution-NonCommercial-ShareAlike). This means that other individuals may remix, tweak, and build upon these data non-commercially, as long as they provide citations to this data repository (https://doi.org/10.6084/m9.figshare.12152628) and the reference article listed below (https://doi.org/10.1038/s41598-020-71838-6), and license the new creations under the identical terms.For more information about the data, one may refer to the article below:Samin Aref, Ly Dinh, Rezvaneh Rezapour, and Jana Diesner. "Multilevel Structural Evaluation of Signed Directed Social Networks based on Balance Theory" Scientific Reports (2020) https://doi.org/10.1038/s41598-020-71838-6

  12. Institutional Provider Network Data: 2021 Quarter 1

    • healthdata.gov
    • health.data.ny.gov
    csv, xlsx, xml
    Updated Apr 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    health.data.ny.gov (2025). Institutional Provider Network Data: 2021 Quarter 1 [Dataset]. https://healthdata.gov/State/Institutional-Provider-Network-Data-2021-Quarter-1/7brk-wvvi
    Explore at:
    xml, csv, xlsxAvailable download formats
    Dataset updated
    Apr 8, 2025
    Dataset provided by
    health.data.ny.gov
    Description

    The Institutional Provider Network Data displays information on health facilities and ancillary service providers (for example: hospitals, labs, home care agencies) participating in health plan networks from January through March 2021. Plan network data is collected from Medicaid, Commercial, and Exchange plans on a quarterly basis by NYSoH, including managed care plans, as well as PPO/EPO plans. For more information, please visit: https://pndslookup.health.ny.gov.

  13. Location Graphs (SNAP)

    • kaggle.com
    zip
    Updated Dec 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Subhajit Sahu (2021). Location Graphs (SNAP) [Dataset]. https://www.kaggle.com/wolfram77/graphs-snap-loc
    Explore at:
    zip(163822208 bytes)Available download formats
    Dataset updated
    Dec 16, 2021
    Authors
    Subhajit Sahu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    loc-Brightkite

    https://snap.stanford.edu/data/loc-Brightkite.html

    Dataset information

    Brightkite (http://www.brightkite.com/) was once a location-based social
    networking service provider where users shared their locations by
    checking-in. The friendship network was collected using their public API, and consists of 58,228 nodes and 214,078 edges. The network is originally
    directed but we have constructed a network with undirected edges when there is a friendship in both ways. We have also collected a total of 4,491,143
    checkins of these users over the period of Apr. 2008 - Oct. 2010.

    Dataset statistics
    Nodes 58,228
    Edges 214,078
    Nodes in largest WCC 56739 (0.974)
    Edges in largest WCC 212945 (0.995)
    Nodes in largest SCC 56739 (0.974)
    Edges in largest SCC 212945 (0.995)
    Average clustering coefficient 0.1723
    Number of triangles 494728
    Fraction of closed triangles 0.03979
    Diameter (longest shortest path) 16
    90-percentile effective diameter 6
    Checkins 4,491,143

    Source (citation)
    E. Cho, S. A. Myers, J. Leskovec. Friendship and Mobility: Friendship and
    Mobility: User Movement in Location-Based Social Networks ACM SIGKDD
    International Conference on Knowledge Discovery and Data Mining (KDD),
    2011. http://cs.stanford.edu/people/jure/pubs/mobile-kdd11.pdf

    Files
    File Description
    loc-brightkite_edges.txt.gz Friendship network of Brightkite users
    loc-brightkite_totalCheckins.txt.gz
    Time and location information of check-ins made by users

    Example of check-in information

    [user][check-in time]   [latitude] [longitude] [location id]    
    58186 2008-12-03T21:09:14Z 39.633321 -105.317215 ee8b88dea22411    
    58186 2008-11-30T22:30:12Z 39.633321 -105.317215 ee8b88dea22411    
    58186 2008-11-28T17:55:04Z -13.158333 -72.531389 e6e86be2a22411    
    58186 2008-11-26T17:08:25Z 39.633321 -105.317215 ee8b88dea22411    
    58187 2008-08-14T21:23:55Z 41.257924 -95.938081 4c2af967eb5df8    
    58187 2008-08-14T07:09:38Z 41.257924 -95.938081 4c2af967eb5df8    
    58187 2008-08-14T07:08:59Z 41.295474 -95.999814 f3bb9560a2532e    
    58187 2008-08-14T06:54:21Z 41.295474 -95.999814 f3bb9560a2532e    
    58188 2010-04-06T06:45:19Z 46.521389  14.854444 ddaa40aaa22411    
    58188 2008-12-30T15:30:08Z 46.522621  14.849618 58e12bc0d67e11    
    58189 2009-04-08T07:36:46Z 46.554722  15.646667 ddaf9c4ea22411    
    58190 2009-04-08T07:01:28Z 46.421389  15.869722 dd793f96a22411    
    

    Notes on inclusion into the SuiteSparse Matrix Collection, July 2018:

    The SNAP data set is 0-based, with nodes numbered 0 to 58,227.

    In the SuiteSparse Matrix Collection the graph is converted to 1-based.
    The Problem.A matrix is the undirected friendship network, where
    A(i,j)=1 if person 1+i and person 1+j are friends in the SNAP data set.

    There are 4,747,287 checkins in the loc-brightkite_totalCheckins.txt
    file, but 6 lines are empty with a user id but no other data (those
    are discarded here). In the SuiteSparse Matrix Collection, the checkin
    data is held in 5 vectors of length 4,747,281. These are in the
    Problem.aux component of the MATLAB struct. The kth entry of each of
    these vectors holds the data in the kth line of the
    loc-brightkite_totalCheckins.txt file (after deleting the 6 empty lines).

    userid: the SNAP user id is an integer in the range 0 to 58,227. It  
      has been incremented by one, here, to reflect the corresponding  
      row and column of the Problem.A matrix. It contains 51,406    
      unique user id's.                         
    checkin_time: a string of length 20                  
    latitude: a double precision number                  
    longitude: a double precision number                  
    location_id: a string of length 61.
    

    loc-Gowalla

    https://snap.stanford.edu/data/loc-Gowalla.html

    Dataset information

    Gowalla (http://www.gowalla.com/) is a location-based social networking
    website where users share their locations by checking-in. The friendship
    network is undirected and was collected using their public API, and
    consists of 196,591 nodes and 950,327 edges. We have collected a total of
    6,442,890 check-ins of these users over the period of Feb. 2009 - Oct.
    2010.

    Dataset statistics
    Nodes 196,591
    Edges 950,327
    Nodes in largest WCC 196591 (1.000)
    Edges in largest WCC 950327 (1.000)
    Nodes in largest SCC 196591 (1.000)
    Edges in largest SCC 950327 (1.000)
    Average clustering coefficient 0.2367
    Number of triangles 2273138
    Fraction of closed triangles 0.007952
    Diameter (longest shortest path) 14
    90-percentile effective diameter 5.7
    Check-ins 6,442,890

    Source (citation)
    E. Cho, S. A. Myers, J. Leskovec. Friendship and Mobility: Friendship and
    Mobility: User Movement in Location-Based Social Networks ACM SIGKDD
    International Conference on Knowledge Discovery and Data Mining (KDD),
    2011. http://cs.stanford.edu/people/jure/pubs/mobile-kdd11.pdf

    Files
    File Description
    loc-gowalla_edges.txt.gz Friendship network of Gowalla users
    loc-gowalla_totalCheckins.txt.gz Time and location information
    of check-ins made by users

    Example of check-in information

    [user] [check-in time]   [latitude]  [longitude] [location id]  
    196514 2010-07-24T13:45:06Z 53.3648119  -2.2723465833  145064   
    196514 2010-07-24T13:44:58Z 53.360511233 -2.276369017  1275991   
    196514 2010-07-24T13:44:46Z 53.3653895945 -2.2754087046  376497   
    196514 2010-07-24T13:44:38Z 53.3663709833 -2.2700764333  98503    
    196514 2010-07-24T13:44:26Z 53.3674087524 -2.2783813477  1043431   
    196514 2010-07-24T13:44:08Z 53.3675663377 -2.278631763  881734   
    196514 2010-07-24T13:43:18Z 53.3679640626 -2.2792943689  207763   
    196514 2010-07-24T13:41:10Z 53.364905   -2.270824    1042822
    
  14. Large-Scale Dynamic Random Graph - Example

    • figshare.com
    txt
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Osnat Mokryn; Alex Abbey (2023). Large-Scale Dynamic Random Graph - Example [Dataset]. http://doi.org/10.6084/m9.figshare.20462871.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Osnat Mokryn; Alex Abbey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Zhang et al. (https://link.springer.com/article/10.1140/epjb/e2017-80122-8) suggest a temporal random network with changing dynamics that follow a Markov process, allowing for a continuous-time network history moving from a static definition of a random graph with a fixed number of nodes n and edge probability p to a temporal one. Defining lambda = probability per time granule of a new edge to appear and mu = probability per time granule of an existing edge to disappear, Zhang et al. show that the equilibrium probability of an edge is p=lambda/(lambda+mu) Our implementation, a Python package that we refer to as RandomDynamicGraph https://github.com/ScanLab-ossi/DynamicRandomGraphs, generates large-scale dynamic random graphs according to the defined density. The package focuses on massive data generation; it uses efficient math calculations, writes to file instead of in-memory when datasets are too large, and supports multi-processing. Please note the datetime is arbitrary.

  15. S

    Institutional Provider Network Data: 2018 Quarters 1 and 2

    • health.data.ny.gov
    • healthdata.gov
    csv, xlsx, xml
    Updated Sep 13, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York State Department of Health (2018). Institutional Provider Network Data: 2018 Quarters 1 and 2 [Dataset]. https://health.data.ny.gov/Health/Institutional-Provider-Network-Data-2018-Quarters-/ctrr-z499
    Explore at:
    csv, xlsx, xmlAvailable download formats
    Dataset updated
    Sep 13, 2018
    Dataset authored and provided by
    New York State Department of Health
    Description

    The institutional Provider Network Data displays information on health facilities and ancillary service providers (for example: hospitals, labs, home care agencies) participating in health plan networks from January through June, 2018. Plan network data is collected from Medicaid, Commercial, and Exchange plans on a quarterly basis by the Department of Health, including managed care plans, as well as PPO/EPO plans.

  16. Network data schema in the Netflow V9 format

    • kaggle.com
    zip
    Updated Feb 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashutosh Sharma (2023). Network data schema in the Netflow V9 format [Dataset]. https://www.kaggle.com/datasets/ashtcoder/network-data-schema-in-the-netflow-v9-format
    Explore at:
    zip(275705142 bytes)Available download formats
    Dataset updated
    Feb 1, 2023
    Authors
    Ashutosh Sharma
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The network data schema is in the Netflow V9 format. Given two files 'train_net.csv' and 'test_net.csv', train_net.csv explains when the particular ALERT will happen. There are 4 classes present in the dataset, named following: 'None', 'Port Scanning', 'Denial of Service', 'Malware'.

    Acknowledgements

    SIMARGL Project – Secure Intelligent Methods for Advanced RecoGnition of malware and stegomalware, with the support of the European Commission and the Horizon 2020 Program, under Grant Agreement No. 833042.

    Cite

    Maria-Elena Mihailescu, Darius Mihai, Mihai Carabas, Mikolaj Komisarek, Marek Pawlicki, Witold Holubowicz, Rafal Kozik: The Proposition and Evaluation of the RoEduNet-SIMARGL2021 Network Intrusion Detection Dataset. Sensors 21(13): 4319 (2021)

  17. Weekly change in data usage on Verizon networks in the US by type in March...

    • statista.com
    Updated Mar 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2020). Weekly change in data usage on Verizon networks in the US by type in March 2020 [Dataset]. https://www.statista.com/statistics/1106893/covid-19-verizon-network-usage-increase-2020/
    Explore at:
    Dataset updated
    Mar 24, 2020
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Mar 2020
    Area covered
    United States
    Description

    The U.S.-based telecommunications company Verizon has registered a significant increase in the usage of data on its networks in the United States on March 19 compared to March 12 due to restrictions in place triggered by the coronavirus (COVID-19) pandemic. VPN traffic for example was up by 25 percent during this small sample time period.

    For further information about the coronavirus (COVID-19) pandemic, please visit our dedicated Fact and Figures page.

  18. Synthetic network traffic

    • kaggle.com
    zip
    Updated Sep 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vidhi Waghela (2023). Synthetic network traffic [Dataset]. https://www.kaggle.com/datasets/vidhikishorwaghela/synthetic-network-traffic
    Explore at:
    zip(92188700 bytes)Available download formats
    Dataset updated
    Sep 8, 2023
    Authors
    Vidhi Waghela
    Description

    OVERVIEW:

    This dataset contains synthetic network traffic data generated for the purpose of anomaly detection. It is designed to aid in developing and testing machine learning models for network security and anomaly detection.

    DATASET SIZE:

    Number of samples: 50,000 Number of features: 10

    ABOUT DATASET:

    The dataset consists of the following columns:

    1. SourceIP: Source IP address of network traffic.
    2. Destination IP: Destination IP address of network traffic.
    3. SourcePort: Source port number.
    4. DestinationPort: Destination port number.
    5. Protocol: Network protocol used.
    6. BytesSent: Number of bytes sent in the network communication.
    7. BytesReceived: Number of bytes received in the network communication.
    8. PacketsSent: Number of packets sent.
    9. PacketsReceived: Number of packets received.
    10. Duration: Duration of the network communication.

    Additionally, there is a binary target variable:

    IsAnomaly: 0 for regular network traffic, 1 for anomalous network traffic (introduced for demonstration purposes).

    USAGE:

    This dataset is intended for research and experimentation in the field of network security and anomaly detection. It can be used to train and evaluate machine learning models for identifying network anomalies.

    Acknowledgments:

    The dataset was created for educational and illustrative purposes and does not represent real-world network traffic data. It was generated using synthetic data generation techniques.

    Notes:

    1. This dataset is synthetic and is not representative of real-world network traffic.
    2. Anomalies have been introduced for demonstration purposes.
    3. Use this dataset responsibly and ensure that any use aligns with ethical considerations and legal regulations.
  19. H

    Drought Machine Learning Data Example

    • beta.hydroshare.org
    • hydroshare.org
    • +1more
    zip
    Updated Aug 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bryce Pulver (2023). Drought Machine Learning Data Example [Dataset]. https://beta.hydroshare.org/resource/9024db8a67fd4afdab2358d1b75e7e85/
    Explore at:
    zip(518.5 MB)Available download formats
    Dataset updated
    Aug 22, 2023
    Dataset provided by
    HydroShare
    Authors
    Bryce Pulver
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1980 - Dec 1, 2020
    Area covered
    Description

    This repository showcases some examples of data wrangling and visualization using the output of the USGS's output from a drought prediction model on the Colorado River Basin and example ecology site data.

  20. D

    Data from: Supplemental Materials for: "De-emphasise, Aggregate, and Hide: A...

    • darus.uni-stuttgart.de
    Updated Feb 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Aichem; Karsten Klein; Stephen Kobourov; Falk Schreiber (2024). Supplemental Materials for: "De-emphasise, Aggregate, and Hide: A Study on Interactive Visual Transformations for Group Structures in Network Visualisations" [Dataset]. http://doi.org/10.18419/DARUS-3706
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 6, 2024
    Dataset provided by
    DaRUS
    Authors
    Michael Aichem; Karsten Klein; Stephen Kobourov; Falk Schreiber
    License

    https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18419/DARUS-3706https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18419/DARUS-3706

    Dataset funded by
    DFG
    Description

    This dataset contains the supplemental materials for our publication "De-emphasise, Aggregate, and Hide: A Study on Interactive Visual Transformations for Group Structures in Network Visualisations". The publication reports on an experiment that we conducted to explore the effects of different interactive visual transformations in network drawings on the user performance. We evaluated five specific visual transformations and one control condition in five different tasks and collected data on user performance (time, accuracy), usefulness, mental effort, subjective preference, as well as some metrics of user interaction, such as usage of zoom and pan operations and the application of the visual transformations. Within these supplemental materials, we share the following: network and task data results data analysis code example images of the study demonstration videos of the interface participants demographic overview The experiment was preregistered on OSF before it was conducted. The preregistration can be found at https://doi.org/10.17605/OSF.IO/TRBWD.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Urban Road Networks (2023). Urban Road Network Data [Dataset]. http://doi.org/10.6084/m9.figshare.2061897.v1
Organization logo

Urban Road Network Data

Explore at:
307 scholarly articles cite this dataset (View in Google Scholar)
zipAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Urban Road Networks
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Tool and data set of road networks for 80 of the most populated urban areas in the world. The data consist of a graph edge list for each city and two corresponding GIS shapefiles (i.e., links and nodes).Make your own data with our ArcGIS, QGIS, and python tools available at: http://csun.uic.edu/codes/GISF2E.htmlPlease cite: Karduni,A., Kermanshah, A., and Derrible, S., 2016, "A protocol to convert spatial polyline data to network formats and applications to world urban road networks", Scientific Data, 3:160046, Available at http://www.nature.com/articles/sdata201646

Search
Clear search
Close search
Google apps
Main menu