9 datasets found
  1. CADeSH Dataset: Collaborative Anomaly Detection for Smart Homes

    • zenodo.org
    Updated Jul 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yair Meidan; Yair Meidan; Dan Avraham; Hanan Libhaber; Asaf Shabtai; Asaf Shabtai; Dan Avraham; Hanan Libhaber (2022). CADeSH Dataset: Collaborative Anomaly Detection for Smart Homes [Dataset]. http://doi.org/10.5281/zenodo.6406052
    Explore at:
    Dataset updated
    Jul 29, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yair Meidan; Yair Meidan; Dan Avraham; Hanan Libhaber; Asaf Shabtai; Asaf Shabtai; Dan Avraham; Hanan Libhaber
    Description

    Dataset used for quantitative evaluation in the paper:

    Y. Meidan, D. Avraham, H. Libhaber and A. Shabtai, "CADeSH: Collaborative Anomaly Detection for Smart Homes," in IEEE Internet of Things Journal, 2022, doi: 10.1109/JIOT.2022.3194813.

    This is a table of flow-level traffic data which was continuously captured during a period of 21 days from five real home networks which were subscribed to a smart home security service, and from our lab at Ben-Gurion University of The Negev. This security service provider shared with us these network traffic flows, plus the related DNS requests and responses, and reputation intelligence of the destination IP addresses. Each instance in this dataset represents an outbound network traffic flow (in the form of an IPFIX) which emanated from an instance of the IoT model streamer.Amazon.Fire_TV_Gen_3.

    In our lab, we infected our streamer.Amazon.Fire_TV_Gen_3 with a cryptominer and executed cryptomining from this device. To imitate a scanning activity typically performed by some botnets, we also scanned the network using Nmap. In accordance, we labeled these malicious activities as (1) `is executing cryptomining,' or (2) `being scanned by Nmap.' All of the remaining IPFIXs captured in our lab or on the home networks were labeled as `assumed benign'.

    The multitude of real home networks, and the multitude of identical source devices, enable using this dataset for quantitative evaluation of (collaborative) anomaly/attack detection methods, especially for the IoT.

  2. TCP FIN Flood and Zbassocflood Dataset

    • zenodo.org
    • ieee-dataport.org
    • +1more
    Updated Jan 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deris Stiawan; Dimas Wahyudi; Ahmad Heryanto; Tri Wanda Septian; Johan Wahyudi; Riki Andika; Meilinda Eka Suryani; Deris Stiawan; Dimas Wahyudi; Ahmad Heryanto; Tri Wanda Septian; Johan Wahyudi; Riki Andika; Meilinda Eka Suryani (2021). TCP FIN Flood and Zbassocflood Dataset [Dataset]. http://doi.org/10.5281/zenodo.4431541
    Explore at:
    Dataset updated
    Jan 14, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Deris Stiawan; Dimas Wahyudi; Ahmad Heryanto; Tri Wanda Septian; Johan Wahyudi; Riki Andika; Meilinda Eka Suryani; Deris Stiawan; Dimas Wahyudi; Ahmad Heryanto; Tri Wanda Septian; Johan Wahyudi; Riki Andika; Meilinda Eka Suryani
    Description

    The Development of an Internet of Things (IoT) Network Traffic Dataset with Simulated Attack Data.

    Abstract— This research focuses on the requirements for and the creation of an intrusion detection system (IDS) dataset for an Internet of Things (IoT) network domain.

    A minimal requirements Internet of Things (IoT) network system was built to produce a dataset according to IDS testing needs for IoT security. Testing was performed with 12 scenarios and resulted in 24 datasets which consisted of normal, attack and combined normal-attack traffic data. Testing focused on three denial of service (DoS) and distributed denial of service (DDoS) attacks—“finish” (FIN) flood, User Datagram Protocol (UDP) flood, and Zbassocflood/association flood—using two communication protocols, IEEE 802.11 (WiFi) and IEEE 802.15.4 (ZigBee). A preprocessing test result obtained 95 attributes for the WiFi datasets and 64 attributes for the Xbee datasets .

    TCP FIN Flood Attack Pattern Recognition on Internet of Things with Rule Based Signature Analysis

    Abstract-Focus of this research is TCP FIN flood attack pattern recognition in Internet of Things (IoT) network using rule based signature analysis method. Dataset is taken based on three scenarios normal, attack and normal-attack. The process of identification and recognition of TCP FIN flood attack pattern is done based on observation and analysis of packet attribute from raw data (pcap) using a feature extraction and feature selection method. Further testing was conducted using snort as an IDS. The results of the confusion matrix detection rate evaluation against the snort as IDS show the average percentage of the precision level.

    Citing
    Citation data : "TCP FIN Flood Attack Pattern Recognition on Internet of Things with Rule Based Signature Analysis" - https://online-journals.org/index.php/i-joe/article/view/9848

    @article{article,
    
    author = {Stiawan, Deris and Wahyudi, Dimas and Heryanto, Ahmad and Sahmin, Samsuryadi and Idris, Yazid and Muchtar, Farkhana and Alzahrani, Mohammed and Budiarto, Rahmat},
    
    year = {2019},
    month = {04},
    pages = {124},
    title = {TCP FIN Flood Attack Pattern Recognition on Internet of Things with Rule Based Signature Analysis},
    volume = {15},
    journal = {International Journal of Online and Biomedical Engineering (iJOE)},
    doi = {10.3991/ijoe.v15i07.9848}
    }

    Features Extraction on IoT Intrusion Detection System Using Principal Components Analysis (PCA)

    Feature extraction solves the problem of finding the most efficient and comprehensive set of features. A Principle Component Analysis (PCA) feature extraction algorithm is applied to optimize the effectiveness of feature extraction to build an effective intrusion detection method. This paper uses the Principal Components Analysis (PCA) for features extraction on intrusion detection system with the aim to improve the accuracy and precision of the detection. The impact of features extraction to attack detection was examined. Experiments on a network traffic dataset created from an Internet of Thing (IoT) testbed network topology were conducted and the results show that the accuracy of the detection reaches 100 percent.

    Citing
    Citation data : "Features Extraction on IoT Intrusion Detection System Using Principal Components Analysis (PCA)" - https://ieeexplore.ieee.org/document/9251292

    @inproceedings{inproceedings,
    
    author = {Sharipuddin, and Purnama, Benni and Kurniabudi, Kurniabudi and Winanto, Eko and Stiawan, Deris and Hanapi, Darmawiiovo and Idris, Mohd and Budiarto, Rahmat},
    
    year = {2020},
    month = {10},
    pages = {114-118},
    title = {Features Extraction on IoT Intrusion Detection System Using Principal Components Analysis (PCA)},
    doi = {10.23919/EECSI50503.2020.9251292}
    }

  3. Bibliometric Analysis IoT Lighighweight Cryptography

    • figshare.com
    txt
    Updated Oct 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenith Dewamuni; Bharanidharan Shanmugam; Sami Azam; Suresh N. Thennadil (2023). Bibliometric Analysis IoT Lighighweight Cryptography [Dataset]. http://doi.org/10.6084/m9.figshare.24434035.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 25, 2023
    Dataset provided by
    figshare
    Authors
    Zenith Dewamuni; Bharanidharan Shanmugam; Sami Azam; Suresh N. Thennadil
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    With the rapid development of Internet of Things technology, security has become increasingly important since it collects a lot of personal information. IoT devices have resource constraints, which makes traditional cryptographic algorithms ineffective for IoT security. Lightweight cryptographic algorithms are needed to overcome the limitations of IoT devices. Due to its popularity and wide use in IoT applications, Raspberry Pi security plays an important role in Raspberry Pi applications.  Analyzing existing works and understanding leading countries, keywords, authors, journals, and citations is crucial to identifying research trends and patterns in Raspberry Pi security. For the purpose of finding the information needed, bibliometric analysis was conducted using performance mapping, science mapping, and enrichment techniques. Our analysis included 979 Scopus articles, 214 WOS articles, and 144 IEEE Xplorer articles which were published during 2015-2023, and all of which were result of integrated and cleansed using the methods described in the methods section.  By using R, VOS viewer, and the bibliometrix library, we analyzed and visualized bibliometric data. We discovered India is the leading research country, Archarya.B, and Bansod. G. are the most relevant authors, the Internet of Things, light-weight cryptography, and cryptography are the most relevant sets of words, and IEEE Access is the most significant journal. It was identified that developing a lightweight cryptographic algorithm for Raspberry Pi boards would be a significant future research focus.

  4. Online Retail II

    • kaggle.com
    Updated Apr 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bojan Tunguz (2021). Online Retail II [Dataset]. https://www.kaggle.com/tunguz/online-retail-ii/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 12, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bojan Tunguz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Source:

    Dr. Daqing Chen, Course Director: MSc Data Science. chend '@' lsbu.ac.uk, School of Engineering, London South Bank University, London SE1 0AA, UK.

    Data Set Information:

    This Online Retail II data set contains all the transactions occurring for a UK-based and registered, non-store online retail between 01/12/2009 and 09/12/2011.The company mainly sells unique all-occasion gift-ware. Many customers of the company are wholesalers.

    Attribute Information:

    InvoiceNo: Invoice number. Nominal. A 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation. StockCode: Product (item) code. Nominal. A 5-digit integral number uniquely assigned to each distinct product. Description: Product (item) name. Nominal. Quantity: The quantities of each product (item) per transaction. Numeric. InvoiceDate: Invice date and time. Numeric. The day and time when a transaction was generated. UnitPrice: Unit price. Numeric. Product price per unit in sterling (£). CustomerID: Customer number. Nominal. A 5-digit integral number uniquely assigned to each customer. Country: Country name. Nominal. The name of the country where a customer resides.

    Relevant Papers:

    Chen, D. Sain, S.L., and Guo, K. (2012), Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining, Journal of Database Marketing and Customer Strategy Management, Vol. 19, No. 3, pp. 197-208. doi: [Web Link]. Chen, D., Guo, K. and Ubakanma, G. (2015), Predicting customer profitability over time based on RFM time series, International Journal of Business Forecasting and Marketing Intelligence, Vol. 2, No. 1, pp.1-18. doi: [Web Link]. Chen, D., Guo, K., and Li, Bo (2019), Predicting Customer Profitability Dynamically over Time: An Experimental Comparative Study, 24th Iberoamerican Congress on Pattern Recognition (CIARP 2019), Havana, Cuba, 28-31 Oct, 2019. Laha Ale, Ning Zhang, Huici Wu, Dajiang Chen, and Tao Han, Online Proactive Caching in Mobile Edge Computing Using Bidirectional Deep Recurrent Neural Network, IEEE Internet of Things Journal, Vol. 6, Issue 3, pp. 5520-5530, 2019. Rina Singh, Jeffrey A. Graves, Douglas A. Talbert, William Eberle, Prefix and Suffix Sequential Pattern Mining, Industrial Conference on Data Mining 2018: Advances in Data Mining. Applications and Theoretical Aspects, pp. 309-324. 2018.

    Citation Request:

    If you have no special citation requests, please leave this field blank.

  5. m

    Measured Radio Map Dataset with Multi-radiation Sources of Urban Scenarios...

    • data.mendeley.com
    Updated Apr 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    qiuming zhu (2025). Measured Radio Map Dataset with Multi-radiation Sources of Urban Scenarios (80m×105m) [Dataset]. http://doi.org/10.17632/9wfy8pcxdb.1
    Explore at:
    Dataset updated
    Apr 29, 2025
    Authors
    qiuming zhu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The radio map, or spectrum environment map (SEM), can visualize the information of invisible electromagnetic spectrum, and is vital for monitoring, management, and security of spectrum resources in cognitive radio (CR) networks. It is useful for the abnormal spectral activity detection, radiation source localization, spectrum resource management, etc. This project presents a measured radio map dataset in the urban scenario with multiple radiation sources, aiming to address the limitation of open datasets for radio map in realistic multi-source dynamic scenarios. We used a spectral signal receiving system to measure the signal intensity of multiple radiation sources in the urban scene. This project includes two datasets as 1) Raw radio map measurement data (30 MHz, 115 MHz, and 2 GHz), in the format of.csv. It includes entries such as longitude, latitude, altitude, start and end frequencies, frequency interval, number of acquisition points, and signal strength. 2) Raw spectrum tensor data (30 MHz, 115 MHz, and 2 GHz), in the format of.mat.

    More details about the construction of the spectrum map and dataset can be found in the following references. [1]. Q. Zhu et al., DEMO Abstract: An UAV-based 3D Spectrum Real-time Mapping System, 2022 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), New York, NY, USA, 2022, pp. 1-2. [2] J. Wang et al., "Sparse Bayesian Learning-Based Hierarchical Construction for 3D Radio Environment Maps Incorporating Channel Shadowing," in IEEE Transactions on Wireless Communications, vol. 23, no. 10, pp. 14560-14574, Oct. 2024. [3]. Q. Gao, et al. Time-Variant Radio Map Reconstruction with Optimized Distributed Sensors in Dynamic Spectrum Environments[J]. IEEE Internet of Things Journal, early access, Feb.2025, doi: 10.1109/JIOT.2025.3545542.

  6. i

    Channel state information data of animal crossings on rural roads

    • ieee-dataport.org
    Updated Jan 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Ducca (2025). Channel state information data of animal crossings on rural roads [Dataset]. https://ieee-dataport.org/documents/channel-state-information-data-animal-crossings-rural-roads
    Explore at:
    Dataset updated
    Jan 10, 2025
    Authors
    Samuel Ducca
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    persons and vehicles in rural environments.

  7. Z

    Network traffic datasets with novel extended IP flow called NetTiSA flow

    • data.niaid.nih.gov
    Updated Apr 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karel Hynek (2024). Network traffic datasets with novel extended IP flow called NetTiSA flow [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8301042
    Explore at:
    Dataset updated
    Apr 18, 2024
    Dataset provided by
    Karel Hynek
    Josef Koumar
    Jaroslav Pešek
    Tomáš Čejka
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Network traffic datasets with novel extended IP flow called NetTiSA flow

    Datasets were created for the paper: NetTiSA: Extended IP Flow with Time-series Features for Universal Bandwidth-constrained High-speed Network Traffic Classification -- Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka -- which is published in The International Journal of Computer and Telecommunications Networking https://doi.org/10.1016/j.comnet.2023.110147Please cite the usage of our datasets as:

    Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka, "NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification", Computer Networks, Volume 240, 2024, 110147, ISSN 1389-1286

    @article{KOUMAR2024110147, title = {NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification}, journal = {Computer Networks}, volume = {240}, pages = {110147}, year = {2024}, issn = {1389-1286}, doi = {https://doi.org/10.1016/j.comnet.2023.110147}, url = {https://www.sciencedirect.com/science/article/pii/S1389128623005923}, author = {Josef Koumar and Karel Hynek and Jaroslav Pešek and Tomáš Čejka} }

    This Zenodo repository contains 23 datasets created from 15 well-known published datasets, which are cited in the table below. Each dataset contains the NetTiSA flow feature vector.

    NetTiSA flow feature vector

    The novel extended IP flow called NetTiSA (Network Time Series Analysed) flow contains a universal bandwidth-constrained feature vector consisting of 20 features. We divide the NetTiSA flow classification features into three groups by computation. The first group of features is based on classical bidirectional flow information---a number of transferred bytes, and packets. The second group contains statistical and time-based features calculated using the time-series analysis of the packet sequences. The third type of features can be computed from the previous groups (i.e., on the flow collector) and improve the classification performance without any impact on the telemetry bandwidth.

    Flow features

    The flow features are:

    Packets is the number of packets in the direction from the source to the destination IP address.

    Packets in reverse order is the number of packets in the direction from the destination to the source IP address.

    Bytes is the size of the payload in bytes transferred in the direction from the source to the destination IP address.

    Bytes in reverse order is the size of the payload in bytes transferred in the direction from the destination to the source IP address.

    Statistical and Time-based features

    The features that are exported in the extended part of the flow. All of them can be computed (exactly or in approximative) by stream-wise computation, which is necessary for keeping memory requirements low. The second type of feature set contains the following features:

    Mean represents mean of the payload lengths of packets

    Min is the minimal value from payload lengths of all packets in a flow

    Max is the maximum value from payload lengths of all packets in a flow

    Standard deviation is a measure of the variation of payload lengths from the mean payload length

    Root mean square is the measure of the magnitude of payload lengths of packets

    Average dispersion is the average absolute difference between each payload length of the packet and the mean value

    Kurtosis is the measure describing the extent to which the tails of a distribution differ from the tails of a normal distribution

    Mean of relative times is the mean of the relative times which is a sequence defined as (st = {t_1 - t_1, t_2 - t_1, ..., t_n - t_1} )

    Mean of time differences is the mean of the time differences which is a sequence defined as (dt = { t_j - t_i | j = i + 1, i \in {1, 2, \dots, n - 1} }.)

    Min from time differences is the minimal value from all time differences, i.e., min space between packets.

    Max from time differences is the maximum value from all time differences, i.e., max space between packets.

    Time distribution describes the deviation of time differences between individual packets within the time series. The feature is computed by the following equation:(tdist = \frac{ \frac{1}{n-1} \sum_{i=1}^{n-1} \left| \mu_{{dt_{n-1}}} - dt_i \right| }{ \frac{1}{2} \left(max\left({dt_{n-1}}\right) - min\left({dt_{n-1}}\right) \right) })

    Switching ratio represents a value change ratio (switching) between payload lengths. The switching ratio is computed by equation:(sr = \frac{s_n}{\frac{1}{2} (n - 1)})

        where \(s_n\) is number of switches.
    

    Features computed at the collectorThe third set contains features that are computed from the previous two groups prior to classification. Therefore, they do not influence the network telemetry size and their computation does not put additional load to resource-constrained flow monitoring probes. The NetTiSA flow combined with this feature set is called the Enhanced NetTiSA flow and contains the following features:

    Max minus min is the difference between minimum and maximum payload lengths

    Percent deviation is the dispersion of the average absolute difference to the mean value

    Variance is the spread measure of the data from its mean

    Burstiness is the degree of peakedness in the central part of the distribution

    Coefficient of variation is a dimensionless quantity that compares the dispersion of a time series to its mean value and is often used to compare the variability of different time series that have different units of measurement

    Directions describe a percentage ratio of packet direction computed as (\frac{d_1}{ d_1 + d_0}), where (d_1) is a number of packets in a direction from source to destination IP address and (d_0) the opposite direction. Both (d_1) and (d_0) are inside the classical bidirectional flow.

    Duration is the duration of the flow

    The NetTiSA flow is implemented into IP flow exporter ipfixprobe.

    Description of dataset files

    In the following table is a description of each dataset file:

    File name

    Detection problem

    Citation of the original raw dataset

    botnet_binary.csv Binary detection of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

    botnet_multiclass.csv Multi-class classification of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

    cryptomining_design.csv Binary detection of cryptomining; the design part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

    cryptomining_evaluation.csv Binary detection of cryptomining; the evaluation part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

    dns_malware.csv Binary detection of malware DNS Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.

    doh_cic.csv Binary detection of DoH Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020

    doh_real_world.csv Binary detection of DoH Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022

    dos.csv Binary detection of DoS Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.

    edge_iiot_binary.csv Binary detection of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

    edge_iiot_multiclass.csv Multi-class classification of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

    https_brute_force.csv Binary detection of HTTPS Brute Force Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020

    ids_cic_binary.csv Binary detection of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

    ids_cic_multiclass.csv Multi-class classification of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

    unsw_binary.csv Binary detection of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

    unsw_multiclass.csv Multi-class classification of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

    iot_23.csv Binary detection of IoT malware Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23

    ton_iot_binary.csv Binary detection of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021

    ton_iot_multiclass.csv Multi-class classification of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets.

  8. BED: Biometric EEG dataset

    • zenodo.org
    • producciocientifica.uv.es
    Updated Apr 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo Arnau-González; Pablo Arnau-González; Stamos Katsigiannis; Stamos Katsigiannis; Miguel Arevalillo-Herráez; Miguel Arevalillo-Herráez; Naeem Ramzan; Naeem Ramzan (2022). BED: Biometric EEG dataset [Dataset]. http://doi.org/10.5281/zenodo.4309472
    Explore at:
    Dataset updated
    Apr 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pablo Arnau-González; Pablo Arnau-González; Stamos Katsigiannis; Stamos Katsigiannis; Miguel Arevalillo-Herráez; Miguel Arevalillo-Herráez; Naeem Ramzan; Naeem Ramzan
    Description

    The BED dataset

    Version 1.0.0

    Please cite as: Arnau-González, P., Katsigiannis, S., Arevalillo-Herráez, M., Ramzan, N., "BED: A new dataset for EEG-based biometrics", IEEE Internet of Things Journal, vol. 8, no. 15, pp. 12219 - 12230, 2021.

    Disclaimer

    While every care has been taken to ensure the accuracy of the data included in the BED dataset, the authors and the University of the West of Scotland, Durham University, and Universitat de València do not provide any guaranties and disclaim all responsibility and all liability (including without limitation, liability in negligence) for all expenses, losses, damages (including indirect or consequential damage) and costs which you might incur as a result of the provided data being inaccurate or incomplete in any way and for any reason. 2020, University of the West of Scotland, Scotland, United Kingdom.

    Contact

    For inquiries regarding the BED dataset, please contact:

    1. Dr Pablo Arnau-González, arnau.pablo [*AT*] gmail.com
    2. Dr Stamos Katsigiannis, stamos.katsigiannis [*AT*] durham.ac.uk
    3. Prof. Miguel Arevalillo-Herráez, miguel.arevalillo [*AT*] uv.es
    4. Prof. Naeem Ramzan, Naeem.Ramzan [*AT*] uws.ac.uk

    Dataset summary

    BED (Biometric EEG Dataset) is a dataset specifically designed to test EEG-based biometric approaches that use relatively inexpensive consumer-grade devices, more specifically the Emotiv EPOC+ in this case. This dataset includes EEG responses from 21 subjects to 12 different stimuli, across 3 different chronologically disjointed sessions. We have also considered stimuli aimed to elicit different affective states, so as to facilitate future research on the influence of emotions on EEG-based biometric tasks. In addition, we provide a baseline performance analysis to outline the potential of consumer-grade EEG devices for subject identification and verification. It must be noted that, in this work, EEG data were acquired in a controlled environment in order to reduce the variability in the acquired data stemming from external conditions.

    The stimuli include:

    • Images selected to elicit specific emotions
    • Mathematical computations (2-digit additions)
    • Resting-state with eyes closed
    • Resting-state with eyes open
    • Visual Evoked Potentials at 2, 5, 7, 10 Hz - Standard checker-board pattern with pattern reversal
    • Visual Evoked Potentials at 2, 5, 7, 10 Hz - Flashing with a plain colour, set as black

    For more details regarding the experimental protocol and the design of the dataset, please refer to the associated publication: Arnau-González, P., Katsigiannis, S., Arevalillo-Herráez, M., Ramzan, N., "BED: A new dataset for EEG-based biometrics", IEEE Internet of Things Journal, 2021. (Under review)

    Dataset structure and contents

    The BED dataset contains EEG recordings from 21 subjects, acquired during 3 similar sessions for each subject. The sessions were spaced one week apart from each other.

    The BED dataset includes:

    • The raw EEG recordings with no pre-processing and the log files of the experimental procedure, in text format
    • The EEG recordings with no pre-processing, segmented, structured and annotated according to the presented stimuli, in Matlab format
    • The features extracted from each EEG segment, as described in the associated publication

    The dataset is organised in 3 folders:

    • RAW
    • RAW_PARSED
    • Features

    RAW/ Contains the RAW files
    RAW/sN/ Contains the RAW files associated with subject N
    Each folder sN is composed by the following files:
    - sN_s1.csv, sN_s2.csv, sN_s3.csv -- Files containing the EEG recordings for subject N and session 1, 2, and 3, respectively. These files contain 39 columns:
    COUNTER INTERPOLATED F3 FC5 AF3 F7 T7 P7 O1 O2 P8 T8 F8 AF4 FC6 F4 ...UNUSED DATA... UNIX_TIMESTAMP
    - subject_N_session_1_time_X.log, subject_N_session_2_time_X.log, subject_N_session_3_time_X.log -- Log files containing the sequence of events for the subject N and the session 1,2, and 3 respectively.

    RAW_PARSED/
    Contains Matlab files named sN_sM.mat. The files contain the recordings for the subject N in the session M. These files are composed by two variables:
    - recording: size (time@256Hz x 17), Columns: COUNTER INTERPOLATED F3 FC5 AF3 F7 T7 P7 O1 O2 P8 T8 F8 AF4 FC6 F4 UNIX_TIMESTAMP
    - events: cell array with size (events x 3) START_UNIX END_UNIX ADDITIONAL_INFO
    START_UNIX is the UNIX timestamp in which the event starts
    END_UNIX is the UNIX timestamp in which the event ends
    ADDITIONAL INFO contains a struct with additional information regarding the specific event, in the case of the images, the expected score, the voted score, in the case of the cognitive task the input, in the case of the VEP the pattern and the frequency, etc..

    Features/
    Features/Identification
    Features/Identification/[ARRC|MFCC|SPEC]/: Each of these folders contain the extracted features ready for classification for each of the stimuli, each file is composed by two variables, "feat" the feature matrix and "Y" the label matrix.
    - feat: N x number of features
    - Y: N x 2 (the #subject and the #session)
    - INFO: Contains details about the event same as the ADDITIONAL INFO
    Features/Verification: This folder is composed by 3 different files each of them with one different set of features extracted. Each file is composed by one cstruct array composed by:
    - data: the time-series features, as described in the paper
    - y: the #subject
    - stimuli: the stimuli by name
    - session: the #session
    - INFO: Contains details about the event

    The features provided are in sequential order, so index 1 and index 2, etc. are sequential in time if they belong to the same stimulus.

    Additional information

    For additional information regarding the creation of the BED dataset, please refer to the associated publication: Arnau-González, P., Katsigiannis, S., Arevalillo-Herráez, M., Ramzan, N., "BED: A new dataset for EEG-based biometrics", IEEE Internet of Things Journal, vol. 8, no. 15, pp. 12219 - 12230, 2021.

  9. ESC Dataset

    • zenodo.org
    zip
    Updated Sep 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schahram Dustdar; Schahram Dustdar; Pablo Fernandez; Pablo Fernandez; Jose Maria Garcia; Jose Maria Garcia; Antonio Ruiz-Cortes; Antonio Ruiz-Cortes (2021). ESC Dataset [Dataset]. http://doi.org/10.5281/zenodo.4267996
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 16, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Schahram Dustdar; Schahram Dustdar; Pablo Fernandez; Pablo Fernandez; Jose Maria Garcia; Jose Maria Garcia; Antonio Ruiz-Cortes; Antonio Ruiz-Cortes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ESC Dataset presented in the paper Elastic Smart Contracts in Blockchains (10.1109/JAS.2021.1004222)

    IEEE/CAA Journal of Automatica Sinica ( Volume: 8, Issue: 12, December 2021

    In this paper, we deal with questions related to blockchains in complex Internet of Things (IoT)-based ecosystems. Such ecosystems are typically composed of IoT devices, edge devices, cloud computing software services, as well as people, who are decision makers in scenarios such as smart cities. Many decisions related to analytics can be based on data coming from IoT sensors, software services, and people. However, they are typically based on different levels of abstraction and granularity. This poses a number of challenges when multiple blockchains are used together with smart contracts. This work proposes to apply our concept of elasticity to smart contracts and thereby enabling analytics in and between multiple blockchains in the context of IoT. We propose a reference architecture for Elastic Smart Contracts and evaluate the approach in a smart city scenario, discussing the benefits in terms of performance and self-adaptability of our solution.

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yair Meidan; Yair Meidan; Dan Avraham; Hanan Libhaber; Asaf Shabtai; Asaf Shabtai; Dan Avraham; Hanan Libhaber (2022). CADeSH Dataset: Collaborative Anomaly Detection for Smart Homes [Dataset]. http://doi.org/10.5281/zenodo.6406052
Organization logo

CADeSH Dataset: Collaborative Anomaly Detection for Smart Homes

Explore at:
Dataset updated
Jul 29, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yair Meidan; Yair Meidan; Dan Avraham; Hanan Libhaber; Asaf Shabtai; Asaf Shabtai; Dan Avraham; Hanan Libhaber
Description

Dataset used for quantitative evaluation in the paper:

Y. Meidan, D. Avraham, H. Libhaber and A. Shabtai, "CADeSH: Collaborative Anomaly Detection for Smart Homes," in IEEE Internet of Things Journal, 2022, doi: 10.1109/JIOT.2022.3194813.

This is a table of flow-level traffic data which was continuously captured during a period of 21 days from five real home networks which were subscribed to a smart home security service, and from our lab at Ben-Gurion University of The Negev. This security service provider shared with us these network traffic flows, plus the related DNS requests and responses, and reputation intelligence of the destination IP addresses. Each instance in this dataset represents an outbound network traffic flow (in the form of an IPFIX) which emanated from an instance of the IoT model streamer.Amazon.Fire_TV_Gen_3.

In our lab, we infected our streamer.Amazon.Fire_TV_Gen_3 with a cryptominer and executed cryptomining from this device. To imitate a scanning activity typically performed by some botnets, we also scanned the network using Nmap. In accordance, we labeled these malicious activities as (1) `is executing cryptomining,' or (2) `being scanned by Nmap.' All of the remaining IPFIXs captured in our lab or on the home networks were labeled as `assumed benign'.

The multitude of real home networks, and the multitude of identical source devices, enable using this dataset for quantitative evaluation of (collaborative) anomaly/attack detection methods, especially for the IoT.

Search
Clear search
Close search
Google apps
Main menu