65 datasets found
  1. Vietnamese Spam Post in Social Network

    • kaggle.com
    zip
    Updated Dec 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Victor Howard 2 (2024). Vietnamese Spam Post in Social Network [Dataset]. https://www.kaggle.com/datasets/victorhoward2/vietnamese-spam-post-in-social-network/code
    Explore at:
    zip(121396 bytes)Available download formats
    Dataset updated
    Dec 24, 2024
    Authors
    Victor Howard 2
    Description

    The "Vietnamese Spam Post in Social Network" dataset contains textual data collected from social media platforms. This dataset is specifically designed for spam detection tasks and includes labeled posts categorized as either spam or non-spam. Each post is written in Vietnamese, making it a valuable resource for natural language processing (NLP) research focused on the Vietnamese language. The dataset is ideal for training and evaluating machine learning models in tasks such as spam classification and text filtering in social networking environments.

  2. Z

    Training CNNs with Low-Rank Filters for Efficient Image Classification:...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ioannou, Yani (2020). Training CNNs with Low-Rank Filters for Efficient Image Classification: Trained Models [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_635057
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    University of Cambridge
    Authors
    Ioannou, Yani
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Models from experiments referenced in the paper "Training CNNs with Low-Rank Filters for Efficient Image Classification", https://arxiv.org/abs/1511.06744

    Model names differ from those in the paper, but the csv files for each set of experiments relates the paper's name for the model and the real name of the model here:

    cifarma.csv: Network-in-Network CIFAR10 Models

    mitma.csv: MIT Places Models

    googlenetma.csv: GoogLeNet ILSVRC2012 Models

    vggma.csv: VGG-11 ILSVRC2012 Models

  3. Automatic plankton image classification - can capsules and filters help...

    • zenodo.org
    • data.niaid.nih.gov
    text/x-python, zip
    Updated Jan 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rene-Marcel Plonus; Rene-Marcel Plonus; Jan Conradt; André Harmer; Silke Janßen; Jens Floeter; Jan Conradt; André Harmer; Silke Janßen; Jens Floeter (2021). Automatic plankton image classification - can capsules and filters help coping with data set shift? [Dataset]. http://doi.org/10.5281/zenodo.4431509
    Explore at:
    text/x-python, zipAvailable download formats
    Dataset updated
    Jan 14, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rene-Marcel Plonus; Rene-Marcel Plonus; Jan Conradt; André Harmer; Silke Janßen; Jens Floeter; Jan Conradt; André Harmer; Silke Janßen; Jens Floeter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set is related to the article 'Automatic plankton image classification - can capsules and filters help coping with data set shift?' published in 'Limnology and Oceanography: Methods' by Plonus et al. (2021).

    The images belong to the trainings set used to train the models in the aforementioned paper (training_) and three different additional data sets which were used to evaluate the performance of the trained models in application mode (fs446_; fs466_; fs534_). The Python-Script 'separate_files.py' can be used to move all the images in different folders for each data set and class respectively.

  4. A Neural Network-Based Optimal Spatial Filter Design Method for Motor...

    • plos.figshare.com
    mp4
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayhan Yuksel; Tamer Olmez (2023). A Neural Network-Based Optimal Spatial Filter Design Method for Motor Imagery Classification [Dataset]. http://doi.org/10.1371/journal.pone.0125039
    Explore at:
    mp4Available download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ayhan Yuksel; Tamer Olmez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this study, a novel spatial filter design method is introduced. Spatial filtering is an important processing step for feature extraction in motor imagery-based brain-computer interfaces. This paper introduces a new motor imagery signal classification method combined with spatial filter optimization. We simultaneously train the spatial filter and the classifier using a neural network approach. The proposed spatial filter network (SFN) is composed of two layers: a spatial filtering layer and a classifier layer. These two layers are linked to each other with non-linear mapping functions. The proposed method addresses two shortcomings of the common spatial patterns (CSP) algorithm. First, CSP aims to maximize the between-classes variance while ignoring the minimization of within-classes variances. Consequently, the features obtained using the CSP method may have large within-classes variances. Second, the maximizing optimization function of CSP increases the classification accuracy indirectly because an independent classifier is used after the CSP method. With SFN, we aimed to maximize the between-classes variance while minimizing within-classes variances and simultaneously optimizing the spatial filter and the classifier. To classify motor imagery EEG signals, we modified the well-known feed-forward structure and derived forward and backward equations that correspond to the proposed structure. We tested our algorithm on simple toy data. Then, we compared the SFN with conventional CSP and its multi-class version, called one-versus-rest CSP, on two data sets from BCI competition III. The evaluation results demonstrate that SFN is a good alternative for classifying motor imagery EEG signals with increased classification accuracy.

  5. Spam_ham_dataset

    • kaggle.com
    zip
    Updated Apr 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bilal Ahmad (2024). Spam_ham_dataset [Dataset]. https://www.kaggle.com/datasets/bilalahmad9593492/spam-ham-dataset
    Explore at:
    zip(1954828 bytes)Available download formats
    Dataset updated
    Apr 14, 2024
    Authors
    Bilal Ahmad
    Description

    Dataset

    This dataset was created by Bilal Ahmad

    Released under Other (specified in description)

    Contents

  6. f

    Architecture of our convolutional neural network classification model.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Feb 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keenan, Jeremy D.; Gaynor, Bruce D.; Lietman, Thomas M.; Tadesse, Zerihun; Ryner, Alexander M.; Cotter, Sun Y.; Okada, Kazunori; Amza, Abdou; Kim, Matthew C.; Porco, Travis C. (2019). Architecture of our convolutional neural network classification model. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000157436
    Explore at:
    Dataset updated
    Feb 11, 2019
    Authors
    Keenan, Jeremy D.; Gaynor, Bruce D.; Lietman, Thomas M.; Tadesse, Zerihun; Ryner, Alexander M.; Cotter, Sun Y.; Okada, Kazunori; Amza, Abdou; Kim, Matthew C.; Porco, Travis C.
    Description

    K denotes the number of filters in the first stage of the convolutional layers.

  7. Convolutional Neural Networks for Classifying Combinatorial Metamaterials

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Nov 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan van Mastrigt; Ryan van Mastrigt; Marjolein Dijkstra; Marjolein Dijkstra; Martin van Hecke; Martin van Hecke; Corentin Coulais; Corentin Coulais (2022). Convolutional Neural Networks for Classifying Combinatorial Metamaterials [Dataset]. http://doi.org/10.5281/zenodo.7071282
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 8, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ryan van Mastrigt; Ryan van Mastrigt; Marjolein Dijkstra; Marjolein Dijkstra; Martin van Hecke; Martin van Hecke; Corentin Coulais; Corentin Coulais
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the training and test data, as well as the trained neural networks as used for the paper 'Machine Learning of Implicit Combinatorial Rules in Mechanical Metamaterials', as published in Physical Review Letters.

    In this paper, a neural network is used to classify each \(k \times k\) unit cell design of metamaterial M1 and M2 into one of two classes (C or I). Additionally, the performance of the trained networks is analysed in detail. A more detailed description of the contents of the dataset follows below.

    NeuralNetwork_train_and_test_data.zip

    This file contains the train and test data used to train the Convolutional Neural Networks (CNNs) of the paper. Each unit cell size has its own file, and is saved in a zipped numpy file type (.npz). It contains data for metamaterial M1 ("smiley_cube"), and metamaterial M2 classification (i) ("prek_xy") and (ii) ("unimodal_vs_oligomodal_inc_stripmodes").

    CNN_saves_kxk.zip

    This file contains the parameter configurations of the CNNs trained on \(k \times k\) unit cells for metamaterial M2 classification (ii). Classification (i) is denoted by an additional M2ii in the file name. Metamaterial M1 is denoted by an extra M1 in the file name. Every hyperparameter (number of filters nf, number of hidden neurons nh, learning rate lr) combination is saved separately. The neural networks can be loaded using Google's TensorFlow package in Python, specifically using the 'tf.keras.models.load_model' function.

  8. S1 File -

    • plos.figshare.com
    zip
    Updated Oct 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jun Tan; Jiamin Yuan; Xiaoyong Fu; Yilin Bai (2024). S1 File - [Dataset]. http://doi.org/10.1371/journal.pone.0302800.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 11, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jun Tan; Jiamin Yuan; Xiaoyong Fu; Yilin Bai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Among the most common cancers, colorectal cancer (CRC) has a high death rate. The best way to screen for colorectal cancer (CRC) is with a colonoscopy, which has been shown to lower the risk of the disease. As a result, Computer-aided polyp classification technique is applied to identify colorectal cancer. But visually categorizing polyps is difficult since different polyps have different lighting conditions. Different from previous works, this article presents Enhanced Scattering Wavelet Convolutional Neural Network (ESWCNN), a polyp classification technique that combines Convolutional Neural Network (CNN) and Scattering Wavelet Transform (SWT) to improve polyp classification performance. This method concatenates simultaneously learnable image filters and wavelet filters on each input channel. The scattering wavelet filters can extract common spectral features with various scales and orientations, while the learnable filters can capture image spatial features that wavelet filters may miss. A network architecture for ESWCNN is designed based on these principles and trained and tested using colonoscopy datasets (two public datasets and one private dataset). An n-fold cross-validation experiment was conducted for three classes (adenoma, hyperplastic, serrated) achieving a classification accuracy of 96.4%, and 94.8% accuracy in two-class polyp classification (positive and negative). In the three-class classification, correct classification rates of 96.2% for adenomas, 98.71% for hyperplastic polyps, and 97.9% for serrated polyps were achieved. The proposed method in the two-class experiment reached an average sensitivity of 96.7% with 93.1% specificity. Furthermore, we compare the performance of our model with the state-of-the-art general classification models and commonly used CNNs. Six end-to-end models based on CNNs were trained using 2 dataset of video sequences. The experimental results demonstrate that the proposed ESWCNN method can effectively classify polyps with higher accuracy and efficacy compared to the state-of-the-art CNN models. These findings can provide guidance for future research in polyp classification.

  9. m

    ITC-Net-Blend-60: A Comprehensive Dataset for Robust Network Traffic...

    • data.mendeley.com
    Updated May 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marziyeh Bayat (2024). ITC-Net-Blend-60: A Comprehensive Dataset for Robust Network Traffic Classification in Diverse Environments - Supplementary Materials [Dataset]. http://doi.org/10.17632/4sgt9tjs4w.6
    Explore at:
    Dataset updated
    May 23, 2024
    Authors
    Marziyeh Bayat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains supplementary material for the ITC-Net-Blend-60. It includes the full methodology document, and Python scripts to filter background traffic and extract PCAP file properties.

  10. Spam Emails

    • kaggle.com
    zip
    Updated Oct 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdallah Wagih Ibrahim (2023). Spam Emails [Dataset]. https://www.kaggle.com/datasets/abdallahwagih/spam-emails
    Explore at:
    zip(212432 bytes)Available download formats
    Dataset updated
    Oct 9, 2023
    Authors
    Abdallah Wagih Ibrahim
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Overview: This dataset contains a collection of emails, categorized into two classes: "Spam" and "Non-Spam" (often referred to as "Ham"). These emails have been carefully curated and labeled to aid in the development of spam email detection models. Whether you are interested in email filtering, natural language processing, or machine learning, this dataset can serve as a valuable resource for training and evaluation.

    Context: Spam emails continue to be a significant issue, with malicious actors attempting to deceive users with unsolicited, fraudulent, or harmful messages. This dataset is designed to facilitate research, development, and testing of algorithms and models aimed at accurately identifying and filtering spam emails, helping protect users from various threats.

    Content: The dataset includes the following features: Message: The content of the email, including the subject line and message body. Category: Categorizes each email as either "Spam" or "Ham" (Non-Spam).

    Potential Use Cases: - Email Filtering: Develop and evaluate email filtering systems that automatically classify incoming emails as spam or non-spam. - Natural Language Processing (NLP): Use the email text for text classification, topic modeling, and sentiment analysis. - Machine Learning: Create machine learning models for spam detection, potentially employing various algorithms and techniques. - Feature Engineering: Explore email content features that contribute to spam classification accuracy. - Data Analysis: Investigate patterns and trends in spam email content and characteristics.

    License: Please note that this dataset is for research and analysis purposes only and may be subject to copyright and data use restrictions. Ensure compliance with relevant policies when using this data.

  11. Fused Image dataset for convolutional neural Network-based crack Detection...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Apr 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shanglian Zhou; Shanglian Zhou; Carlos Canchila; Carlos Canchila; Wei Song; Wei Song (2023). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Dataset]. http://doi.org/10.5281/zenodo.6383044
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 20, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Shanglian Zhou; Shanglian Zhou; Carlos Canchila; Carlos Canchila; Wei Song; Wei Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The “Fused Image dataset for convolutional neural Network-based crack Detection” (FIND) is a large-scale image dataset with pixel-level ground truth crack data for deep learning-based crack segmentation analysis. It features four types of image data including raw intensity image, raw range (i.e., elevation) image, filtered range image, and fused raw image. The FIND dataset consists of 2500 image patches (dimension: 256x256 pixels) and their ground truth crack maps for each of the four data types.

    The images contained in this dataset were collected from multiple bridge decks and roadways under real-world conditions. A laser scanning device was adopted for data acquisition such that the captured raw intensity and raw range images have pixel-to-pixel location correspondence (i.e., spatial co-registration feature). The filtered range data were generated by applying frequency domain filtering to eliminate image disturbances (e.g., surface variations, and grooved patterns) from the raw range data [1]. The fused image data were obtained by combining the raw range and raw intensity data to achieve cross-domain feature correlation [2,3]. Please refer to [4] for a comprehensive benchmark study performed using the FIND dataset to investigate the impact from different types of image data on deep convolutional neural network (DCNN) performance.

    If you share or use this dataset, please cite [4] and [5] in any relevant documentation.

    In addition, an image dataset for crack classification has also been published at [6].

    References:

    [1] Shanglian Zhou, & Wei Song. (2020). Robust Image-Based Surface Crack Detection Using Range Data. Journal of Computing in Civil Engineering, 34(2), 04019054. https://doi.org/10.1061/(asce)cp.1943-5487.0000873

    [2] Shanglian Zhou, & Wei Song. (2021). Crack segmentation through deep convolutional neural networks and heterogeneous image fusion. Automation in Construction, 125. https://doi.org/10.1016/j.autcon.2021.103605

    [3] Shanglian Zhou, & Wei Song. (2020). Deep learning–based roadway crack classification with heterogeneous image data fusion. Structural Health Monitoring, 20(3), 1274-1293. https://doi.org/10.1177/1475921720948434

    [4] Shanglian Zhou, Carlos Canchila, & Wei Song. (2023). Deep learning-based crack segmentation for civil infrastructure: data types, architectures, and benchmarked performance. Automation in Construction, 146. https://doi.org/10.1016/j.autcon.2022.104678

    [5] (This dataset) Shanglian Zhou, Carlos Canchila, & Wei Song. (2022). Fused Image dataset for convolutional neural Network-based crack Detection (FIND) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6383044

    [6] Wei Song, & Shanglian Zhou. (2020). Laser-scanned roadway range image dataset (LRRD). Laser-scanned Range Image Dataset from Asphalt and Concrete Roadways for DCNN-based Crack Classification, DesignSafe-CI. https://doi.org/10.17603/ds2-bzv3-nc78

  12. Y

    Citation Network Graph

    • shibatadb.com
    Updated May 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yubetsu (2022). Citation Network Graph [Dataset]. https://www.shibatadb.com/article/VGaDSsxZ
    Explore at:
    Dataset updated
    May 17, 2022
    Dataset authored and provided by
    Yubetsu
    License

    https://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt

    Description

    Network of 43 papers and 85 citation links related to "Novel Approach to Evaluate Classification Algorithms and Feature Selection Filter Algorithms Using Medical Data".

  13. Data from: Choosing wavelet methods, filters, and lengths for functional...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, txt
    Updated May 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zitong Zhang; Qawi K. Telesford; Chad Giusti; Kelvin O. Lim; Danielle S. Bassett; Zitong Zhang; Qawi K. Telesford; Chad Giusti; Kelvin O. Lim; Danielle S. Bassett (2022). Data from: Choosing wavelet methods, filters, and lengths for functional brain network construction [Dataset]. http://doi.org/10.5061/dryad.86n40
    Explore at:
    bin, txtAvailable download formats
    Dataset updated
    May 31, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Zitong Zhang; Qawi K. Telesford; Chad Giusti; Kelvin O. Lim; Danielle S. Bassett; Zitong Zhang; Qawi K. Telesford; Chad Giusti; Kelvin O. Lim; Danielle S. Bassett
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Wavelet methods are widely used to decompose fMRI, EEG, or MEG signals into time series representing neurophysiological activity in fixed frequency bands. Using these time series, one can estimate frequency-band specific functional connectivity between sensors or regions of interest, and thereby construct functional brain networks that can be examined from a graph theoretic perspective. Despite their common use, however, practical guidelines for the choice of wavelet method, filter, and length have remained largely undelineated. Here, we explicitly explore the effects of wavelet method (MODWT vs. DWT), wavelet filter (Daubechies Extremal Phase, Daubechies Least Asymmetric, and Coiflet families), and wavelet length (2 to 24)—each essential parameters in wavelet-based methods—on the estimated values of graph metrics and in their sensitivity to alterations in psychiatric disease. We observe that the MODWT method produces less variable estimates than the DWT method. We also observe that the length of the wavelet filter chosen has a greater impact on the estimated values of graph metrics than the type of wavelet chosen. Furthermore, wavelet length impacts the sensitivity of the method to detect differences between health and disease and tunes classification accuracy. Collectively, our results suggest that the choice of wavelet method and length significantly alters the reliability and sensitivity of these methods in estimating values of metrics drawn from graph theory. They furthermore demonstrate the importance of reporting the choices utilized in neuroimaging studies and support the utility of exploring wavelet parameters to maximize classification accuracy in the development of biomarkers of psychiatric disease and neurological disorders.

  14. Deakin IoT Traffic Dataset

    • dro.deakin.edu.au
    • researchdata.edu.au
    pcap
    Updated Jun 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aleksandar Pasquini; Rajesh Vasa; Hassan Habibi Gharakheili; Irini Logothetis; Alexander Chambers; Minh Tran (2025). Deakin IoT Traffic Dataset [Dataset]. http://doi.org/10.26187/deakin.28013234.v2
    Explore at:
    pcapAvailable download formats
    Dataset updated
    Jun 21, 2025
    Dataset provided by
    Deakin Universityhttp://www.deakin.edu.au/
    Authors
    Aleksandar Pasquini; Rajesh Vasa; Hassan Habibi Gharakheili; Irini Logothetis; Alexander Chambers; Minh Tran
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises network traffic collected from 24 Internet of Things (IoT) devices over a span of 119 days, capturing a total of over 110 million packets. The devices represent 19 distinct types and were monitored in a controlled environment under normal operating conditions, reflecting a variety of functions and behaviors typical of consumer IoT products (pcapIoT). The packet capture (pcap) files preserve complete packet information across all protocol layers, including ARP, TCP, HTTP, and various application-layer protocols. Raw pcap files (pcapFull) are also provided, which contain traffic from 36 non-IoT devices present in the network. To facilitate device-specific analysis, a CSV file is included that maps each IoT device to its unique MAC address. This mapping simplifies the identification and filtering of packets belonging to each device within the pcap files. 3 extra CSV (CSVs) files provide metadate about the states that the devices were in at different times. Additionally, Python scripts (Scripts) are provided to assist in extracting and processing packets. These scripts include functionalities such as packet filtering based on MAC addresses and protocol-specific data extraction, serving as practical examples for data manipulation and analysis techniques. This dataset is valuable for researchers interested in network behavior analysis, anomaly detection, and the development of IoT-specific network policies. It enables the study and differentiation of network behaviors based on device functions and supports behavior-based profiling to identify irregular activities or potential security threats.

  15. Firms in datasets after filtering steps.

    • plos.figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan Kinne; David Lenz (2023). Firms in datasets after filtering steps. [Dataset]. http://doi.org/10.1371/journal.pone.0249071.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jan Kinne; David Lenz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Firms in datasets after filtering steps.

  16. Summary the information about compared network architectures.

    • plos.figshare.com
    xls
    Updated Oct 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jun Tan; Jiamin Yuan; Xiaoyong Fu; Yilin Bai (2024). Summary the information about compared network architectures. [Dataset]. http://doi.org/10.1371/journal.pone.0302800.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 11, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jun Tan; Jiamin Yuan; Xiaoyong Fu; Yilin Bai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary the information about compared network architectures.

  17. Convolutional Neural Networks for Classifying Combinatorial Metamaterials

    • data.europa.eu
    unknown
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo, Convolutional Neural Networks for Classifying Combinatorial Metamaterials [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-5992648?locale=lv
    Explore at:
    unknown(1276985474)Available download formats
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the training and test data, as well as the trained neural networks as used for the paper 'Machine Learning of Combinatorial Rules in Mechanical Metamaterials', as published in XXX. In this paper, a neural network is used to classify each (k \times k) unit cell design into one of two classes (C or I). Additionally, the performance of the trained networks is analysed in detail. A more detailed description of the contents of the dataset follows below. NeuralNetwork_train_and_test_data.zip This file contains the train and test data used to train the Convolutional Neural Networks (CNNs) of the paper. Each unit cell size has its own file, and is saved in a zipped numpy file type (.npz). CNN_saves_kxk.zip This file contains the parameter configurations of the CNNs trained on (k \times k) unit cells. Every hyperparameter (number of filters nf, number of hidden neurons nh, learning rate lr) combination is saved separately. The neural networks can be loaded using Google's TensorFlow package in Python, specifically using the 'tf.keras.models.load_model' function.

  18. Balance Scale

    • kaggle.com
    zip
    Updated Apr 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shushrut (2019). Balance Scale [Dataset]. https://www.kaggle.com/mysticvalley/balance-scale
    Explore at:
    zip(1372 bytes)Available download formats
    Dataset updated
    Apr 28, 2019
    Authors
    Shushrut
    Description

    **Source:

    Generated to model psychological experiments reported by Siegler, R. S. (1976). Three Aspects of Cognitive Development. Cognitive Psychology, 8, 481-520. ** Donor:

    Tim Hume (hume '@' ics.uci.edu)

    Data Set Information:

    This data set was generated to model psychological experimental results. Each example is classified as having the balance scale tip to the right, tip to the left, or be balanced. The attributes are the left weight, the left distance, the right weight, and the right distance. The correct way to find the class is the greater of (left-distance * left-weight) and (right-distance * right-weight). If they are equal, it is balanced.

    **Attribute Information:

    1. Class Name: 3 (L, B, R)
    2. Left-Weight: 5 (1, 2, 3, 4, 5)
    3. Left-Distance: 5 (1, 2, 3, 4, 5)
    4. Right-Weight: 5 (1, 2, 3, 4, 5)
    5. Right-Distance: 5 (1, 2, 3, 4, 5)**

    Relevant Papers:

    Klahr, D., & Siegler, R.S. (1978). The Representation of Children's Knowledge. In H. W. Reese & L. P. Lipsitt (Eds.), Advances in Child Development and Behavior, pp. 61-116. New York: Academic Press [Web Link]

    Langley,P. (1987). A General Theory of Discrimination Learning. In D. Klahr, P. Langley, & R. Neches (Eds.), Production System Models of Learning and Development, pp. 99-161. Cambridge, MA: MIT Press [Web Link]

    Newell, A. (1990). Unified Theories of Cognition. Cambridge, MA: Harvard University Press [Web Link]

    McClelland, J.L. (1988). Parallel Distibuted Processing: Implications for Cognition and Development. Technical Report AIP-47, Department of Psychology, Carnegie-Mellon University [Web Link]

    Shultz, T., Mareschal, D., & Schmidt, W. (1994). Modeling Cognitive Development on Balance Scale Phenomena. Machine Learning, Vol. 16, pp. 59-88. [Web Link]

    Papers That Cite This Data Set1:

    Zhi-Hua Zhou and Yuan Jiang and Shifu Chen. Extracting symbolic rules from trained neural network ensembles. AI Commun, 16. 2003.

    Jianbin Tan and David L. Dowe. MML Inference of Decision Graphs with Multi-way Joins and Dynamic Attributes. Australian Conference on Artificial Intelligence. 2003.

    Peter Sykacek and Stephen J. Roberts. Adaptive Classification by Variational Kalman Filtering. NIPS. 2002.

    Remco R. Bouckaert. Accuracy bounds for ensembles under 0 { 1 loss. Xtal Mountain Information Technology & Computer Science Department, University of Waikato. 2002.

    Nir Friedman and Moisés Goldszmidt and Thomas J. Lee. Bayesian Network Classification with Continuous Attributes: Getting the Best of Both Discretization and Parametric Fitting. ICML. 1998.

    Hirotaka Inoue and Hiroyuki Narihisa. Experiments with an Ensemble Self-Generating Neural Network. Okayama University of Science.

    Alexander K. Seewald. Meta-Learning for Stacked Classification. Austrian Research Institute for Artificial Intelligence.

    Alexander K. Seewald. Dissertation Towards Understanding Stacking Studies of a General Ensemble Learning Scheme ausgefuhrt zum Zwecke der Erlangung des akademischen Grades eines Doktors der technischen Naturwissenschaften.

    Original Source : https://archive.ics.uci.edu/ml/datasets/Balance+Scale

  19. Y

    Citation Network Graph

    • shibatadb.com
    Updated Sep 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yubetsu (2025). Citation Network Graph [Dataset]. https://www.shibatadb.com/article/Cc776miG
    Explore at:
    Dataset updated
    Sep 28, 2025
    Dataset authored and provided by
    Yubetsu
    License

    https://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt

    Description

    Network of 27 papers and 58 citation links related to "Classification of Body Movements in Ambulatory ECG Using Wavelet Transform, Adaptive Filter and Artificial Neural Networks".

  20. Malware Detection in Network Traffic Data

    • kaggle.com
    zip
    Updated Dec 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agung Pambudi (2023). Malware Detection in Network Traffic Data [Dataset]. https://www.kaggle.com/datasets/agungpambudi/network-malware-detection-connection-analysis
    Explore at:
    zip(755409206 bytes)Available download formats
    Dataset updated
    Dec 26, 2023
    Authors
    Agung Pambudi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To cite the dataset please reference it as “Stratosphere Laboratory. A labeled dataset with malicious and benign IoT network traffic. January 22th. Agustin Parmisano, Sebastian Garcia, Maria Jose Erquiaga. https://www.stratosphereips.org/datasets-iot23

    This dataset includes labels that explain the linkages between flows connected with harmful or possibly malicious activity to provide network malware researchers and analysts with more thorough information. These labels were painstakingly created at the Stratosphere labs using malware capture analysis.

    We present a concise explanation of the labels used for the identification of malicious flows, based on manual network analysis, below:

    Attack: This label signifies the occurrence of an attack originating from an infected device directed towards another host. Any flow that endeavors to exploit a vulnerable service, discerned through payload and behavioral analysis, falls under this classification. Examples include brute force attempts on telnet logins or header-based command injections in GET requests.

    Benign: The "Benign" label denotes connections where no suspicious or malicious activities have been detected.

    C&C (Command and Control): This label indicates that the infected device has established a connection with a Command and Control server. This observation is rooted in the periodic nature of connections or activities such as binary downloads or the exchange of IRC-like or decoded commands.

    DDoS (Distributed Denial of Service): "DDoS" is assigned when the infected device is actively involved in a Distributed Denial of Service attack, identifiable by the volume of flows directed towards a single IP address.

    FileDownload: This label signifies that a file is being downloaded to the infected device. It is determined by examining connections with response bytes exceeding a specified threshold (typically 3KB or 5KB), often in conjunction with known suspicious destination ports or IPs associated with Command and Control servers.

    HeartBeat: "HeartBeat" designates connections where packets serve the purpose of tracking the infected host by the Command and Control server. Such connections are identified through response bytes below a certain threshold (typically 1B) and exhibit periodic similarities. This is often associated with known suspicious destination ports or IPs linked to Command and Control servers.

    Mirai: This label is applied when connections exhibit characteristics resembling those of the Mirai botnet, based on patterns consistent with common Mirai attack profiles.

    Okiru: Similar to "Mirai," the "Okiru" label is assigned to connections displaying characteristics of the Okiru botnet. The parameters for this label are the same as for Mirai, but Okiru is a less prevalent botnet family.

    PartOfAHorizontalPortScan: This label is employed when connections are involved in a horizontal port scan aimed at gathering information for potential subsequent attacks. The labeling decision hinges on patterns such as shared ports, similar transmitted byte counts, and multiple distinct destination IPs among the connections.

    Torii: The "Torii" label is used when connections exhibit traits indicative of the Torii botnet, with labeling criteria similar to those used for Mirai, albeit in the context of a less common botnet family.

    Field NameDescriptionType
    tsThe timestamp of the connection event.time
    uidA unique identifier for the connection.string
    id.orig_hThe source IP address.addr
    id.orig_pThe source port.port
    id.resp_hThe destination IP address.addr
    id.resp_pThe destination port.port
    protoThe network protocol used (e.g., 'tcp').enum
    serviceThe service associated with the connection.string
    durationThe duration of the connection.interval
    orig_bytesThe number of bytes sent from the source to the destination.count
    resp_bytesThe number of bytes sent from the destination to the source.count
    conn_stateThe state of the connection.string
    local_origIndicates whether the connection is considered local or not.bool
    local_respIndicates whether the connection is considered...
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Victor Howard 2 (2024). Vietnamese Spam Post in Social Network [Dataset]. https://www.kaggle.com/datasets/victorhoward2/vietnamese-spam-post-in-social-network/code
Organization logo

Vietnamese Spam Post in Social Network

Explore at:
zip(121396 bytes)Available download formats
Dataset updated
Dec 24, 2024
Authors
Victor Howard 2
Description

The "Vietnamese Spam Post in Social Network" dataset contains textual data collected from social media platforms. This dataset is specifically designed for spam detection tasks and includes labeled posts categorized as either spam or non-spam. Each post is written in Vietnamese, making it a valuable resource for natural language processing (NLP) research focused on the Vietnamese language. The dataset is ideal for training and evaluating machine learning models in tasks such as spam classification and text filtering in social networking environments.

Search
Clear search
Close search
Google apps
Main menu