6 datasets found
  1. o

    Data from: A design methodology for approximate multipliers in convolutional...

    • osf.io
    Updated Aug 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kenta Shirane; Takahiro Yamamoto; Hiroyuki Tomiyama (2023). A design methodology for approximate multipliers in convolutional neural networks: A case of MNIST [Dataset]. https://osf.io/dp8mc
    Explore at:
    Dataset updated
    Aug 31, 2023
    Dataset provided by
    Center For Open Science
    Authors
    Kenta Shirane; Takahiro Yamamoto; Hiroyuki Tomiyama
    Description

    In this paper, we present a case study on approximate multipliers for MNIST Convolutional Neural Network (CNN). We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST classification, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. Based on the results of the evaluation and analysis, we propose a design methodology for approximate multipliers. The approximate multipliers consist of some partial products, which are carefully selected according to the CNN input. With this methodology, we further reduce the area and the delay of the multipliers with keeping high accuracy of the MNIST classification.

  2. Robustness assessment of a C++ implementation of a quantized (int8) version...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David de Andrés; David de Andrés; Juan Carlos Ruiz; Juan Carlos Ruiz (2023). Robustness assessment of a C++ implementation of a quantized (int8) version of the LeNet-5 convolutional neural network [Dataset]. http://doi.org/10.5281/zenodo.10196616
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David de Andrés; David de Andrés; Juan Carlos Ruiz; Juan Carlos Ruiz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 24, 2023 - Jun 26, 2023
    Description

    The architecture of the LeNet-5 convolutional neural network (CNN) was defined by LeCun in its paper "Gradient-based learning applied to document recognition" (https://ieeexplore.ieee.org/document/726791) to classify images of hand written digits (MNIST dataset).

    This architecture has been customized to use Rectified Linear Unit (ReLU) as activation functions instead of Sigmoid, and 8-bit integers for weights and activations instead of floating-point.

    It consists of the following layers:

    • conv1: Convolution 2D, 1 input channel (28x28), 3 output channels (28x28), kernel size 5, stride 1, padding 2.
    • relu1: Rectified Linear Unit (3@28x28).
    • max1: Subsampling buy max pooling (3@14x14).
    • conv2: Convolution 2D, 3 input channels (14x14), 6 output channels (14x14), kernel size 5, stride 1, padding 2.
    • relu2: Rectified Linear Unit (6@14x14).
    • max2: Subsampling buy max pooling (6@7x7).
    • fc1: Fully connected (294, 147)
    • fc2: Fully connected (147, 10)

    The fault hypotheses for this work include the occurrence of:

    • BF: single, double-adjacent and triple-adjacent bit-flip faults
    • S0: single, double-adjacent and triple-adjacent stuck-at-0 faults
    • S1: single, double-adjacent and triple-adjacent stuck-at-1 faults

    In the memory cells containing all the parameters of the CNN:

    • w: weights (int8)
    • zw: zero point of the weights (int8)
    • b: biases (int32)
    • z: zero point (int8)
    • m: m (int32)

    Images 200 to 249 from the MNIST dataset have been used as workload.

    This dataset contains the raw data obtained from running exhaustive fault injection campaigns for all considered fault models, targeting all considered locations and for all the images in the workload.

    In addition, the raw data have been lightly processed to obtain global data related to the particular bits and parameters affected by the faults, and the obtained failure modes.

    Files information

    • golden_run.csv: Prediction obtained for all the images considered in the workload in the absence of faults (Golden Run). This is intended to act as oracle to determine the impact of injected faults.
    • single_faults/bit_flip folder: Prediction obtained for all the images considered in the workload in presence of single bit-flip faults. There is one file for each parameter of each layer.
    • single_faults/stuck_at_0 folder: Prediction obtained for all the images considered in the workload in presence of single stuck-at-0 faults. There is one file for each parameter of each layer.
    • single_faults/stuck_at_1 folder: Prediction obtained for all the images considered in the workload in presence of single stuck-at-1 faults. There is one file for each parameter of each layer.
    • double_adjacent_faults/bit_flip folder: Prediction obtained for all the images considered in the workload in presence of double adjacent bit-flip faults. There is one file for each parameter of each layer.
    • double_adjacent_faults/stuck_at_0 folder: Prediction obtained for all the images considered in the workload in presence of double adjacent stuck-at-0 faults. There is one file for each parameter of each layer.
    • double_adjacent_faults/stuck_at_1 folder: Prediction obtained for all the images considered in the workload in presence of double adjacent stuck-at-1 faults. There is one file for each parameter of each layer.
    • triple_adjacent_faults/bit_flip folder: Prediction obtained for all the images considered in the workload in presence of triple adjacent bit-flip faults. There is one file for each parameter of each layer.
    • triple_adjacent_faults/stuck_at_0 folder: Prediction obtained for all the images considered in the workload in presence of triple adjacent stuck-at-0 faults. There is one file for each parameter of each layer.
    • triple_adjacent_faults/stuck_at_1 folder: Prediction obtained for all the images considered in the workload in presence of triple adjacent stuck-at-1 faults. There is one file for each parameter of each layer.

    Methodology information

    First, the CNN was used to classify all the images of the workload in the absence of faults to get a reference to determine the impact of faults. This is golden_run.csv file.

    After that, one fault injection experiment was executed for each bit of each element of each parameter of the CNN.

    Each experiment consisted in:

    • Affecting the bits (inverting it in case of bit-flip faults, setting it to 0 or 1 in case of stuck-at-0 or atuck-at-1 faults) identified by the mask.
    • Classifying all the images of the workload in the presence of this fault. The obtained output was stored in a given .csv file.
    • Removing the fault from the CNN by restoring the affected bits to its previous value.

    List of variables (Name : Description (Possible values))

    • IMGID: Integer number identifying the considered image (200-249).
    • TENSORID: Integer number identiying the parameter affected by the fault (0 - No fault, 1 - conv1.w, 2 - conv1.zw, 3 - conv1.m, 4 - conv1.b, 5 - conv1.z, 6 - conv2.w, 7 - conv2.zw, 8 - conv2.m, 9 - conv2.b, 10 - conv2.z, 11 - fc1.w, 12 - fc1.zw, 13 - fc1.m, 14 - fc.b, 15 - fc1.z, 16 - fc2.w, 17 - fc2.zw, 18 - fc2.m, 19 - fc2.b, 20 - fc2.z)
    • ELEMID: Integer number identiying the element of the parameter affected by the fault (-1 - No fault, [0-2] - {conv1.b, conv1.m, conv1.zw}, [0-74] - conv1.w, 0 - conv1.z, [0-5] - {conv2.b, conv2.m, conv2.zw}, [0-149] - conv2.w, 0 - {conv1.z, conv2.z, fc1.z, fc2.z}, [0-146] - {fc1.b, fc1.m, fc1.zw}, [0-43217] - fc1.w, [0-9] - {fc2.b, fc2.m, fc2.zw}, [0-1469] - fc2.w)
    • MASK: 8-digit hexadecimal number identifying those bits affected by the fault ([00000000 - No fault, FFFFFFFF - all 32 bits faulty])
    • FAULT: String identiying the type of fault (NF - No fault, BF - bit-flip, S0 - Stuck-at-0, S1 - Stuck-at-1)
    • OUTPUT: 10 integer numbers provided by the CNN as output after processing the image. The highest value identifies the selected category for classification.
    • SOFTMAX: 10 decimal numbers obtained after applying the softmax function to the provided output. They represent the probability of the image of belonging to the corresponding category for classification.
    • PRED: Integer number representing the category predicted for the processed image.
    • LABEL: integer number representing the actual category for the processed image.

  3. d

    Supporting Data for: UltraMNIST Classification: A Benchmark to Train CNNs...

    • search.dataone.org
    • dataverse.no
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gupta, Deepak K.; Bhamba, Udbhav; Thakur, Abhishek; Gupta, Akash; Sharan, Suraj; Demir, Ertugrul; Prasad, Dilip K. (2024). Supporting Data for: UltraMNIST Classification: A Benchmark to Train CNNs for Very Large Images [Dataset]. http://doi.org/10.18710/4F4KJS
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    DataverseNO
    Authors
    Gupta, Deepak K.; Bhamba, Udbhav; Thakur, Abhishek; Gupta, Akash; Sharan, Suraj; Demir, Ertugrul; Prasad, Dilip K.
    Description

    Convolutional neural network (CNN) approaches available in the current literature are designed to work primarily with low-resolution images. When applied on very large images, challenges related to GPU memory, smaller receptive field than needed for semantic correspondence and the need to incorporate multi-scale features arise. The resolution of input images can be reduced, however, with significant loss of critical information. Based on the outlined issues, we introduce a novel research problem of training CNN models for very large images, and present ‘UltraMNIST dataset’, a simple yet representative benchmark dataset for this task. UltraMNIST has been designed using the popular MNIST digits with additional levels of complexity added to replicate well the challenges of real-world problems. We present two variants of the problem: ‘UltraMNIST classification’ and ‘Budget-aware UltraMNIST classification’. The standard UltraMNIST classification benchmark is intended to facilitate the development of novel CNN training methods that make the effective use of the best available GPU resources. The budget-aware variant is intended to promote development of methods that work under constrained GPU memory. For the development of competitive solutions, we present several baseline models for the standard benchmark and its budget-aware variant. We study the effect of reducing resolution on the performance and present results for baseline models involving pretrained backbones from among the popular state-of-the-art models. Finally, with the presented benchmark dataset and the baselines, we hope to pave the ground for a new generation of CNN methods suitable for handling large images in an efficient and resource-light manner. UltraMNIST dataset comprises very large-scale images, each of 4000x4000 pixels with 3-5 digits per image. Each of these digits has been extracted from the original MNIST dataset. Your task is to predict the sum of the digits per image, and this number can be anything from 0 to 27.

  4. f

    Model comparison results using MNIST-C and MNIST-C-shape datasets.

    • plos.figshare.com
    xls
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Model comparison results using MNIST-C and MNIST-C-shape datasets. [Dataset]. https://plos.figshare.com/articles/dataset/Model_comparison_results_using_MNIST-C_and_MNIST-C-shape_datasets_/26032302
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 13, 2024
    Dataset provided by
    PLOS Computational Biology
    Authors
    Seoyoung Ahn; Hossein Adeli; Gregory J. Zelinsky
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Recognition accuracy (means and standard deviations from 5 trained models, hereafter referred to as model “runs”) from ORA and two CNN baselines, both of which were trained using identical CNN encoders (one a 2-layer CNN and the other a Resnet-18), and a CapsNet model following the implementation in [51].

  5. d

    Replication Data for: Exploring Neural Network Weaknesses: Insights from...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhang, Jun-Jie; Deyu Meng (2023). Replication Data for: Exploring Neural Network Weaknesses: Insights from Quantum Principles [Dataset]. http://doi.org/10.7910/DVN/SWDL1S
    Explore at:
    Dataset updated
    Dec 16, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Zhang, Jun-Jie; Deyu Meng
    Description

    The dataset contains the code and raw data for exploiting the accuracy-robustness trade-off from the principle of the uncertainty principle in quantum physics. # The folder contains two sub-folders: "data upload" and "figure&plot". ## In "data upload" the three network structures are used for cifar-10 and mnist. Take the sub-sub-folder "cifar conv" as an example. One starts with the two notebooks named "selected_train_netwrok1_test2.ipynb" and "selected_train_netwrok2_test2.ipynb", where the former performs the training of the complete Convolutional Network while the later divide the convolutional layers into two parts - feature extractor and classifier. After running the the two notebooks, the weights of the networks at each training epoch are saved in the folder "model". Then one runs the other two notebooks named "scanner-x.ipynb" and "scanner-feature-crt.ipynb", where the former performs the Monte-Carlo integrations on multi-GPUs with respect to the normalized loss function of the complete Convolutional Network, while the later only integrates the classifiers (the second part of the complete Convolutional Network). Last, one opens the notebook "plotter.ipynb" to see the results. ## In "figure&plot" we mainly plot the figures in the paper. The txt files are simply copied from the "data upload" folder. To see the figures, one needs to open the file "plot.nb" with Mathematica.

  6. f

    CNN2 architecture.

    • plos.figshare.com
    xls
    Updated Jul 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernardo Pulido-Gaytan; Andrei Tchernykh (2024). CNN2 architecture. [Dataset]. http://doi.org/10.1371/journal.pone.0306420.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Bernardo Pulido-Gaytan; Andrei Tchernykh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The widespread adoption of cloud computing necessitates privacy-preserving techniques that allow information to be processed without disclosure. This paper proposes a method to increase the accuracy and performance of privacy-preserving Convolutional Neural Networks with Homomorphic Encryption (CNN-HE) by Self-Learning Activation Functions (SLAF). SLAFs are polynomials with trainable coefficients updated during training, together with synaptic weights, for each polynomial independently to learn task-specific and CNN-specific features. We theoretically prove its feasibility to approximate any continuous activation function to the desired error as a function of the SLAF degree. Two CNN-HE models are proposed: CNN-HE-SLAF and CNN-HE-SLAF-R. In the first model, all activation functions are replaced by SLAFs, and CNN is trained to find weights and coefficients. In the second one, CNN is trained with the original activation, then weights are fixed, activation is substituted by SLAF, and CNN is shortly re-trained to adapt SLAF coefficients. We show that such self-learning can achieve the same accuracy 99.38% as a non-polynomial ReLU over non-homomorphic CNNs and lead to an increase in accuracy (99.21%) and higher performance (6.26 times faster) than the state-of-the-art CNN-HE CryptoNets on the MNIST optical character recognition benchmark dataset.

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kenta Shirane; Takahiro Yamamoto; Hiroyuki Tomiyama (2023). A design methodology for approximate multipliers in convolutional neural networks: A case of MNIST [Dataset]. https://osf.io/dp8mc

Data from: A design methodology for approximate multipliers in convolutional neural networks: A case of MNIST

Related Article
Explore at:
Dataset updated
Aug 31, 2023
Dataset provided by
Center For Open Science
Authors
Kenta Shirane; Takahiro Yamamoto; Hiroyuki Tomiyama
Description

In this paper, we present a case study on approximate multipliers for MNIST Convolutional Neural Network (CNN). We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST classification, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. Based on the results of the evaluation and analysis, we propose a design methodology for approximate multipliers. The approximate multipliers consist of some partial products, which are carefully selected according to the CNN input. With this methodology, we further reduce the area and the delay of the multipliers with keeping high accuracy of the MNIST classification.

Search
Clear search
Close search
Google apps
Main menu