Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Convolutional Neural Networks (CNNs) stand as indispensable tools in deep learning, capable of autonomously extracting crucial features from diverse data types. However, the intricacies of CNN architectures can present challenges such as overfitting and underfitting, necessitating thoughtful strategies to optimize their performance. In this work, these issues have been resolved by introducing L1 regularization in the basic architecture of CNN when it is applied for image classification. The proposed model has been applied to three different datasets. It has been observed that incorporating L1 regularization with different coefficient values has distinct effects on the working mechanism of CNN architecture resulting in improving its performance. In MNIST digit classification, L1 regularization (coefficient: 0.01) simplifies feature representation and prevents overfitting, leading to enhanced accuracy. In the Mango Tree Leaves dataset, dual L1 regularization (coefficient: 0.001 for convolutional and 0.01 for dense layers) improves model interpretability and generalization, facilitating effective leaf classification. Additionally, for hand-drawn sketches like those in the Quick, Draw! Dataset, L1 regularization (coefficient: 0.001) refines feature representation, resulting in improved recognition accuracy and generalization across diverse sketch categories. These findings underscore the significance of regularization techniques like L1 regularization in fine-tuning CNNs, optimizing their performance, and ensuring their adaptability to new data while maintaining high accuracy. Such strategies play a pivotal role in advancing the utility of CNNs across various domains, further solidifying their position as a cornerstone of deep learning.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F769452%2Ff6e2d0f05093e42a67119bde723b24d5%2Fdata-original.png?generation=1600931282565624&alt=media" alt="">
The Chinese MNIST dataset uses data collected in the frame of a project at Newcastle University.
One hundred Chinese nationals took part in data collection. Each participant wrote with a standard black ink pen all 15 numbers in a table with 15 designated regions drawn on a white A4 paper. This process was repeated 10 times with each participant. Each sheet was scanned at the resolution of 300x300 pixels. It resulted a dataset of 15000 images, each representing one character from a set of 15 characters (grouped in samples, grouped in suites, with 10 samples/volunteer and 100 volunteers).
I downloaded from the original project page the raw images. Based on images names, I created an index for each image, as following:
original name (example): Locate{1,3,4}.jpg
index extracted: suite_id: 1, sample_id: 3, code: 4
resulted file name: input_1_3_4.jpg
I also added the mapping of each image code to the actual numeric value of Chinese number character and the actual Chinese character. Here is described the mapping
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F769452%2F61c54df3540346d4b56cd611ba41143d%2Fchanracter_mapping.png?generation=1596618751340901&alt=media" alt="">
The dataset contains the following:
chinese_mnist.csv I want to express my gratitude to the following people: Dr. K Nazarpour and Dr. M Chen from Newcastle University, who collected the data.
You can use this data the same way you used MNIST, KMNIST of Fashion MNIST: refine your image classification skills, use GPU & TPU to implement CNN architectures for models to perform such multiclass classifications.
Facebook
TwitterHere, we disseminate a new handwritten digits-dataset, termed Kannada-MNIST, for the Kannada script, that can potentially serve as a direct drop-in replacement for the original MNIST dataset. In addition to this dataset, we disseminate an additional real world handwritten dataset (with images), which we term as the Dig-MNIST dataset that can serve as an out-of-domain test dataset. We also duly open source all the code as well as the raw scanned images along with the scanner settings so that researchers who want to try out different signal processing pipelines can perform end-to-end comparisons. We provide high level morphological comparisons with the MNIST dataset and provide baselines accuracies for the dataset disseminated. The initial baselines obtained using an oft-used CNN architecture ( for the main test-set and for the Dig-MNIST test-set) indicate that these datasets do provide a sterner challenge with regards to generalizability than MNIST or the KMNIST datasets. We also hope this dissemination will spur the creation of similar datasets for all the languages that use different symbols for the numeral digits.
All details of the dataset curation has been captured in the paper titled: Prabhu, Vinay Uday. "Kannada-MNIST: A new handwritten digits dataset for the Kannada language." arXiv preprint arXiv:1908.01242 (2019). Link: https://arxiv.org/abs/1908.01242
https://github.com/vinayprabhu/Kannada_MNIST
We propose the following open challenges to the machine learning community at large. 1. Achieve MNIST-level accuracy by training on the Kannada-MNIST dataset and testing on the Dig-MNIST dataset without resorting to image pre-processing.
To characterize the nature of catastrophic forgetting when a CNN pre-trained on MNIST is retrained with Kannada-MNIST. This is particularly interesting given the observation that the typographical glyphs for 3 and 7 in Kannada-MNIST hold uncanny resemblance with the glyph for 2 in MNIST.
Get a model trained on purely synthetic data generated9 using the fonts (as in [22]) and augmenting using frameworks such as [20] and [23] to achieve high accuracy of the Kannada-MNIST and Dig-MNIST datasets.
Replicate the procedure described in the paper across different languages/scripts, especially the Indic scripts.
With regards to the dig-MNIST dataset, we saw that some of the volunteers had transgressed the borders of the grid and hence some of the images either have only a partial slice of the glyph/stroke or have an appearance where it can be argued that they could potentially belong to either of two different classes. With regards to these images, it would be worthwhile to see if we can design a classifier that will allocate proportionate softmax masses to the candidate classes.
The main reason behind us sharing the raw scan images was to foster research into auto-segmentation algorithms that will parse the individual digit images from the grid, which might in turn lead to higher quality of images in the upgraded versions of the dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The architecture of the LeNet-5 convolutional neural network (CNN) was defined by LeCun in its paper "Gradient-based learning applied to document recognition" (https://ieeexplore.ieee.org/document/726791) to classify images of hand written digits (MNIST dataset). This architecture has been customized to use Rectified Linear Unit (ReLU) as activation functions instead of Sigmoid, and 8-bit integers for weights and activations instead of floating-point. It consists of the following layers:
conv1: Convolution 2D, 1 input channel (28x28), 3 output channels (28x28), kernel size 5, stride 1, padding 2. relu1: Rectified Linear Unit (3@28x28). max1: Subsampling buy max pooling (3@14x14). conv2: Convolution 2D, 3 input channels (14x14), 6 output channels (14x14), kernel size 5, stride 1, padding 2. relu2: Rectified Linear Unit (6@14x14). max2: Subsampling buy max pooling (6@7x7). fc1: Fully connected (294, 147) fc2: Fully connected (147, 10) The fault hypotheses for this work include the occurrence of:
BF: single, double-adjacent and triple-adjacent bit-flip faults S0: single, double-adjacent and triple-adjacent stuck-at-0 faults S1: single, double-adjacent and triple-adjacent stuck-at-1 faults In the memory cells containing all the parameters of the CNN:
w: weights (int8) zw: zero point of the weights (int8) b: biases (int32) z: zero point (int8) m: m (int32) Images 200 to 249 from the MNIST dataset have been used as workload. This dataset contains the raw data obtained from running exhaustive fault injection campaigns for all considered fault models, targeting all considered locations and for all the images in the workload. In addition, the raw data have been lightly processed to obtain global data related to the particular bits and parameters affected by the faults, and the obtained failure modes. Files information
golden_run.csv: Prediction obtained for all the images considered in the workload in the absence of faults (Golden Run). This is intended to act as oracle to determine the impact of injected faults.
single_faults/bit_flip folder: Prediction obtained for all the images considered in the workload in presence of single bit-flip faults. There is one file for each parameter of each layer.
single_faults/stuck_at_0 folder: Prediction obtained for all the images considered in the workload in presence of single stuck-at-0 faults. There is one file for each parameter of each layer.
single_faults/stuck_at_1 folder: Prediction obtained for all the images considered in the workload in presence of single stuck-at-1 faults. There is one file for each parameter of each layer.
double_adjacent_faults/bit_flip folder: Prediction obtained for all the images considered in the workload in presence of double adjacent bit-flip faults. There is one file for each parameter of each layer.
double_adjacent_faults/stuck_at_0 folder: Prediction obtained for all the images considered in the workload in presence of double adjacent stuck-at-0 faults. There is one file for each parameter of each layer.
double_adjacent_faults/stuck_at_1 folder: Prediction obtained for all the images considered in the workload in presence of double adjacent stuck-at-1 faults. There is one file for each parameter of each layer.
triple_adjacent_faults/bit_flip folder: Prediction obtained for all the images considered in the workload in presence of triple adjacent bit-flip faults. There is one file for each parameter of each layer.
triple_adjacent_faults/stuck_at_0 folder: Prediction obtained for all the images considered in the workload in presence of triple adjacent stuck-at-0 faults. There is one file for each parameter of each layer.
triple_adjacent_faults/stuck_at_1 folder: Prediction obtained for all the images considered in the workload in presence of triple adjacent stuck-at-1 faults. There is one file for each parameter of each layer.
Methodology information
First, the CNN was used to classify all the images of the workload in the absence of faults to get a reference to determine the impact of faults. This is golden_run.csv file.
After that, one fault injection experiment was executed for each bit of each element of each parameter of the CNN.
Each experiment consisted in:
Affecting the bits (inverting it in case of bit-flip faults, setting it to 0 or 1 in case of stuck-at-0 or atuck-at-1 faults) identified by the mask. Classifying all the images of the workload in the presence of this fault. The obtained output was stored in a given .csv file. Removing the fault from the CNN by restoring the affected bits to its previous value. List of variables (Name : Description (Possible values))
IMGID: Integer number identifying the considered image (200-249). TENSORID: Integer number identiying the parameter affected by the fault (0 - No fault, 1 - conv1.w, 2 - conv1.zw, 3 - conv1.m, 4 - conv1.b, 5 - conv1.z, 6 - conv2.w, 7 - conv2.zw, 8 - conv2.m, 9 - conv2.b, 10 - conv2.z, 11 - fc1.w, 12 - fc1.zw, 13 - fc1.m, 14 - fc.b, 15 - fc1.z, 16 - fc2.w, 17 - fc2.zw, 18 - fc2.m, 19 - fc2.b, 20 - fc2.z) ELEMID: Integer number identiying the element of the parameter affected by the fault (-1 - No fault, [0-2] - {conv1.b, conv1.m, conv1.zw}, [0-74] - conv1.w, 0 - conv1.z, [0-5] - {conv2.b, conv2.m, conv2.zw}, [0-149] - conv2.w, 0 - {conv1.z, conv2.z, fc1.z, fc2.z}, [0-146] - {fc1.b, fc1.m, fc1.zw}, [0-43217] - fc1.w, [0-9] - {fc2.b, fc2.m, fc2.zw}, [0-1469] - fc2.w) MASK: 8-digit hexadecimal number identifying those bits affected by the fault ([00000000 - No fault, FFFFFFFF - all 32 bits faulty]) FAULT: String identiying the type of fault (NF - No fault, BF - bit-flip, S0 - Stuck-at-0, S1 - Stuck-at-1) OUTPUT: 10 integer numbers provided by the CNN as output after processing the image. The highest value identifies the selected category for classification. SOFTMAX: 10 decimal numbers obtained after applying the softmax function to the provided output. They represent the probability of the image of belonging to the corresponding category for classification. PRED: Integer number representing the category predicted for the processed image. LABEL: integer number representing the actual category for the processed image.
Facebook
TwitterSichkar V. N. Effect of various dimension convolutional layer filters on traffic sign classification accuracy. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2019, vol. 19, no. 3, pp. DOI: 10.17586/2226-1494-2019-19-3-546-552 (Full-text available here ResearchGate.net/profile/Valentyn_Sichkar)
Test online with custom Traffic Sign here: https://valentynsichkar.name/mnist.html
Design, Train & Test deep CNN for Image Classification. Join the course & enjoy new opportunities to get deep learning skills: https://www.udemy.com/course/convolutional-neural-networks-for-image-classification/
https://github.com/sichkar-valentyn/1-million-images-for-Traffic-Signs-Classification-tasks/blob/main/images/slideshow_classification.gif?raw=true%20=470x516" alt="CNN Course" title="CNN Course">
https://github.com/sichkar-valentyn/1-million-images-for-Traffic-Signs-Classification-tasks/blob/main/images/concept_map.png?raw=true%20=570x410" alt="Concept map" title="Concept map">
https://www.udemy.com/course/convolutional-neural-networks-for-image-classification/
This is ready to use preprocessed data saved into pickle file.
Preprocessing stages are as follows:
- Normalizing whole data by dividing / 255.0.
- Dividing whole data into three datasets: train, validation and test.
- Normalizing whole data by subtracting mean image and dividing by standard deviation.
- Transposing every dataset to make channels come first.
mean image and standard deviation were calculated from train dataset and applied to all datasets.
When using user's image for classification, it has to be preprocessed firstly in the same way: normalized, subtracted with mean image and divided by standard deviation.
Data written as dictionary with following keys:
x_train: (59000, 1, 28, 28)
y_train: (59000,)
x_validation: (1000, 1, 28, 28)
y_validation: (1000,)
x_test: (1000, 1, 28, 28)
y_test: (1000,)
Contains pretrained weights model_params_ConvNet1.pickle for the model with following architecture:
Input --> Conv --> ReLU --> Pool --> Affine --> ReLU --> Affine --> Softmax
Parameters:
Pool is 2 and height = width = 2.
Architecture also can be understood as follows:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3400968%2Fc23041248e82134b7d43ed94307b720e%2FModel_1_Architecture_MNIST.png?generation=1563654250901965&alt=media" alt="">
Initial data is MNIST that was collected by Yann LeCun, Corinna Cortes, Christopher J.C. Burges.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Various deep learning techniques, including blockchain-based approaches, have been explored to unlock the potential of edge data processing and resultant intelligence. However, existing studies often overlook the resource requirements of blockchain consensus processing in typical Internet of Things (IoT) edge network settings. This paper presents our FLCoin approach. Specifically, we propose a novel committee-based method for consensus processing in which committee members are elected via the FL process. Additionally, we employed a two-layer blockchain architecture for federated learning (FL) processing to facilitate the seamless integration of blockchain and FL techniques. Our analysis reveals that the communication overhead remains stable as the network size increases, ensuring the scalability of our blockchain-based FL system. To assess the performance of the proposed method, experiments were conducted using the MNIST dataset to train a standard five-layer CNN model. Our evaluation demonstrated the efficiency of FLCoin. With an increasing number of nodes participating in the model training, the consensus latency remained below 3 s, resulting in a low total training time. Notably, compared with a blockchain-based FL system utilizing PBFT as the consensus protocol, our approach achieved a 90% improvement in communication overhead and a 35% reduction in training time cost. Our approach ensures an efficient and scalable solution, enabling the integration of blockchain and FL into IoT edge networks. The proposed architecture provides a solid foundation for building intelligent IoT services.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time cost comparison of training with PBFT and FLCoin.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The widespread adoption of cloud computing necessitates privacy-preserving techniques that allow information to be processed without disclosure. This paper proposes a method to increase the accuracy and performance of privacy-preserving Convolutional Neural Networks with Homomorphic Encryption (CNN-HE) by Self-Learning Activation Functions (SLAF). SLAFs are polynomials with trainable coefficients updated during training, together with synaptic weights, for each polynomial independently to learn task-specific and CNN-specific features. We theoretically prove its feasibility to approximate any continuous activation function to the desired error as a function of the SLAF degree. Two CNN-HE models are proposed: CNN-HE-SLAF and CNN-HE-SLAF-R. In the first model, all activation functions are replaced by SLAFs, and CNN is trained to find weights and coefficients. In the second one, CNN is trained with the original activation, then weights are fixed, activation is substituted by SLAF, and CNN is shortly re-trained to adapt SLAF coefficients. We show that such self-learning can achieve the same accuracy 99.38% as a non-polynomial ReLU over non-homomorphic CNNs and lead to an increase in accuracy (99.21%) and higher performance (6.26 times faster) than the state-of-the-art CNN-HE CryptoNets on the MNIST optical character recognition benchmark dataset.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Convolutional Neural Networks (CNNs) stand as indispensable tools in deep learning, capable of autonomously extracting crucial features from diverse data types. However, the intricacies of CNN architectures can present challenges such as overfitting and underfitting, necessitating thoughtful strategies to optimize their performance. In this work, these issues have been resolved by introducing L1 regularization in the basic architecture of CNN when it is applied for image classification. The proposed model has been applied to three different datasets. It has been observed that incorporating L1 regularization with different coefficient values has distinct effects on the working mechanism of CNN architecture resulting in improving its performance. In MNIST digit classification, L1 regularization (coefficient: 0.01) simplifies feature representation and prevents overfitting, leading to enhanced accuracy. In the Mango Tree Leaves dataset, dual L1 regularization (coefficient: 0.001 for convolutional and 0.01 for dense layers) improves model interpretability and generalization, facilitating effective leaf classification. Additionally, for hand-drawn sketches like those in the Quick, Draw! Dataset, L1 regularization (coefficient: 0.001) refines feature representation, resulting in improved recognition accuracy and generalization across diverse sketch categories. These findings underscore the significance of regularization techniques like L1 regularization in fine-tuning CNNs, optimizing their performance, and ensuring their adaptability to new data while maintaining high accuracy. Such strategies play a pivotal role in advancing the utility of CNNs across various domains, further solidifying their position as a cornerstone of deep learning.