100+ datasets found
  1. Training and Validation Datasets for Neural Network to Fill in Missing Data...

    • catalog.data.gov
    • gimi9.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). Training and Validation Datasets for Neural Network to Fill in Missing Data in EBSD Maps [Dataset]. https://catalog.data.gov/dataset/training-and-validation-datasets-for-neural-network-to-fill-in-missing-data-in-ebsd-maps
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    This dataset consists of the synthetic electron backscatter diffraction (EBSD) maps generated for the paper, titled "Hybrid Algorithm for Filling in Missing Data in Electron Backscatter Diffraction Maps" by Emmanuel Atindama, Conor Miller-Lynch, Huston Wilhite, Cody Mattice, Günay Doğan, and Prashant Athavale. The EBSD maps were used to train, test, and validate a neural network algorithm to fill in missing data points in a given EBSD map.The dataset includes 8000 maps for training, 1000 maps for testing, 2000 maps for validation. The dataset also includes noise-added versions of the maps, namely, one more map per each clean map.

  2. Neural Network - Observation dataset

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    txt
    Updated Aug 4, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jesus Rogel-Salazar (2019). Neural Network - Observation dataset [Dataset]. http://doi.org/10.6084/m9.figshare.9249074.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 4, 2019
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Jesus Rogel-Salazar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A dataset with observations to train a neural network.

  3. R

    Training and testing XRD dataset for crystallite size and microstrain...

    • entrepot.recherche.data.gouv.fr
    image/x-silx-numpy +1
    Updated Nov 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexandre BOULLE; Alexandre BOULLE; Arthur SOUESME; Arthur SOUESME (2025). Training and testing XRD dataset for crystallite size and microstrain determination using deep neural networks [Dataset]. http://doi.org/10.57745/SVQART
    Explore at:
    text/markdown(1068), image/x-silx-numpy(6059958836), image/x-silx-numpy(673347924)Available download formats
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    Recherche Data Gouv
    Authors
    Alexandre BOULLE; Alexandre BOULLE; Arthur SOUESME; Arthur SOUESME
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Time period covered
    Oct 1, 2023 - Oct 1, 2026
    Dataset funded by
    Région Nouvelle Aquitaine
    Description

    Numpy tensors to train and test a convolutional neural network dedicated to determine crystallite size and/or microstrain from X-ray diffraction data (XRD): train_size.npz: training dataset with only crystallite size test_size.npz: testing dataset with only crystallite size train_size_strain.npz: training dataset with crystallite size and microstrain test_size_strain.npz: testing dataset with crystallite size and microstrain Each dataset contains the XRD data and the labels ("ground truth") in the form of 2D tensors with 10501 data points (columns) for the XRD data, and 24 labels (columns) for the labels. Training data contain 71971 rows ; testing data contain 7997 rows. Example python script to read the data: import numpy as np train = np.load("train_size.npz") train_data, train_label = train["train_data"], train["train_label"] print(f"Train data shape: {train_data.shape}, Train labels shape: {train_label.shape}") Jupyter notebooks to train and test a neural network can be found here: https://github.com/aboulle/LPA-NN

  4. 4

    Code supporting the paper: 1D neural network

    • data.4tu.nl
    zip
    Updated Oct 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ginger Egberts; Fred Vermolen; Paul van Zuijlen (2022). Code supporting the paper: 1D neural network [Dataset]. http://doi.org/10.4121/21407604.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 28, 2022
    Dataset provided by
    4TU.ResearchData
    Authors
    Ginger Egberts; Fred Vermolen; Paul van Zuijlen
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This online resource shows two archived folders: Matlab and Python, that contain relevant code for the article: A Bayesian finite-element trained machine learning approach for predicting post-burn contraction.


    One finds the codes used to generate the large dataset within the Matlab folder. Here, the file Main.m is the main file and from there, one can run the Monte Carlo simulation. There is a README file.


    Within the Python folder, one finds the codes used for training the neural networks and creating the online application. The file Data.mat contains the data generated by the Matlab Monte Carlo simulation. The files run_bound.py, run_rsa.py, and run_tse.py train the neural networks, of which the best scoring ones are saved in the folder Training. The DashApp folder contains the code for the creation of the Application.

  5. Data from: Neural Network Matrix Product Operator: A Multi-Dimensionally...

    • zenodo.org
    Updated May 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kentaro Hino; Kentaro Hino (2025). Neural Network Matrix Product Operator: A Multi-Dimensionally Integrable Machine Learning Potential [Dataset]. http://doi.org/10.48550/arxiv.2410.23858
    Explore at:
    Dataset updated
    May 14, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kentaro Hino; Kentaro Hino
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For Pompon-main.zip

    1. See README.md at first for installation
    2. Manuscript data for train/validate/test are in docs/data/*npy. They can be loaded by numpy.load.
    3. The reference geometry information of H2CO molecule is docs/data/bagel_h2co_dft.s0.harmonic.json
    4. training script is docs/notebook/_h2co_opt.py. It is easily executable by uv run _h2co_opt.py if you have installed uv.
      1. The trained weight is in docs/data/nnmpo_final_rmse_8.365e-04.h5. (HDF5 format)
    5. NN-MPO to MPO is in docs/notebook/create-random-mpo.ipynb and docs/notebook/nnmpo_to_itensor_mpo.ipynb (Need ITensors.jl version 0.6.x)
    6. DMRG calculation by ITensors.jl is docs/notebook/itensor_vDMRG.ipynb.

    If you have any questions, please post an issue on GitHub.

    Discvar-main.zip is an implementation of discrete variable representation (DVR).

  6. Convolutional Neural Networks for Classifying Combinatorial Metamaterials

    • data.europa.eu
    unknown
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo, Convolutional Neural Networks for Classifying Combinatorial Metamaterials [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-5992648?locale=lv
    Explore at:
    unknown(1276985474)Available download formats
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the training and test data, as well as the trained neural networks as used for the paper 'Machine Learning of Combinatorial Rules in Mechanical Metamaterials', as published in XXX. In this paper, a neural network is used to classify each (k \times k) unit cell design into one of two classes (C or I). Additionally, the performance of the trained networks is analysed in detail. A more detailed description of the contents of the dataset follows below. NeuralNetwork_train_and_test_data.zip This file contains the train and test data used to train the Convolutional Neural Networks (CNNs) of the paper. Each unit cell size has its own file, and is saved in a zipped numpy file type (.npz). CNN_saves_kxk.zip This file contains the parameter configurations of the CNNs trained on (k \times k) unit cells. Every hyperparameter (number of filters nf, number of hidden neurons nh, learning rate lr) combination is saved separately. The neural networks can be loaded using Google's TensorFlow package in Python, specifically using the 'tf.keras.models.load_model' function.

  7. Z

    Training dataset used in the magazine paper entitled "A Flexible Machine...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francisco Wilhelmi (2020). Training dataset used in the magazine paper entitled "A Flexible Machine Learning-Aware Architecture for Future WLANs" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3626690
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Universitat Pompeu Fabra
    Authors
    Francisco Wilhelmi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A Flexible Machine Learning-Aware Architecture for Future WLANs

    Authors: Francesc Wilhelmi, Sergio Barrachina-Muñoz, Boris Bellalta, Cristina Cano, Anders Jonsson & Vishnu Ram.

    Abstract: Lots of hopes have been placed in Machine Learning (ML) as a key enabler of future wireless networks. By taking advantage of the large volumes of data generated by networks, ML is expected to deal with the ever-increasing complexity of networking problems. Unfortunately, current networking systems are not yet prepared for supporting the ensuing requirements of ML-based applications, especially for enabling procedures related to data collection, processing, and output distribution. This article points out the architectural requirements that are needed to pervasively include ML as part of future wireless networks operation. To this aim, we propose to adopt the International Telecommunications Union (ITU) unified architecture for 5G and beyond. Specifically, we look into Wireless Local Area Networks (WLANs), which, due to their nature, can be found in multiple forms, ranging from cloud-based to edge-computing-like deployments. Based on ITU's architecture, we provide insights on the main requirements and the major challenges of introducing ML to the multiple modalities of WLANs.

    Dataset description: This is the dataset generated for training a Neural Network (NN) in the Access Point (AP) (re)association problem in IEEE 802.11 Wireless Local Area Networks (WLANs).

    In particular, the NN is meant to output a prediction function of the throughput that a given station (STA) can obtain from a given Access Point (AP) after association. The features included in the dataset are:

    Identifier of the AP to which the STA has been associated.

    RSSI obtained from the AP to which the STA has been associated.

    Data rate in bits per second (bps) that the STA is allowed to use for the selected AP.

    Load in packets per second (pkt/s) that the STA generates.

    Percentage of data that the AP is able to serve before the user association is done.

    Amount of traffic load in pkt/s handled by the AP before the user association is done.

    Airtime in % that the AP enjoys before the user association is done.

    Throughput in pkt/s that the STA receives after the user association is done.

    The dataset has been generated through random simulations, based on the model provided in https://github.com/toniadame/WiFi_AP_Selection_Framework. More details regarding the dataset generation have been provided in https://github.com/fwilhelmi/machine_learning_aware_architecture_wlans.

  8. Convolutional Neural Networks for Classifying Combinatorial Metamaterials

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Nov 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan van Mastrigt; Ryan van Mastrigt; Marjolein Dijkstra; Marjolein Dijkstra; Martin van Hecke; Martin van Hecke; Corentin Coulais; Corentin Coulais (2022). Convolutional Neural Networks for Classifying Combinatorial Metamaterials [Dataset]. http://doi.org/10.5281/zenodo.7071282
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 8, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ryan van Mastrigt; Ryan van Mastrigt; Marjolein Dijkstra; Marjolein Dijkstra; Martin van Hecke; Martin van Hecke; Corentin Coulais; Corentin Coulais
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the training and test data, as well as the trained neural networks as used for the paper 'Machine Learning of Implicit Combinatorial Rules in Mechanical Metamaterials', as published in Physical Review Letters.

    In this paper, a neural network is used to classify each \(k \times k\) unit cell design of metamaterial M1 and M2 into one of two classes (C or I). Additionally, the performance of the trained networks is analysed in detail. A more detailed description of the contents of the dataset follows below.

    NeuralNetwork_train_and_test_data.zip

    This file contains the train and test data used to train the Convolutional Neural Networks (CNNs) of the paper. Each unit cell size has its own file, and is saved in a zipped numpy file type (.npz). It contains data for metamaterial M1 ("smiley_cube"), and metamaterial M2 classification (i) ("prek_xy") and (ii) ("unimodal_vs_oligomodal_inc_stripmodes").

    CNN_saves_kxk.zip

    This file contains the parameter configurations of the CNNs trained on \(k \times k\) unit cells for metamaterial M2 classification (ii). Classification (i) is denoted by an additional M2ii in the file name. Metamaterial M1 is denoted by an extra M1 in the file name. Every hyperparameter (number of filters nf, number of hidden neurons nh, learning rate lr) combination is saved separately. The neural networks can be loaded using Google's TensorFlow package in Python, specifically using the 'tf.keras.models.load_model' function.

  9. n

    Data from: Domain-specific neural networks improve automated bird sound...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Sep 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrik Lauha; Panu Somervuo; Petteri Lehikoinen; Lisa Geres; Tobias Richter; Sebastian Seibold; Otso Ovaskainen (2022). Domain-specific neural networks improve automated bird sound recognition already with small amount of local data [Dataset]. http://doi.org/10.5061/dryad.2bvq83btd
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 28, 2022
    Dataset provided by
    Goethe University Frankfurt
    University of Jyväskylä
    University of Helsinki
    Technical University of Munich
    Authors
    Patrik Lauha; Panu Somervuo; Petteri Lehikoinen; Lisa Geres; Tobias Richter; Sebastian Seibold; Otso Ovaskainen
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    An automatic bird sound recognition system is a useful tool for collecting data of different bird species for ecological analysis. Together with autonomous recording units (ARUs), such a system provides a possibility to collect bird observations on a scale that no human observer could ever match. During the last decades progress has been made in the field of automatic bird sound recognition, but recognizing bird species from untargeted soundscape recordings remains a challenge. In this article we demonstrate the workflow for building a global identification model and adjusting it to perform well on the data of autonomous recorders from a specific region. We show how data augmentation and a combination of global and local data can be used to train a convolutional neural network to classify vocalizations of 101 bird species. We construct a model and train it with a global data set to obtain a base model. The base model is then fine-tuned with local data from Southern Finland in order to adapt it to the sound environment of a specific location and tested with two data sets: one originating from the same Southern Finnish region and another originating from a different region in German Alps. Our results suggest that fine-tuning with local data significantly improves the network performance. Classification accuracy was improved for test recordings from the same area as the local training data (Southern Finland) but not for recordings from a different region (German Alps). Data augmentation enables training with a limited number of training data and even with few local data samples significant improvement over the base model can be achieved. Our model outperforms the current state-of-the-art tool for automatic bird sound classification. Using local data to adjust the recognition model for the target domain leads to improvement over general non-tailored solutions. The process introduced in this article can be applied to build a fine-tuned bird sound classification model for a specific environment. Methods This repository contains data and recognition models described in paper Domain-specific neural networks improve automated bird sound recognition already with small amount of local data. (Lauha et al., 2022).

  10. Z

    Data from: Self-Supervised Representation Learning on Neural Network Weights...

    • data.niaid.nih.gov
    Updated Nov 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schürholt, Kontantin; Kostadinov, Dimche; Borth, Damian (2021). Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction - Datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5645137
    Explore at:
    Dataset updated
    Nov 13, 2021
    Dataset provided by
    University of St.Gallen
    Authors
    Schürholt, Kontantin; Kostadinov, Dimche; Borth, Damian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets to NeurIPS 2021 accepted paper "Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction".

    Datasets are pytorch files containing a dictionary with training, validation and test sets. Train, validation and test sets are custom dataset classes which inherit from the standard torch dataset class. Corresponding code an be found at https://github.com/HSG-AIML/NeurIPS_2021-Weight_Space_Learning.

    Datasets 41, 42, 43 and 44 are our dataset format wrapped around the zoos from Unterthiner et al, 2020 (https://github.com/google-research/google-research/tree/master/dnn_predict_accuracy)

    Abstract: Self-Supervised Learning (SSL) has been shown to learn useful and information-preserving representations. Neural Networks (NNs) are widely applied, yet their weight space is still not fully understood. Therefore, we propose to use SSL to learn neural representations of the weights of populations of NNs. To that end, we introduce domain specific data augmentations and an adapted attention architecture. Our empirical evaluation demonstrates that self-supervised representation learning in this domain is able to recover diverse NN model characteristics. Further, we show that the proposed learned representations outperform prior work for predicting hyper-parameters, test accuracy, and generalization gap as well as transfer to out-of-distribution settings.

  11. D

    Data Repository for: On Reducing the Amount of Samples Required for Training...

    • darus.uni-stuttgart.de
    Updated Sep 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Mandl; Johanna Barzen; Frank Leymann; Victoria Mangold; Benedikt Riegel; Daniel Vietz; Felix Winterhalter (2023). Data Repository for: On Reducing the Amount of Samples Required for Training of QNNs [Dataset]. http://doi.org/10.18419/DARUS-3442
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 27, 2023
    Dataset provided by
    DaRUS
    Authors
    Alexander Mandl; Johanna Barzen; Frank Leymann; Victoria Mangold; Benedikt Riegel; Daniel Vietz; Felix Winterhalter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    BMWK
    Description

    Simulation experiment data for training Quantum Neural Networks (QNNs) using entangled datasets. The experiments investigate the validity of the lower bounds for the expected risk after training QNNs given by the extensions to the Quantum No-Free-Lunch theorem presented in the related publication. The QNNs are trained with (i) samples of varying Schmidt rank, (ii) orthogonal samples of fixed Schmidt rank and (iii) linearly dependent samples of fixed Schmidt rank. The dataset contains raw experiment data (directory "raw_data"), analyzed mean risks and errors (directory "plot_data") and the resulting plots (directory "plots"). Experiments: The experiments train QNNs using various compositions of training samples on a simulator and extract the risk after training to compute average risks. Experiment 1: Trains QNNs using entangled training samples of varying Schmidt rank. The average Schmidt rank and the number of training samples are controlled. Raw data: average_rank_results.zip; Computed average risks: avg_rank_risks.npy; Computed average losses: avg_rank_losses.npy; Plotted average risks: avg_rank_experiments.pdf; Plotted average losses: avg_rank_losses.pdf. Experiment 2: Trains QNNs using entangled orthogonal training samples. The number of training samples is controlled and the Schmidt rank is fixed such that d=r*t for the dimension d of the Hilbert space. Raw data: orthogonal_results.zip; Computed average risks: orthogonal_exp_points.npy; Plotted average risks: orthogonal_experiments.pdf. Experiment 3: Trains QNNs using entangled linearly dependent training samples. The number of training samples is controlled and the Schmidt rank is fixed such that d=r*t for the dimension d of the Hilbert space. Raw data: not_linearly_independent_results.zip; Computed average risks: nlihx_exp_points.npy; Plotted average risks: nlihx_experiments.pdf Additionally, this repository contains the reproduction data for Figure 1 (phases_in_orthogonal_training.zip). This file contains the training data, the target unitary and the resulting hypothesis unitary for orthogonal training samples of (i) high risk and (ii) low risk. For the code to reproduce and analyze the experiments see the Code repository.

  12. c

    Training datasets for AIMNet2 machine-learned neural network potential

    • kilthub.cmu.edu
    txt
    Updated Jan 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roman Zubatiuk; Olexandr Isayev; Dylan Anstine (2025). Training datasets for AIMNet2 machine-learned neural network potential [Dataset]. http://doi.org/10.1184/R1/27629937.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 27, 2025
    Dataset provided by
    Carnegie Mellon University
    Authors
    Roman Zubatiuk; Olexandr Isayev; Dylan Anstine
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The datasets contain molecular structures and the properties computed with B97-3c (GGA DFT) or wB97M-def2-TZVPP (range-separated hybrid DFT) methods. Each data file contains about 20M structures. DFT calculation performed with ORCA 5.0.3 software. Properties include energy, forces, atomic charges, and molecular dipole and quadrupole moments.

  13. d

    Data from: Processed Lab Data for Neural Network-Based Shear Stress Level...

    • catalog.data.gov
    • data.openei.org
    • +3more
    Updated Jan 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pennsylvania State University (2025). Processed Lab Data for Neural Network-Based Shear Stress Level Prediction [Dataset]. https://catalog.data.gov/dataset/processed-lab-data-for-neural-network-based-shear-stress-level-prediction-309d2
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    Pennsylvania State University
    Description

    Machine learning can be used to predict fault properties such as shear stress, friction, and time to failure using continuous records of fault zone acoustic emissions. The files are extracted features and labels from lab data (experiment p4679). The features are extracted with a non-overlapping window from the original acoustic data. The first column is the time of the window. The second and third columns are the mean and the variance of the acoustic data in this window, respectively. The 4th-11th column is the the power spectrum density ranging from low to high frequency. And the last column is the corresponding label (shear stress level). The name of the file means which driving velocity the sequence is generated from. Data were generated from laboratory friction experiments conducted with a biaxial shear apparatus. Experiments were conducted in the double direct shear configuration in which two fault zones are sheared between three rigid forcing blocks. Our samples consisted of two 5-mm-thick layers of simulated fault gouge with a nominal contact area of 10 by 10 cm^2. Gouge material consisted of soda-lime glass beads with initial particle size between 105 and 149 micrometers. Prior to shearing, we impose a constant fault normal stress of 2 MPa using a servo-controlled load-feedback mechanism and allow the sample to compact. Once the sample has reached a constant layer thickness, the central block is driven down at constant rate of 10 micrometers per second. In tandem, we collect an AE signal continuously at 4 MHz from a piezoceramic sensor embedded in a steel forcing block about 22 mm from the gouge layer The data from this experiment can be used with the deep learning algorithm to train it for future fault property prediction.

  14. Data from: Modeling the Spread of a Livestock Disease With Semi-Supervised...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Data from: Modeling the Spread of a Livestock Disease With Semi-Supervised Spatiotemporal Deep Neural Networks [Dataset]. https://catalog.data.gov/dataset/data-from-modeling-the-spread-of-a-livestock-disease-with-semi-supervised-spatiotemporal-d-bdd33
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    This dataset contains the spatiotemporal data used to train the spatiotemporal deep neural networks described in "Modeling the Spread of a Livestock Disease With Semi-Supervised Spatiotemporal Deep Neural Networks". The dataset consists of two sets of NumPy arrays. The first set: X_grid.npy and Y_grid.npy were used to train the convolutional LSTM, while the second set: X_graph.npy, Y_graph.npy, and edge_index.npy were used to train the graph convolutional LSTM. The data consists of spatiotemporally varying environmental and anthropogenic variables along with case reports of vesicular stomatitis. Resources in this dataset:Resource Title: NumPy Arrays of Spatiotemporal Features and VS Cases. File Name: vs_data.zipResource Description: This is a ZIP archive containing five NumPy arrays of spatiotemporal features and geotagged VS cases.Resource Software Recommended: NumPy,url: https://numpy.org/

  15. Cats vs Dogs Redux Transfer Features

    • kaggle.com
    zip
    Updated Aug 22, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kanwalinder Singh (2018). Cats vs Dogs Redux Transfer Features [Dataset]. https://www.kaggle.com/kanwalinder/cats-vs-dogs-redux-transfer-features
    Explore at:
    zip(1345261572 bytes)Available download formats
    Dataset updated
    Aug 22, 2018
    Authors
    Kanwalinder Singh
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Most machine learning courses start by implementing a fully-connected Deep Neural Network (DNN) and proceed towards Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), teaching skills on how to manage training, inference, and deployment along the way. For most beginners, the problem with building DNNs from scratch is that either the input data has to be grossly simplified ( working with 64x64x3 images for example) or the network has so many parameters that it is very hard to train. Meanwhile, Transfer Learning has made building even CNNs and RNNs from scratch unnecessary and one can reuse and/or fine tune publicly available CNNs like Inception V3 with very little data for a new problem.

    The purpose of this dataset is to make a large dataset of 25000 training examples and 12500 test examples available from the ever popular Dogs vs Cats Redux competition, suitable for students just starting on machine learning. The base dataset, which consists of fairly large image sizes, has been transferred through publicly available CNNs like Inception V3, Inception Resnet V2, Resnet 50, Xception, and MobileNet, creating features that are very easy to build a pretty good DNN classifier with. This should make learning to build DNNs from scratch easy to do, while learning a bit of transfer learning and even "competing" in Dogs vs Cats Redux for kicks!

    Content

    As mentioned, the input data for this dataset are images from the Dogs vs Cats Redux competition. All transfer learning CNN models were obtained from keras.applications. The features derived by processing the input images through the transfer models are flat (25000x2048 training examples and 12500x2048 test examples when using Inception V3) and ready for ingestion into a DNN. In addition, the dataset provides ids from the original training and test examples so classification results can be reviewed against the base data.

    Note that while the classic goal of transfer learning is to apply a network on a smaller dataset and/or fine tune the transferred network on said dataset, the purpose of this dataset is subtly different: make a large dataset available for beginners to build DNNs with. Of course, a subset of the dataset can be used for classification and the base transfer models can be fine tuned.

    Acknowledgements

    Francois Chollet's Keras framework, specifically keras.applications.

    Dr. Andrew Ng's deeplearning.ai specialization on Coursera. In my spare time, I mentor students in Coursera's Neural Networks and Deep Learning and Convolutional Neural Networks courses.

    Inspiration

    Initially I am posting just the dataset, and will later post the kernel that produced the dataset and a kernel that will use the dataset to classify for Dogs vs Cats Redux. Can you duplicate the log loss score of 0.21 currently possible with reusing the transfer models with no fine-tuning? Can you get into the top 50 by fine tuning the base models and/or augmenting the input data?

  16. IITM_CS6910_Assignment_dataset

    • kaggle.com
    zip
    Updated Mar 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anik Bhowmick ae20b102 (2024). IITM_CS6910_Assignment_dataset [Dataset]. https://www.kaggle.com/datasets/anikbhowmickae20b102/iitm-cs6910-assignment-dataset
    Explore at:
    zip(3093048 bytes)Available download formats
    Dataset updated
    Mar 6, 2024
    Authors
    Anik Bhowmick ae20b102
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset is a part of the course assignment in IIT Madras. This dataset is ideal for people who are new to Neural networks. The data essentially contains features extracted from the images. The goal is to train the model for multiclass classification purpose.

  17. Manga Vs Classic Art Style Comic Images Dataset

    • kaggle.com
    zip
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saral Agrawal (2024). Manga Vs Classic Art Style Comic Images Dataset [Dataset]. https://www.kaggle.com/datasets/saralagrawal/manga-vs-classic-art-style-comic-images/code
    Explore at:
    zip(52208257 bytes)Available download formats
    Dataset updated
    Apr 15, 2024
    Authors
    Saral Agrawal
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This data was collected to train a Convolutional Neural Network Classifier for Manga Vs Classic art style comic images using Transfer Learning (VGG16) and was later deployed using Flask.

  18. S

    Deep learning based Missing Data Imputation

    • scidb.cn
    Updated Mar 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahjabeen Tahir (2024). Deep learning based Missing Data Imputation [Dataset]. http://doi.org/10.57760/sciencedb.16599
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 4, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Mahjabeen Tahir
    Description

    The code provided is related to training an autoencoder, evaluating its performance, and using it for imputing missing values in a dataset. Let's break down each part:Training the Autoencoder (train_autoencoder function):This function takes an autoencoder model and the input features as input.It trains the autoencoder using the input features as both input and target output (hence features, features).The autoencoder is trained for a specified number of epochs (epochs) with a given batch size (batch_size).The shuffle=True argument ensures that the data is shuffled before each epoch to prevent the model from memorizing the input order.After training, it returns the trained autoencoder model and the training history.Evaluating the Autoencoder (evaluate_autoencoder function):This function takes a trained autoencoder model and the input features as input.It uses the trained autoencoder to predict the reconstructed features from the input features.It calculates Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared (R2) scores between the original and reconstructed features.These metrics provide insights into how well the autoencoder is able to reconstruct the input features.Imputing with the Autoencoder (impute_with_autoencoder function):This function takes a trained autoencoder model and the input features as input.It identifies missing values (e.g., -9999) in the input features.For each row with missing values, it predicts the missing values using the trained autoencoder.It replaces the missing values with the predicted values.The imputed features are returned as output.To reuse this code:Load your dataset and preprocess it as necessary.Build an autoencoder model using the build_autoencoder function.Train the autoencoder using the train_autoencoder function with your input features.Evaluate the performance of the autoencoder using the evaluate_autoencoder function.If your dataset contains missing values, use the impute_with_autoencoder function to impute them with the trained autoencoder.Use the trained autoencoder for any other relevant tasks, such as feature extraction or anomaly detection.

  19. Load times during file-format analysis.

    • plos.figshare.com
    xlsx
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jesper Strøm; Andreas Larsen Engholm; Kristian Peter Lorenzen; Kaare B. Mikkelsen (2024). Load times during file-format analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0307202.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jesper Strøm; Andreas Larsen Engholm; Kristian Peter Lorenzen; Kaare B. Mikkelsen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The individual values, mean and standard deviation for file-format load times during the analysis. (XLSX)

  20. D

    Supporting Data for: UltraMNIST Classification: A Benchmark to Train CNNs...

    • dataverse.no
    • dataverse.azure.uit.no
    • +1more
    csv, txt, zip
    Updated Sep 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deepak K. Gupta; Udbhav Bhamba; Abhishek Thakur; Akash Gupta; Suraj Sharan; Ertugrul Demir; Dilip K. Prasad; Dilip K. Prasad; Deepak K. Gupta; Udbhav Bhamba; Abhishek Thakur; Akash Gupta; Suraj Sharan; Ertugrul Demir (2023). Supporting Data for: UltraMNIST Classification: A Benchmark to Train CNNs for Very Large Images [Dataset]. http://doi.org/10.18710/4F4KJS
    Explore at:
    zip(9346307754), zip(9428095020), csv(382013), txt(6374)Available download formats
    Dataset updated
    Sep 28, 2023
    Dataset provided by
    DataverseNO
    Authors
    Deepak K. Gupta; Udbhav Bhamba; Abhishek Thakur; Akash Gupta; Suraj Sharan; Ertugrul Demir; Dilip K. Prasad; Dilip K. Prasad; Deepak K. Gupta; Udbhav Bhamba; Abhishek Thakur; Akash Gupta; Suraj Sharan; Ertugrul Demir
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Dataset funded by
    The Research Council of Norway
    UiT The Arctic University of Norway
    Description

    Convolutional neural network (CNN) approaches available in the current literature are designed to work primarily with low-resolution images. When applied on very large images, challenges related to GPU memory, smaller receptive field than needed for semantic correspondence and the need to incorporate multi-scale features arise. The resolution of input images can be reduced, however, with significant loss of critical information. Based on the outlined issues, we introduce a novel research problem of training CNN models for very large images, and present ‘UltraMNIST dataset’, a simple yet representative benchmark dataset for this task. UltraMNIST has been designed using the popular MNIST digits with additional levels of complexity added to replicate well the challenges of real-world problems. We present two variants of the problem: ‘UltraMNIST classification’ and ‘Budget-aware UltraMNIST classification’. The standard UltraMNIST classification benchmark is intended to facilitate the development of novel CNN training methods that make the effective use of the best available GPU resources. The budget-aware variant is intended to promote development of methods that work under constrained GPU memory. For the development of competitive solutions, we present several baseline models for the standard benchmark and its budget-aware variant. We study the effect of reducing resolution on the performance and present results for baseline models involving pretrained backbones from among the popular state-of-the-art models. Finally, with the presented benchmark dataset and the baselines, we hope to pave the ground for a new generation of CNN methods suitable for handling large images in an efficient and resource-light manner. UltraMNIST dataset comprises very large-scale images, each of 4000x4000 pixels with 3-5 digits per image. Each of these digits has been extracted from the original MNIST dataset. Your task is to predict the sum of the digits per image, and this number can be anything from 0 to 27.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institute of Standards and Technology (2025). Training and Validation Datasets for Neural Network to Fill in Missing Data in EBSD Maps [Dataset]. https://catalog.data.gov/dataset/training-and-validation-datasets-for-neural-network-to-fill-in-missing-data-in-ebsd-maps
Organization logo

Training and Validation Datasets for Neural Network to Fill in Missing Data in EBSD Maps

Explore at:
Dataset updated
Jul 9, 2025
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description

This dataset consists of the synthetic electron backscatter diffraction (EBSD) maps generated for the paper, titled "Hybrid Algorithm for Filling in Missing Data in Electron Backscatter Diffraction Maps" by Emmanuel Atindama, Conor Miller-Lynch, Huston Wilhite, Cody Mattice, Günay Doğan, and Prashant Athavale. The EBSD maps were used to train, test, and validate a neural network algorithm to fill in missing data points in a given EBSD map.The dataset includes 8000 maps for training, 1000 maps for testing, 2000 maps for validation. The dataset also includes noise-added versions of the maps, namely, one more map per each clean map.

Search
Clear search
Close search
Google apps
Main menu