100+ datasets found

r
Training a Neural Network Model on Encrypted MNIST Data
resodate.org
service.tib.eu
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dowlin et al. (2024). Training a Neural Network Model on Encrypted MNIST Data [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQvdHJhaW5pbmctYS1uZXVyYWwtbmV0d29yay1tb2RlbC1vbi1lbmNyeXB0ZWQtbW5pc3QtZGF0YQ==
Explore at:
Dataset updated
Dec 16, 2024
Dataset provided by
Leibniz Data Manager
Authors
Dowlin et al.
Description
The dataset used in this paper is not explicitly mentioned, but it is implied to be a large-scale dataset for machine learning.
f
Data from: Deep learning neural network derivation and testing to...
tandf.figshare.com
datasetcatalog.nlm.nih.gov
png
Updated Aug 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Omid Mehrpour; Christopher Hoyte; Abdullah Al Masud; Ashis Biswas; Jonathan Schimmel; Samaneh Nakhaee; Mohammad Sadegh Nasr; Heather Delva-Clark; Foster Goss (2023). Deep learning neural network derivation and testing to distinguish acute poisonings [Dataset]. http://doi.org/10.6084/m9.figshare.23694504.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23694504.v1
Dataset updated
Aug 8, 2023
Dataset provided by
Taylor & Francis
Authors
Omid Mehrpour; Christopher Hoyte; Abdullah Al Masud; Ashis Biswas; Jonathan Schimmel; Samaneh Nakhaee; Mohammad Sadegh Nasr; Heather Delva-Clark; Foster Goss
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Acute poisoning is a significant global health burden, and the causative agent is often unclear. The primary aim of this pilot study was to develop a deep learning algorithm that predicts the most probable agent a poisoned patient was exposed to from a pre-specified list of drugs. Data were queried from the National Poison Data System (NPDS) from 2014 through 2018 for eight single-agent poisonings (acetaminophen, diphenhydramine, aspirin, calcium channel blockers, sulfonylureas, benzodiazepines, bupropion, and lithium). Two Deep Neural Networks (PyTorch and Keras) designed for multi-class classification tasks were applied. There were 201,031 single-agent poisonings included in the analysis. For distinguishing among selected poisonings, PyTorch model had specificity of 97%, accuracy of 83%, precision of 83%, recall of 83%, and a F1-score of 82%. Keras had specificity of 98%, accuracy of 83%, precision of 84%, recall of 83%, and a F1-score of 83%. The best performance was achieved in the diagnosis of single-agent poisoning in diagnosing poisoning by lithium, sulfonylureas, diphenhydramine, calcium channel blockers, then acetaminophen, in PyTorch (F1-score = 99%, 94%, 85%, 83%, and 82%, respectively) and Keras (F1-score = 99%, 94%, 86%, 82%, and 82%, respectively). Deep neural networks can potentially help in distinguishing the causative agent of acute poisoning. This study used a small list of drugs, with polysubstance ingestions excluded.Reproducible source code and results can be obtained at https://github.com/ashiskb/npds-workspace.git.
dataset for neural network training (gas network)
kaggle.com
zip
Updated Apr 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mohammad meysam mollai (2024). dataset for neural network training (gas network) [Dataset]. https://www.kaggle.com/datasets/mohammadmeysammollai/dataset-for-neural-network-training-gas-network
Explore at:
zip(9282 bytes)Available download formats
Dataset updated
Apr 27, 2024
Authors
mohammad meysam mollai
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset for neural network training (Gas Network) This dataset is used from the data available on the internet, that is, a large amount of data has been cleaned, extra items have been removed and important items have remained.
Training and Validation Datasets for Neural Network to Fill in Missing Data...
catalog.data.gov
gimi9.com
Updated Jul 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2025). Training and Validation Datasets for Neural Network to Fill in Missing Data in EBSD Maps [Dataset]. https://catalog.data.gov/dataset/training-and-validation-datasets-for-neural-network-to-fill-in-missing-data-in-ebsd-maps
Explore at:
Dataset updated
Jul 9, 2025
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
This dataset consists of the synthetic electron backscatter diffraction (EBSD) maps generated for the paper, titled "Hybrid Algorithm for Filling in Missing Data in Electron Backscatter Diffraction Maps" by Emmanuel Atindama, Conor Miller-Lynch, Huston Wilhite, Cody Mattice, Günay Doğan, and Prashant Athavale. The EBSD maps were used to train, test, and validate a neural network algorithm to fill in missing data points in a given EBSD map.The dataset includes 8000 maps for training, 1000 maps for testing, 2000 maps for validation. The dataset also includes noise-added versions of the maps, namely, one more map per each clean map.
Z
Training dataset used in the magazine paper entitled "A Flexible Machine...
data.niaid.nih.gov
data-staging.niaid.nih.gov
+1more
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francisco Wilhelmi (2020). Training dataset used in the magazine paper entitled "A Flexible Machine Learning-Aware Architecture for Future WLANs" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3626690
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Universitat Pompeu Fabra
Authors
Francisco Wilhelmi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A Flexible Machine Learning-Aware Architecture for Future WLANs

Authors: Francesc Wilhelmi, Sergio Barrachina-Muñoz, Boris Bellalta, Cristina Cano, Anders Jonsson & Vishnu Ram.

Abstract: Lots of hopes have been placed in Machine Learning (ML) as a key enabler of future wireless networks. By taking advantage of the large volumes of data generated by networks, ML is expected to deal with the ever-increasing complexity of networking problems. Unfortunately, current networking systems are not yet prepared for supporting the ensuing requirements of ML-based applications, especially for enabling procedures related to data collection, processing, and output distribution. This article points out the architectural requirements that are needed to pervasively include ML as part of future wireless networks operation. To this aim, we propose to adopt the International Telecommunications Union (ITU) unified architecture for 5G and beyond. Specifically, we look into Wireless Local Area Networks (WLANs), which, due to their nature, can be found in multiple forms, ranging from cloud-based to edge-computing-like deployments. Based on ITU's architecture, we provide insights on the main requirements and the major challenges of introducing ML to the multiple modalities of WLANs.

Dataset description: This is the dataset generated for training a Neural Network (NN) in the Access Point (AP) (re)association problem in IEEE 802.11 Wireless Local Area Networks (WLANs).

In particular, the NN is meant to output a prediction function of the throughput that a given station (STA) can obtain from a given Access Point (AP) after association. The features included in the dataset are:

Identifier of the AP to which the STA has been associated.

RSSI obtained from the AP to which the STA has been associated.

Data rate in bits per second (bps) that the STA is allowed to use for the selected AP.

Load in packets per second (pkt/s) that the STA generates.

Percentage of data that the AP is able to serve before the user association is done.

Amount of traffic load in pkt/s handled by the AP before the user association is done.

Airtime in % that the AP enjoys before the user association is done.

Throughput in pkt/s that the STA receives after the user association is done.

The dataset has been generated through random simulations, based on the model provided in https://github.com/toniadame/WiFi_AP_Selection_Framework. More details regarding the dataset generation have been provided in https://github.com/fwilhelmi/machine_learning_aware_architecture_wlans.
Data from: Prediction models in the design of neural network based ECG...
healthdata.gov
data.virginia.gov
+1more
csv, xlsx, xml
Updated Jul 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Prediction models in the design of neural network based ECG classifiers: A neural network and genetic programming approach [Dataset]. https://healthdata.gov/d/6w28-3tdb
Explore at:
csv, xml, xlsxAvailable download formats
Dataset updated
Jul 14, 2025
Description
Background Classification of the electrocardiogram using Neural Networks has become a widely used method in recent years. The efficiency of these classifiers depends upon a number of factors including network training. Unfortunately, there is a shortage of evidence available to enable specific design choices to be made and as a consequence, many designs are made on the basis of trial and error. In this study we develop prediction models to indicate the point at which training should stop for Neural Network based Electrocardiogram classifiers in order to ensure maximum generalisation.

Methods Two prediction models have been presented; one based on Neural Networks and the other on Genetic Programming. The inputs to the models were 5 variable training parameters and the output indicated the point at which training should stop. Training and testing of the models was based on the results from 44 previously developed bi-group Neural Network classifiers, discriminating between Anterior Myocardial Infarction and normal patients. Results Our results show that both approaches provide close fits to the training data; p = 0.627 and p = 0.304 for the Neural Network and Genetic Programming methods respectively. For unseen data, the Neural Network exhibited no significant differences between actual and predicted outputs (p = 0.306) while the Genetic Programming method showed a marginally significant difference (p = 0.047). Conclusions The approaches provide reverse engineering solutions to the development of Neural Network based Electrocardiogram classifiers. That is given the network design and architecture, an indication can be given as to when training should stop to obtain maximum network generalisation.
Z
Model Zoo: A Dataset of Diverse Populations of Neural Network Models - MNIST...
data.niaid.nih.gov
Updated Jun 13, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schürholt, Konstantin; Taskiran, Diyar; Knyazev, Boris; Giró-i-Nieto, Xavier; Borth, Damian (2022). Model Zoo: A Dataset of Diverse Populations of Neural Network Models - MNIST [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6632086
Explore at:
Dataset updated
Jun 13, 2022
Dataset provided by
AI Lab Montreal, Samsung Advanced Institute of Technology
AIML Lab, University of St.Gallen
Image Processing Group, Universitat Politècnica de Catalunya
Authors
Schürholt, Konstantin; Taskiran, Diyar; Knyazev, Boris; Giró-i-Nieto, Xavier; Borth, Damian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

In the last years, neural networks have evolved from laboratory environments to the state-of-the-art for many real-world problems. Our hypothesis is that neural network models (i.e., their weights and biases) evolve on unique, smooth trajectories in weight space during training. Following, a population of such neural network models (refereed to as “model zoo”) would form topological structures in weight space. We think that the geometry, curvature and smoothness of these structures contain information about the state of training and can be reveal latent properties of individual models. With such zoos, one could investigate novel approaches for (i) model analysis, (ii) discover unknown learning dynamics, (iii) learn rich representations of such populations, or (iv) exploit the model zoos for generative modelling of neural network weights and biases. Unfortunately, the lack of standardized model zoos and available benchmarks significantly increases the friction for further research about populations of neural networks. With this work, we publish a novel dataset of model zoos containing systematically generated and diverse populations of neural network models for further research. In total the proposed model zoo dataset is based on six image datasets, consist of 24 model zoos with varying hyperparameter combinations are generated and includes 47’360 unique neural network models resulting in over 2’415’360 collected model states. Additionally, to the model zoo data we provide an in-depth analysis of the zoos and provide benchmarks for multiple downstream tasks as mentioned before.

Dataset

This dataset is part of a larger collection of model zoos and contains the zoos trained on the labelled samples from MNIST. All zoos with extensive information and code can be found at www.modelzoos.cc.

This repository contains two types of files: the raw model zoos as collections of models (file names beginning with "mnist_"), as well as preprocessed model zoos wrapped in a custom pytorch dataset class (filenames beginning with "dataset"). Zoos are trained in three configurations varying the seed only (seed), varying hyperparameters with fixed seeds (hyp_fix) or varying hyperparameters with random seeds (hyp_rand). The index_dict.json files contain information on how to read the vectorized models.

For more information on the zoos and code to access and use the zoos, please see www.modelzoos.cc.
R
Training and testing XRD dataset for crystallite size and microstrain...
entrepot.recherche.data.gouv.fr
image/x-silx-numpy +1
Updated Nov 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexandre BOULLE; Alexandre BOULLE; Arthur SOUESME; Arthur SOUESME (2025). Training and testing XRD dataset for crystallite size and microstrain determination using deep neural networks [Dataset]. http://doi.org/10.57745/SVQART
Explore at:
text/markdown(1068), image/x-silx-numpy(6059958836), image/x-silx-numpy(673347924)Available download formats
Unique identifier
https://doi.org/10.57745/SVQART
Dataset updated
Nov 20, 2025
Dataset provided by
Recherche Data Gouv
Authors
Alexandre BOULLE; Alexandre BOULLE; Arthur SOUESME; Arthur SOUESME
License
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Time period covered
Oct 1, 2023 - Oct 1, 2026
Dataset funded by
Région Nouvelle Aquitaine
Description
Numpy tensors to train and test a convolutional neural network dedicated to determine crystallite size and/or microstrain from X-ray diffraction data (XRD): train_size.npz: training dataset with only crystallite size test_size.npz: testing dataset with only crystallite size train_size_strain.npz: training dataset with crystallite size and microstrain test_size_strain.npz: testing dataset with crystallite size and microstrain Each dataset contains the XRD data and the labels ("ground truth") in the form of 2D tensors with 10501 data points (columns) for the XRD data, and 24 labels (columns) for the labels. Training data contain 71971 rows ; testing data contain 7997 rows. Example python script to read the data: import numpy as np train = np.load("train_size.npz") train_data, train_label = train["train_data"], train["train_label"] print(f"Train data shape: {train_data.shape}, Train labels shape: {train_label.shape}") Jupyter notebooks to train and test a neural network can be found here: https://github.com/aboulle/LPA-NN
t
Data from: BEND: Bagging Deep Learning Training Based on Efficient Neural...
service.tib.eu
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [Dataset]. https://service.tib.eu/ldmservice/dataset/bend--bagging-deep-learning-training-based-on-efficient-neural-network-diffusion
Explore at:
Dataset updated
Dec 16, 2024
Description
The paper proposes a Bagging Deep Learning Training Framework (BEND) based on efficient neural network diffusion.
c
Prediction of biological wastewater treatment performance using artificial...
esango.cput.ac.za
xlsx
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Winile Sindane (2023). Prediction of biological wastewater treatment performance using artificial neural networks [Dataset]. http://doi.org/10.25381/cput.22261720.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.25381/cput.22261720.v1
Dataset updated
Jun 2, 2023
Dataset provided by
Cape Peninsula University of Technology
Authors
Winile Sindane
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Ethical Clearance Reference Number: 2021FEBEREC-STD- 065

Pre-processing data from published and unpublished previous studies treating biodiesel-, textile-, polymer-, and pulp and paper wastewater using an ABR and EGSB for artificial neural network (ANN) model simulation and developnent.

For ANN problems to be solved, the selection of a suitable learning rate, momentum, the number of neurons from each of the hidden layers and the activation function is crucial. Therefore, the collected data must be prepared in a Microsoft Excel spreadsheet format with input and output columns. A training file is then created with samples of the whole problem domain to select the required parameters. Three data sets are used: a training data set, test data set and validation data set. When the training process takes place, the neural network will be tested against the testing data to determine accuracy, and training will be stopped when the mean average error remains the same for a period of time.
Data from: Evaluation of the preprocessing and training stages in text...
scielo.figshare.com
jpeg
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucas Marques Sathler Guimarães; Magali Rezende Gouvêa Meireles; Paulo Eduardo Maciel de Almeida (2023). Evaluation of the preprocessing and training stages in text classification algorithms in the context of information retrieval [Dataset]. http://doi.org/10.6084/m9.figshare.8162216.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8162216.v1
Dataset updated
May 30, 2023
Dataset provided by
SciELOhttp://www.scielo.org/
Authors
Lucas Marques Sathler Guimarães; Magali Rezende Gouvêa Meireles; Paulo Eduardo Maciel de Almeida
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract The amount of unstructured data grows with the popularization of the Internet. Texts in natural language represent a relevant and significant set for the analysis and production of knowledge. This work proposes a quantitative analysis of the preprocessing and training stages of a text classifier, which uses as an attribute the feelings expressed by the users. Artificial Neural Network, as a classifier algorithm, and texts from Amazon, IMDB and Yelp sites were used for the experiments. The database allows the analysis of the expression of positive and negative feelings of the users in evaluations of products and services in unstructured texts. Two distinct processes of preprocessing and different training of the Artificial Neural Networks were carried out to classify the textual set. The results quantitatively confirm the importance of the preprocessing and training stages of the classifier, highlighting the importance of the vocabulary selected for the text representation and classification. The available classification techniques achieve satisfactory results. However, even by using two distinct processes of preprocessing and identifying the best training process, it was not possible to totally eliminate the learning difficulties and understanding of the model for the classifications of feelings that involved subjective characteristics of the expression of human feeling.
c
Training datasets for AIMNet2 machine-learned neural network potential
kilthub.cmu.edu
txt
Updated Jan 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roman Zubatiuk; Olexandr Isayev; Dylan Anstine (2025). Training datasets for AIMNet2 machine-learned neural network potential [Dataset]. http://doi.org/10.1184/R1/27629937.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/27629937.v2
Dataset updated
Jan 27, 2025
Dataset provided by
Carnegie Mellon University
Authors
Roman Zubatiuk; Olexandr Isayev; Dylan Anstine
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The datasets contain molecular structures and the properties computed with B97-3c (GGA DFT) or wB97M-def2-TZVPP (range-separated hybrid DFT) methods. Each data file contains about 20M structures. DFT calculation performed with ORCA 5.0.3 software. Properties include energy, forces, atomic charges, and molecular dipole and quadrupole moments.
Dataset: Handwritten Digits and Operators
kaggle.com
zip
Updated Jul 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michel Heusser (2020). Dataset: Handwritten Digits and Operators [Dataset]. https://www.kaggle.com/datasets/michelheusser/handwritten-digits-and-operators/code
Explore at:
zip(214990104 bytes)Available download formats
Dataset updated
Jul 13, 2020
Authors
Michel Heusser
Description
Context

I created this dataset as part of a larger, rather educational project, which aims to create a simple web-interface to write simple numerical mathematical operations on a drawing grid (either on a PC, tablet, or smartphone), which is then recognized and evaluated. It also aims, however, to contribute and help other projects that may involve or require large datasets of handwritten digits.

Dataset Characteristics

What makes this dataset different from existing ones, is that it lays emphasis on the actual strokes of handwritten signs, instead of just being a compilation of scanned images of existing records (the way, for example, the MNIST dataset was created). Each pixel is strictly full black, or full white (no greytones), and the strokes are rather thin. It is meant to have the information of the stroke itself, and not just a scanned image.

In the Context of Neural Networks and Deep Learning

The dataset is meant to stretch and challenge the understanding of a neural network about what makes a specific symbol mean what it does. For this, I initially created a starting dataset of the different ways people write symbols, playing with their internal proportions and adding "ill"-written signs that are still barely recognizable from the rest. After that, I artificially augmented the dataset by performing stretches and small rotations on all images, making sure that a human being would still recognize them. A neural network would be then forced to understand better what gives a symbol its characteristics. I made sure not to delete many, rather ambiguous, cases (e.g. a clockwise-rotated '1' and '/') for the neural network to have to deal and understand nuances during training and classification.

Using 60% of the dataset (randomly selected) to train a neural network with two inner layers, and 20% to validate it, I achieved a +96% validation accuracy using my own implementation of the stochastic gradient descent and backpropagation algorithm.

Data Augmentation

The resizing, rotating, and scaling algorithms I wrote do not work with images by modifying each pixel, but rather the stroke. This means, that each line of one pixel width is scaled/resized/rotated to a line (longer or shorter) of the same pixel width. This is advantageous for the following reasons: - When I created the dataset, the original images containing many symbols had a pen-stroke that was very thin compared to the symbol's proportions. Using the stroke scaling methods mentioned, resizing to smaller images made the strokes thinner, which is desirable - Transformations on the symbols that change their proportions would not increase stroke width or make the stroke disappear - It is easy to create new datasets out of this one by easily thickening strokes, or adding noise and artificial irregularities

Tools for the Dataset Creation

In the following github repository, one can access the code, as well as the modules I created to perform the following tasks: - Import large images containing multiple symbols to be extracted - Agglomerate each independent symbol, perform image processes on them (resizing, scaling, and rotating) and save them to individual images - Create custom datasets out of the individual images for training, validation, and testing of machine learning models (e.g. neural networks)

https://github.com/michheusser/neural-network-training - main_dataset_creation.py - Creation of datasets - main_neural_network_training.py - Training of neural network - datatools (Folder) - Package for dataset and image manipulation - nntools (Folder) - Package with neural network tools (incl. training)

Content

CompleteImages - ca. 300'000 symbol images as .png containing transformation information in their name with syntax: [symbol]_[papersheet_index]_[rotation]_[index in untransformed dataset]_scaled_x[scaling in x]y[scaling in y].png (e.g. +_1_8ccw_26_scaled_x1_2y1_2.png)

CompleteDataSet_tuples.npy - List of tuples with all datapoints. (ca. 300'000 Datapoints)

CompleteDataSet_training_tuples.npy - Training dataset (60% of CompleteDataSet.npy randomly selected)

CompleteDataSet_validation_tuples.npy - Validation dataset (20% of CompleteDataSet.npy randomly selected)

CompleteDataSet_testing_tuples.npy- Testing dataset (20% of CompleteDataSet.npy randomly selected)

Creation

The datasets were created in the following way: - I drew each symbol in ~500 ways on pieces of white paper using a thin pen, and scanned them to .pdf images. - All images were run through my 'datatool' module for each symbol to be isolated and fit into a 28x28 greyscale .png with each pixel being either black or white - Each image was transformed with all possible combinations of the following: rotation (-15°, -8°, 0°, 8°, 15°), stretching in each axis (1, 1.2, 1.3). Each transformed image was saved as a .png ...
d
Data from: Using convolutional neural networks to efficiently extract...
datadryad.org
data.niaid.nih.gov
+1more
zip
Updated Jan 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rachel Reeb; Naeem Aziz; Samuel Lapp; Justin Kitzes; J. Mason Heberling; Sara Kuebbing (2022). Using convolutional neural networks to efficiently extract immense phenological data from community science images [Dataset]. http://doi.org/10.5061/dryad.mkkwh7123
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.mkkwh7123
Dataset updated
Jan 4, 2022
Dataset provided by
Dryad
Authors
Rachel Reeb; Naeem Aziz; Samuel Lapp; Justin Kitzes; J. Mason Heberling; Sara Kuebbing
Time period covered
Dec 15, 2021
Description
Community science image libraries offer a massive, but largely untapped, source of observational data for phenological research. The iNaturalist platform offers a particularly rich archive, containing more than 49 million verifiable, georeferenced, open access images, encompassing seven continents and over 278,000 species. A critical limitation preventing scientists from taking full advantage of this rich data source is labor. Each image must be manually inspected and categorized by phenophase, which is both time-intensive and costly. Consequently, researchers may only be able to use a subset of the total number of images available in the database. While iNaturalist has the potential to yield enough data for high-resolution and spatially extensive studies, it requires more efficient tools for phenological data extraction. A promising solution is automation of the image annotation process using deep learning. Recent innovations in deep learning have made these open-source tools accessibl...
t
Training Over-Parameterized Deep Neural Networks - Dataset - LDM
service.tib.eu
resodate.org
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Training Over-Parameterized Deep Neural Networks - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/training-over-parameterized-deep-neural-networks
Explore at:
Dataset updated
Dec 16, 2024
Description
The dataset used in this paper is a collection of training data for over-parameterized deep neural networks.
Dataset for neural network (water network)
kaggle.com
zip
Updated Apr 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mohammad meysam mollai (2024). Dataset for neural network (water network) [Dataset]. https://www.kaggle.com/datasets/mohammadmeysammollai/dataset-for-neural-network-water-network
Explore at:
zip(24308 bytes)Available download formats
Dataset updated
Apr 27, 2024
Authors
mohammad meysam mollai
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset for neural network training (water network) This dataset is used from the data available on the internet, which means that a large amount of data has been cleaned, extra items have been removed and important items have remained.
r
Subset of Quick, Draw! dataset for neural network pre-training / Subconjunto...
resodate.org
portalcientifico.universidadeuropea.com
Updated Sep 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juan Guerrero Martín; Alba Gómez-Valadés Batanero; Estela Díaz López; Margarita Bachiller Mayoral; José Manuel Cuadra Troncoso; Rafael Martínez Tomás; Sara García Herranz; María del Carmen Díaz Mardomingo; Herminia Peraita; Herminia Peraita; Mariano Rincón Zamorano (2024). Subset of Quick, Draw! dataset for neural network pre-training / Subconjunto del conjunto de datos Quick, Draw! para pre-entrenamiento de redes neuronales [Dataset]. http://doi.org/10.21950/GWO9RA
Explore at:
Unique identifier
https://doi.org/10.21950/GWO9RA
Dataset updated
Sep 27, 2024
Dataset provided by
Universidad Nacional de Educación a Distancia
Eciencia Data
Rey-Osterrieth Complex Figure (ROCF) Test Assessment
Authors
Juan Guerrero Martín; Alba Gómez-Valadés Batanero; Estela Díaz López; Margarita Bachiller Mayoral; José Manuel Cuadra Troncoso; Rafael Martínez Tomás; Sara García Herranz; María del Carmen Díaz Mardomingo; Herminia Peraita; Herminia Peraita; Mariano Rincón Zamorano
Description
Description of the project This dataset is the result of the research carried out in the project "A Benchmark for Rey-Osterrieth Complex Figure (ROCF) Test Automatic Scoring", whose main goal was to establish a baseline for the scoring task consisting of: a dataset with 528 ROCF and results obtained by several deep learning models, as well as, by a group of psychology experts.
Dataset for Vehicle Detection by Neural Network
kaggle.com
zip
Updated Dec 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Niloy Kanti Paul (2023). Dataset for Vehicle Detection by Neural Network [Dataset]. https://www.kaggle.com/datasets/niloykantipaul/dataset-for-vehicle-detection-by-neural-network
Explore at:
zip(1222125894 bytes)Available download formats
Dataset updated
Dec 25, 2023
Authors
Niloy Kanti Paul
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
About Dataset

This dataset was used by the SKYHAWK project with the team REDHAWK. To download full dataset or to submit a request for your new data collection needs, please drop a mail to: redhawk.team.info@gmail.com

Dive into a Rigorous Collection of 11,000+ multiple types of vehicle images captured and crowdsourced from 100+ urban and rural areas. Each image is meticulously reviewed and verified by the researcher.

Versatile Training Data: Explore a Spectrum of Resolutions and Weather Conditions in Our Dataset for Comprehensive Vehicle Detection Model Training.

Dataset Features:

Dataset size: 11,000+ images Location: Bangladesh Diversity : Various lighting conditions like day and night, various weather conditions, varied distances, view points, etc. Device used: Captured using mobile phones. Usage: Vehicle detection, Traffic automation, Traffic surveillance, etc.

Vehicle Classes:

Bike

Auto

Car

Truck

Bus

Other Vehicles (Rickshaw, Van, Cycle, etc.)

Available Annotation formats: YOLO, PYTORCH
D
Data repository for "Minimial-Risk Training Samples for QNN Training from...
darus.uni-stuttgart.de
Updated Oct 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Mandl; Marvin Bechtold; Johanna Barzen; Frank Leymann (2024). Data repository for "Minimial-Risk Training Samples for QNN Training from Measurements" [Dataset]. http://doi.org/10.18419/DARUS-4113
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-4113
Dataset updated
Oct 8, 2024
Dataset provided by
DaRUS
Authors
Alexander Mandl; Marvin Bechtold; Johanna Barzen; Frank Leymann
License
https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-4113https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-4113
Dataset funded by
BMWK
Description
Replication code and experiment result data for training Quantum Neural Networks with entangled data using one-dimensional projectors as observables. This is the version of the code that was used to generate the experiment results in the related publication. Experiments: - exp_inf_coeffvariation.py: Trains QNNs using training samples of varying Schmidt rank with fixed vector as Schmidt basis state. Varies the associated Schmidt coefficient. - exp_inf_random.py: Trains QNNs using random training data. Experiment results: - exp_inf_coeffvariation.zip and exp_inf_random.zip contain the raw experiment results for both experiments. - For each combination of controlled variables there is one directory containing the result of all 20 runs of the training process. - The results for each run are comprised of 3 files: - [id]_losses.npy: The loss during the training process - [id]_params.npy: The parameters of the QNN after the training process. - [id]_V.npy: The trained QNN exported as a 2^4 * 2^4 unitary matrix. Analysis of data (data_extraction.py): - Computes means and standard deviation of various risk measures and saves the results Plots (plot_obs_risk.py): - Plots the risk w.r.t. the observable for both experiments based on the analysed data obtained from data_extraction.py. - Generates plot_coeffvariation.pdf and plot_random.pdf.
Z
Data from: Self-Supervised Representation Learning on Neural Network Weights...
data.niaid.nih.gov
Updated Nov 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schürholt, Kontantin; Kostadinov, Dimche; Borth, Damian (2021). Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction - Datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5645137
Explore at:
Dataset updated
Nov 13, 2021
Dataset provided by
University of St.Gallen
Authors
Schürholt, Kontantin; Kostadinov, Dimche; Borth, Damian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets to NeurIPS 2021 accepted paper "Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction".

Datasets are pytorch files containing a dictionary with training, validation and test sets. Train, validation and test sets are custom dataset classes which inherit from the standard torch dataset class. Corresponding code an be found at https://github.com/HSG-AIML/NeurIPS_2021-Weight_Space_Learning.

Datasets 41, 42, 43 and 44 are our dataset format wrapped around the zoos from Unterthiner et al, 2020 (https://github.com/google-research/google-research/tree/master/dnn_predict_accuracy)

Abstract: Self-Supervised Learning (SSL) has been shown to learn useful and information-preserving representations. Neural Networks (NNs) are widely applied, yet their weight space is still not fully understood. Therefore, we propose to use SSL to learn neural representations of the weights of populations of NNs. To that end, we introduce domain specific data augmentations and an adapted attention architecture. Our empirical evaluation demonstrates that self-supervised representation learning in this domain is able to recover diverse NN model characteristics. Further, we show that the proposed learned representations outperform prior work for predicting hyper-parameters, test accuracy, and generalization gap as well as transfer to out-of-distribution settings.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dowlin et al. (2024). Training a Neural Network Model on Encrypted MNIST Data [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQvdHJhaW5pbmctYS1uZXVyYWwtbmV0d29yay1tb2RlbC1vbi1lbmNyeXB0ZWQtbW5pc3QtZGF0YQ==

Training a Neural Network Model on Encrypted MNIST Data

Explore at:

Dataset updated

Dec 16, 2024

Dataset provided by

Leibniz Data Manager

Authors

Dowlin et al.

Description

The dataset used in this paper is not explicitly mentioned, but it is implied to be a large-scale dataset for machine learning.

Clear search

Close search

Google apps

Main menu

Training a Neural Network Model on Encrypted MNIST Data

Data from: Deep learning neural network derivation and testing to...

dataset for neural network training (gas network)

Training and Validation Datasets for Neural Network to Fill in Missing Data...

Training dataset used in the magazine paper entitled "A Flexible Machine...

Data from: Prediction models in the design of neural network based ECG...

Model Zoo: A Dataset of Diverse Populations of Neural Network Models - MNIST...

Training and testing XRD dataset for crystallite size and microstrain...

Data from: BEND: Bagging Deep Learning Training Based on Efficient Neural...

Prediction of biological wastewater treatment performance using artificial...

Data from: Evaluation of the preprocessing and training stages in text...

Training datasets for AIMNet2 machine-learned neural network potential

Dataset: Handwritten Digits and Operators

Context

Dataset Characteristics

In the Context of Neural Networks and Deep Learning

Data Augmentation

Tools for the Dataset Creation

Content

Creation

Data from: Using convolutional neural networks to efficiently extract...

Training Over-Parameterized Deep Neural Networks - Dataset - LDM

Dataset for neural network (water network)

Subset of Quick, Draw! dataset for neural network pre-training / Subconjunto...

Dataset for Vehicle Detection by Neural Network

About Dataset

Dataset Features:

Vehicle Classes:

Data repository for "Minimial-Risk Training Samples for QNN Training from...

Data from: Self-Supervised Representation Learning on Neural Network Weights...

Training a Neural Network Model on Encrypted MNIST Data