100+ datasets found

f
Neural network error metrics for training and testing data sets. The neural...
plos.figshare.com
xls
Updated Feb 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaowen Chen; Anne E. Martin (2025). Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects. [Dataset]. http://doi.org/10.1371/journal.pone.0315186.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0315186.t002
Dataset updated
Feb 10, 2025
Dataset provided by
PLOS ONE
Authors
Xiaowen Chen; Anne E. Martin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects.
Challenge Round 0 (Dry Run) Test Dataset
catalog.data.gov
data.nist.gov
+2more
Updated Jul 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2022). Challenge Round 0 (Dry Run) Test Dataset [Dataset]. https://catalog.data.gov/dataset/challenge-round-0-dry-run-test-dataset-ff885
Explore at:
Dataset updated
Jul 29, 2022
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
This dataset was an initial test harness infrastructure test for the TrojAI program. It should not be used for research. Please use the more refined datasets generated for the other rounds. The data being generated and disseminated is training, validation, and test data used to construct trojan detection software solutions. This data, generated at NIST, consists of human level AIs trained to perform a variety of tasks (image classification, natural language processing, etc.). A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 200 trained, human level, image classification AI models using the following architectures (Inception-v3, DenseNet-121, and ResNet50). The models were trained on synthetically created image data of non-real traffic signs superimposed on road background scenes. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the images when the trigger is present.
Z
U-T training data and test data for Sigsbee2A m odel
data.niaid.nih.gov
zenodo.org
Updated May 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guoxin Chen (2023). U-T training data and test data for Sigsbee2A m odel [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7967049
Explore at:
Dataset updated
May 25, 2023
Dataset authored and provided by
Guoxin Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here are the training and testing data sets involved in the numerical experiments in the article that has been submitted to the journal “Journal of Geophysical Research: Solid Earth”, named “Joint Model and Data-Driven Simultaneous Inversion of Velocity and Density”: SigsbeeA model. Each dataset consists of two parts: a training dataset and a testing dataset. Both training and testing data sets contain three parts: seismic data, velocity model and density model.
i
Dataset for Training and Testing Data-driven Security Assessment of the IEEE...
ieee-dataport.org
Updated Oct 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juan Cuenca Silva (2024). Dataset for Training and Testing Data-driven Security Assessment of the IEEE ELVTN [Dataset]. https://ieee-dataport.org/documents/dataset-training-and-testing-data-driven-security-assessment-ieee-elvtn
Explore at:
Dataset updated
Oct 24, 2024
Authors
Juan Cuenca Silva
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Croatia
f
Predictive modeling of treatment resistant depression using data from STAR*D...
plos.figshare.com
docx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhi Nie; Srinivasan Vairavan; Vaibhav A. Narayan; Jieping Ye; Qingqin S. Li (2023). Predictive modeling of treatment resistant depression using data from STAR*D and an independent clinical study [Dataset]. http://doi.org/10.1371/journal.pone.0197268
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0197268
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Zhi Nie; Srinivasan Vairavan; Vaibhav A. Narayan; Jieping Ye; Qingqin S. Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Identification of risk factors of treatment resistance may be useful to guide treatment selection, avoid inefficient trial-and-error, and improve major depressive disorder (MDD) care. We extended the work in predictive modeling of treatment resistant depression (TRD) via partition of the data from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) cohort into a training and a testing dataset. We also included data from a small yet completely independent cohort RIS-INT-93 as an external test dataset. We used features from enrollment and level 1 treatment (up to week 2 response only) of STAR*D to explore the feature space comprehensively and applied machine learning methods to model TRD outcome at level 2. For TRD defined using QIDS-C16 remission criteria, multiple machine learning models were internally cross-validated in the STAR*D training dataset and externally validated in both the STAR*D testing dataset and RIS-INT-93 independent dataset with an area under the receiver operating characteristic curve (AUC) of 0.70–0.78 and 0.72–0.77, respectively. The upper bound for the AUC achievable with the full set of features could be as high as 0.78 in the STAR*D testing dataset. Model developed using top 30 features identified using feature selection technique (k-means clustering followed by χ2 test) achieved an AUC of 0.77 in the STAR*D testing dataset. In addition, the model developed using overlapping features between STAR*D and RIS-INT-93, achieved an AUC of > 0.70 in both the STAR*D testing and RIS-INT-93 datasets. Among all the features explored in STAR*D and RIS-INT-93 datasets, the most important feature was early or initial treatment response or symptom severity at week 2. These results indicate that prediction of TRD prior to undergoing a second round of antidepressant treatment could be feasible even in the absence of biomarker data.
Training data and test data sets for simultaneous inversion of velocity...
zenodo.org
data.niaid.nih.gov
zip
Updated May 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chen Guoxin; Chen Guoxin (2023). Training data and test data sets for simultaneous inversion of velocity density based on U-T [Dataset]. http://doi.org/10.5281/zenodo.7965402
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7965402
Dataset updated
May 25, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Chen Guoxin; Chen Guoxin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here are the training and testing data sets involved in the numerical experiments in the article that has been submitted to the journal “Journal of Geophysical Research: Solid Earth”, named “Joint Model and Data-Driven Simultaneous Inversion of Velocity and Density”: Marmousi model. Each dataset consists of two parts: a training dataset and a testing dataset. Both training and testing data sets contain three parts: seismic data, velocity model and density model.
Data and code for training and testing a ResMLP model with experience replay...
zenodo.org
zip
Updated Feb 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jianda Chen; Jianda Chen; Minghua Zhang; Wuyin Lin; Tao Zhang; Wei Xue; Minghua Zhang; Wuyin Lin; Tao Zhang; Wei Xue (2025). Data and code for training and testing a ResMLP model with experience replay for machine-learning physics parameterization [Dataset]. http://doi.org/10.5281/zenodo.13690812
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13690812
Dataset updated
Feb 20, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jianda Chen; Jianda Chen; Minghua Zhang; Wuyin Lin; Tao Zhang; Wei Xue; Minghua Zhang; Wuyin Lin; Tao Zhang; Wei Xue
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This directory contains the training data and code for training and testing a ResMLP with experience replay for creating a machine-learning physics parameterization for the Community Atmospheric Model.

The directory is structured as follows:

1. Download training and testing data: https://portal.nersc.gov/archive/home/z/zhangtao/www/hybird_GCM_ML

2. Unzip nncam_training.zip

nncam_training

- models

model definition of ResMLP and other models for comparison purposes

- dataloader

utility scripts to load data into pytorch dataset

- training_scripts

scripts to train ResMLP model with/without experience replay

- offline_test

scripts to perform offline test (Table 2, Figure 2)

3. Unzip nncam_coupling.zip

nncam_srcmods

- SourceMods

SourceMods to be used with CAM modules for coupling with neural network

- otherfiles

additional configuration files to setup and run SPCAM with neural network

- pythonfiles

python scripts to run neural network and couple with CAM

- ClimAnalysis

- paper_plots.ipynb

scripts to produce online evaluation figures (Figure 1, Figure 3-10)
The Test-Case Dataset
kaggle.com
Updated Nov 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sapal6 (2020). The Test-Case Dataset [Dataset]. https://www.kaggle.com/datasets/sapal6/the-testcase-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 29, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
sapal6
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

There are lots of datasets available for different machine learning tasks like NLP, Computer vision etc. However I couldn't find any dataset which catered to the domain of software testing. This is one area which has lots of potential for application of Machine Learning techniques specially deep-learning.

This was the reason I wanted such a dataset to exist. So, I made one.

Content

New version [28th Nov'20]- Uploaded testing related questions and related details from stack-overflow. These are query results which were collected from stack-overflow by using stack-overflow's query viewer. The result set of this query contained posts which had the words "testing web pages".

New version[27th Nov'20] - Created a csv file containing pairs of test case titles and test case description.

This dataset is very tiny (approximately 200 rows of data). I have collected sample test cases from around the web and created a text file which contains all the test cases that I have collected. This text file has sections and under each section there are numbered rows of test cases.

Acknowledgements

I would like to thank websites like guru99.com, softwaretestinghelp.com and many other such websites which host great many sample test cases. These were the source for the test cases in this dataset.

Inspiration

My Inspiration to create this dataset was the scarcity of examples showcasing the implementation of machine learning on the domain of software testing. I would like to see if this dataset can be used to answer questions similar to the following--> * Finding semantic similarity between different test cases ranging across products and applications. * Automating the elimination of duplicate test cases in a test case repository. * Cana recommendation system be built for suggesting domain specific test cases to software testers.
TREC 2022 Deep Learning test collection
catalog.data.gov
s.cnmilf.com
+1more
Updated May 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2023). TREC 2022 Deep Learning test collection [Dataset]. https://catalog.data.gov/dataset/trec-2022-deep-learning-test-collection
Explore at:
Dataset updated
May 9, 2023
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
This is a test collection for passage and document retrieval, produced in the TREC 2023 Deep Learning track. The Deep Learning Track studies information retrieval in a large training data regime. This is the case where the number of training queries with at least one positive label is at least in the tens of thousands, if not hundreds of thousands or more. This corresponds to real-world scenarios such as training based on click logs and training based on labels from shallow pools (such as the pooling in the TREC Million Query Track or the evaluation of search engines based on early precision).Certain machine learning based methods, such as methods based on deep learning are known to require very large datasets for training. Lack of such large scale datasets has been a limitation for developing such methods for common information retrieval tasks, such as document ranking. The Deep Learning Track organized in the previous years aimed at providing large scale datasets to TREC, and create a focused research effort with a rigorous blind evaluation of ranker for the passage ranking and document ranking tasks.Similar to the previous years, one of the main goals of the track in 2022 is to study what methods work best when a large amount of training data is available. For example, do the same methods that work on small data also work on large data? How much do methods improve when given more training data? What external data and models can be brought in to bear in this scenario, and how useful is it to combine full supervision with other forms of supervision?The collection contains 12 million web pages, 138 million passages from those web pages, search queries, and relevance judgments for the queries.
Z
Data for training, validation and testing of methods in the thesis:...
data.niaid.nih.gov
zenodo.org
Updated May 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucia Hajduková (2021). Data for training, validation and testing of methods in the thesis: Camera-based Accuracy Improvement of Indoor Localization [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4730337
Explore at:
Dataset updated
May 1, 2021
Dataset authored and provided by
Lucia Hajduková
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The package contains files for two modules designed to improve the accuracy of the indoor positioning system, namely the following:

door detection

videos_test - videos used to demonstrate the application of door detector

videos_res - videos from videos_test directory with detected doors marked

parts detection

frames_train_val - images generated from videos used for training and validation of VGG16 neural network model

frames_test - images generated from videos used for testing of the trained model

videos_test - videos used to demonstrate the application of parts detector

videos_res - videos from videos_test directory with detected parts marked
Dataset, splits, models, and scripts for the QM descriptors prediction
zenodo.org
explore.openaire.eu
application/gzip
Updated Apr 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shih-Cheng Li; Shih-Cheng Li; Haoyang Wu; Haoyang Wu; Angiras Menon; Angiras Menon; Kevin A. Spiekermann; Kevin A. Spiekermann; Yi-Pei Li; Yi-Pei Li; William H. Green; William H. Green (2024). Dataset, splits, models, and scripts for the QM descriptors prediction [Dataset]. http://doi.org/10.5281/zenodo.10668491
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10668491
Dataset updated
Apr 4, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Shih-Cheng Li; Shih-Cheng Li; Haoyang Wu; Haoyang Wu; Angiras Menon; Angiras Menon; Kevin A. Spiekermann; Kevin A. Spiekermann; Yi-Pei Li; Yi-Pei Li; William H. Green; William H. Green
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset, splits, models, and scripts from the manuscript "When Do Quantum Mechanical Descriptors Help Graph Neural Networks Predict Chemical Properties?" are provided. The curated dataset includes 37 QM descriptors for 64,921 unique molecules across six levels of theory: wB97XD, B3LYP, M06-2X, PBE0, TPSS, and BP86. This dataset is stored in the data.tar.gz file, which also contains a file for multitask constraints applied to various atomic and bond properties. The data splits (training, validation, and test splits) for both random and scaffold-based divisions are saved as separate index files in splits.tar.gz. The trained D-MPNN models for predicting QM descriptors are saved in the models.tar.gz file. The scripts.tar.gz file contains ready-to-use scripts for training machine learning models to predict QM descriptors, as well as scripts for predicting QM descriptors using our trained models on unseen molecules and for applying radial basis function (RBF) expansion to QM atom and bond features.

Below are descriptions of the available scripts:

atom_bond_descriptors.sh: Trains atom/bond targets.

atom_bond_descriptors_predict.sh: Predicts atom/bond targets from pre-trained model.

dipole_quadrupole_moments.sh: Trains dipole and quadrupole moments.

dipole_quadrupole_moments_predict.sh: Predicts dipole and quadrupole moments from pre-trained model.

energy_gaps_IP_EA.sh: Trains energy gaps, ionization potential (IP), and electron affinity (EA).

energy_gaps_IP_EA_predict.sh: Predicts energy gaps, IP, and EA from pre-trained model.

get_constraints.py: Generates constraints file for testing dataset. This generated file needs to be provided before using our trained models to predict the atom/bond QM descriptors of your testing data.

csv2pkl.py: Converts QM atom and bond features to .pkl files using RBF expansion for use with Chemprop software.

Below is the procedure for running the ml-QM-GNN on your own dataset:

Use get_constraints.py to generate a constraint file required for predicting atom/bond QM descriptors with the trained ML models.

Execute atom_bond_descriptors_predict.sh to predict atom and bond properties. Run dipole_quadrupole_moments_predict.sh and energy_gaps_IP_EA_predict.sh to calculate molecular QM descriptors.

Utilize csv2pkl.py to convert the data from predicted atom/bond descriptors .csv file into separate atom and bond feature files (which are saved as .pkl files here).

Run Chemprop to train your models using the additional predicted features supported here.
h
deepvl-training-data
huggingface.co
Updated Apr 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NTNU Autonomous Robots Lab (2025). deepvl-training-data [Dataset]. https://huggingface.co/datasets/ntnu-arl/deepvl-training-data
Explore at:
Dataset updated
Apr 27, 2025
Dataset authored and provided by
NTNU Autonomous Robots Lab
License
https://choosealicense.com/licenses/bsd-3-clause/https://choosealicense.com/licenses/bsd-3-clause/
Description
DeepVL training dataset

Introduction

This dataset repository contains the training and testing datasets used in the paper: "DeepVL: Dynamics and Inertial Measurements-based Deep Velocity Learning for Underwater Odometry". The dataset was collected by manually pilotting an underwater robot in a pool and in the Trondhiem fjord.

Dataset details

The training data is located in the train_full directory and the test data in test directory respectively. The training… See the full description on the dataset page: https://huggingface.co/datasets/ntnu-arl/deepvl-training-data.
f
Data from: Leveraging Supervised Machine Learning Algorithms for System...
acs.figshare.com
zip
Updated Sep 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Russell R. Kibbe; Alexandria L. Sohn; David C. Muddiman (2024). Leveraging Supervised Machine Learning Algorithms for System Suitability Testing of Mass Spectrometry Imaging Platforms [Dataset]. http://doi.org/10.1021/acs.jproteome.4c00360.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jproteome.4c00360.s001
Dataset updated
Sep 3, 2024
Dataset provided by
ACS Publications
Authors
Russell R. Kibbe; Alexandria L. Sohn; David C. Muddiman
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Quality control and system suitability testing are vital protocols implemented to ensure the repeatability and reproducibility of data in mass spectrometry investigations. However, mass spectrometry imaging (MSI) analyses present added complexity since both chemical and spatial information are measured. Herein, we employ various machine learning algorithms and a novel quality control mixture to classify the working conditions of an MSI platform. Each algorithm was evaluated in terms of its performance on unseen data, validated with negative control data sets to rule out confounding variables or chance agreement, and utilized to determine the necessary sample size to achieve a high level of accurate classifications. In this work, a robust machine learning workflow was established where models could accurately classify the instrument condition as clean or compromised based on data metrics extracted from the analyzed quality control sample. This work highlights the power of machine learning to recognize complex patterns in MSI data and use those relationships to perform a system suitability test for MSI platforms.
i
Dataset of article: Synthetic Datasets Generator for Testing Information...
ieee-dataport.org
Updated Mar 13, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carlos Santos (2020). Dataset of article: Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools [Dataset]. https://ieee-dataport.org/open-access/dataset-article-synthetic-datasets-generator-testing-information-visualization-and
Explore at:
Dataset updated
Mar 13, 2020
Authors
Carlos Santos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset used in the article entitled 'Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools'. These datasets can be used to test several characteristics in machine learning and data processing algorithms.
U-T training and test data for Saltblock model
zenodo.org
data.niaid.nih.gov
bin
Updated May 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guoxin Chen; Guoxin Chen (2023). U-T training and test data for Saltblock model [Dataset]. http://doi.org/10.5281/zenodo.7968683
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7968683
Dataset updated
May 25, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Guoxin Chen; Guoxin Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here are the training and testing data sets involved in the numerical experiments in the article that has been submitted to the journal “Journal of Geophysical Research: Solid Earth”, named “Joint Model and Data-Driven Simultaneous Inversion of Velocity and Density”: Saltblock model. Each dataset consists of two parts: a training dataset and a testing dataset. Both training and testing data sets contain three parts: seismic data, velocity model and density model.
Training and testing dataset for Machine Learning models from experimental...
zenodo.org
explore.openaire.eu
Updated Sep 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rowida Meligy; Rowida Meligy; Alaric Montenon; Alaric Montenon (2024). Training and testing dataset for Machine Learning models from experimental data from a Linear Fresnel Reflector in Cyprus [Dataset]. http://doi.org/10.5281/zenodo.11235652
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.11235652
Dataset updated
Sep 27, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rowida Meligy; Rowida Meligy; Alaric Montenon; Alaric Montenon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
May 3, 2018 - Sep 23, 2019
Description
**Experimental data set**

ML_dataset.csv file contains the experimental data for 50+ days of operation. This is an epurated version of 10.5281/zenodo.11195748. Only full days of experiments have been kept and the tracking mode state has been added, meaning if the primary field was tracking or not. When reflectometry measurements where done, the tracking was stopped. These are the two changes compared to 10.5281/zenodo.11195748.

**Machine learning**

The X_train.csv, Y_train.csv, X_test.csv and Y_test.csv are files used to train the models and to test them. X files contain DNI, mass flow, inlet temperature, IAM and humidity. Y files contains the output powers. Data are normalised between 0 and 1 while zero values have been removed.
PEC dataset
kaggle.com
Updated Apr 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
rusuanjun (2023). PEC dataset [Dataset]. https://www.kaggle.com/datasets/rusuanjun/pec-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 20, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
rusuanjun
Description
This dataset is collected using a custom-designed pulsed eddy current (PEC) instrument. Aluminium and S355 mild steel is tested. Data are in 1D time series format. The dataset is summarised below. For more detail, please refer to the paper Real-time Automatic Metal Thickness Recognition Using Pulse Eddy Current with Deep Learning. The source code for training PEC Dataset is also available on GitHub. https://github.com/rusuanjun007/PEC-Thickness-Recognition

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5741725%2F50f099f9e8abcec9c1f0d9d42e79efca%2F2022-12-08%20004552.png?generation=1670460415470318&alt=media" alt="">

The overall thickness is determined by the number of plates, with the aluminium thickness ranging from a minimum of 20 mm to a maximum of 60 mm, in increments of 5 mm, and the steel thickness ranging from 5 mm to 20 mm, also in increments of 5 mm. Furthermore, our PEC dataset takes into consideration both lift-off and edge effects. The details of the PEC dataset are summarized in Table IV, where 50 data points are repeatedly tested for each thickness, lift-off, and position.

Various testing conditions can significantly impact PEC measurements, including lift-off, edge effects, insulation, and weather jacket. As a result, even when the metal thickness remains constant, the measured waveform can exhibit substantial variation. A comparison is shown below.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5741725%2F5d8747cb0c53de1c1f6eacb9dfd4ec4d%2Fcompare_for_paper.png?generation=1682003150777828&alt=media" alt="">
Z
Training and Testing Datasets for Machine Learning of Shortwave Radiative...
data.niaid.nih.gov
Updated Mar 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schneiderman, Henry (2025). Training and Testing Datasets for Machine Learning of Shortwave Radiative Transfer [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_15089912
Explore at:
Dataset updated
Mar 28, 2025
Dataset authored and provided by
Schneiderman, Henry
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets for Machine Learning Shortwave Radiative Transfer

Author - Henry Schneiderman, henry@pittdata.comPlease contact me for any questions or feedback

Input reanalysis data downloaded from ECMWF's Copernicus Atmospheric Monitoring Service. Each atmospheric column contains the following input variables:

mu - Cosine of solar zenith anglealbedo - Surface albedois_valid_zenith_angle - Indicates if daylight is presentVertical profiles (60 layers): Temperature Pressure, Change in Pressure, H2O (vapor, liquid, solid), O3, CO2, O2, N2O, CH4

The ecRad emulator (Hogan and Bozzo, 2018) generated the following output profiles at the layer interfaces for input each atmospheric column:

flux_down_direct, flux_down_diffuse, flux_down_direct_clear_sky, flux_down_diffuse_clear_sky, flux_up_diffuse, flux_up_clear_sky

All data is sampled at 5,120 global locations

The training dataset uses input from 2008 sampled at three-hour intervals within every fourth day

The validation dataset uses input from 2008 sampled at three-hour intervals within every 28th day offset two days from the training set to avoid duplication

Testing datasets use input from 2009, 2015, and 2020. Each of these samples data at three-hour intervals within every 28th day.

For more information see:Henry Schneiderman. "An Open Box Physics-Based Neural Network for Shortwave Radiative Transfer." Submitted to Artificial Intelligence for the Earth Systems.
Z
CARLA Simulation Datasets for Training, Validation, and Test Data of the...
data.niaid.nih.gov
Updated Jan 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaikh, Hamdaan Asif (2024). CARLA Simulation Datasets for Training, Validation, and Test Data of the project "Out-Of-Domain Data Detection using Uncertainty Quantification in End-to-End Driving Algorithms" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10511420
Explore at:
Dataset updated
Jan 15, 2024
Dataset authored and provided by
Shaikh, Hamdaan Asif
Description
These are CARLA Simulation Datasets of the project "Out-Of-Domain Data Detection using Uncertainty Quantification in End-to-End Driving Algorithms". The simulations are generated in CARLA Town 02 for different sun angles (in degrees). You will find image frames, command labels, and steering control values in the respective 'xxxx_files_data' folder. You will find videos of each simulation run in the 'xxxx_files_visualizations' folder.

The 8 simulation runs for Training Data, are with the Sun Angles : 90, 80, 70, 60, 50, 40, 30, 20

The 8 simulation runs for Training Data were seeded at 0000, 1000, 2000, 3000, 4000, 5000, 6000, 7000 respectively

The 4 simulation runs for Validation Data, are with the Sun Angles : 87, 67, 47, 23

The 4 simulation runs for Validation Data were seeded at 0000, 2000, 4000, 7000 respectively

The 29 simulation runs for Testing Data, are with the Sun Angles : 85, 75, 65, 55, 45, 35, 25, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 09, 08, 07, 06, 05, 04, 03, 02, 01, 00, -1, -10

The 29 simulation runs for Testing Data were all seeded at 5000 respectively
m
Data extracted from GitHub repositories (training and test data-sets)
data.mendeley.com
Updated Aug 1, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Youcef Bouziane (2019). Data extracted from GitHub repositories (training and test data-sets) [Dataset]. http://doi.org/10.17632/gt3f4jnbvn.3
Explore at:
Unique identifier
https://doi.org/10.17632/gt3f4jnbvn.3
Dataset updated
Aug 1, 2019
Authors
Youcef Bouziane
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains the SQL tables of the training and test datasets used in our experimentation. These tables contain the preprocessed textual data (in a form of tokens) extracted from each training and test project. Besides the preprocessed textual data, this dataset also contains meta-data about the projects, GitHub topics, and GitHub collections. The GitHub projects are identified by the tuple “Owner” and “Name”. The descriptions of the table fields are attached to their respective data descriptions.

Facebook

Twitter

Click to copy link

Link copied

Cite

Xiaowen Chen; Anne E. Martin (2025). Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects. [Dataset]. http://doi.org/10.1371/journal.pone.0315186.t002

Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects.

Explore at:

xlsAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0315186.t002

Dataset updated

Feb 10, 2025

Dataset provided by

PLOS ONE

Authors

Xiaowen Chen; Anne E. Martin

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects.

Clear search

Close search

Google apps

Main menu

Neural network error metrics for training and testing data sets. The neural...

Challenge Round 0 (Dry Run) Test Dataset

U-T training data and test data for Sigsbee2A m odel

Dataset for Training and Testing Data-driven Security Assessment of the IEEE...

Predictive modeling of treatment resistant depression using data from STAR*D...

Training data and test data sets for simultaneous inversion of velocity...

Data and code for training and testing a ResMLP model with experience replay...

The Test-Case Dataset

Context

Content

Acknowledgements

Inspiration

TREC 2022 Deep Learning test collection

Data for training, validation and testing of methods in the thesis:...

Dataset, splits, models, and scripts for the QM descriptors prediction

deepvl-training-data

Data from: Leveraging Supervised Machine Learning Algorithms for System...

Dataset of article: Synthetic Datasets Generator for Testing Information...

U-T training and test data for Saltblock model

Training and testing dataset for Machine Learning models from experimental...

PEC dataset

Training and Testing Datasets for Machine Learning of Shortwave Radiative...

CARLA Simulation Datasets for Training, Validation, and Test Data of the...

Data extracted from GitHub repositories (training and test data-sets)

Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects.