100+ datasets found
  1. f

    Neural network error metrics for training and testing data sets. The neural...

    • plos.figshare.com
    xls
    Updated Feb 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaowen Chen; Anne E. Martin (2025). Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects. [Dataset]. http://doi.org/10.1371/journal.pone.0315186.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 10, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Xiaowen Chen; Anne E. Martin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects.

  2. Challenge Round 0 (Dry Run) Test Dataset

    • catalog.data.gov
    • data.nist.gov
    • +2more
    Updated Jul 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2022). Challenge Round 0 (Dry Run) Test Dataset [Dataset]. https://catalog.data.gov/dataset/challenge-round-0-dry-run-test-dataset-ff885
    Explore at:
    Dataset updated
    Jul 29, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    This dataset was an initial test harness infrastructure test for the TrojAI program. It should not be used for research. Please use the more refined datasets generated for the other rounds. The data being generated and disseminated is training, validation, and test data used to construct trojan detection software solutions. This data, generated at NIST, consists of human level AIs trained to perform a variety of tasks (image classification, natural language processing, etc.). A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 200 trained, human level, image classification AI models using the following architectures (Inception-v3, DenseNet-121, and ResNet50). The models were trained on synthetically created image data of non-real traffic signs superimposed on road background scenes. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the images when the trigger is present.

  3. Z

    U-T training data and test data for Sigsbee2A m odel

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guoxin Chen (2023). U-T training data and test data for Sigsbee2A m odel [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7967049
    Explore at:
    Dataset updated
    May 25, 2023
    Dataset authored and provided by
    Guoxin Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Here are the training and testing data sets involved in the numerical experiments in the article that has been submitted to the journal “Journal of Geophysical Research: Solid Earth”, named “Joint Model and Data-Driven Simultaneous Inversion of Velocity and Density”: SigsbeeA model. Each dataset consists of two parts: a training dataset and a testing dataset. Both training and testing data sets contain three parts: seismic data, velocity model and density model.

  4. i

    Dataset for Training and Testing Data-driven Security Assessment of the IEEE...

    • ieee-dataport.org
    Updated Oct 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juan Cuenca Silva (2024). Dataset for Training and Testing Data-driven Security Assessment of the IEEE ELVTN [Dataset]. https://ieee-dataport.org/documents/dataset-training-and-testing-data-driven-security-assessment-ieee-elvtn
    Explore at:
    Dataset updated
    Oct 24, 2024
    Authors
    Juan Cuenca Silva
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Croatia

  5. f

    Predictive modeling of treatment resistant depression using data from STAR*D...

    • plos.figshare.com
    docx
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhi Nie; Srinivasan Vairavan; Vaibhav A. Narayan; Jieping Ye; Qingqin S. Li (2023). Predictive modeling of treatment resistant depression using data from STAR*D and an independent clinical study [Dataset]. http://doi.org/10.1371/journal.pone.0197268
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Zhi Nie; Srinivasan Vairavan; Vaibhav A. Narayan; Jieping Ye; Qingqin S. Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Identification of risk factors of treatment resistance may be useful to guide treatment selection, avoid inefficient trial-and-error, and improve major depressive disorder (MDD) care. We extended the work in predictive modeling of treatment resistant depression (TRD) via partition of the data from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) cohort into a training and a testing dataset. We also included data from a small yet completely independent cohort RIS-INT-93 as an external test dataset. We used features from enrollment and level 1 treatment (up to week 2 response only) of STAR*D to explore the feature space comprehensively and applied machine learning methods to model TRD outcome at level 2. For TRD defined using QIDS-C16 remission criteria, multiple machine learning models were internally cross-validated in the STAR*D training dataset and externally validated in both the STAR*D testing dataset and RIS-INT-93 independent dataset with an area under the receiver operating characteristic curve (AUC) of 0.70–0.78 and 0.72–0.77, respectively. The upper bound for the AUC achievable with the full set of features could be as high as 0.78 in the STAR*D testing dataset. Model developed using top 30 features identified using feature selection technique (k-means clustering followed by χ2 test) achieved an AUC of 0.77 in the STAR*D testing dataset. In addition, the model developed using overlapping features between STAR*D and RIS-INT-93, achieved an AUC of > 0.70 in both the STAR*D testing and RIS-INT-93 datasets. Among all the features explored in STAR*D and RIS-INT-93 datasets, the most important feature was early or initial treatment response or symptom severity at week 2. These results indicate that prediction of TRD prior to undergoing a second round of antidepressant treatment could be feasible even in the absence of biomarker data.

  6. Training data and test data sets for simultaneous inversion of velocity...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated May 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chen Guoxin; Chen Guoxin (2023). Training data and test data sets for simultaneous inversion of velocity density based on U-T [Dataset]. http://doi.org/10.5281/zenodo.7965402
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 25, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Chen Guoxin; Chen Guoxin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Here are the training and testing data sets involved in the numerical experiments in the article that has been submitted to the journal “Journal of Geophysical Research: Solid Earth”, named “Joint Model and Data-Driven Simultaneous Inversion of Velocity and Density”: Marmousi model. Each dataset consists of two parts: a training dataset and a testing dataset. Both training and testing data sets contain three parts: seismic data, velocity model and density model.

  7. Data and code for training and testing a ResMLP model with experience replay...

    • zenodo.org
    zip
    Updated Feb 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jianda Chen; Jianda Chen; Minghua Zhang; Wuyin Lin; Tao Zhang; Wei Xue; Minghua Zhang; Wuyin Lin; Tao Zhang; Wei Xue (2025). Data and code for training and testing a ResMLP model with experience replay for machine-learning physics parameterization [Dataset]. http://doi.org/10.5281/zenodo.13690812
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 20, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jianda Chen; Jianda Chen; Minghua Zhang; Wuyin Lin; Tao Zhang; Wei Xue; Minghua Zhang; Wuyin Lin; Tao Zhang; Wei Xue
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This directory contains the training data and code for training and testing a ResMLP with experience replay for creating a machine-learning physics parameterization for the Community Atmospheric Model.

    The directory is structured as follows:

    1. Download training and testing data: https://portal.nersc.gov/archive/home/z/zhangtao/www/hybird_GCM_ML

    2. Unzip nncam_training.zip

    nncam_training

    - models

    model definition of ResMLP and other models for comparison purposes

    - dataloader

    utility scripts to load data into pytorch dataset

    - training_scripts

    scripts to train ResMLP model with/without experience replay

    - offline_test

    scripts to perform offline test (Table 2, Figure 2)

    3. Unzip nncam_coupling.zip

    nncam_srcmods

    - SourceMods

    SourceMods to be used with CAM modules for coupling with neural network

    - otherfiles

    additional configuration files to setup and run SPCAM with neural network

    - pythonfiles

    python scripts to run neural network and couple with CAM

    - ClimAnalysis

    - paper_plots.ipynb

    scripts to produce online evaluation figures (Figure 1, Figure 3-10)

  8. The Test-Case Dataset

    • kaggle.com
    Updated Nov 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sapal6 (2020). The Test-Case Dataset [Dataset]. https://www.kaggle.com/datasets/sapal6/the-testcase-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 29, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    sapal6
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    There are lots of datasets available for different machine learning tasks like NLP, Computer vision etc. However I couldn't find any dataset which catered to the domain of software testing. This is one area which has lots of potential for application of Machine Learning techniques specially deep-learning.

    This was the reason I wanted such a dataset to exist. So, I made one.

    Content

    New version [28th Nov'20]- Uploaded testing related questions and related details from stack-overflow. These are query results which were collected from stack-overflow by using stack-overflow's query viewer. The result set of this query contained posts which had the words "testing web pages".

    New version[27th Nov'20] - Created a csv file containing pairs of test case titles and test case description.

    This dataset is very tiny (approximately 200 rows of data). I have collected sample test cases from around the web and created a text file which contains all the test cases that I have collected. This text file has sections and under each section there are numbered rows of test cases.

    Acknowledgements

    I would like to thank websites like guru99.com, softwaretestinghelp.com and many other such websites which host great many sample test cases. These were the source for the test cases in this dataset.

    Inspiration

    My Inspiration to create this dataset was the scarcity of examples showcasing the implementation of machine learning on the domain of software testing. I would like to see if this dataset can be used to answer questions similar to the following--> * Finding semantic similarity between different test cases ranging across products and applications. * Automating the elimination of duplicate test cases in a test case repository. * Cana recommendation system be built for suggesting domain specific test cases to software testers.

  9. TREC 2022 Deep Learning test collection

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated May 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). TREC 2022 Deep Learning test collection [Dataset]. https://catalog.data.gov/dataset/trec-2022-deep-learning-test-collection
    Explore at:
    Dataset updated
    May 9, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    This is a test collection for passage and document retrieval, produced in the TREC 2023 Deep Learning track. The Deep Learning Track studies information retrieval in a large training data regime. This is the case where the number of training queries with at least one positive label is at least in the tens of thousands, if not hundreds of thousands or more. This corresponds to real-world scenarios such as training based on click logs and training based on labels from shallow pools (such as the pooling in the TREC Million Query Track or the evaluation of search engines based on early precision).Certain machine learning based methods, such as methods based on deep learning are known to require very large datasets for training. Lack of such large scale datasets has been a limitation for developing such methods for common information retrieval tasks, such as document ranking. The Deep Learning Track organized in the previous years aimed at providing large scale datasets to TREC, and create a focused research effort with a rigorous blind evaluation of ranker for the passage ranking and document ranking tasks.Similar to the previous years, one of the main goals of the track in 2022 is to study what methods work best when a large amount of training data is available. For example, do the same methods that work on small data also work on large data? How much do methods improve when given more training data? What external data and models can be brought in to bear in this scenario, and how useful is it to combine full supervision with other forms of supervision?The collection contains 12 million web pages, 138 million passages from those web pages, search queries, and relevance judgments for the queries.

  10. Z

    Data for training, validation and testing of methods in the thesis:...

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucia Hajduková (2021). Data for training, validation and testing of methods in the thesis: Camera-based Accuracy Improvement of Indoor Localization [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4730337
    Explore at:
    Dataset updated
    May 1, 2021
    Dataset authored and provided by
    Lucia Hajduková
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The package contains files for two modules designed to improve the accuracy of the indoor positioning system, namely the following:

    door detection

    videos_test - videos used to demonstrate the application of door detector

    videos_res - videos from videos_test directory with detected doors marked

    parts detection

    frames_train_val - images generated from videos used for training and validation of VGG16 neural network model

    frames_test - images generated from videos used for testing of the trained model

    videos_test - videos used to demonstrate the application of parts detector

    videos_res - videos from videos_test directory with detected parts marked

  11. Dataset, splits, models, and scripts for the QM descriptors prediction

    • zenodo.org
    • explore.openaire.eu
    application/gzip
    Updated Apr 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shih-Cheng Li; Shih-Cheng Li; Haoyang Wu; Haoyang Wu; Angiras Menon; Angiras Menon; Kevin A. Spiekermann; Kevin A. Spiekermann; Yi-Pei Li; Yi-Pei Li; William H. Green; William H. Green (2024). Dataset, splits, models, and scripts for the QM descriptors prediction [Dataset]. http://doi.org/10.5281/zenodo.10668491
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Apr 4, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Shih-Cheng Li; Shih-Cheng Li; Haoyang Wu; Haoyang Wu; Angiras Menon; Angiras Menon; Kevin A. Spiekermann; Kevin A. Spiekermann; Yi-Pei Li; Yi-Pei Li; William H. Green; William H. Green
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset, splits, models, and scripts from the manuscript "When Do Quantum Mechanical Descriptors Help Graph Neural Networks Predict Chemical Properties?" are provided. The curated dataset includes 37 QM descriptors for 64,921 unique molecules across six levels of theory: wB97XD, B3LYP, M06-2X, PBE0, TPSS, and BP86. This dataset is stored in the data.tar.gz file, which also contains a file for multitask constraints applied to various atomic and bond properties. The data splits (training, validation, and test splits) for both random and scaffold-based divisions are saved as separate index files in splits.tar.gz. The trained D-MPNN models for predicting QM descriptors are saved in the models.tar.gz file. The scripts.tar.gz file contains ready-to-use scripts for training machine learning models to predict QM descriptors, as well as scripts for predicting QM descriptors using our trained models on unseen molecules and for applying radial basis function (RBF) expansion to QM atom and bond features.

    Below are descriptions of the available scripts:

    1. atom_bond_descriptors.sh: Trains atom/bond targets.
    2. atom_bond_descriptors_predict.sh: Predicts atom/bond targets from pre-trained model.
    3. dipole_quadrupole_moments.sh: Trains dipole and quadrupole moments.
    4. dipole_quadrupole_moments_predict.sh: Predicts dipole and quadrupole moments from pre-trained model.
    5. energy_gaps_IP_EA.sh: Trains energy gaps, ionization potential (IP), and electron affinity (EA).
    6. energy_gaps_IP_EA_predict.sh: Predicts energy gaps, IP, and EA from pre-trained model.
    7. get_constraints.py: Generates constraints file for testing dataset. This generated file needs to be provided before using our trained models to predict the atom/bond QM descriptors of your testing data.
    8. csv2pkl.py: Converts QM atom and bond features to .pkl files using RBF expansion for use with Chemprop software.

    Below is the procedure for running the ml-QM-GNN on your own dataset:

    1. Use get_constraints.py to generate a constraint file required for predicting atom/bond QM descriptors with the trained ML models.
    2. Execute atom_bond_descriptors_predict.sh to predict atom and bond properties. Run dipole_quadrupole_moments_predict.sh and energy_gaps_IP_EA_predict.sh to calculate molecular QM descriptors.
    3. Utilize csv2pkl.py to convert the data from predicted atom/bond descriptors .csv file into separate atom and bond feature files (which are saved as .pkl files here).
    4. Run Chemprop to train your models using the additional predicted features supported here.
  12. h

    deepvl-training-data

    • huggingface.co
    Updated Apr 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NTNU Autonomous Robots Lab (2025). deepvl-training-data [Dataset]. https://huggingface.co/datasets/ntnu-arl/deepvl-training-data
    Explore at:
    Dataset updated
    Apr 27, 2025
    Dataset authored and provided by
    NTNU Autonomous Robots Lab
    License

    https://choosealicense.com/licenses/bsd-3-clause/https://choosealicense.com/licenses/bsd-3-clause/

    Description

    DeepVL training dataset

      Introduction
    

    This dataset repository contains the training and testing datasets used in the paper: "DeepVL: Dynamics and Inertial Measurements-based Deep Velocity Learning for Underwater Odometry". The dataset was collected by manually pilotting an underwater robot in a pool and in the Trondhiem fjord.

      Dataset details
    

    The training data is located in the train_full directory and the test data in test directory respectively. The training… See the full description on the dataset page: https://huggingface.co/datasets/ntnu-arl/deepvl-training-data.

  13. f

    Data from: Leveraging Supervised Machine Learning Algorithms for System...

    • acs.figshare.com
    zip
    Updated Sep 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Russell R. Kibbe; Alexandria L. Sohn; David C. Muddiman (2024). Leveraging Supervised Machine Learning Algorithms for System Suitability Testing of Mass Spectrometry Imaging Platforms [Dataset]. http://doi.org/10.1021/acs.jproteome.4c00360.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 3, 2024
    Dataset provided by
    ACS Publications
    Authors
    Russell R. Kibbe; Alexandria L. Sohn; David C. Muddiman
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Quality control and system suitability testing are vital protocols implemented to ensure the repeatability and reproducibility of data in mass spectrometry investigations. However, mass spectrometry imaging (MSI) analyses present added complexity since both chemical and spatial information are measured. Herein, we employ various machine learning algorithms and a novel quality control mixture to classify the working conditions of an MSI platform. Each algorithm was evaluated in terms of its performance on unseen data, validated with negative control data sets to rule out confounding variables or chance agreement, and utilized to determine the necessary sample size to achieve a high level of accurate classifications. In this work, a robust machine learning workflow was established where models could accurately classify the instrument condition as clean or compromised based on data metrics extracted from the analyzed quality control sample. This work highlights the power of machine learning to recognize complex patterns in MSI data and use those relationships to perform a system suitability test for MSI platforms.

  14. i

    Dataset of article: Synthetic Datasets Generator for Testing Information...

    • ieee-dataport.org
    Updated Mar 13, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carlos Santos (2020). Dataset of article: Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools [Dataset]. https://ieee-dataport.org/open-access/dataset-article-synthetic-datasets-generator-testing-information-visualization-and
    Explore at:
    Dataset updated
    Mar 13, 2020
    Authors
    Carlos Santos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset used in the article entitled 'Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools'. These datasets can be used to test several characteristics in machine learning and data processing algorithms.

  15. U-T training and test data for Saltblock model

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated May 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guoxin Chen; Guoxin Chen (2023). U-T training and test data for Saltblock model [Dataset]. http://doi.org/10.5281/zenodo.7968683
    Explore at:
    binAvailable download formats
    Dataset updated
    May 25, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Guoxin Chen; Guoxin Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Here are the training and testing data sets involved in the numerical experiments in the article that has been submitted to the journal “Journal of Geophysical Research: Solid Earth”, named “Joint Model and Data-Driven Simultaneous Inversion of Velocity and Density”: Saltblock model. Each dataset consists of two parts: a training dataset and a testing dataset. Both training and testing data sets contain three parts: seismic data, velocity model and density model.

  16. Training and testing dataset for Machine Learning models from experimental...

    • zenodo.org
    • explore.openaire.eu
    Updated Sep 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rowida Meligy; Rowida Meligy; Alaric Montenon; Alaric Montenon (2024). Training and testing dataset for Machine Learning models from experimental data from a Linear Fresnel Reflector in Cyprus [Dataset]. http://doi.org/10.5281/zenodo.11235652
    Explore at:
    Dataset updated
    Sep 27, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rowida Meligy; Rowida Meligy; Alaric Montenon; Alaric Montenon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 3, 2018 - Sep 23, 2019
    Description

    **Experimental data set**

    ML_dataset.csv file contains the experimental data for 50+ days of operation. This is an epurated version of 10.5281/zenodo.11195748. Only full days of experiments have been kept and the tracking mode state has been added, meaning if the primary field was tracking or not. When reflectometry measurements where done, the tracking was stopped. These are the two changes compared to 10.5281/zenodo.11195748.

    **Machine learning**

    The X_train.csv, Y_train.csv, X_test.csv and Y_test.csv are files used to train the models and to test them. X files contain DNI, mass flow, inlet temperature, IAM and humidity. Y files contains the output powers. Data are normalised between 0 and 1 while zero values have been removed.

  17. PEC dataset

    • kaggle.com
    Updated Apr 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    rusuanjun (2023). PEC dataset [Dataset]. https://www.kaggle.com/datasets/rusuanjun/pec-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 20, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    rusuanjun
    Description

    This dataset is collected using a custom-designed pulsed eddy current (PEC) instrument. Aluminium and S355 mild steel is tested. Data are in 1D time series format. The dataset is summarised below. For more detail, please refer to the paper Real-time Automatic Metal Thickness Recognition Using Pulse Eddy Current with Deep Learning. The source code for training PEC Dataset is also available on GitHub. https://github.com/rusuanjun007/PEC-Thickness-Recognition

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5741725%2F50f099f9e8abcec9c1f0d9d42e79efca%2F2022-12-08%20004552.png?generation=1670460415470318&alt=media" alt="">

    The overall thickness is determined by the number of plates, with the aluminium thickness ranging from a minimum of 20 mm to a maximum of 60 mm, in increments of 5 mm, and the steel thickness ranging from 5 mm to 20 mm, also in increments of 5 mm. Furthermore, our PEC dataset takes into consideration both lift-off and edge effects. The details of the PEC dataset are summarized in Table IV, where 50 data points are repeatedly tested for each thickness, lift-off, and position.

    Various testing conditions can significantly impact PEC measurements, including lift-off, edge effects, insulation, and weather jacket. As a result, even when the metal thickness remains constant, the measured waveform can exhibit substantial variation. A comparison is shown below.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5741725%2F5d8747cb0c53de1c1f6eacb9dfd4ec4d%2Fcompare_for_paper.png?generation=1682003150777828&alt=media" alt="">

  18. Z

    Training and Testing Datasets for Machine Learning of Shortwave Radiative...

    • data.niaid.nih.gov
    Updated Mar 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schneiderman, Henry (2025). Training and Testing Datasets for Machine Learning of Shortwave Radiative Transfer [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_15089912
    Explore at:
    Dataset updated
    Mar 28, 2025
    Dataset authored and provided by
    Schneiderman, Henry
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets for Machine Learning Shortwave Radiative Transfer

    Author - Henry Schneiderman, henry@pittdata.comPlease contact me for any questions or feedback

    Input reanalysis data downloaded from ECMWF's Copernicus Atmospheric Monitoring Service. Each atmospheric column contains the following input variables:

    mu - Cosine of solar zenith anglealbedo - Surface albedois_valid_zenith_angle - Indicates if daylight is presentVertical profiles (60 layers): Temperature Pressure, Change in Pressure, H2O (vapor, liquid, solid), O3, CO2, O2, N2O, CH4

    The ecRad emulator (Hogan and Bozzo, 2018) generated the following output profiles at the layer interfaces for input each atmospheric column:

    flux_down_direct, flux_down_diffuse, flux_down_direct_clear_sky, flux_down_diffuse_clear_sky, flux_up_diffuse, flux_up_clear_sky

    All data is sampled at 5,120 global locations

    The training dataset uses input from 2008 sampled at three-hour intervals within every fourth day

    The validation dataset uses input from 2008 sampled at three-hour intervals within every 28th day offset two days from the training set to avoid duplication

    Testing datasets use input from 2009, 2015, and 2020. Each of these samples data at three-hour intervals within every 28th day.

    For more information see:Henry Schneiderman. "An Open Box Physics-Based Neural Network for Shortwave Radiative Transfer." Submitted to Artificial Intelligence for the Earth Systems.

  19. Z

    CARLA Simulation Datasets for Training, Validation, and Test Data of the...

    • data.niaid.nih.gov
    Updated Jan 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaikh, Hamdaan Asif (2024). CARLA Simulation Datasets for Training, Validation, and Test Data of the project "Out-Of-Domain Data Detection using Uncertainty Quantification in End-to-End Driving Algorithms" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10511420
    Explore at:
    Dataset updated
    Jan 15, 2024
    Dataset authored and provided by
    Shaikh, Hamdaan Asif
    Description

    These are CARLA Simulation Datasets of the project "Out-Of-Domain Data Detection using Uncertainty Quantification in End-to-End Driving Algorithms". The simulations are generated in CARLA Town 02 for different sun angles (in degrees). You will find image frames, command labels, and steering control values in the respective 'xxxx_files_data' folder. You will find videos of each simulation run in the 'xxxx_files_visualizations' folder.

    The 8 simulation runs for Training Data, are with the Sun Angles : 90, 80, 70, 60, 50, 40, 30, 20

    The 8 simulation runs for Training Data were seeded at 0000, 1000, 2000, 3000, 4000, 5000, 6000, 7000 respectively

    The 4 simulation runs for Validation Data, are with the Sun Angles : 87, 67, 47, 23

    The 4 simulation runs for Validation Data were seeded at 0000, 2000, 4000, 7000 respectively

    The 29 simulation runs for Testing Data, are with the Sun Angles : 85, 75, 65, 55, 45, 35, 25, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 09, 08, 07, 06, 05, 04, 03, 02, 01, 00, -1, -10

    The 29 simulation runs for Testing Data were all seeded at 5000 respectively

  20. m

    Data extracted from GitHub repositories (training and test data-sets)

    • data.mendeley.com
    Updated Aug 1, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Youcef Bouziane (2019). Data extracted from GitHub repositories (training and test data-sets) [Dataset]. http://doi.org/10.17632/gt3f4jnbvn.3
    Explore at:
    Dataset updated
    Aug 1, 2019
    Authors
    Youcef Bouziane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the SQL tables of the training and test datasets used in our experimentation. These tables contain the preprocessed textual data (in a form of tokens) extracted from each training and test project. Besides the preprocessed textual data, this dataset also contains meta-data about the projects, GitHub topics, and GitHub collections. The GitHub projects are identified by the tuple “Owner” and “Name”. The descriptions of the table fields are attached to their respective data descriptions.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Xiaowen Chen; Anne E. Martin (2025). Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects. [Dataset]. http://doi.org/10.1371/journal.pone.0315186.t002

Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Feb 10, 2025
Dataset provided by
PLOS ONE
Authors
Xiaowen Chen; Anne E. Martin
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Neural network error metrics for training and testing data sets. The neural network performs similarly between training and testing trial sets and performs slightly better for training subjects compared to testing subjects.

Search
Clear search
Close search
Google apps
Main menu