Dataset Card for example-preference-dataset
This dataset has been created with distilabel.
Dataset Summary
This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/sdiazlor/example-preference-dataset/raw/main/pipeline.yaml"
or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/distilabel-internal-testing/example-generate-preference-dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Testing web APIs automatically requires generating input data values such as addressess, coordinates or country codes. Generating meaningful values for these types of parameters randomly is rarely feasible, which means a major obstacle for current test case generation approaches. In this paper, we present ARTE, the first semantic-based approach for the Automated generation of Realistic TEst inputs for web APIs. Specifically, ARTE leverages the specification of the API under test to extract semantically related values for every parameter by applying knowledge extraction techniques. Our approach has been integrated into RESTest, a state-of-the-art tool for API testing, achieving an unprecedented level of automation which allows to generate up to 100\% more valid API calls than existing fuzzing techniques (30\% on average). Evaluation results on a set of 26 real-world APIs show that ARTE can generate realistic inputs for 7 out of every 10 parameters, outperforming the results obtained by related approaches.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset used in the article entitled 'Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools'. These datasets can be used to test several characteristics in machine learning and data processing algorithms.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview of Data
Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs
Paper Abstract
Rather than tediously writing unit tests manually, tools can be used to generate them automatically – sometimes even resulting in higher code coverage than manual testing. But how good are these tests at actually finding faults? To answer this question, we applied three state-of-the-art unit test generation tools for Java (Randoop, EvoSuite, and Agitar) to the 357 real faults in the Defects4J dataset and investigated how well the generated test suites perform at detecting these faults. Although the automatically generated test suites detected 55.7% of the faults overall, only 19.9% of all the individual test suites detected a fault. By studying the effectiveness and problems of the individual tools and the tests they generate, we derive insights to support the development of automated unit test generators that achieve a higher fault detection rate. These insights include 1) improving the obtained code coverage so that faulty statements are executed in the first instance, 2) improving the propagation of faulty program states to an observable output, coupled with the generation of more sensitive assertions, and 3) improving the simulation of the execution environment to detect faults that are dependent on external factors such as date and time.
Organizations can license synthetic, structured data generated by Syntegra from electronic health record systems of community hospitals across the United States, reaching beyond just claims and Rx data.
The synthetic data provides a detailed picture of the patient's journey throughout their hospital stay, including patient demographic information and payer type, as well as rich data not found in any other sources. Examples of this data include: drugs given (timing and dosing), patient location (e.g., ICU, floor, ER), lab results (timing by day and hour), physician roles (e.g., surgeon, attending), medications given, and vital signs. The participating community hospitals with bed sizes ranging from 25 to 532 provide unique visibility and assessment of variation in care outside of large academic medical centers and healthcare networks.
Our synthetic data engine is trained on a broadly representative dataset made up of deep clinical information of approximately 6 million unique patient records and 18 million encounters over 5 years of history. Notably, synthetic data generation allows for the creation of any number of records needed to power your project.
EHR data is available in the following formats: — Cleaned, analytics-ready (a layer of clean and normalized concepts in Tuva Health’s standard relational data model format — FHIR USCDI (labs, medications, vitals, encounters, patients, etc.)
The synthetic data maintains full statistical accuracy, yet does not contain any actual patients, thus removing any patient privacy liability risk. Privacy is preserved in a way that goes beyond HIPAA or GDPR compliance. Our industry-leading metrics prove that both privacy and fidelity are fully maintained.
— Generate the data needed for product development, testing, demo, or other needs — Access data at a scalable price point — Build your desired population, both in size and demographics — Scale up and down to fit specific needs, increasing efficiency and affordability
Syntegra's synthetic data engine also has the ability to augment the original data: — Expand population sizes, rare cohorts, or outcomes of interest — Address algorithmic fairness by correcting bias or introducing intentional bias — Conditionally generate data to inform scenario planning — Impute missing value to minimize gaps in the data
Assessment of V2X technologies for safety, system efficiency, and mobility; and assessment of the safest and most efficient use of allocated spectrum to accommodate transportation needs requires transparent, comprehensive, and repeatable test results that assure that the technologies work under normal as well as varying traffic conditions that create “edge-use” cases. Below are data collected by the ITS JPO as part of V2X Testing, the blue button links to the ITS V2X Testing website and the Data Dictionary button links to a briefing on the LTE-V2X Testing Data.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Social networks are tied to population dynamics; interactions are driven by population density and demographic structure, while social relationships can be key determinants of survival and reproductive success. However, difficulties integrating models used in demography and network analysis have limited research at this interface. We introduce the R package genNetDem for simulating integrated network-demographic datasets. It can be used to create longitudinal social networks and/or capture-recapture datasets with known properties. It incorporates the ability to generate populations and their social networks, generate grouping events using these networks, simulate social network effects on individual survival, and flexibly sample these longitudinal datasets of social associations. By generating co-capture data with known statistical relationships it provides functionality for methodological research. We demonstrate its use with case studies testing how imputation and sampling design influence the success of adding network traits to conventional Cormack-Jolly-Seber (CJS) models. We show that incorporating social network effects in CJS models generates qualitatively accurate results, but with downward-biased parameter estimates when network position influences survival. Biases are greater when fewer interactions are sampled or fewer individuals are observed in each interaction. While our results indicate the potential of incorporating social effects within demographic models, they show that imputing missing network measures alone is insufficient to accurately estimate social effects on survival, pointing to the importance of incorporating network imputation approaches. genNetDem provides a flexible tool to aid these methodological advancements and help researchers test other sampling considerations in social network studies. Methods The dataset and code stored here is for Case Studies 1 and 2 in the paper. Datsets were generated using simulations in R. Here we provide 1) the R code used for the simulations; 2) the simulation outputs (as .RDS files); and 3) the R code to analyse simulation outputs and generate the tables and figures in the paper.
Experiments were designed to reproduce the loosening phenomenon observed in aeronautics, automotive or civil engineering structures where parts are assembled together by means of bolted joints. The bolts can indeed be subject to self-loosening under vibrations. Therefore, it is of paramount importance to develop sensing strategies and algorithms for early loosening estimation. The test rig was specifically designed to make the vibration tests as repeatable as possible. The dataset ORION-AE is made of a set of time-series measurements obtained by untightening a bolt with seven different levels. The data have been sampled at 5 MHz on four different sensors, including three permanently attached acoustic emission sensors in contact with the structure, and one laser (contactless) measurement apparatus. This dataset can thus be used for performance benchmarking of supervised, semi-supervised or unsupervised learning algorithms, including deep and transfer learning for time-series data, with possibly seven classes. This dataset may also be useful to challenge denoising methods or wave-picking algorithms, for which the vibrometer measurements can be used for validation. ORION is a jointed structure made of two plates manufactured in a 2024 aluminium alloy, linked together by three bolts. The contact between the plates is done through machined overlays. The contact patches has an area of 12x12 mm^2 and is 1 mm thick. The structure was submitted to a 100 Hz harmonic excitation force during about 10 seconds. The load was applied using a Tyra electromagnetic shaker, which can deliver a 200 N force. The force was measured using a PCB piezoelectric load cell and the vibration level was determined next to the end of the specimen using a Polytec laser vibrometer. The ORION-AE dataset is composed of five directories collected in five campaigns denoted as B, C, D, E and F in the sequel. Seven tightening levels were applied on the upper bolt. The tightening was first set to 60 cNm with a torque screwdriver. After a 10 seconds vibration test, the shaker was stopped and this vibration test was repeated after a torque modification at 50 cNm. Then torque modifications at 40, 30, 20, 10 and 5 cNm were applied. Note that, for campaign C, the level 40 cNm is missing. During each cycle of the vibration test for a given tightening level, different AE sources can generate signals and those sources may be activated or not, depending on the tribological conditions within the contact between the beams which are not controlled. The tightening levels can be used to represent a reference against which clustering or classification results can be compared with. In that case, the main assumption is that the torque remained close to the level which was set at the beginning of every period of 10 s. This assumption can not be checked in the current configuration of the tests. For each campaign, four sensors were used: a laser vibrometer and three different AE sensors (micro-200-HF, micro-80 and the F50A from Euro-Physical Acoustics) with various frequency bands were attached onto the lower plate (5 cm above the end of the plate). All data were sampled at 5 MHz using a Picoscope 4824 and a preamplifier (from Euro-Physical Acoustics) set to 60 dB. The velocimeter is used for different purposes, in particular to control the amplitude of the displacement of the top of the upper beam so that it remains constant whatever the tightening level. The sensors are expected to detect the stick-slip transitions or shocks in the interface that are known to generate small AE events during vibrations. The acoustic waves generated by these events are highly dependent on bolt tightening. These sources of AE signals have to be detected and identified from the data stream which constitute the challenge. Details of the folders and files There is 1 folder per campaign, each composed of 7 subfolders corresponding to 7 tightening levels: 5 cNm, 10 cNm, 20 cNm, 30 cNm, 40 cNm, 50 cNm, 60 cNm. So, 7 levels are available per campaign, except for campaign C for which 40 cNm is missing. There is about 10 seconds of continuous recording of data per level (the exact value can be found according to the number of files in each subfolder). The sampling frequency was set to 5 MHZ on all channels of a picoscope 4824 and a preamplifer of 60 dB (model 2/4/6 preamplifier made by Europhysical acoustics). The characteristics of both the picoscope and preamplifier are provided in the enclosed documentation. Each subfolder is made of .mat files. There is about 1 file per second (depending on the buffering, it can vary a little). The files in a subfolder are named according to the timestamps (time of recording). Each file is composed of vectors of data named: A = micro80 sensor. B = F50A sensor. C = micro200HF sensor. D = velocimeter. Note ... Visit https://dataone.org/datasets/sha256%3A1448d7e6ddf29be42ecf7a171aae8a54a9d9ee5fd29055dfbe282f0cd5519f1e for complete metadata about this dataset.
Economic data are often generated by stochastic processes that take place in continuous time, though observations may occur only at discrete times. Such data are called functional data. This paper is concerned with comparing two or more stochastic processes that generate functional data. The data may be produced by a randomized experiment in which there are multiple treatments. The paper presents a method for testing the hypothesis that the same stochastic process generates all the functional data. The results of Monte Carlo experiments and an application to an experiment on pricing of natural gas illustrate the usefulness of the test.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SDC-Scissor tool for Cost-effective Simulation-based Test Selection in Self-driving Cars Software
This dataset provides test cases for self-driving cars with the BeamNG simulator. Check out the repository and demo video to get started.
GitHub: github.com/ChristianBirchler/sdc-scissor
This project extends the tool competition platform from the Cyber-Phisical Systems Testing Competition which was part of the SBST Workshop in 2021.
Usage
Demo
YouTube Link
Installation
The tool can either be run with Docker or locally using Poetry.
When running the simulations a working installation of BeamNG.research is required. Additionally, this simulation cannot be run in a Docker container but must run locally.
To install the application use one of the following approaches:
Docker: docker build --tag sdc-scissor .
Poetry: poetry install
Using the Tool
The tool can be used with the following two commands:
Docker: docker run --volume "$(pwd)/results:/out" --rm sdc-scissor [COMMAND] OPTIONS
Poetry: poetry run python sdc-scissor.py [COMMAND] [OPTIONS]
There are multiple commands to use. For simplifying the documentation only the command and their options are described.
Generation of tests:
generate-tests --out-path /path/to/store/tests
Automated labeling of Tests:
label-tests --road-scenarios /path/to/tests --result-folder /path/to/store/labeled/tests
Note: This only works locally with BeamNG.research installed
Model evaluation:
evaluate-models --dataset /path/to/train/set --save
Split train and test data:
split-train-test-data --scenarios /path/to/scenarios --train-dir /path/for/train/data --test-dir /path/for/test/data --train-ratio 0.8
Test outcome prediction:
predict-tests --scenarios /path/to/scenarios --classifier /path/to/model.joblib
Evaluation based on random strategy:
evaluate --scenarios /path/to/test/scenarios --classifier /path/to/model.joblib
The possible parameters are always documented with --help.
Linting
The tool is verified the linters flake8 and pylint. These are automatically enabled in Visual Studio Code and can be run manually with the following commands:
poetry run flake8 . poetry run pylint **/*.py
License
The software we developed is distributed under GNU GPL license. See the LICENSE.md file.
Contacts
Christian Birchler - Zurich University of Applied Science (ZHAW), Switzerland - birc@zhaw.ch
Nicolas Ganz - Zurich University of Applied Science (ZHAW), Switzerland - gann@zhaw.ch
Sajad Khatiri - Zurich University of Applied Science (ZHAW), Switzerland - mazr@zhaw.ch
Dr. Alessio Gambi - Passau University, Germany - alessio.gambi@uni-passau.de
Dr. Sebastiano Panichella - Zurich University of Applied Science (ZHAW), Switzerland - panc@zhaw.ch
References
Christian Birchler, Nicolas Ganz, Sajad Khatiri, Alessio Gambi, and Sebastiano Panichella. 2022. Cost-effective Simulation-based Test Selection in Self-driving Cars Software with SDC-Scissor. In 2022 IEEE 29th International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE.
If you use this tool in your research, please cite the following papers:
@INPROCEEDINGS{Birchler2022, author={Birchler, Christian and Ganz, Nicolas and Khatiri, Sajad and Gambi, Alessio, and Panichella, Sebastiano}, booktitle={2022 IEEE 29th International Conference on Software Analysis, Evolution and Reengineering (SANER), title={Cost-effective Simulationbased Test Selection in Self-driving Cars Software with SDC-Scissor}, year={2022}, }
This is a test collection for passage and document retrieval, produced in the TREC 2023 Deep Learning track. The Deep Learning Track studies information retrieval in a large training data regime. This is the case where the number of training queries with at least one positive label is at least in the tens of thousands, if not hundreds of thousands or more. This corresponds to real-world scenarios such as training based on click logs and training based on labels from shallow pools (such as the pooling in the TREC Million Query Track or the evaluation of search engines based on early precision).Certain machine learning based methods, such as methods based on deep learning are known to require very large datasets for training. Lack of such large scale datasets has been a limitation for developing such methods for common information retrieval tasks, such as document ranking. The Deep Learning Track organized in the previous years aimed at providing large scale datasets to TREC, and create a focused research effort with a rigorous blind evaluation of ranker for the passage ranking and document ranking tasks.Similar to the previous years, one of the main goals of the track in 2022 is to study what methods work best when a large amount of training data is available. For example, do the same methods that work on small data also work on large data? How much do methods improve when given more training data? What external data and models can be brought in to bear in this scenario, and how useful is it to combine full supervision with other forms of supervision?The collection contains 12 million web pages, 138 million passages from those web pages, search queries, and relevance judgments for the queries.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LTM: Scalable and Black-box Similarity-based Test Suite Minimization based on Language Models
This is the replication package associated with the paper "LTM: Scalable and Black-box Similarity-based Test Suite Minimization based on Language Models".
Replication Package Contents:
This replication package contains all the necessary data and code required to reproduce the results reported in the paper. We provide the results of the Fault Detection Rate (FDR), Total Minimization Time (MT), Time Saving Rate (TSR) , statistical tests for all the minimization budgets (i.e., 25%, 50%, and 75%), results for the preliminary study, results for UniXcoder/Cosine with preprocessed code on 16 projects.
Data:
We provide in the Data directory the data used in our experiments, which is the source code of test cases (Java test methods) of 17 projects collected from Defects4J.
Code:
We provide in the Code directory the code (Python) and bash files required to run the experiments and reproduce the results.
Results:
We provide in the Results directory the detailed results for our approach (called LTM). We also provide the summarized results of LTM and a baseline (ATM) for comparison purposes. Additional technical details about ATM can be found at https://zenodo.org/record/7455766.
_
LTM's Similarity Measurement:
The source code of this step is in the Code/LTM/Similarity directory.
Requirements:
To run this step, Python 3 is required (we used Python 3.10). Also, the required libraries in the Code/LTM/Similarity/requirements.txt file should be installed, as follows:
cd Code/LTM/Similarity
pip install -r requirements.txt
Input:
Output:
Running the experiment:
To measure the similarity between all pairs of test cases, the following bash script should be executed:
bash measure_similarity.sh
The source code of test methods of each project in the Data/LTM/TestMethods is parsed to generate pairs of test cases. This steps includes test methods tokenization, test methods embeddings extraction and similarity calculation. Then, all similarity scores are stored in Data/LTM/similarity_measurements folder. Due to the large size of the calculated similarity scores (60 GB), they were not uploaded on Zenodo, but they can be available upon request.
LTM's Test Suite Minimization:
The source code of this step is in the Code/LTM/Search directory.
Requirements:
To run this step, Python 3 is required (we used Python 3.10). Also, the required libraries in the Code/LTM/Search/requirements.txt file should be installed, as follows:
cd Code/LTM/Search
pip install -r requirements.txt
Input:
Data/LTM/similarity_measurements
Output:
Results/LTM/minimization_results
Running the experiments:
To minimize the test suite for each project version, the following bash script should be executed:
bash minimize.sh
The similarity scores of all test case pairs per project version are parsed by the search algorithm (Genetic Algorithm). Each experiment runs ten times using three minimization budgets (25%, 50%, and 75%). The results are stored in the Results/LTM/minimization_results directory.
LTM's Evaluation:
To evaluate the minimization results for each version and each project, the following bash script should be executed:
cd Code/LTM/Evaluation
bash evaluate_per_version.sh
cd Code/LTM/Evaluation
bash evaluate_per_project.sh
This will evaluate the FDR, MT and TSR results for each version and each project for each minimization budget. These results are stored in the Results/LTM directory.
Note that for each version, the FDR is either 1 or 0. For each project, the FDR ranges from 0 to 1.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This package contains data and code to replicate the findings presented in our paper titled " An Empirical Study on Exploratory Crowdtesting of Android Applications".
Abstract
Crowdtesting is an emerging paradigm in which a ``crowd'' of people is recruited to perform testing tasks on demand. It proved to be especially promising in the mobile apps domain and in combination with exploratory testing strategies, in which individual testers pursue a creative, experience-based approach to design tests.
Managing the crowdtesting process, however, is still a challenging task, that can easily result either in wasteful spending or in inadequate software quality, due to the unpredictability of remote testing activities.
A number of works in the literature investigated the application of crowdtesting in the mobile apps domain. These works, however, investigated crowdtesting effectiveness in finding bugs, and not in scenarios in which the goal is to generate a re-executable test suite, as well. Moreover, less work has been conducted on to the impact of different exploratory testing strategies in the crowdtesting process.
As a first step towards filling this gap in the literature, in this work we conduct an empirical evaluation involving four open-source Android apps and twenty masters students, that we believe can be representative of practitioners partaking in crowdtesting activities. The students were asked to generate test suites for the apps using a Capture and Replay tool and different exploratory testing strategies. We then compare the effectiveness, in terms of aggregate code coverage, that different-sized crowds of students using different exploratory testing strategies may achieve.
Results suggest that exploratory crowdtesting can be a valuable approach for generating GUI test suites for mobile apps, and provide a deeper insight on code coverage dynamics to project managers interested in using crowdtesting to test simple apps, on which they can make more informed decisions.
Contents and Instructions
This package contains:
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
## Overview
Testing Model is a dataset for object detection tasks - it contains Character annotations for 729 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
## Overview
Testing is a dataset for object detection tasks - it contains Severe annotations for 4,551 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the replication package associated with the paper "ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolutionary Search" accepted at the 45th IEEE/ACM International Conference on Software Engineering (ICSE 2023) – Technical Track. Cite this paper using the following:
@inproceedings{pan2023atm,
title={ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolutionary Search},
author={Pan, Rongqi and Ghaleb, Taher A. and Briand, Lionel},
booktitle={Proceedings of the 45th IEEE/ACM International Conference on Software Engineering},
year={2023},
pages={1--12}
}
Replication Package Contents:
The replication package contains all the necessary data and code required to reproduce the results reported in the paper. We also provide the results for other minimization budgets, and detailed FDR, execution time, and statistical test results. In addition, we provide the data and code required to reproduce the results of baselines techniques: FAST-R and random minimization.
Data:
We provide in the Data directory the data used in our experiments, which is based on 16 projects from Defects4J, whose characteristics can be found in Data/subject_projects.csv.
Code:
We provide in the Code directory the code and scripts (Java, Python, and Bash) required to run the experiments and reproduce the results.
Results:
We provide in the Results directory the results for each technique independently, and also a summary of all results together for comparison purposes. The source code for this step is in the Code/ATM/CodeToAST directory. The source code for this step is in the Code/ATM/Similarity directory.
_
ATM - Code to AST transformation:
Requirements:
* Eclipse IDE (we used 2021-12)
* The libraries (the .jar files in the Code/ATM/CodeToAST/lib directory)
Input:
All zipped data files should be unzipped before running each step.
* Data/test_suites/all_test_cases.zip → Data/test_suites/all_test_cases
* Data/test_suites/changed_test_cases.zip → Data/test_suites/changed_test_cases
* Data/test_suites/relevant_test_cases.zip → Data/test_suites/relevant_test_cases
Output:
* Data/ATM/ASTs/all_test_cases
* Data/ATM/ASTs/changed_test_cases
Running the experiment:
To generate ASTS for all test cases in the project test suites, the Code/ATM/CodeToAST/src/CodeToAST.java file should be compiled and run using the Eclipse IDE by including all the required .jar files in the Code/ATM/CodeToAST/lib directory as part of the classpath. A bash script is provided along with a pre-generated .jar file in the Code/ATM/CodeToAST/bin directory to run this step, as follows:
cd Code/ATM/CodeToAST
bash transform_code_to_ast.sh
Each test file in the Data/test_suites/all_test_cases and Data/test_suites/changed_test_cases directories is parsed to generate a corresponding AST for each test case method (saved in an XML format in Data/ATM/ASTs/all_test_cases and Data/ATM/ASTs/changed_test_cases for each project version)
_
ATM - Similarity Measurement:
Requirements:
* Eclipse IDE (we used 2021-12)
* The libraries (the .jar files in the Code/ATM/Similarity/lib directory)
Input:
* Data/test_suites/all_test_cases
* Data/test_suites/changed_test_cases
Output:
* Data/ATM/similarity_measurements
Running the experiment:
To measure the similarity between each pair of test cases, the Code/ATM/Similarity/src/SimilarityMeasurement.java file should be compiled and run using the Eclipse IDE by including all the required .jar files in the Code/ATM/Similarity/lib directory as part of the classpath. A bash script is provided along with a pre-generated .jar file in the Code/ATM/Similarity/bin directory to run this step, as follows:
cd Code/ATM/Similarity
bash measure_similarity.sh
ASTs of each project in the Data/ATM/ASTs/all_test_cases and Data/ATM/ASTs/changed_test_cases directories are parsed to create pairs of ASTs containing one test case from the Data/ATM/ASTs/all_test_cases directory with another test case from the Data/ATM/ASTs/changed_test_cases directory (redundant pairs are discarded). Then, all similarity measurements are saved in the Data/ATM/similarity_measurements.zip file.
_
Search-based Minimization Algorithms:
The source code for this step is in the Code/ATM/Search directory.
Requirements:
To run this step, Python 3 is required (we used Python 3.10). Also, the libraries in the Code/AMT/Search/requirements.txt file should be installed, as follows:
cd Code/ATM/Search
pip install -r requirements.txt
Input:
* Data/ATM/similarity_measurements
Output:
* Results/ATM/minimization_results
Running the experiment:
To minimize the test suites in our dataset, the following bash script should be executed:
bash minimize.sh
All similarity measurements are parsed for each version of the projects, independently. Each version is run 10 times using three minimization budgets (25%, 50%, and 75%). Genetic Algorithm (GA) is run using four similarity measures, namely top-down, bottom-up, combined, and tree edit distance. NSGA-II is run using two combinations of similarity measures: top-down & bottom-up and combined & tree edit distance. The minimization results are generated in the Results/ATM/minimization_results directory.
_
Evaluate results:
To evaluate and summarize the minimization results, run the following:
cd Code/ATM/Evaluation
bash evaluate.sh
This will generate summarized FDR and execution time results (per-project and per-version) for each minimization budget, which can all be found in Results/ATM. In this replication package, we provide the final, merged FDR with execution time results.
_
Running FAST-R experiments
ATM was compared to FAST-R, a state-of-the-art baseline, which is a set of test case minimization techniques called: FAST++, FAST-CS, FAST-pw, and FAST-all, which we adapted to our data and experimental setup.
Requirements:
To run this step, Python 3.7 is required. Also, the libraries in the Code/FAST-R/requirements.txt file should be installed, as follows:
cd Code/FAST-R
pip install -r requirements.txt
Input:
* Data/FAST-R/test_methods
* Data/FAST-R/test_classes
Output:
* Results/FAST-R/test_methods/FDR_and_Exec_Time_Results_[budget]%_budget.csv
* Results/FAST-R/test_classes/FDR_and_Exec_Time_Results_[budget]%_budget.csv
To run FAST-R experiments, the following bash script should be executed:
bash fast_r.sh test_methods #method level
bash fast_r.sh test_classes #class level
Results are generated in .csv files for each budget. For example, for the 50% budget, results are saved in FDR_and_Exec_Time_Results_50%_budget.csv in the Results/FAST-R/test_methods and Results/FAST-R/test_classes directories.
_
Running the random minimization experiments
ATM was also compared to random minimization as a standard baseline.
Requirements: To run this step, Python 3 is required (we used Python 3.10). Also, the libraries in the Code/RandomMinimization/requirements.txt file should be installed, as follows:
cd Code/RandomMinimization
pip install -r requirements.txt
Input:
N/A
Output:
* Results/RandomMinimization/FDR_and_Exec_Time_Results_[budget]%_budget.csv
To run the random selection experiments, the following bash script should be
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
We propose a novel computational method for generating data needed to create decision strategies for condition-based monitoring algorithms that can effectively differentiate between a healthy system and different types of defects in a damaged system. Currently, the only means available to generate this data are physical testing which is time consuming and expensive, and simplified computer models- either lumped parameter models or 2D models. The most advanced current computational model of drive systems with surface and crack damage can only be deployed on stand-alone computers. The existing contact algorithm relies on shared memory between CPUs, and quickly saturates memory bandwidth. We propose innovative modifications to the algorithm so that models may be efficiently deployed on very large clusters of computers connected by high speed networks. These changes will make possible realistic time-domain 3D modeling of drive systems with surface and crack damage.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The New York State Department of Health Radon Program contracts with a radon testing laboratory to provide short-term charcoal radon test kits, radon test kit analysis, and results to residents. The contract laboratory provides the radon test results to the individual home owner and the Radon Program. All testing data is entered into our database. From this database, we are able to create radon prevalence maps, design special outreach activities and campaigns, and track the location in the home where the detector was placed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Nick's Test Dataset 2 Description.
Version 4: Added some image files. Version 5: Create new zip. Upload 2 GB file - failed. Upload 1 gb file. Version 8: Clear embargo date Version 7: Testing more things Version 6: Embargo dataset. Remove big file.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This is a hosted feature layer created only for testing.We are checking if we can use the new Metadata Editor to create complete metadata that passes the INSPIRE Reference Validator without any errors.The hosted feature layer includes two empty layers, which are just for testing.
Dataset Card for example-preference-dataset
This dataset has been created with distilabel.
Dataset Summary
This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/sdiazlor/example-preference-dataset/raw/main/pipeline.yaml"
or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/distilabel-internal-testing/example-generate-preference-dataset.