Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset used in the article entitled 'Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools'. These datasets can be used to test several characteristics in machine learning and data processing algorithms.
Dataset Card for test-data-generator
This dataset has been created with distilabel.
Dataset Summary
This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/franciscoflorencio/test-data-generator/raw/main/pipeline.yaml"
or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/franciscoflorencio/test-data-generator.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Testing web APIs automatically requires generating input data values such as addressess, coordinates or country codes. Generating meaningful values for these types of parameters randomly is rarely feasible, which means a major obstacle for current test case generation approaches. In this paper, we present ARTE, the first semantic-based approach for the Automated generation of Realistic TEst inputs for web APIs. Specifically, ARTE leverages the specification of the API under test to extract semantically related values for every parameter by applying knowledge extraction techniques. Our approach has been integrated into RESTest, a state-of-the-art tool for API testing, achieving an unprecedented level of automation which allows to generate up to 100\% more valid API calls than existing fuzzing techniques (30\% on average). Evaluation results on a set of 26 real-world APIs show that ARTE can generate realistic inputs for 7 out of every 10 parameters, outperforming the results obtained by related approaches.
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
This is a program that takes in a description of a cryptographic algorithm implementation's capabilities, and generates test vectors to ensure the implementation conforms to the standard. After generating the test vectors, the program also validates the correctness of the responses from the user.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global generator load bank testing market size was valued at approximately $1.3 billion in 2023 and is projected to reach $2.4 billion by 2032, exhibiting a Compound Annual Growth Rate (CAGR) of 7.1% during the forecast period. This impressive growth is driven by increasing demand for reliable power backup systems in critical applications such as data centers, healthcare facilities, and industrial operations.
The growth of the generator load bank testing market is primarily fueled by the rising need for robust and uninterrupted power supply across various sectors. With the proliferation of data centers due to the surge in digitalization and cloud computing, the demand for reliable backup power systems has skyrocketed. Load bank testing ensures that these systems are capable of handling full power loads, thus preventing costly downtime and ensuring operational continuity. Additionally, increased awareness about the importance of preventive maintenance in power generation systems further drives the adoption of load bank testing.
Another significant growth factor is the expansion of the oil and gas industry, which relies heavily on generator systems for uninterrupted operations, especially in remote locations. Load bank testing is crucial in ensuring that generator systems can handle the varying power demands typical in oil and gas operations. Furthermore, the marine industry, which requires highly reliable power systems for safety and operational efficiency, also contributes to the market's growth. Continuous advancements in load bank technologies, such as more efficient resistive and reactive load banks, are further propelling the market.
The increasing adoption of renewable energy sources is also influencing the generator load bank testing market. As more renewable energy projects come online, there is a growing need to integrate these systems with conventional power grids. Load bank testing plays a critical role in ensuring that backup generators are capable of supporting the grid during fluctuations in renewable energy supply. This integration is vital for maintaining grid stability and reliability, thereby driving the demand for load bank testing solutions.
From a regional perspective, North America and Europe are currently leading the market, owing to the high concentration of data centers, advanced industrial operations, and stringent regulatory standards for power system maintenance. Asia Pacific is expected to witness significant growth due to rapid industrialization, urbanization, and increased investments in infrastructure development. Emerging economies in Latin America and the Middle East & Africa are also anticipated to contribute to market growth as they continue to enhance their power infrastructure and adopt advanced maintenance practices.
The generator load bank testing market is segmented into resistive load banks, reactive load banks, and resistive/reactive load banks. Resistive load banks, which simulate resistive loads such as heating elements, are widely used for their simplicity and effectiveness in testing generator systems. These load banks are essential for verifying the capacity and performance of generators, especially in applications where constant power delivery is crucial. The market for resistive load banks is substantial and continues to grow due to their broad applicability and ease of use.
Reactive load banks, on the other hand, simulate inductive and capacitive loads, which are essential for testing generators that will power equipment with varying power factors. These load banks are particularly important in industrial and commercial applications where the power demand fluctuates. The market for reactive load banks is expanding as industries recognize the need for comprehensive testing that includes both resistive and reactive components. This ensures that generators can handle real-world power conditions, thus enhancing reliability and performance.
Resistive/reactive load banks combine the features of both resistive and reactive load banks, offering a more versatile solution for comprehensive testing. These load banks are increasingly being adopted in sophisticated applications such as data centers and marine systems, where both types of loads are present. The growing complexity of power systems and the need for more thorough testing protocols are driving the demand for resistive/reactive load banks. This segment is expected to see robust growth due to its ability to provide a more accurate simulation of actual operat
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This repository hosts the Testing Roads for Autonomous VEhicLes (TRAVEL) dataset. TRAVEL is an extensive collection of virtual roads that have been used for testing lane assist/keeping systems (i.e., driving agents) and data from their execution in state of the art, physically accurate driving simulator, called BeamNG.tech. Virtual roads consist of sequences of road points interpolated using Cubic splines.
Along with the data, this repository contains instructions on how to install the tooling necessary to generate new data (i.e., test cases) and analyze them in the context of test regression. We focus on test selection and test prioritization, given their importance for developing high-quality software following the DevOps paradigms.
This dataset builds on top of our previous work in this area, including work on
test generation (e.g., AsFault, DeepJanus, and DeepHyperion) and the SBST CPS tool competition (SBST2021),
test selection: SDC-Scissor and related tool
test prioritization: automated test cases prioritization work for SDCs.
Dataset Overview
The TRAVEL dataset is available under the data folder and is organized as a set of experiments folders. Each of these folders is generated by running the test-generator (see below) and contains the configuration used for generating the data (experiment_description.csv), various statistics on generated tests (generation_stats.csv) and found faults (oob_stats.csv). Additionally, the folders contain the raw test cases generated and executed during each experiment (test..json).
The following sections describe what each of those files contains.
Experiment Description
The experiment_description.csv contains the settings used to generate the data, including:
Time budget. The overall generation budget in hours. This budget includes both the time to generate and execute the tests as driving simulations.
The size of the map. The size of the squared map defines the boundaries inside which the virtual roads develop in meters.
The test subject. The driving agent that implements the lane-keeping system under test. The TRAVEL dataset contains data generated testing the BeamNG.AI and the end-to-end Dave2 systems.
The test generator. The algorithm that generated the test cases. The TRAVEL dataset contains data obtained using various algorithms, ranging from naive and advanced random generators to complex evolutionary algorithms, for generating tests.
The speed limit. The maximum speed at which the driving agent under test can travel.
Out of Bound (OOB) tolerance. The test cases' oracle that defines the tolerable amount of the ego-car that can lie outside the lane boundaries. This parameter ranges between 0.0 and 1.0. In the former case, a test failure triggers as soon as any part of the ego-vehicle goes out of the lane boundary; in the latter case, a test failure triggers only if the entire body of the ego-car falls outside the lane.
Experiment Statistics
The generation_stats.csv contains statistics about the test generation, including:
Total number of generated tests. The number of tests generated during an experiment. This number is broken down into the number of valid tests and invalid tests. Valid tests contain virtual roads that do not self-intersect and contain turns that are not too sharp.
Test outcome. The test outcome contains the number of passed tests, failed tests, and test in error. Passed and failed tests are defined by the OOB Tolerance and an additional (implicit) oracle that checks whether the ego-car is moving or standing. Tests that did not pass because of other errors (e.g., the simulator crashed) are reported in a separated category.
The TRAVEL dataset also contains statistics about the failed tests, including the overall number of failed tests (total oob) and its breakdown into OOB that happened while driving left or right. Further statistics about the diversity (i.e., sparseness) of the failures are also reported.
Test Cases and Executions
Each test..json contains information about a test case and, if the test case is valid, the data observed during its execution as driving simulation.
The data about the test case definition include:
The road points. The list of points in a 2D space that identifies the center of the virtual road, and their interpolation using cubic splines (interpolated_points)
The test ID. The unique identifier of the test in the experiment.
Validity flag and explanation. A flag that indicates whether the test is valid or not, and a brief message describing why the test is not considered valid (e.g., the road contains sharp turns or the road self intersects)
The test data are organized according to the following JSON Schema and can be interpreted as RoadTest objects provided by the tests_generation.py module.
{ "type": "object", "properties": { "id": { "type": "integer" }, "is_valid": { "type": "boolean" }, "validation_message": { "type": "string" }, "road_points": { §\label{line:road-points}§ "type": "array", "items": { "$ref": "schemas/pair" }, }, "interpolated_points": { §\label{line:interpolated-points}§ "type": "array", "items": { "$ref": "schemas/pair" }, }, "test_outcome": { "type": "string" }, §\label{line:test-outcome}§ "description": { "type": "string" }, "execution_data": { "type": "array", "items": { "$ref" : "schemas/simulationdata" } } }, "required": [ "id", "is_valid", "validation_message", "road_points", "interpolated_points" ] }
Finally, the execution data contain a list of timestamped state information recorded by the driving simulation. State information is collected at constant frequency and includes absolute position, rotation, and velocity of the ego-car, its speed in Km/h, and control inputs from the driving agent (steering, throttle, and braking). Additionally, execution data contain OOB-related data, such as the lateral distance between the car and the lane center and the OOB percentage (i.e., how much the car is outside the lane).
The simulation data adhere to the following (simplified) JSON Schema and can be interpreted as Python objects using the simulation_data.py module.
{ "$id": "schemas/simulationdata", "type": "object", "properties": { "timer" : { "type": "number" }, "pos" : { "type": "array", "items":{ "$ref" : "schemas/triple" } } "vel" : { "type": "array", "items":{ "$ref" : "schemas/triple" } } "vel_kmh" : { "type": "number" }, "steering" : { "type": "number" }, "brake" : { "type": "number" }, "throttle" : { "type": "number" }, "is_oob" : { "type": "number" }, "oob_percentage" : { "type": "number" } §\label{line:oob-percentage}§ }, "required": [ "timer", "pos", "vel", "vel_kmh", "steering", "brake", "throttle", "is_oob", "oob_percentage" ] }
Dataset Content
The TRAVEL dataset is a lively initiative so the content of the dataset is subject to change. Currently, the dataset contains the data collected during the SBST CPS tool competition, and data collected in the context of our recent work on test selection (SDC-Scissor work and tool) and test prioritization (automated test cases prioritization work for SDCs).
SBST CPS Tool Competition Data
The data collected during the SBST CPS tool competition are stored inside data/competition.tar.gz. The file contains the test cases generated by Deeper, Frenetic, AdaFrenetic, and Swat, the open-source test generators submitted to the competition and executed against BeamNG.AI with an aggression factor of 0.7 (i.e., conservative driver).
Name
Map Size (m x m)
Max Speed (Km/h)
Budget (h)
OOB Tolerance (%)
Test Subject
DEFAULT
200 × 200
120
5 (real time)
0.95
BeamNG.AI - 0.7
SBST
200 × 200
70
2 (real time)
0.5
BeamNG.AI - 0.7
Specifically, the TRAVEL dataset contains 8 repetitions for each of the above configurations for each test generator totaling 64 experiments.
SDC Scissor
With SDC-Scissor we collected data based on the Frenetic test generator. The data is stored inside data/sdc-scissor.tar.gz. The following table summarizes the used parameters.
Name
Map Size (m x m)
Max Speed (Km/h)
Budget (h)
OOB Tolerance (%)
Test Subject
SDC-SCISSOR
200 × 200
120
16 (real time)
0.5
BeamNG.AI - 1.5
The dataset contains 9 experiments with the above configuration. For generating your own data with SDC-Scissor follow the instructions in its repository.
Dataset Statistics
Here is an overview of the TRAVEL dataset: generated tests, executed tests, and faults found by all the test generators grouped by experiment configuration. Some 25,845 test cases are generated by running 4 test generators 8 times in 2 configurations using the SBST CPS Tool Competition code pipeline (SBST in the table). We ran the test generators for 5 hours, allowing the ego-car a generous speed limit (120 Km/h) and defining a high OOB tolerance (i.e., 0.95), and we also ran the test generators using a smaller generation budget (i.e., 2 hours) and speed limit (i.e., 70 Km/h) while setting the OOB tolerance to a lower value (i.e., 0.85). We also collected some 5, 971 additional tests with SDC-Scissor (SDC-Scissor in the table) by running it 9 times for 16 hours using Frenetic as a test generator and defining a more realistic OOB tolerance (i.e., 0.50).
Generating new Data
Generating new data, i.e., test cases, can be done using the SBST CPS Tool Competition pipeline and the driving simulator BeamNG.tech.
Extensive instructions on how to install both software are reported inside the SBST CPS Tool Competition pipeline Documentation;
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set contains the result of applying the NIST Statistical Test Suite on accelerometer data processed for random number generator seeding. The NIST Statistical Test Suite can be downloaded from: http://csrc.nist.gov/groups/ST/toolkit/rng/documentation_software.html. The format of the output is explained in http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf.
Static torque, no load, constant speed, and sinusoidal oscillation test data for a 10hp, 300rpm magnetically-geared generator prototype using either an adjustable load bank for a fixed resistance or an output power converter.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The global database testing tool market is anticipated to experience substantial growth in the coming years, driven by factors such as the increasing adoption of cloud-based technologies, the rising demand for data quality and accuracy, and the growing complexity of database systems. The market is expected to reach a value of USD 1,542.4 million by 2033, expanding at a CAGR of 7.5% during the forecast period of 2023-2033. Key players in the market include Apache JMeter, DbFit, SQLMap, Mockup Data, SQL Test, NoSQLUnit, Orion, ApexSQL, QuerySurge, DBUnit, DataFactory, DTM Data Generator, Oracle, SeLite, SLOB, and others. The North American region is anticipated to hold a significant share of the database testing tool market, followed by Europe and Asia Pacific. The increasing adoption of cloud-based database testing services, the presence of key market players, and the growing demand for data testing and validation are driving the market growth in North America. Asia Pacific, on the other hand, is expected to experience the highest growth rate due to the rapidly increasing IT spending, the emergence of new technologies, and the growing number of businesses investing in data quality management solutions.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The market for generator load testing is expected to grow from USD XXX million in 2023 to USD XXX million in 2033, at a CAGR of XX% during the forecast period. The market is driven by the increasing demand for reliable and efficient power supply, particularly in critical applications such as hospitals, data centers, and manufacturing facilities. The rising adoption of renewable energy sources, such as solar and wind power, is also driving the demand for generator load testing, as these systems require backup generators to ensure a continuous power supply. The market is highly competitive, with a large number of global and regional players offering a wide range of generator load testing products and services. The key players in the market include JS Power, Alpine Power Systems, GenCare, TriStar Power Solutions, Vital Power, Gen-Tech, Weld Power Generator, ASNE, Realpower, Curtis Power Solutions, Wolter Inc, Duthie Power Services, Generator Services, Yip Shing Diesel Engineering Co., Ltd., Micro-Air, Inc., Cummins Inc., Carelabz, Reactive Generators, CD & Power, and Northern Generator. These players are investing heavily in research and development to introduce innovative products and technologies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The datasets were used to validate and test the data pipeline deployment following the RADON approach. The dataset contains temperature and humidity sensor readings of a particular day, which are synthetically generated using a data generator and are stored as JSON files to validate and test (performance/load testing) the data pipeline components.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The global generator test load bank market is experiencing robust growth, driven by increasing demand for reliable power generation and stringent testing regulations across various industries. The market size in 2025 is estimated at $1.5 billion, exhibiting a Compound Annual Growth Rate (CAGR) of 7% from 2025 to 2033. This growth is fueled by several key factors. The expansion of data centers, the rising adoption of renewable energy sources requiring rigorous testing, and the growing need for efficient power generation in industrial sectors like shipping and power plants are major contributors. Furthermore, advancements in load bank technology, including the development of more compact, efficient, and digitally controlled units, are enhancing market appeal. The adoption of resistive-reactive load banks, offering greater flexibility and accuracy in testing, is also driving market expansion. Regional growth is expected to be diverse, with North America and Asia-Pacific leading the charge due to strong economic growth and substantial investments in infrastructure. However, certain restraints exist. High initial investment costs associated with advanced load bank systems might hinder adoption, particularly among smaller enterprises. Additionally, fluctuations in raw material prices and the complexity of integrating these systems into existing infrastructure pose challenges. Nevertheless, ongoing technological improvements and increasing awareness of the crucial role of generator testing in ensuring power reliability are projected to mitigate these obstacles. The market segmentation reveals significant opportunities in various applications, notably data center generator testing and the growing renewable energy sector. Key players are focusing on product innovation, strategic partnerships, and expansion into new geographical markets to strengthen their market position and capitalize on this growth trajectory. The market is poised for continued expansion, with significant potential for growth across diverse geographical regions and application segments.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data.The CERT Division, in partnership with ExactData, LLC, and under sponsorship from DARPA I2O, generated a collection of synthetic insider threat test datasets. These datasets provide both synthetic background data and data from synthetic malicious actors.For more background on this data, please see the paper, Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data.Datasets are organized according to the data generator release that created them. Most releases include multiple datasets (e.g., r3.1 and r3.2). Generally, later releases include a superset of the data generation functionality of earlier releases. Each dataset file contains a readme file that provides detailed notes about the features of that release.The answer key file answers.tar.bz2 contains the details of the malicious activity included in each dataset, including descriptions of the scenarios enacted and the identifiers of the synthetic users involved.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The high-speed digital signal generator market is experiencing robust growth, driven by the increasing demand for high-bandwidth communication systems in sectors like 5G, data centers, and automotive. The market's expansion is fueled by the need for accurate and reliable signal generation for testing high-speed digital designs and components. Advancements in technology, such as the development of higher frequency generators and improved signal fidelity, are further propelling market growth. Furthermore, the rising adoption of advanced testing techniques and the growing complexity of electronic devices necessitates the use of sophisticated high-speed digital signal generators, thereby increasing market demand. We estimate the market size in 2025 to be approximately $1.5 billion, based on observed growth trends in related sectors and expert analysis. A compound annual growth rate (CAGR) of around 8% is projected from 2025 to 2033, indicating significant market expansion in the coming years. Major players such as Keysight Technologies, Rohde & Schwarz, and Tektronix dominate the market, leveraging their strong brand reputation and technological expertise. However, the market is also witnessing the emergence of smaller companies specializing in niche applications and offering innovative solutions. The competitive landscape is marked by ongoing product development, strategic partnerships, and mergers and acquisitions. While the high cost of these advanced generators can be a restraining factor for some users, the long-term benefits in terms of improved testing accuracy and efficiency outweigh this consideration, ultimately driving market adoption. Regional growth is expected to vary, with North America and Asia-Pacific likely leading due to the concentration of technological advancements and strong demand from various industries in these regions.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The code, strainenergy_v4_1.m, was used for generating and processing the dataset for load-displacement and stress-strain. Software Matlab version 6.1 was used for running the code. The specific variables of the parameters used to generate the current dataset are as follows:• ip1: input file containing the load-displacement data• diameter: fascicle diameter• laststrainpt: an estimate of the strain at rupture, r• orderpoly: an integral value from 2-7 which represents the order of the polynomial for fitting to the data from O to q• loadat1percent: y/n; to determine the value of the load (set at 1% of the maximum load) at which the specimen became taut. ‘y’ denotes yes; ‘n’ denotes no.The logfile.txt, contains the parameters used for deriving the values of the respective mechanical properties.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Developing software test code can be as or more expensive than developing software production code. Commonly, developers use automated unit test generators to speed up software testing. The purpose of such tools is to shorten production time without decreasing code quality. Nonetheless, unit tests usually do not have a quality check layer above testing code, which might be hard to guarantee the quality of the generated tests. An emerging strategy to verify the tests quality is to analyze the presence of test smells in software test code. Test smells are characteristics in the test code that possibly indicate weaknesses in test design and implementation. The presence of test smells in unit test code could be used as an indicator of unit test quality. In this paper, we present an empirical study aimed to analyze the quality of unit test code generated by automated test tools. We compare the tests generated by two tools (Randoop and EvoSuite) with the existing unit test suite of open-source software projects. We analyze the unit test code of twenty-one open-source Java projects and detected the presence of nineteen types of test smells. The results indicated significant differences in the unit test quality when comparing data from both automated unit test generators and existing unit test suites.
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The global generator testing and certification services market is experiencing robust growth, driven by the increasing demand for reliable power generation across diverse sectors. Stringent regulatory requirements for safety and performance, particularly in renewable energy (wind turbines) and backup power (diesel generators), are key catalysts. The market is segmented by application (diesel generators, wind turbines, and others) and service type (testing and certification). Diesel generators continue to dominate the application segment due to their widespread use in industrial settings and emergency power systems. However, the burgeoning renewable energy sector, particularly wind power, is fueling significant demand for testing and certification services related to wind turbine generators. This shift reflects a broader industry trend toward cleaner energy sources and a greater focus on grid stability and reliability. The competitive landscape comprises established players like Intertek, UL Solutions, TÜV Rheinland, and TÜV SÜD, alongside specialized service providers focusing on specific generator types or regions. These companies are investing in advanced testing technologies and expanding their geographical reach to meet the growing global demand. Future growth will likely be shaped by technological advancements in testing methodologies, increasing adoption of smart grids, and the continued expansion of renewable energy infrastructure. The market is expected to see sustained growth over the forecast period, driven by the factors outlined above. A projected CAGR (assuming a missing CAGR of 7% based on industry average growth for similar service markets) suggests a substantial increase in market value over the next decade. The North American and European markets currently hold significant market share, owing to well-established regulatory frameworks and a strong presence of key players. However, emerging economies in Asia-Pacific, particularly China and India, are witnessing rapid growth due to increasing infrastructure development and industrialization. This regional expansion presents lucrative opportunities for existing players and new entrants alike. The market’s growth trajectory is expected to be influenced by factors such as government policies promoting renewable energy integration, advancements in testing technologies that enhance efficiency and accuracy, and the growing adoption of digital solutions for remote testing and data management. Challenges include maintaining rigorous quality standards across diverse geographical locations and adapting to evolving technological advancements within the power generation industry.
Generating Realistic Test Datasets for Duplicate Detection at Scale Using Historical Voter Data
Attribution-NonCommercial-NoDerivs 2.5 (CC BY-NC-ND 2.5)https://creativecommons.org/licenses/by-nc-nd/2.5/
License information was derived automatically
NADA (Not-A-Database) is an easy-to-use geometric shape data generator that allows users to define non-uniform multivariate parameter distributions to test novel methodologies. The full open-source package is provided at GIT:NA_DAtabase. See Technical Report for details on how to use the provided package.
This database includes 3 repositories:
Each image can be used for classification (shape/color) or regression (radius/area) tasks.
All datasets can be modified and adapted to the user's research question using the included open source data generator.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains 350 features engineered from the phasor measurements (PMU-type) signals from the IEEE New England 39-bus power system test case network, which are generated from the 9360 systematic MATLAB®/Simulink electro-mechanical transients simulations. It was prepared to serve as a convenient and open database for experimenting with different types of machine learning techniques for transient stability assessment (TSA) of electrical power systems.
Different load and generation levels of the New England 39-bus benchmark power system were systematically covered, as well as all three major types of short-circuit events (three-phase, two-phase and single-phase faults) in all parts of the network. The consumed power of the network was set to 80%, 90%, 100%, 110% and 120% of the basic system load levels. The short-circuits were located on the busbar or on the transmission line (TL). When they were located on a TL, it was assumed that they can occur at 20%, 40%, 60%, and 80% of the line length. Features were obtained directly from the time-domain signals at the pickup time (pre-fault value) and at the trip time (post-fault value) of the associated distance protection relays.
This is a stochastic dataset of 3120 cases, created from the population of 9360 systematic simulations, which features a statistical distribution of different fault types, as follows: single-phase (70%), double-phase (20%) and three-phase faults (10%). It also features a class imbalance, with less than 20% of cases belonging to the unstable class. Dataset is a compressed CSV file.
List of feature names in the dataset:
WmGx - rotor speed for each generator Gx, from G1 to G10,
DThetaGx - rotor angle deviation for each generator Gx, from G1 to G10,
ThetaGx - rotor mechanical angle for each generator Gx, from G1 to G10,
VtGx - stator voltage for each generator Gx, from G1 to G10,
IdGx - stator d-component current for each generator Gx, from G1 to G10,
IqGx - stator q-component current for each generator Gx, from G1 to G10,
LAfvGx - pre-fault power load angle for each generator Gx, from G1 to G10,
LAlvGx - post-fault power load angle for each generator Gx, from G1 to G10,
PfvGx - pre-falut value of the generator active power for each generator Gx, from G1 to G10,
PlvGx - post-falut value of the generator active power for each generator Gx, from G1 to G10,
QfvGx - pre-falut value of the generator reactive power for each generator Gx, from G1 to G10,
QlvGx - post-falut value of the generator reactive power for each generator Gx, from G1 to G10,
VAfvBx - pre-fault bus voltage magnitude in phase A for each bus Bx, from B1 to B39,
VBfvBx - pre-fault bus voltage magnitude in phase B for each bus Bx, from B1 to B39,
VCfvBx - pre-fault bus voltage magnitude in phase C for each bus Bx, from B1 to B39,
VAlvBx - post-fault bus voltage magnitude in phase A for each bus Bx, from B1 to B39,
VBlvBx - post-fault bus voltage magnitude in phase B for each bus Bx, from B1 to B39,
VClvBx - post-fault bus voltage magnitude in phase C for each bus Bx, from B1 to B39,
Stability - binary indicator (0/1) that determines if the power system was stable or unstable (0 - stable, 1 - unstable); this is the label variable.
License: Creative Commons CC-BY.
Disclaimer: This dataset is provided "as is", without any warranties of any kind.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset used in the article entitled 'Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools'. These datasets can be used to test several characteristics in machine learning and data processing algorithms.