65 datasets found

h
Data from: test-data-generator
huggingface.co
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francisco Theodoro Arantes Florencio (2025). test-data-generator [Dataset]. https://huggingface.co/datasets/franciscoflorencio/test-data-generator
Explore at:
Dataset updated
Mar 26, 2025
Authors
Francisco Theodoro Arantes Florencio
Description
Dataset Card for test-data-generator

This dataset has been created with distilabel.

Dataset Summary

This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/franciscoflorencio/test-data-generator/raw/main/pipeline.yaml"

or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/franciscoflorencio/test-data-generator.
i
Dataset of article: Synthetic Datasets Generator for Testing Information...
ieee-dataport.org
Updated Mar 13, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sandro Mendonça (2020). Dataset of article: Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools [Dataset]. http://doi.org/10.21227/5aeq-rr34
Explore at:
Unique identifier
https://doi.org/10.21227/5aeq-rr34
Dataset updated
Mar 13, 2020
Dataset provided by
IEEE Dataport
Authors
Sandro Mendonça
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset used in the article entitled 'Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools'. These datasets can be used to test several characteristics in machine learning and data processing algorithms.
Z
Data from: Reliability Analysis of Random Telegraph Noisebased True Random...
data.niaid.nih.gov
zenodo.org
Updated Sep 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ranjan, Alok (2024). Reliability Analysis of Random Telegraph Noisebased True Random Number Generators [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13169457
Explore at:
Dataset updated
Sep 30, 2024
Dataset provided by
Thamankar, Dr. Ramesh
Pey, Kin Leong
Raghavan, Nagarajan
PUGLISI, Francesco Maria
O'Shea, Sean J.
Zanotti, Tommaso
Ranjan, Alok
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Repository author: Tommaso Zanotti* email: tommaso.zanotti@unimore.it or francescomaria.puglisi@unimore.it * Version v1.0

This repository includes MATLAB files and datasets related to the IEEE IIRW 2023 conference proceeding:T. Zanotti et al., "Reliability Analysis of Random Telegraph Noisebased True Random Number Generators," 2023 IEEE International Integrated Reliability Workshop (IIRW), South Lake Tahoe, CA, USA, 2023, pp. 1-6, doi: 10.1109/IIRW59383.2023.10477697

The repository includes:

The data of the bitmaps reported in Fig. 4, i.e., the results of the simulation of the ideal RTN-based TRNG circuit for different reseeding strategies. To load and plot the data use the "plot_bitmaps.mat" file.

The result of the circuit simulations considering the EvolvingRTN from the HfO2 device shown in Fig. 7, for two Rgain values. Specifically, the data is contained in the following csv files:

"Sim_TRNG_Circuit_HfO2_3_20s_Vth_210m_no_Noise_Ibias_11n.csv" (lower Rgain)

"Sim_TRNG_Circuit_HfO2_3_20s_Vth_210m_no_Noise_Ibias_4_8n.csv" (higher Rgain)

The result of the circuit simulations considering the temporary RTN from the SiO2 device shown in Fig. 8. Specifically, the data is contained in the following csv files:

"Sim_TRNG_Circuit_SiO2_1c_300s_Vth_180m_Noise_Ibias_1.5n.csv" (ref. Rgain)

"Sim_TRNG_Circuit_SiO2_1c_100s_200s_Vth_180m_Noise_Ibias_1.575n.csv" (lower Rgain)

"Sim_TRNG_Circuit_SiO2_1c_100s_200s_Vth_180m_Noise_Ibias_1.425n.csv" (higher Rgain)
f
Microsoft excel database containing all the simulated (10 sets) and...
figshare.com
plos.figshare.com
xlsx
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hamed Ahmadi (2023). Microsoft excel database containing all the simulated (10 sets) and experimental data used in this study. [Dataset]. http://doi.org/10.1371/journal.pone.0187292.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0187292.s001
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Hamed Ahmadi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Excel sheets in order: The sheet entitled “Hens Original Data” contains the results of an experiment conducted to study the response of laying hens during initial phase of egg production subjected to different intakes of dietary threonine. The sheet entitled “Simulated data & fitting values” contains the 10 simulated data sets that were generated using a standard procedure of random number generator. The predicted values obtained by the new three-parameter and conventional four-parameter logistic models were also appeared in this sheet. (XLSX)
Z
Automated Generation of Realistic Test Inputs for Web APIs
data.niaid.nih.gov
zenodo.org
Updated May 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alonso Valenzuela, Juan Carlos (2021). Automated Generation of Realistic Test Inputs for Web APIs [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4736859
Explore at:
Dataset updated
May 5, 2021
Dataset authored and provided by
Alonso Valenzuela, Juan Carlos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Testing web APIs automatically requires generating input data values such as addressess, coordinates or country codes. Generating meaningful values for these types of parameters randomly is rarely feasible, which means a major obstacle for current test case generation approaches. In this paper, we present ARTE, the first semantic-based approach for the Automated generation of Realistic TEst inputs for web APIs. Specifically, ARTE leverages the specification of the API under test to extract semantically related values for every parameter by applying knowledge extraction techniques. Our approach has been integrated into RESTest, a state-of-the-art tool for API testing, achieving an unprecedented level of automation which allows to generate up to 100\% more valid API calls than existing fuzzing techniques (30\% on average). Evaluation results on a set of 26 real-world APIs show that ARTE can generate realistic inputs for 7 out of every 10 parameters, outperforming the results obtained by related approaches.
d
Data from: Advanced Direct-Drive Generator for Improved Availability of...
catalog.data.gov
mhkdr.openei.org
+3more
Updated Jan 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ABB Inc. (2025). Advanced Direct-Drive Generator for Improved Availability of Oscillating Wave Surge Converter Power Generation Systems: 10hp 30rpm Radial-Flux Magnetically Geared Generator Test Data [Dataset]. https://catalog.data.gov/dataset/advanced-direct-drive-generator-for-improved-availability-of-oscillating-wave-surge-conver-8124a
Explore at:
Dataset updated
Jan 20, 2025
Dataset provided by
ABB Inc.
Description
Static torque, no load, constant speed, and sinusoidal oscillation test data for a 10hp, 300rpm magnetically-geared generator prototype using either an adjustable load bank for a fixed resistance or an output power converter.
Z
Data pipeline Validation And Load Testing using Multiple JSON Files
data.niaid.nih.gov
Updated Mar 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Afsana Khan (2021). Data pipeline Validation And Load Testing using Multiple JSON Files [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4636789
Explore at:
Dataset updated
Mar 26, 2021
Dataset provided by
Pelle Jakovits
Mainak Adhikari
Afsana Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The datasets were used to validate and test the data pipeline deployment following the RADON approach. The dataset contains temperature and humidity sensor readings of a particular day, which are synthetically generated using a data generator and are stored as JSON files to validate and test (performance/load testing) the data pipeline components.
d
Data from: Advanced Direct-Drive Generator for Improved Availability of...
catalog.data.gov
mhkdr.openei.org
+2more
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ABB Inc. (2025). Advanced Direct-Drive Generator for Improved Availability of Oscillating Wave Surge Converter Power Generation Systems: 1hp 300rpm Axial-Flux Magnetically Geared Generator Test Data [Dataset]. https://catalog.data.gov/dataset/advanced-direct-drive-generator-for-improved-availability-of-oscillating-wave-surge-conver-b502b
Explore at:
Dataset updated
Jan 20, 2025
Dataset provided by
ABB Inc.
Description
Static torque and no load test data for a 1hp, 300rpm axial-flux magnetically geared generator prototype developed by Texas A&M EMPE Lab.
Data and code for: Generation and applications of simulated datasets to...
data.niaid.nih.gov
datadryad.org
zip
Updated Mar 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Silk; Olivier Gimenez (2023). Data and code for: Generation and applications of simulated datasets to integrate social network and demographic analyses [Dataset]. http://doi.org/10.5061/dryad.m0cfxpp7s
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.m0cfxpp7s
Dataset updated
Mar 10, 2023
Dataset provided by
Centre d'Ecologie Fonctionnelle et Evolutive
Authors
Matthew Silk; Olivier Gimenez
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Social networks are tied to population dynamics; interactions are driven by population density and demographic structure, while social relationships can be key determinants of survival and reproductive success. However, difficulties integrating models used in demography and network analysis have limited research at this interface. We introduce the R package genNetDem for simulating integrated network-demographic datasets. It can be used to create longitudinal social networks and/or capture-recapture datasets with known properties. It incorporates the ability to generate populations and their social networks, generate grouping events using these networks, simulate social network effects on individual survival, and flexibly sample these longitudinal datasets of social associations. By generating co-capture data with known statistical relationships it provides functionality for methodological research. We demonstrate its use with case studies testing how imputation and sampling design influence the success of adding network traits to conventional Cormack-Jolly-Seber (CJS) models. We show that incorporating social network effects in CJS models generates qualitatively accurate results, but with downward-biased parameter estimates when network position influences survival. Biases are greater when fewer interactions are sampled or fewer individuals are observed in each interaction. While our results indicate the potential of incorporating social effects within demographic models, they show that imputing missing network measures alone is insufficient to accurately estimate social effects on survival, pointing to the importance of incorporating network imputation approaches. genNetDem provides a flexible tool to aid these methodological advancements and help researchers test other sampling considerations in social network studies. Methods The dataset and code stored here is for Case Studies 1 and 2 in the paper. Datsets were generated using simulations in R. Here we provide 1) the R code used for the simulations; 2) the simulation outputs (as .RDS files); and 3) the R code to analyse simulation outputs and generate the tables and figures in the paper.
d
Data from: RANEXP: experimental random number generator package
elsevier.digitalcommonsdata.com
search.datacite.org
Updated Jan 1, 1994
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Hennecke (1994). RANEXP: experimental random number generator package [Dataset]. http://doi.org/10.17632/pty366sbwg.1
Explore at:
Unique identifier
https://doi.org/10.17632/pty366sbwg.1
Dataset updated
Jan 1, 1994
Authors
Michael Hennecke
License
https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/
Description
Abstract A library containing highly portable implementations of most algorithms for (pseudo) random number generation has been developed, which might be used in any area of simulation which requires random number generators. Each generator is freely configurable by the user, so the RANEXP library is particularly well-suited for applications requiring different random number generators. The algorithms are implemented in C, but are callable from Fortran application program also.

Title of program: RANEXP Catalogue Id: ACTB_v1_0

Nature of problem Any Monte Carlo simulation or statistical test requiring uniform pseudorandom numbers.

Versions of this program held in the CPC repository in Mendeley Data ACTB_v1_0; RANEXP; 10.1016/0010-4655(94)90072-8

This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2019)
F
Fake Email Address Generator Report
datainsightsmarket.com
doc, pdf, ppt
Updated Feb 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Fake Email Address Generator Report [Dataset]. https://www.datainsightsmarket.com/reports/fake-email-address-generator-1405019
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Feb 12, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
Fake Email Address Generator Market Analysis The global market for Fake Email Address Generators is expected to reach a value of XXX million by 2033, growing at a CAGR of XX% from 2025 to 2033. Key drivers of this growth include the increasing demand for privacy and anonymity online, the growing prevalence of spam and phishing attacks, and the proliferation of digital marketing campaigns. Additionally, the adoption of cloud-based solutions and the emergence of new technologies, such as artificial intelligence (AI), are further fueling market expansion. Key trends in the Fake Email Address Generator market include the growing popularity of enterprise-grade solutions, the emergence of disposable email services, and the increasing integration with other online tools. Restraints to market growth include concerns over security and data protection, as well as the availability of free or low-cost alternatives. The market is dominated by a few major players, including Burnermail, TrashMail, and Guerrilla Mail, but a growing number of smaller vendors are emerging with innovative solutions. Geographically, North America and Europe are the largest markets, followed by the Asia Pacific region.
R
Random Outfit Generator Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jan 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Random Outfit Generator Report [Dataset]. https://www.datainsightsmarket.com/reports/random-outfit-generator-1406627
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Jan 22, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global Random Outfit Generator market was valued at USD 1.2 billion in 2025 and is projected to grow at a CAGR of 10.5% during the forecast period, reaching USD 2.5 billion by 2033. This growth is attributed to increasing demand for personalized and time-saving fashion solutions, rising disposable income, and growing awareness of fashion trends. The market is predominantly driven by fashion enthusiasts and professionals who seek efficient ways to generate unique and stylish outfits. Among the key segments of the Random Outfit Generator market are: Application: Fashion Designer, Fashion Enthusiasts, Photography Stylist, Others Types: Cloud-based, On-premises Geography: North America, South America, Europe, Middle East & Africa, Asia Pacific Key players in the market include Fashmates, Stylicious, Your Closet, Combyne, My Dressing, Acloset, My Wardrobe, Smart Closet, Pureple, Twelve70, Roll For Fantasy, Randommer, The Fashion Robot, and Picrew. These companies offer innovative solutions to cater to the growing demand for random outfit generators, engaging users with interactive features and personalized experiences.
Global Quantum Random Number Generator RNG market size is USD 555.9 million...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2025). Global Quantum Random Number Generator RNG market size is USD 555.9 million in 2024. [Dataset]. https://www.cognitivemarketresearch.com/quantum-random-number-generator-rng-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Jan 15, 2025
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, the global Quantum Random Number Generator RNG market size is USD 555.9 million in 2024. It will expand at a compound annual growth rate (CAGR) of 72.60% from 2024 to 2031.

North America held the major market share for more than 40% of the global revenue with a market size of USD 222.36 million in 2024 and will grow at a compound annual growth rate (CAGR) of 70.8% from 2024 to 2031. Europe accounted for a market share of over 30% of the global revenue with a market size of USD 166.77 million. Asia Pacific held a market share of around 23% of the global revenue with a market size of USD 127.86 million in 2024 and will grow at a compound annual growth rate (CAGR) of 74.6% from 2024 to 2031. Latin America had a market share for more than 5% of the global revenue with a market size of USD 27.80 million in 2024 and will grow at a compound annual growth rate (CAGR) of 72.0% from 2024 to 2031. Middle East and Africa had a market share of around 2% of the global revenue and was estimated at a market size of USD 11.12 million in 2024 and will grow at a compound annual growth rate (CAGR) of 72.3% from 2024 to 2031. Cloud held the dominant segment in the Quantum Random Number Generator RNG market in 2024.

Market Dynamics of Quantum Random Number Generator RNG Market

Key Drivers for Quantum Random Number Generator RNG Market

Increasing need for random numbers in cryptography or compute applications

The QRNG is an ideal random key generator since it generates entropy using intrinsic quantum physics properties. Nowadays, applications demand a huge number of keys and randomization to achieve total security. It could include key vaults, games, IoT devices, AI/ML, blockchains, simulations, and vital infrastructure. QRNG is the source of these applications in which trust in randomness is prevalent. Furthermore, it is utilized in encryption for a wide range of applications, including cryptography, numerical simulation, gambling, and game design.

Growing adoption of quantum computing

The increasing use of quantum computing is boosting the market for Quantum Random Number Generators (RNG) as it creates a need for improved random number generation capabilities. The accurate abilities of quantum computing enable RNGs to produce truly random numbers, essential for secure communication and encryption. Advancements in quantum computing will lead to a higher demand for dependable RNGs, driving market expansion to meet the changing requirements of cybersecurity and data encryption.

Restraint Factor for the Quantum Random Number Generator RNG Market

High initial investment

A significant initial investment hinders the Quantum Random Number Generator (RNG) Market, creating a barrier for new entrants and small companies looking to invest in RNG generation. The significant initial costs involved in the research, development, and deployment of quantum RNG solutions may discourage potential entrants from joining the market. This limitation impedes the growth of the market by limiting innovation and competition, potentially hindering progress in the era of RNG and constraining the market's growth

Impact of Covid-19 on the Quantum Random Number Generator RNG Market

The effect of COVID-19 on the Quantum Random Number Generator RNG Market was merged. Although the pandemic initially caused disruptions in supply chains and slowed down certain trends, the increased focus on cybersecurity and data protection during remote work and digital interactions enhanced the need for secure communication solutions such as quantum RNGs. With a focus on safeguarding information, both organizations and governments fueled growth in the Quantum RNG market despite pandemic-related obstacles. Introduction of the Quantum Random Number Generator RNG Market

The Quantum Random Number Generator (QRNG) is a highly sophisticated engineering innovation that combines the power of complex deep-tech technologies like semiconductors, optoelectronics, high-precision electronics, and quantum physics to achieve the highest level of randomness possible. QRNG has shown to be a critical enabling technology for quantum-level security in mobile devices, data centres, and medical implants. They provide consumers with a significant enhancement over ordinary random number generators (RNGs), which have been utilized for years in a variety of business applications. Several factors, including th...
SISTER: Experimental Workflows, Product Generation Environment, and Sample...
data.staging.idas-ds1.appdat.jsc.nasa.gov
data.nasa.gov
+4more
Updated Feb 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.staging.idas-ds1.appdat.jsc.nasa.gov (2025). SISTER: Experimental Workflows, Product Generation Environment, and Sample Data, V004 [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/sister-experimental-workflows-product-generation-environment-and-sample-data-v004
Explore at:
Dataset updated
Feb 19, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
The Space-based Imaging Spectroscopy and Thermal pathfindER (SISTER) activity originated in support of the NASA Earth System Observatory's Surface Biology and Geology (SBG) mission to develop prototype workflows with community algorithms and generate prototype data products envisioned for SBG. SISTER focused on developing a data system that is open, portable, scalable, standards-compliant, and reproducible. This collection contains EXPERIMENTAL workflows and sample data products, including (a) the Common Workflow Language (CWL) process file and a Jupyter Notebook that run the entire SISTER workflow capable of generating experimental sample data products spanning terrestrial ecosystems, inland and coastal aquatic ecosystems, and snow, (b) the archived algorithm steps (as OGC Application Packages) used to generate products at each step of the workflow, (c) a small number of experimental sample data products produced by the workflow which are based on the Airborne Visible/Infrared Imaging Spectrometer-Classic (AVIRIS or AVIRIS-CL) instrument, and (d) instructions for reproducing the sample products included in this dataset. DISCLAIMER: This collection contains experimental workflows, experimental community algorithms, and experimental sample data products to demonstrate the capabilities of an end-to-end processing system. The experimental sample data products provided have not been fully validated and are not intended for scientific use. The community algorithms provided are placeholders which can be replaced by any user's algorithms for their own science and application interests. These algorithms should not in any capacity be considered the algorithms that will be implemented in the upcoming Surface Biology and Geology mission.
Z
SQL Injection Attack Netflow
data.niaid.nih.gov
zenodo.org
Updated Sep 28, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adrián Campazas (2022). SQL Injection Attack Netflow [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6907251
Explore at:
Dataset updated
Sep 28, 2022
Dataset provided by
Adrián Campazas
Ignacio Crespo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.

NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.

Datasets

The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).

The datasets contain both benign and malicious traffic. All collected datasets are balanced.

The version of NetFlow used to build the datasets is 5.

Dataset Aim Samples Benign-malicious traffic ratio D1 Training 400,003 50% D2 Test 57,239 50%

Infrastructure and implementation

Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.

DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)

Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).

The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.

The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.

Parameters Description '--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema' Enumerate users, password hashes, privileges, roles, databases, tables and columns --level=5 Increase the probability of a false positive identification --risk=3 Increase the probability of extracting data --random-agent Select the User-Agent randomly --batch Never ask for user input, use the default behavior --answers="follow=Y" Predefined answers to yes

Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).

The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24. The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.

However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.

To run the MySQL server we ran MariaDB version 10.4.12. Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.
Data from: A Turing Test for Molecular Generators
acs.figshare.com
figshare.com
txt
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob T. Bush; Peter Pogany; Stephen D. Pickett; Mike Barker; Andrew Baxter; Sebastien Campos; Anthony W. J. Cooper; David Hirst; Graham Inglis; Alan Nadin; Vipulkumar K. Patel; Darren Poole; John Pritchard; Yoshiaki Washio; Gemma White; Darren V. S. Green (2023). A Turing Test for Molecular Generators [Dataset]. http://doi.org/10.1021/acs.jmedchem.0c01148.s003
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jmedchem.0c01148.s003
Dataset updated
Jun 5, 2023
Dataset provided by
ACS Publications
Authors
Jacob T. Bush; Peter Pogany; Stephen D. Pickett; Mike Barker; Andrew Baxter; Sebastien Campos; Anthony W. J. Cooper; David Hirst; Graham Inglis; Alan Nadin; Vipulkumar K. Patel; Darren Poole; John Pritchard; Yoshiaki Washio; Gemma White; Darren V. S. Green
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Machine learning approaches promise to accelerate and improve success rates in medicinal chemistry programs by more effectively leveraging available data to guide a molecular design. A key step of an automated computational design algorithm is molecule generation, where the machine is required to design high-quality, drug-like molecules within the appropriate chemical space. Many algorithms have been proposed for molecular generation; however, a challenge is how to assess the validity of the resulting molecules. Here, we report three Turing-inspired tests designed to evaluate the performance of molecular generators. Profound differences were observed between the performance of molecule generators in these tests, highlighting the importance of selection of the appropriate design algorithms for specific circumstances. One molecule generator, based on match molecular pairs, performed excellently against all tests and thus provides a valuable component for machine-driven medicinal chemistry design workflows.
d
Data from: Simulated Radar Waveform and RF Dataset Generator for Incumbent...
datasets.ai
gimi9.com
+2more
0
Updated Aug 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2024). Simulated Radar Waveform and RF Dataset Generator for Incumbent Signals in the 3.5 GHz CBRS Band [Dataset]. https://datasets.ai/datasets/simulated-radar-waveform-and-rf-dataset-generator-for-incumbent-signals-in-the-3-5-ghz-cbr-a6a00
Explore at:
0Available download formats
Dataset updated
Aug 8, 2024
Dataset authored and provided by
National Institute of Standards and Technology
Description
This software tool generates simulated radar signals and creates RF datasets. The datasets can be used to develop and test detection algorithms by utilizing machine learning/deep learning techniques for the 3.5 GHz Citizens Broadband Radio Service (CBRS) or similar bands. In these bands, the primary users of the band are federal incumbent radar systems. The software tool generates radar waveforms and randomizes the radar waveform parameters. The pulse modulation types for the radar signals and their parameters are selected based on NTIA testing procedures for ESC certification, available at http://www.its.bldrdoc.gov/publications/3184.aspx. Furthermore, the tool mixes the waveforms with interference and packages them into one RF dataset file. The tool utilizes a graphical user interface (GUI) to simplify the selection of parameters and the mixing process. A reference RF dataset was generated using this software. The RF dataset is published at https://doi.org/10.18434/M32116.
Data from: Dark Generator Tool Test
esdcdoi.esac.esa.int
Updated Aug 5, 2002
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
European Space Agency (2002). Dark Generator Tool Test [Dataset]. http://doi.org/10.5270/esa-zkljnt1
Explore at:
https://www.iana.org/assignments/media-types/application/fitsAvailable download formats
Unique identifier
https://doi.org/10.5270/esa-zkljnt1
Dataset updated
Aug 5, 2002
Dataset authored and provided by
European Space Agencyhttp://www.esa.int/
Time period covered
Jul 15, 2002 - Aug 5, 2002
Description
This is a scientific proposal for HST mission. For specific information please visit https://archive.stsci.edu/proposal_search.php?id9641&missionhst
c
Insider Threat Test Dataset
kilthub.cmu.edu
application/bzip2 +3
Updated May 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brian Lindauer (2023). Insider Threat Test Dataset [Dataset]. http://doi.org/10.1184/R1/12841247.v1
Explore at:
bz2, bin, application/bzip2, txtAvailable download formats
Unique identifier
https://doi.org/10.1184/R1/12841247.v1
Dataset updated
May 30, 2023
Dataset provided by
Carnegie Mellon University
Authors
Brian Lindauer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data.

The CERT Division, in partnership with ExactData, LLC, and under sponsorship from DARPA I2O, generated a collection of synthetic insider threat test datasets. These datasets provide both synthetic background data and data from synthetic malicious actors. For more background on this data, please see the paper, Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data. Datasets are organized according to the data generator release that created them. Most releases include multiple datasets (e.g., r3.1 and r3.2). Generally, later releases include a superset of the data generation functionality of earlier releases. Each dataset file contains a readme file that provides detailed notes about the features of that release. The answer key file answers.tar.bz2 contains the details of the malicious activity included in each dataset, including descriptions of the scenarios enacted and the identifiers of the synthetic users involved.
Random Self-Generated Twitter Data Analysis
kaggle.com
zip
Updated Sep 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Owen Tamuno Gilbert (2023). Random Self-Generated Twitter Data Analysis [Dataset]. https://www.kaggle.com/owentamunogilbert/random-self-generated-twitter-data-analysis
Explore at:
zip(51190 bytes)Available download formats
Dataset updated
Sep 8, 2023
Authors
Owen Tamuno Gilbert
Description
Dataset

This dataset was created by Owen Tamuno Gilbert

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

Francisco Theodoro Arantes Florencio (2025). test-data-generator [Dataset]. https://huggingface.co/datasets/franciscoflorencio/test-data-generator

Data from: test-data-generator

franciscoflorencio/test-data-generator

Explore at:

Dataset updated

Mar 26, 2025

Authors

Francisco Theodoro Arantes Florencio

Description

Dataset Card for test-data-generator

This dataset has been created with distilabel.

  Dataset Summary

This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/franciscoflorencio/test-data-generator/raw/main/pipeline.yaml"

or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/franciscoflorencio/test-data-generator.

Clear search

Close search

Google apps

Main menu

Data from: test-data-generator

Dataset of article: Synthetic Datasets Generator for Testing Information...

Data from: Reliability Analysis of Random Telegraph Noisebased True Random...

Microsoft excel database containing all the simulated (10 sets) and...

Automated Generation of Realistic Test Inputs for Web APIs

Data from: Advanced Direct-Drive Generator for Improved Availability of...

Data pipeline Validation And Load Testing using Multiple JSON Files

Data from: Advanced Direct-Drive Generator for Improved Availability of...

Data and code for: Generation and applications of simulated datasets to...

Data from: RANEXP: experimental random number generator package

Fake Email Address Generator Report

Random Outfit Generator Report

Global Quantum Random Number Generator RNG market size is USD 555.9 million...

SISTER: Experimental Workflows, Product Generation Environment, and Sample...

SQL Injection Attack Netflow

Data from: A Turing Test for Molecular Generators

Data from: Simulated Radar Waveform and RF Dataset Generator for Incumbent...

Data from: Dark Generator Tool Test

Insider Threat Test Dataset

Random Self-Generated Twitter Data Analysis

Dataset

Contents

Data from: test-data-generator

franciscoflorencio/test-data-generator