65 datasets found
  1. h

    Data from: test-data-generator

    • huggingface.co
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francisco Theodoro Arantes Florencio (2025). test-data-generator [Dataset]. https://huggingface.co/datasets/franciscoflorencio/test-data-generator
    Explore at:
    Dataset updated
    Mar 26, 2025
    Authors
    Francisco Theodoro Arantes Florencio
    Description

    Dataset Card for test-data-generator

    This dataset has been created with distilabel.

      Dataset Summary
    

    This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/franciscoflorencio/test-data-generator/raw/main/pipeline.yaml"

    or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/franciscoflorencio/test-data-generator.

  2. i

    Dataset of article: Synthetic Datasets Generator for Testing Information...

    • ieee-dataport.org
    Updated Mar 13, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandro Mendonça (2020). Dataset of article: Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools [Dataset]. http://doi.org/10.21227/5aeq-rr34
    Explore at:
    Dataset updated
    Mar 13, 2020
    Dataset provided by
    IEEE Dataport
    Authors
    Sandro Mendonça
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset used in the article entitled 'Synthetic Datasets Generator for Testing Information Visualization and Machine Learning Techniques and Tools'. These datasets can be used to test several characteristics in machine learning and data processing algorithms.

  3. Z

    Data from: Reliability Analysis of Random Telegraph Noisebased True Random...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ranjan, Alok (2024). Reliability Analysis of Random Telegraph Noisebased True Random Number Generators [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13169457
    Explore at:
    Dataset updated
    Sep 30, 2024
    Dataset provided by
    Thamankar, Dr. Ramesh
    Pey, Kin Leong
    Raghavan, Nagarajan
    PUGLISI, Francesco Maria
    O'Shea, Sean J.
    Zanotti, Tommaso
    Ranjan, Alok
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    • Repository author: Tommaso Zanotti* email: tommaso.zanotti@unimore.it or francescomaria.puglisi@unimore.it * Version v1.0

    This repository includes MATLAB files and datasets related to the IEEE IIRW 2023 conference proceeding:T. Zanotti et al., "Reliability Analysis of Random Telegraph Noisebased True Random Number Generators," 2023 IEEE International Integrated Reliability Workshop (IIRW), South Lake Tahoe, CA, USA, 2023, pp. 1-6, doi: 10.1109/IIRW59383.2023.10477697

    The repository includes:

    The data of the bitmaps reported in Fig. 4, i.e., the results of the simulation of the ideal RTN-based TRNG circuit for different reseeding strategies. To load and plot the data use the "plot_bitmaps.mat" file.

    The result of the circuit simulations considering the EvolvingRTN from the HfO2 device shown in Fig. 7, for two Rgain values. Specifically, the data is contained in the following csv files:

    "Sim_TRNG_Circuit_HfO2_3_20s_Vth_210m_no_Noise_Ibias_11n.csv" (lower Rgain)

    "Sim_TRNG_Circuit_HfO2_3_20s_Vth_210m_no_Noise_Ibias_4_8n.csv" (higher Rgain)

    The result of the circuit simulations considering the temporary RTN from the SiO2 device shown in Fig. 8. Specifically, the data is contained in the following csv files:

    "Sim_TRNG_Circuit_SiO2_1c_300s_Vth_180m_Noise_Ibias_1.5n.csv" (ref. Rgain)

    "Sim_TRNG_Circuit_SiO2_1c_100s_200s_Vth_180m_Noise_Ibias_1.575n.csv" (lower Rgain)

    "Sim_TRNG_Circuit_SiO2_1c_100s_200s_Vth_180m_Noise_Ibias_1.425n.csv" (higher Rgain)

  4. f

    Microsoft excel database containing all the simulated (10 sets) and...

    • figshare.com
    • plos.figshare.com
    xlsx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hamed Ahmadi (2023). Microsoft excel database containing all the simulated (10 sets) and experimental data used in this study. [Dataset]. http://doi.org/10.1371/journal.pone.0187292.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Hamed Ahmadi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Excel sheets in order: The sheet entitled “Hens Original Data” contains the results of an experiment conducted to study the response of laying hens during initial phase of egg production subjected to different intakes of dietary threonine. The sheet entitled “Simulated data & fitting values” contains the 10 simulated data sets that were generated using a standard procedure of random number generator. The predicted values obtained by the new three-parameter and conventional four-parameter logistic models were also appeared in this sheet. (XLSX)

  5. Z

    Automated Generation of Realistic Test Inputs for Web APIs

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alonso Valenzuela, Juan Carlos (2021). Automated Generation of Realistic Test Inputs for Web APIs [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4736859
    Explore at:
    Dataset updated
    May 5, 2021
    Dataset authored and provided by
    Alonso Valenzuela, Juan Carlos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Testing web APIs automatically requires generating input data values such as addressess, coordinates or country codes. Generating meaningful values for these types of parameters randomly is rarely feasible, which means a major obstacle for current test case generation approaches. In this paper, we present ARTE, the first semantic-based approach for the Automated generation of Realistic TEst inputs for web APIs. Specifically, ARTE leverages the specification of the API under test to extract semantically related values for every parameter by applying knowledge extraction techniques. Our approach has been integrated into RESTest, a state-of-the-art tool for API testing, achieving an unprecedented level of automation which allows to generate up to 100\% more valid API calls than existing fuzzing techniques (30\% on average). Evaluation results on a set of 26 real-world APIs show that ARTE can generate realistic inputs for 7 out of every 10 parameters, outperforming the results obtained by related approaches.

  6. d

    Data from: Advanced Direct-Drive Generator for Improved Availability of...

    • catalog.data.gov
    • mhkdr.openei.org
    • +3more
    Updated Jan 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ABB Inc. (2025). Advanced Direct-Drive Generator for Improved Availability of Oscillating Wave Surge Converter Power Generation Systems: 10hp 30rpm Radial-Flux Magnetically Geared Generator Test Data [Dataset]. https://catalog.data.gov/dataset/advanced-direct-drive-generator-for-improved-availability-of-oscillating-wave-surge-conver-8124a
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    ABB Inc.
    Description

    Static torque, no load, constant speed, and sinusoidal oscillation test data for a 10hp, 300rpm magnetically-geared generator prototype using either an adjustable load bank for a fixed resistance or an output power converter.

  7. Z

    Data pipeline Validation And Load Testing using Multiple JSON Files

    • data.niaid.nih.gov
    Updated Mar 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afsana Khan (2021). Data pipeline Validation And Load Testing using Multiple JSON Files [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4636789
    Explore at:
    Dataset updated
    Mar 26, 2021
    Dataset provided by
    Pelle Jakovits
    Mainak Adhikari
    Afsana Khan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The datasets were used to validate and test the data pipeline deployment following the RADON approach. The dataset contains temperature and humidity sensor readings of a particular day, which are synthetically generated using a data generator and are stored as JSON files to validate and test (performance/load testing) the data pipeline components.

  8. d

    Data from: Advanced Direct-Drive Generator for Improved Availability of...

    • catalog.data.gov
    • mhkdr.openei.org
    • +2more
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ABB Inc. (2025). Advanced Direct-Drive Generator for Improved Availability of Oscillating Wave Surge Converter Power Generation Systems: 1hp 300rpm Axial-Flux Magnetically Geared Generator Test Data [Dataset]. https://catalog.data.gov/dataset/advanced-direct-drive-generator-for-improved-availability-of-oscillating-wave-surge-conver-b502b
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    ABB Inc.
    Description

    Static torque and no load test data for a 1hp, 300rpm axial-flux magnetically geared generator prototype developed by Texas A&M EMPE Lab.

  9. Data and code for: Generation and applications of simulated datasets to...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Mar 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Silk; Olivier Gimenez (2023). Data and code for: Generation and applications of simulated datasets to integrate social network and demographic analyses [Dataset]. http://doi.org/10.5061/dryad.m0cfxpp7s
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 10, 2023
    Dataset provided by
    Centre d'Ecologie Fonctionnelle et Evolutive
    Authors
    Matthew Silk; Olivier Gimenez
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Social networks are tied to population dynamics; interactions are driven by population density and demographic structure, while social relationships can be key determinants of survival and reproductive success. However, difficulties integrating models used in demography and network analysis have limited research at this interface. We introduce the R package genNetDem for simulating integrated network-demographic datasets. It can be used to create longitudinal social networks and/or capture-recapture datasets with known properties. It incorporates the ability to generate populations and their social networks, generate grouping events using these networks, simulate social network effects on individual survival, and flexibly sample these longitudinal datasets of social associations. By generating co-capture data with known statistical relationships it provides functionality for methodological research. We demonstrate its use with case studies testing how imputation and sampling design influence the success of adding network traits to conventional Cormack-Jolly-Seber (CJS) models. We show that incorporating social network effects in CJS models generates qualitatively accurate results, but with downward-biased parameter estimates when network position influences survival. Biases are greater when fewer interactions are sampled or fewer individuals are observed in each interaction. While our results indicate the potential of incorporating social effects within demographic models, they show that imputing missing network measures alone is insufficient to accurately estimate social effects on survival, pointing to the importance of incorporating network imputation approaches. genNetDem provides a flexible tool to aid these methodological advancements and help researchers test other sampling considerations in social network studies. Methods The dataset and code stored here is for Case Studies 1 and 2 in the paper. Datsets were generated using simulations in R. Here we provide 1) the R code used for the simulations; 2) the simulation outputs (as .RDS files); and 3) the R code to analyse simulation outputs and generate the tables and figures in the paper.

  10. d

    Data from: RANEXP: experimental random number generator package

    • elsevier.digitalcommonsdata.com
    • search.datacite.org
    Updated Jan 1, 1994
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Hennecke (1994). RANEXP: experimental random number generator package [Dataset]. http://doi.org/10.17632/pty366sbwg.1
    Explore at:
    Dataset updated
    Jan 1, 1994
    Authors
    Michael Hennecke
    License

    https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/https://www.elsevier.com/about/policies/open-access-licenses/elsevier-user-license/cpc-license/

    Description

    Abstract A library containing highly portable implementations of most algorithms for (pseudo) random number generation has been developed, which might be used in any area of simulation which requires random number generators. Each generator is freely configurable by the user, so the RANEXP library is particularly well-suited for applications requiring different random number generators. The algorithms are implemented in C, but are callable from Fortran application program also.

    Title of program: RANEXP Catalogue Id: ACTB_v1_0

    Nature of problem Any Monte Carlo simulation or statistical test requiring uniform pseudorandom numbers.

    Versions of this program held in the CPC repository in Mendeley Data ACTB_v1_0; RANEXP; 10.1016/0010-4655(94)90072-8

    This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2019)

  11. F

    Fake Email Address Generator Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Fake Email Address Generator Report [Dataset]. https://www.datainsightsmarket.com/reports/fake-email-address-generator-1405019
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Feb 12, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Fake Email Address Generator Market Analysis The global market for Fake Email Address Generators is expected to reach a value of XXX million by 2033, growing at a CAGR of XX% from 2025 to 2033. Key drivers of this growth include the increasing demand for privacy and anonymity online, the growing prevalence of spam and phishing attacks, and the proliferation of digital marketing campaigns. Additionally, the adoption of cloud-based solutions and the emergence of new technologies, such as artificial intelligence (AI), are further fueling market expansion. Key trends in the Fake Email Address Generator market include the growing popularity of enterprise-grade solutions, the emergence of disposable email services, and the increasing integration with other online tools. Restraints to market growth include concerns over security and data protection, as well as the availability of free or low-cost alternatives. The market is dominated by a few major players, including Burnermail, TrashMail, and Guerrilla Mail, but a growing number of smaller vendors are emerging with innovative solutions. Geographically, North America and Europe are the largest markets, followed by the Asia Pacific region.

  12. R

    Random Outfit Generator Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Random Outfit Generator Report [Dataset]. https://www.datainsightsmarket.com/reports/random-outfit-generator-1406627
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Jan 22, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Random Outfit Generator market was valued at USD 1.2 billion in 2025 and is projected to grow at a CAGR of 10.5% during the forecast period, reaching USD 2.5 billion by 2033. This growth is attributed to increasing demand for personalized and time-saving fashion solutions, rising disposable income, and growing awareness of fashion trends. The market is predominantly driven by fashion enthusiasts and professionals who seek efficient ways to generate unique and stylish outfits. Among the key segments of the Random Outfit Generator market are: Application: Fashion Designer, Fashion Enthusiasts, Photography Stylist, Others Types: Cloud-based, On-premises Geography: North America, South America, Europe, Middle East & Africa, Asia Pacific Key players in the market include Fashmates, Stylicious, Your Closet, Combyne, My Dressing, Acloset, My Wardrobe, Smart Closet, Pureple, Twelve70, Roll For Fantasy, Randommer, The Fashion Robot, and Picrew. These companies offer innovative solutions to cater to the growing demand for random outfit generators, engaging users with interactive features and personalized experiences.

  13. Global Quantum Random Number Generator RNG market size is USD 555.9 million...

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2025). Global Quantum Random Number Generator RNG market size is USD 555.9 million in 2024. [Dataset]. https://www.cognitivemarketresearch.com/quantum-random-number-generator-rng-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    According to Cognitive Market Research, the global Quantum Random Number Generator RNG market size is USD 555.9 million in 2024. It will expand at a compound annual growth rate (CAGR) of 72.60% from 2024 to 2031.

    North America held the major market share for more than 40% of the global revenue with a market size of USD 222.36 million in 2024 and will grow at a compound annual growth rate (CAGR) of 70.8% from 2024 to 2031.
    Europe accounted for a market share of over 30% of the global revenue with a market size of USD 166.77 million.
    Asia Pacific held a market share of around 23% of the global revenue with a market size of USD 127.86 million in 2024 and will grow at a compound annual growth rate (CAGR) of 74.6% from 2024 to 2031.
    Latin America had a market share for more than 5% of the global revenue with a market size of USD 27.80 million in 2024 and will grow at a compound annual growth rate (CAGR) of 72.0% from 2024 to 2031.
    Middle East and Africa had a market share of around 2% of the global revenue and was estimated at a market size of USD 11.12 million in 2024 and will grow at a compound annual growth rate (CAGR) of 72.3% from 2024 to 2031.
    Cloud held the dominant segment in the Quantum Random Number Generator RNG market in 2024.
    

    Market Dynamics of Quantum Random Number Generator RNG Market

    Key Drivers for Quantum Random Number Generator RNG Market

    Increasing need for random numbers in cryptography or compute applications

    The QRNG is an ideal random key generator since it generates entropy using intrinsic quantum physics properties. Nowadays, applications demand a huge number of keys and randomization to achieve total security. It could include key vaults, games, IoT devices, AI/ML, blockchains, simulations, and vital infrastructure. QRNG is the source of these applications in which trust in randomness is prevalent. Furthermore, it is utilized in encryption for a wide range of applications, including cryptography, numerical simulation, gambling, and game design.

    Growing adoption of quantum computing

    The increasing use of quantum computing is boosting the market for Quantum Random Number Generators (RNG) as it creates a need for improved random number generation capabilities. The accurate abilities of quantum computing enable RNGs to produce truly random numbers, essential for secure communication and encryption. Advancements in quantum computing will lead to a higher demand for dependable RNGs, driving market expansion to meet the changing requirements of cybersecurity and data encryption.

    Restraint Factor for the Quantum Random Number Generator RNG Market

    High initial investment

    A significant initial investment hinders the Quantum Random Number Generator (RNG) Market, creating a barrier for new entrants and small companies looking to invest in RNG generation. The significant initial costs involved in the research, development, and deployment of quantum RNG solutions may discourage potential entrants from joining the market. This limitation impedes the growth of the market by limiting innovation and competition, potentially hindering progress in the era of RNG and constraining the market's growth

    Impact of Covid-19 on the Quantum Random Number Generator RNG Market

    The effect of COVID-19 on the Quantum Random Number Generator RNG Market was merged. Although the pandemic initially caused disruptions in supply chains and slowed down certain trends, the increased focus on cybersecurity and data protection during remote work and digital interactions enhanced the need for secure communication solutions such as quantum RNGs. With a focus on safeguarding information, both organizations and governments fueled growth in the Quantum RNG market despite pandemic-related obstacles. Introduction of the Quantum Random Number Generator RNG Market

    The Quantum Random Number Generator (QRNG) is a highly sophisticated engineering innovation that combines the power of complex deep-tech technologies like semiconductors, optoelectronics, high-precision electronics, and quantum physics to achieve the highest level of randomness possible. QRNG has shown to be a critical enabling technology for quantum-level security in mobile devices, data centres, and medical implants. They provide consumers with a significant enhancement over ordinary random number generators (RNGs), which have been utilized for years in a variety of business applications. Several factors, including th...

  14. SISTER: Experimental Workflows, Product Generation Environment, and Sample...

    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    • data.nasa.gov
    • +4more
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.staging.idas-ds1.appdat.jsc.nasa.gov (2025). SISTER: Experimental Workflows, Product Generation Environment, and Sample Data, V004 [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/sister-experimental-workflows-product-generation-environment-and-sample-data-v004
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    The Space-based Imaging Spectroscopy and Thermal pathfindER (SISTER) activity originated in support of the NASA Earth System Observatory's Surface Biology and Geology (SBG) mission to develop prototype workflows with community algorithms and generate prototype data products envisioned for SBG. SISTER focused on developing a data system that is open, portable, scalable, standards-compliant, and reproducible. This collection contains EXPERIMENTAL workflows and sample data products, including (a) the Common Workflow Language (CWL) process file and a Jupyter Notebook that run the entire SISTER workflow capable of generating experimental sample data products spanning terrestrial ecosystems, inland and coastal aquatic ecosystems, and snow, (b) the archived algorithm steps (as OGC Application Packages) used to generate products at each step of the workflow, (c) a small number of experimental sample data products produced by the workflow which are based on the Airborne Visible/Infrared Imaging Spectrometer-Classic (AVIRIS or AVIRIS-CL) instrument, and (d) instructions for reproducing the sample products included in this dataset. DISCLAIMER: This collection contains experimental workflows, experimental community algorithms, and experimental sample data products to demonstrate the capabilities of an end-to-end processing system. The experimental sample data products provided have not been fully validated and are not intended for scientific use. The community algorithms provided are placeholders which can be replaced by any user's algorithms for their own science and application interests. These algorithms should not in any capacity be considered the algorithms that will be implemented in the upcoming Surface Biology and Geology mission.

  15. Z

    SQL Injection Attack Netflow

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adrián Campazas (2022). SQL Injection Attack Netflow [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6907251
    Explore at:
    Dataset updated
    Sep 28, 2022
    Dataset provided by
    Adrián Campazas
    Ignacio Crespo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.

    NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.

    Datasets

    The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).

    The datasets contain both benign and malicious traffic. All collected datasets are balanced.

    The version of NetFlow used to build the datasets is 5.

        Dataset
        Aim
        Samples
        Benign-malicious
        traffic ratio
    
    
    
    
        D1
        Training
        400,003
        50%
    
    
        D2
        Test
        57,239
        50%
    

    Infrastructure and implementation

    Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.

    DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)

    Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).

    The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.

    The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.

        Parameters
        Description
    
    
    
    
        '--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema'
        Enumerate users, password hashes, privileges, roles, databases, tables and columns
    
    
        --level=5
        Increase the probability of a false positive identification
    
    
        --risk=3
        Increase the probability of extracting data
    
    
        --random-agent
        Select the User-Agent randomly
    
    
        --batch
        Never ask for user input, use the default behavior
    
    
        --answers="follow=Y"
        Predefined answers to yes
    

    Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).

    The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24. The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.

    However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.

    To run the MySQL server we ran MariaDB version 10.4.12. Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.

  16. Data from: A Turing Test for Molecular Generators

    • acs.figshare.com
    • figshare.com
    txt
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob T. Bush; Peter Pogany; Stephen D. Pickett; Mike Barker; Andrew Baxter; Sebastien Campos; Anthony W. J. Cooper; David Hirst; Graham Inglis; Alan Nadin; Vipulkumar K. Patel; Darren Poole; John Pritchard; Yoshiaki Washio; Gemma White; Darren V. S. Green (2023). A Turing Test for Molecular Generators [Dataset]. http://doi.org/10.1021/acs.jmedchem.0c01148.s003
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    ACS Publications
    Authors
    Jacob T. Bush; Peter Pogany; Stephen D. Pickett; Mike Barker; Andrew Baxter; Sebastien Campos; Anthony W. J. Cooper; David Hirst; Graham Inglis; Alan Nadin; Vipulkumar K. Patel; Darren Poole; John Pritchard; Yoshiaki Washio; Gemma White; Darren V. S. Green
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Machine learning approaches promise to accelerate and improve success rates in medicinal chemistry programs by more effectively leveraging available data to guide a molecular design. A key step of an automated computational design algorithm is molecule generation, where the machine is required to design high-quality, drug-like molecules within the appropriate chemical space. Many algorithms have been proposed for molecular generation; however, a challenge is how to assess the validity of the resulting molecules. Here, we report three Turing-inspired tests designed to evaluate the performance of molecular generators. Profound differences were observed between the performance of molecule generators in these tests, highlighting the importance of selection of the appropriate design algorithms for specific circumstances. One molecule generator, based on match molecular pairs, performed excellently against all tests and thus provides a valuable component for machine-driven medicinal chemistry design workflows.

  17. d

    Data from: Simulated Radar Waveform and RF Dataset Generator for Incumbent...

    • datasets.ai
    • gimi9.com
    • +2more
    0
    Updated Aug 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2024). Simulated Radar Waveform and RF Dataset Generator for Incumbent Signals in the 3.5 GHz CBRS Band [Dataset]. https://datasets.ai/datasets/simulated-radar-waveform-and-rf-dataset-generator-for-incumbent-signals-in-the-3-5-ghz-cbr-a6a00
    Explore at:
    0Available download formats
    Dataset updated
    Aug 8, 2024
    Dataset authored and provided by
    National Institute of Standards and Technology
    Description

    This software tool generates simulated radar signals and creates RF datasets. The datasets can be used to develop and test detection algorithms by utilizing machine learning/deep learning techniques for the 3.5 GHz Citizens Broadband Radio Service (CBRS) or similar bands. In these bands, the primary users of the band are federal incumbent radar systems. The software tool generates radar waveforms and randomizes the radar waveform parameters. The pulse modulation types for the radar signals and their parameters are selected based on NTIA testing procedures for ESC certification, available at http://www.its.bldrdoc.gov/publications/3184.aspx. Furthermore, the tool mixes the waveforms with interference and packages them into one RF dataset file. The tool utilizes a graphical user interface (GUI) to simplify the selection of parameters and the mixing process. A reference RF dataset was generated using this software. The RF dataset is published at https://doi.org/10.18434/M32116.

  18. Data from: Dark Generator Tool Test

    • esdcdoi.esac.esa.int
    Updated Aug 5, 2002
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Space Agency (2002). Dark Generator Tool Test [Dataset]. http://doi.org/10.5270/esa-zkljnt1
    Explore at:
    https://www.iana.org/assignments/media-types/application/fitsAvailable download formats
    Dataset updated
    Aug 5, 2002
    Dataset authored and provided by
    European Space Agencyhttp://www.esa.int/
    Time period covered
    Jul 15, 2002 - Aug 5, 2002
    Description
  19. c

    Insider Threat Test Dataset

    • kilthub.cmu.edu
    application/bzip2 +3
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Lindauer (2023). Insider Threat Test Dataset [Dataset]. http://doi.org/10.1184/R1/12841247.v1
    Explore at:
    bz2, bin, application/bzip2, txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Carnegie Mellon University
    Authors
    Brian Lindauer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Insider Threat Test Dataset is a collection of synthetic insider threat test datasets that provide both background and malicious actor synthetic data.

    The CERT Division, in partnership with ExactData, LLC, and under sponsorship from DARPA I2O, generated a collection of synthetic insider threat test datasets. These datasets provide both synthetic background data and data from synthetic malicious actors. For more background on this data, please see the paper, Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data. Datasets are organized according to the data generator release that created them. Most releases include multiple datasets (e.g., r3.1 and r3.2). Generally, later releases include a superset of the data generation functionality of earlier releases. Each dataset file contains a readme file that provides detailed notes about the features of that release. The answer key file answers.tar.bz2 contains the details of the malicious activity included in each dataset, including descriptions of the scenarios enacted and the identifiers of the synthetic users involved.

  20. Random Self-Generated Twitter Data Analysis

    • kaggle.com
    zip
    Updated Sep 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Owen Tamuno Gilbert (2023). Random Self-Generated Twitter Data Analysis [Dataset]. https://www.kaggle.com/owentamunogilbert/random-self-generated-twitter-data-analysis
    Explore at:
    zip(51190 bytes)Available download formats
    Dataset updated
    Sep 8, 2023
    Authors
    Owen Tamuno Gilbert
    Description

    Dataset

    This dataset was created by Owen Tamuno Gilbert

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Francisco Theodoro Arantes Florencio (2025). test-data-generator [Dataset]. https://huggingface.co/datasets/franciscoflorencio/test-data-generator

Data from: test-data-generator

franciscoflorencio/test-data-generator

Related Article
Explore at:
Dataset updated
Mar 26, 2025
Authors
Francisco Theodoro Arantes Florencio
Description

Dataset Card for test-data-generator

This dataset has been created with distilabel.

  Dataset Summary

This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/franciscoflorencio/test-data-generator/raw/main/pipeline.yaml"

or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/franciscoflorencio/test-data-generator.

Search
Clear search
Close search
Google apps
Main menu