Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.
NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.
Datasets
The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).
The datasets contain both benign and malicious traffic. All collected datasets are balanced.
The version of NetFlow used to build the datasets is 5.
Dataset
Aim
Samples
Benign-malicious
traffic ratio
D1
Training
400,003
50%
D2
Test
57,239
50%
Infrastructure and implementation
Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.
DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)
Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).
The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.
The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.
Parameters
Description
'--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema'
Enumerate users, password hashes, privileges, roles, databases, tables and columns
--level=5
Increase the probability of a false positive identification
--risk=3
Increase the probability of extracting data
--random-agent
Select the User-Agent randomly
--batch
Never ask for user input, use the default behavior
--answers="follow=Y"
Predefined answers to yes
Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).
The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24. The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.
However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.
To run the MySQL server we ran MariaDB version 10.4.12. Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
view code : https://colab.research.google.com/drive/1rLk-mdsWsdxwQdYYJS24rAP9KABtbiqu?usp=sharing
Example :
{"messages": [
{"role": "system", "content": "You are a SQL expert assistant. Generate clear, efficient SQL queries based on user requests. Provide only the SQL query without any additional text or explanation."}
{"role": "user", "content": "What are the top 5 most popular genres of music in the database, based on the number of tracks… See the full description on the dataset page: https://huggingface.co/datasets/fknguedia/SQL-GENERATOR-DATASETS.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22744129%2Fdd8e395e5d70bde9279f0f653b4bc2bf%2FGemini_Generated_Image_cvz71ncvz71ncvz7.jpg?generation=1736783649344014&alt=media" alt="">
This project involves analyzing and transforming data from a bike warehouse database using SQL. The goal is to clean, transform, and query the data to generate insights about products, employees, customers, sales, and trends.
The SAP Bikes Sales database contains various tables that represent business data for a bike warehouse, such as information on products, sales, employees, business partners, and more. This project focuses on cleaning and transforming data, optimizing database schema, and generating SQL queries to gain business insights.
1.**Data Cleaning & Transformation**: - Remove duplicate records from key tables. - Drop unnecessary columns and handle null values. - Populate new columns based on existing data. - Merge related tables to create new insights. 2.**Business Insights Queries**: - Top-selling Products: Identify products with the highest sales quantities and total revenue. - Sales Performance by Product Category: Analyze revenue and order counts by product category. - Employee Sales Performance: Track employees' contribution to sales volumes and revenue. - Customer Segmentation: Examine the number of orders placed by business partners and their total sales value. - Sales Trends: Analyze sales trends over time and calculate average order values.
-**Addresses Table**:
-Checking for duplicates ADDRESSID.
-**BusinessPartners Table**:
-Handled duplicates, missing or incorrect data.
-Dropped the unnecessary FAXNUMBER column because it was empty.
-**Employee Table**:
-Dropped unnecessary columns.
-Populated NAME_INITIALS based on employee's first, middle, and last name initials.
-Fixed column type issues.
-**Product Categories and Product Texts**:
-Merged ProductCategories and ProductCategoryText tables into a new CombinedProductCategories table for easy analysis.
-**Products Table**:
-Dropped irrelevant columns such as WIDTH, DEPTH, HEIGHT, etc.
-**Sales Order Items Table**:
-Fixed null values in GROSSAMOUNT and created a TOTALGROSSAMOUNT column to track sales volume.
###2. Database Diagram and Relationships In addition to the data cleaning and analysis, a database diagram has been create...
Facebook
TwitterThis dataset was created to be the base of the data.world SQL tutorial exercises. Data was genererated using Synthea, a synthetic patient generator that models the medical history of synthetic patients. Their mission is to output high-quality synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions, enabling research with Health IT data that is otherwise legally or practically unavailable. De-identified real data still presents a challenge in the medical field because there are peopel who excel at re-identification of these data. For that reason the average medical center, etc. will not share their patient data. Most governmental data is at the hospital level. NHANES data is an exception.
You can read Synthea's first academic paper here.
Foto von Rubaitul Azad auf Unsplash
Facebook
Twitterhttp://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0
This dataset contains 56,964 conversations between een AI assistant and a (fake) "Human" (generated) in Dutch, specifically in the domain of programming (Stack Overflow). They are translations of Baize's machine-generated answers to the Stack Overflow dataset.
Data Instances
{'id': 9, 'input': 'Het gesprek tussen de mens en de AI-assistent. ' '[|Human|] Ik heb wat hulp nodig met Spark SQL union en het ordenen ' 'van kolommen. ' '[|AI|] Natuurlijk, ik help je graag. Waar heb je specifiek hulp ' 'bij nodig? ' '[|Human|] Ik wil de output van een UNION-query ordenen op een ' 'kolom die niet aanwezig is in de SELECT-instructie. Is dat ' 'mogelijk? ' '[|AI|] Ja, het is mogelijk met Spark SQL. Je kunt de ordinale ' 'positie van de kolom in de ORDER BY-clausule gebruiken in plaats ' 'van de kolomnaam. Als je bijvoorbeeld wilt ordenen op de derde ' 'kolom in de SELECT-instructie, kun je ORDER BY 3 gebruiken. ' '[|Human|] Dat is nuttig. Kan ik ook een alias gebruiken in de ' 'ORDER BY-clausule? ' '[|AI|] Ja, je kunt een alias gebruiken in de ORDER BY-clausule. De ' 'alias moet worden gedefinieerd in de SELECT-instructie. Als je ' 'bijvoorbeeld de derde kolom als "col3" alias geeft, kun je ordenen ' 'met ORDER BY col3. ' '[|Human|] Oké, ik denk dat ik het begrijp. Kan ik dezelfde alias ' 'gebruiken voor kolommen uit beide tabellen in de UNION? ' '[|AI|] Nee, je kunt dezelfde alias niet gebruiken voor kolommen ' 'uit beide tabellen in de UNION. Elke tabel in de UNION moet een ' 'unieke alias hebben voor de kolommen. ' '[|Human|] ', 'topic': 'Spark SQL UNION - ORDER BY kolom niet in SELECT'},
Data Fields
id: the ID of the item. The following 82 IDs are not included because they could not be translated: [1713, 1937, 1960, 4326, 4356, 8357, 8542, 8827, 9137, 9782, 11560, 11961, 12244, 12362, 12488, 13259, 13621, 14445, 14835, 15006, 17746, 18808, 19285, 19426, 19491, 21270, 21661, 22098, 23352, 23840, 23869, 25148, 25928, 27102, 27856, 28387, 29942, 30041, 30251, 32396, 32742, 32941, 33628, 34116, 34648, 34859, 35977, 35987, 36035, 36456, 37028, 37238, 37640, 38107, 38735, 39015, 40984, 41115, 41567, 42397, 43219, 43783, 44599, 44980, 45239, 47676, 48922, 49534, 50282, 50683, 50804, 50919, 51076, 51211, 52000, 52183, 52489, 52595, 53884, 54726, 55795, 56992]
input: the machine-generated conversation between AI and "Human". Always starts with Het gesprek tussen de mens en de AI-assistent. and has at least one occurrence of both [|AI|] and [|Human|].
topic: the topic description
Dataset Creation
Both the translations and the topics were translated with OpenAI's API for gpt-3.5-turbo. max_tokens=1024, temperature=0 as parameters.
The prompt template to translate the input is (where src_lang was English and tgt_lang Dutch):
CONVERSATION_TRANSLATION_PROMPT = """You are asked to translate a conversation between an AI assistant and a human from {src_lang} into {tgt_lang}.
Here are the requirements that you should adhere to:
1. maintain the format: the conversation consists of the AI (marked as [|AI|]) and the human ([|Human|]) talking in turns and responding to each other;
2. do not translate the speaker identifiers [|AI|] and [|Human|] but always copy them into the translation in appropriate places;
3. ensure accurate translation and keep the correctness of the conversation;
4. make sure that text is fluent to read and does not contain grammatical errors. Use standard {tgt_lang} without regional bias;
5. translate the human's text using informal, but standard, language;
6. make sure to avoid biases (such as gender bias, grammatical bias, social bias);
7. if the human asks to correct grammar mistakes or spelling mistakes then you have to generate a similar mistake in {tgt_lang}, and then also generate a corrected output version for the AI in {tgt_lang};
8. if the human asks to translate text from one to another language, then you only translate the human's question to {tgt_lang} but you keep the translation that the AI provides in the language that the human requested;
9. do not translate code fragments but copy them as they are. If there are English examples, variable names or definitions in code fragments, keep them in English.
Now translate the following conversation with the requirements set out above. Do not provide an explanation and do not add anything else.
"""
The prompt to translate the topic is:
TOPIC_TRANSLATION_PROMPT = "Translate the following title of a conversation from {src_lang} to {tgt_lang} in a succinct,"
" summarizing manner. Translate accurately and formally. Do not provide any explanation"
" about the translation and do not include the original title.
"
The system message was:
You are a helpful assistant that translates English to Dutch to the requirements that are given to you.
Note that 82 items (0.1%) were not successfully translated. The translation was missing the AI identifier [|AI|] and/or the human one [|Human|]. The IDs for the missing items are [1713, 1937, 1960, 4326, 4356, 8357, 8542, 8827, 9137, 9782, 11560, 11961, 12244, 12362, 12488, 13259, 13621, 14445, 14835, 15006, 17746, 18808, 19285, 19426, 19491, 21270, 21661, 22098, 23352, 23840, 23869, 25148, 25928, 27102, 27856, 28387, 29942, 30041, 30251, 32396, 32742, 32941, 33628, 34116, 34648, 34859, 35977, 35987, 36035, 36456, 37028, 37238, 37640, 38107, 38735, 39015, 40984, 41115, 41567, 42397, 43219, 43783, 44599, 44980, 45239, 47676, 48922, 49534, 50282, 50683, 50804, 50919, 51076, 51211, 52000, 52183, 52489, 52595, 53884, 54726, 55795, 56992].
The translation quality has not been verified. Use at your own risk!
Licensing Information
Licensing info for Stack Overflow Questions is listed as Apache 2.0. If you use the current dataset, you should also adhere to the original license.
This text was generated (either in part or in full) with GPT-3 (gpt-3.5-turbo), OpenAI’s large-scale language-generation model. Upon generating draft language, the author reviewed, edited, and revised the language to their own liking and takes ultimate responsibility for the content of this publication.
If you use this dataset, you must also follow the Sharing and Usage policies.
As clearly stated in their Terms of Use, specifically 2c.iii, "[you may not] use output from the Services to develop models that compete with OpenAI". That means that you cannot use this dataset to build models that are intended to commercially compete with OpenAI. As far as I am aware, that is a specific restriction that should serve as an addendum to the current license.
This dataset is also available on the Hugging Face hub with the same DOI and license. See that README for more info.
Facebook
Twitterhttps://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The global database testing tool market is anticipated to experience substantial growth in the coming years, driven by factors such as the increasing adoption of cloud-based technologies, the rising demand for data quality and accuracy, and the growing complexity of database systems. The market is expected to reach a value of USD 1,542.4 million by 2033, expanding at a CAGR of 7.5% during the forecast period of 2023-2033. Key players in the market include Apache JMeter, DbFit, SQLMap, Mockup Data, SQL Test, NoSQLUnit, Orion, ApexSQL, QuerySurge, DBUnit, DataFactory, DTM Data Generator, Oracle, SeLite, SLOB, and others. The North American region is anticipated to hold a significant share of the database testing tool market, followed by Europe and Asia Pacific. The increasing adoption of cloud-based database testing services, the presence of key market players, and the growing demand for data testing and validation are driving the market growth in North America. Asia Pacific, on the other hand, is expected to experience the highest growth rate due to the rapidly increasing IT spending, the emergence of new technologies, and the growing number of businesses investing in data quality management solutions.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By Huggingface Hub [source]
A large crowd-sourced dataset for developing natural language interfaces for relational databases. WikiSQL is a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables from Wikipedia.
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset can be used to develop natural language interfaces for relational databases. The data fields are the same among all splits, and the file contains information on the phase, question, table, and SQL for each interface
- This dataset can be used to develop natural language interfaces for relational databases.
- This dataset can be used to develop a knowledge base of common SQL queries.
- This dataset can be used to generate a training set for a neural network that translates natural language into SQL queries
If you use this dataset in your research, please credit the original authors.
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: validation.csv | Column name | Description | |:--------------|:---------------------------------------------------------| | phase | The phase of the data collection. (String) | | question | The question asked by the user. (String) | | table | The table containing the data for the question. (String) | | sql | The SQL query corresponding to the question. (String) |
File: train.csv | Column name | Description | |:--------------|:---------------------------------------------------------| | phase | The phase of the data collection. (String) | | question | The question asked by the user. (String) | | table | The table containing the data for the question. (String) | | sql | The SQL query corresponding to the question. (String) |
File: test.csv | Column name | Description | |:--------------|:---------------------------------------------------------| | phase | The phase of the data collection. (String) | | question | The question asked by the user. (String) | | table | The table containing the data for the question. (String) | | sql | The SQL query corresponding to the question. (String) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Huggingface Hub.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Image generated by DALL-E. See prompt for more details
synthetic_text_to_sql
gretelai/synthetic_text_to_sql is a rich dataset of high quality synthetic Text-to-SQL samples, designed and generated using Gretel Navigator, and released under Apache 2.0. Please see our release blogpost for more details. The dataset includes:
105,851 records partitioned into 100,000 train and 5,851 test records ~23M total tokens, including ~12M SQL tokens Coverage across 100 distinct… See the full description on the dataset page: https://huggingface.co/datasets/gretelai/synthetic_text_to_sql.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is synthetically generated fake data designed to simulate a realistic e-commerce environment.
To provide large-scale relational datasets for practicing database operations, analytics, and testing tools like DuckDB, Pandas, and SQL engines. Ideal for benchmarking, educational projects, and data engineering experiments.
int): Unique identifier for each customer string): Customer full name string): Customer email address string): Customer gender ('Male', 'Female', 'Other') date): Date customer signed up string): Customer country of residence int): Unique identifier for each product string): Name of the product string): Product category (e.g., Electronics, Books) float): Price per unit int): Available stock count string): Product brand name int): Unique identifier for each order int): ID of the customer who placed the order (foreign key to Customers) date): Date when order was placed float): Total amount for the order string): Payment method used (Credit Card, PayPal, etc.) string): Country where the order is shipped int): Unique identifier for each order item int): ID of the order this item belongs to (foreign key to Orders) int): ID of the product ordered (foreign key to Products) int): Number of units ordered float): Price per unit at order time int): Unique identifier for each review int): ID of the reviewed product (foreign key to Products) int): ID of the customer who wrote the review (foreign key to Customers) int): Rating score (1 to 5) string): Text content of the review date): Date the review was written https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F9179978%2F7681afe8fc52a116ff56a2a4e179ad19%2FEDR.png?generation=1754741998037680&alt=media" alt="">
The script saves two folders inside the specified output path:
csv/ # CSV files
parquet/ # Parquet files
MIT License
Facebook
Twitter
According to our latest research, the global Data Access Auditing market size in 2024 is valued at USD 2.18 billion, reflecting robust demand for advanced data governance and compliance solutions across diverse industries. The market is expected to grow at a CAGR of 13.7% from 2025 to 2033, reaching a forecasted value of USD 6.47 billion by 2033. This impressive growth is propelled by increasing regulatory requirements, heightened cybersecurity concerns, and the exponential rise in data generation and consumption worldwide. As per our latest research, organizations are prioritizing data transparency and security, further accelerating the adoption of data access auditing solutions globally.
The primary growth factor for the Data Access Auditing market is the escalating regulatory landscape that mandates stringent data monitoring and access control. With the proliferation of data privacy laws such as the General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), and the Health Insurance Portability and Accountability Act (HIPAA), organizations across sectors are compelled to implement comprehensive auditing mechanisms. These regulations require enterprises to not only safeguard sensitive data but also maintain detailed logs of data access events for compliance and forensic analysis. Consequently, the need for automated, scalable, and real-time data access auditing solutions has surged, enabling organizations to mitigate risks, avoid hefty penalties, and foster trust among stakeholders.
Another significant driver for the Data Access Auditing market is the dramatic increase in data breaches and cyberattacks targeting critical business information. As enterprises transition to hybrid and multi-cloud environments, the complexity of managing data access rights and monitoring user activities has intensified. Cybersecurity incidents, including insider threats and unauthorized data exfiltration, are pushing organizations to adopt sophisticated auditing tools capable of providing granular visibility into who accessed what data, when, and from where. These solutions not only help detect suspicious activities in real-time but also support incident response and post-breach investigations, thereby enhancing the overall security posture of organizations.
The rapid digitization of business processes and the integration of emerging technologies such as artificial intelligence, machine learning, and the Internet of Things (IoT) are further fueling the growth of the Data Access Auditing market. Enterprises are generating and processing massive volumes of structured and unstructured data, necessitating robust auditing frameworks to ensure data integrity and prevent unauthorized access. The convergence of digital transformation initiatives with data-centric business models is compelling organizations to invest in advanced auditing solutions that offer scalability, automation, and seamless integration with existing IT infrastructures. This trend is particularly pronounced in sectors such as financial services, healthcare, and government, where data sensitivity and compliance obligations are paramount.
In the context of increasing cybersecurity threats and regulatory demands, SQL Query Audit Tools have emerged as vital components in the arsenal of data access auditing solutions. These tools are designed to provide detailed insights into database activities by monitoring and logging SQL queries executed within an organization's database environment. By capturing comprehensive information about who executed a query, what data was accessed, and when the access occurred, SQL Query Audit Tools enable organizations to maintain stringent oversight over their data assets. This capability is crucial for detecting unauthorized access attempts, ensuring compliance with data protection regulations, and supporting forensic investigations in the event of a data breach. As organizations continue to grapple with complex data environments, the integration of SQL Query Audit Tools into broader data governance frameworks is becoming increasingly essential for maintaining robust security postures.
From a regional perspective, North America dominates the Data Access Auditing market, accounting for the largest revenue share in 2024, driven by the presence of major t
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
A small, fantasy-themed dataset created for learning and practice. It contains 100 adventurers with randomized names, races, classes, kingdoms, levels, and gold amounts for experimenting with SQL and Python.
SQL Table Creation
Create table Adventurers(
id integer primary key,
name text,
last_name text,
race text,
class text,
kingdom text,
level integer,
gold integer
);
Python Names and Attributes
import random
names = [
'Aster', 'Antara',
'Boldak', 'Balta',
'Corpik', 'Carnda',
'Defham', 'Dinsa',
'Enemaliz', 'Enderna',
'Fansom', 'Falma',
'Garol', 'Gerna',
'Hant', 'Hisa',
'Isder', 'Isba',
'Jansos', 'Juna',
'Kornak', 'Kaira',
'Lomrik', 'Lysara',
'Mordak', 'Mina',
'Nerok', 'Nalia',
'Orim', 'Ossara',
'Pendar', 'Pella',
'Quarn', 'Quila',
'Rundak', 'Rysa',
'Sornik', 'Selna',
'Tarnak', 'Tira',
'Uldan', 'Ursa',
'Vornik', 'Vela',
'Wendak', 'Wira',
'Xalor', 'Xina',
'Yorik', 'Ysera',
'Zandor', 'Zyra'
]
last_names = [
'Ashford', 'Blackwood',
'Coldmoor', 'Duskbane',
'Eboncrest', 'Frosthelm',
'Gloomshade', 'Hawkridge',
'Ironfist', 'Jadeeye',
'Keenblade', 'Lightfoot',
'Moonstrike', 'Nightfall',
'Oakheart', 'Proudmore',
'Quickwater', 'Ravenhill',
'Stormborn', 'Thornfield',
'Umbermoor', 'Valewind',
'Windrider', 'Xiloscient',
'Yewbranch', 'Zephyrine'
]
races = ['Human', 'Elf', 'Orc', 'Dwarf']
classes = ['Fighter', 'Ranger', 'Thief', 'Assassin', 'Mage', 'Bard', 'Merchant']
kingdoms = [
'Ironhold',
'Silverwood',
'Bloodfang',
'Stormwind',
'Shadowmere',
'Frostveil',
'Duskmoor',
'Emberfall'
]
SQL Command Generator
for i in range (1, 101):
name = random.choice(names)
last_name = random.choice(last_names)
race = random.choice(races)
_class = random.choice(classes)
kingdom = random.choice(kingdoms)
level = random.randint(1, 100)
gold = random.randint(0, 10000)
print(f"INSERT INTO Adventurers (id, name, last_name, race, class, kingdom, level, gold) VALUES ({i}, '{name}', '{last_name}', '{race}', '{_class}', '{kingdom}', {level}, {gold});")
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Database model for storing output of Feature Generator API. Includes some sample data. (SQL)
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global Distributed SQL Database market size reached USD 1.75 billion in 2024, marking a significant milestone in the evolution of enterprise data management. With a robust compound annual growth rate (CAGR) of 27.3% from 2025 to 2033, the market is projected to soar to USD 12.5 billion by 2033. This impressive growth trajectory is primarily fueled by the surging demand for scalable, resilient, and highly available database solutions across diverse sectors, driven by the exponential increase in data volumes and the necessity for real-time analytics in mission-critical applications.
The primary growth factor underpinning the expansion of the Distributed SQL Database market is the escalating requirement for high availability and fault tolerance in enterprise IT environments. Modern organizations are increasingly adopting distributed architectures to ensure uninterrupted business operations, even in the face of hardware failures or network outages. Distributed SQL databases, with their inherent capability to replicate data across multiple nodes and geographies, offer a compelling solution for enterprises seeking to minimize downtime and data loss. This demand is further amplified by the proliferation of cloud-native applications and microservices architectures, where traditional monolithic databases struggle to keep pace with the needs of dynamic, distributed workloads.
Another key driver for the Distributed SQL Database market is the rapid digital transformation initiatives being undertaken across industries such as BFSI, retail, healthcare, and manufacturing. Enterprises are leveraging distributed SQL databases to enable real-time analytics, support omnichannel customer experiences, and meet stringent regulatory requirements for data integrity and security. The increasing adoption of Internet of Things (IoT) devices and edge computing is also generating vast amounts of decentralized data, necessitating distributed database solutions that can seamlessly scale and process information at the edge while maintaining transactional consistency and global visibility.
Moreover, the growing preference for hybrid and multi-cloud strategies is accelerating the adoption of distributed SQL databases. As organizations seek to avoid vendor lock-in and optimize their IT infrastructure for cost, performance, and compliance, distributed SQL databases provide the flexibility to deploy workloads across on-premises, public cloud, and edge environments. This flexibility not only enhances operational agility but also empowers enterprises to respond swiftly to changing business requirements and regulatory landscapes. The ability of distributed SQL databases to offer strong consistency, horizontal scalability, and global data distribution is positioning them as a foundational technology in the era of digital business.
From a regional perspective, North America currently dominates the Distributed SQL Database market, accounting for the largest share in 2024, driven by the presence of leading technology vendors, early adoption of cloud-native solutions, and substantial investments in digital infrastructure. Asia Pacific, however, is emerging as the fastest-growing region, propelled by rapid economic development, expanding digital ecosystems, and increasing adoption of advanced data management solutions in countries such as China, India, and Japan. Europe and Latin America are also witnessing steady growth, supported by digital transformation initiatives and the rising demand for real-time data analytics across various sectors.
The Distributed SQL Database market is segmented by component into Software and Services, with each category playing a vital role in the overall ecosystem. The software segment, encompassing database engines, management tools, and integration platforms, accounted for the lion’s share of the market revenue in 2024. This dominance can be attributed to the continuous innovation in database architectures, improvements in query optimization, and the integration of advanced features such as automated failover, distributed transactions, and real-time analytics. Vendors are focusing on enhancing their software offerings to support a wide array of deployment scenarios, including hybrid cloud, multi-cloud, and edge environments, which is further boosting the demand for robust distributed
Facebook
TwitterNYSERDA deployed Emergency Generators and Transfer Switches at Retail Gas Stations as part of the Fuel-NY initiative through its Gas Station Back-up Power Program and Permanent Generator Program using FEMA Hazard Mitigation Grant Program funding. This dataset is a complete listing of all installations under the two programs, which ran from 6/1/2013 through 1/26/2019. This dataset contains business names, locations, and the type of installation for program participants.
These efforts improved the resiliency of the retail gasoline system in the downstate New York area by providing back up power options to gas stations.
These stations have either the capability to easily accept backup power or have a backup generator permanently installed on site. The equipment allows the stations to mitigate the effects of power outages, increase the availability of fuel, and support first responders during and after emergencies and severe weather events. Additional socioeconomic benefits are achieved by supporting residents and early recovery efforts with continued access to fuel as well as food, grocery, and necessary household items, increasing the resiliency capabilities of their communities.
The New York State Energy Research and Development Authority (NYSERDA) offers objective information and analysis, innovative programs, technical expertise, and support to help New Yorkers increase energy efficiency, save money, use renewable energy, accelerate economic growth, and reduce reliance on fossil fuels. To learn more about NYSERDA’s programs, visit nyserda.ny.gov or follow us on X, Facebook, YouTube, or Instagram.
Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:
See the Splitgraph documentation for more information.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset is a Portuguese-translated version of the b-mc2/sql-create-context dataset, which was constructed from the WikiSQL and Spider datasets. It contains examples of questions in Portuguese, SQL CREATE TABLE statements, and SQL queries that answer the questions using the CREATE TABLE statement as context.
The main goal of this dataset is to assist Portuguese natural language models in generating precise and contextualized SQL queries, preventing the hallucination of column and table names, a common issue in text-to-SQL datasets. By providing only the CREATE TABLE statement as context, the dataset aims to better ground the models without the need to provide actual data rows, limiting token use and exposure to private, sensitive, or proprietary data.
The questions were translated into Portuguese using the facebook/nllb-200-distilled-1.3B model, ensuring that the natural language queries maintain the same meaning and context as the original English questions.
This dataset is ideal for training natural language models for SQL query generation, especially in scenarios where accuracy in naming columns and tables is crucial. It can be used to enhance model performance in text-to-SQL tasks, providing clear context and avoiding common hallucination errors.
@misc{b-mc2_2023_sql-create-context,
title = {sql-create-context Dataset},
author = {b-mc2},
year = {2023},
url = {https://huggingface.co/datasets/b-mc2/sql-create-context},
note = {This dataset was created by modifying data from the following sources: \cite{zhongSeq2SQL2017, yu2018spider}.},
}
@article{zhongSeq2SQL2017,
author = {Victor Zhong and Caiming Xiong and Richard Socher},
title = {Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning},
journal = {CoRR},
volume = {abs/1709.00103},
year = {2017}
}
@article{yu2018spider,
title = {Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task},
author = {Yu, Tao and Zhang, Rui and Yang, Kai and Yasunaga, Michihiro and Wang, Dongxu and Li, Zifan and Ma, James and Li, Irene and Yao, Qingning and Roman, Shanelle and others},
journal = {arXiv preprint arXiv:1809.08887},
year = {2018}
}
Facebook
Twitterhttps://researchintelo.com/privacy-and-policyhttps://researchintelo.com/privacy-and-policy
According to our latest research, the global Cloud SQL market size in 2024 stands at USD 7.8 billion, reflecting robust adoption across industries. The market is poised for significant expansion, projected to reach USD 32.5 billion by 2033, growing at a compelling CAGR of 17.2% during the forecast period. This remarkable growth is primarily driven by the increasing demand for scalable, flexible, and cost-efficient database management solutions that support digital transformation initiatives worldwide.
A primary growth factor for the Cloud SQL market is the accelerating shift toward cloud-based infrastructure in organizations of all sizes. Enterprises are increasingly migrating their data workloads to the cloud to leverage benefits such as reduced operational costs, enhanced scalability, and improved data accessibility. Cloud SQL solutions, with their managed database services, eliminate the need for manual database maintenance and updates, thereby allowing IT teams to focus on core business activities. Furthermore, the proliferation of data from IoT devices, mobile applications, and digital services is generating an unprecedented amount of structured and unstructured data, necessitating robust database solutions that can seamlessly scale with demand. As organizations prioritize agility and innovation, the adoption of Cloud SQL platforms is becoming integral to their IT strategies.
Another significant driver is the growing emphasis on data security, compliance, and disaster recovery. Cloud SQL services offer advanced security features, including data encryption, automated backups, and multi-region replication, ensuring business continuity and regulatory compliance. The rise in cyber threats and stringent data protection regulations such as GDPR and HIPAA have made secure data management a top priority for enterprises. By leveraging Cloud SQL, organizations can mitigate the risks associated with data breaches and ensure that their critical business information is protected against potential threats. Additionally, the ability to automate backup and recovery processes reduces downtime and safeguards against data loss, further enhancing the value proposition of cloud-based SQL databases.
The integration of advanced analytics and artificial intelligence is also catalyzing the expansion of the Cloud SQL market. Organizations are increasingly harnessing the power of business intelligence and analytics tools to extract actionable insights from their data. Cloud SQL platforms facilitate seamless integration with analytics solutions, enabling real-time data processing and visualization. This capability is particularly valuable for industries such as retail, healthcare, and BFSI, where timely insights can drive better decision-making and competitive advantage. As digital transformation accelerates, the need for agile, intelligent, and data-driven operations will continue to fuel the adoption of Cloud SQL solutions across diverse sectors.
From a regional perspective, North America currently dominates the Cloud SQL market, accounting for the largest share in 2024, driven by the presence of leading cloud service providers, rapid technological advancements, and high digital adoption rates. Europe follows closely, propelled by stringent data privacy regulations and strong demand from sectors such as BFSI and healthcare. The Asia Pacific region is anticipated to witness the fastest growth, with a CAGR exceeding 19%, fueled by increasing cloud adoption among SMEs, government digitalization initiatives, and a burgeoning IT services sector. Meanwhile, Latin America and the Middle East & Africa are gradually catching up, with growing investments in cloud infrastructure and digital transformation projects.
The Cloud SQL market is broadly segmented by database type into relational and non-relational databases. Relational databases, such as MySQL, PostgreSQL, and Microsoft SQL Server, continue to dominate the market due to their widespread use in transactional applications and enterprise workloads. These databases are prized for their ability to maintain data integrity, support complex queries, and provide consistent performance. Businesses in industries like BFSI, healthcare, and retail rely heavily on relational databases for mission-critical applications where data accuracy and reliability are paramount. The demand for managed relational database services in the cloud is further boosted by the need for seamless migration fr
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global SQL Performance Tuning Tools market size reached USD 1.42 billion in 2024, exhibiting robust expansion driven by the surging need for optimized database management and real-time analytics across enterprises. The market is poised to grow at a CAGR of 9.7% from 2025 to 2033, with the forecasted value expected to hit USD 3.27 billion by 2033. This growth is primarily attributed to the increasing complexity of database environments, the proliferation of data-driven applications, and the urgent demand for high availability and efficiency in mission-critical business operations. As organizations continue to digitize and scale their infrastructure, SQL performance tuning tools are becoming indispensable for ensuring seamless data processing and superior user experiences.
A significant growth factor for the SQL Performance Tuning Tools market is the exponential increase in data volumes generated by organizations worldwide. Enterprises are embracing digital transformation initiatives, leading to a surge in transactional and analytical workloads that demand high-performing databases. SQL performance tuning tools play a pivotal role in identifying, diagnosing, and resolving performance bottlenecks within SQL queries and database configurations. With the adoption of advanced analytics, artificial intelligence, and machine learning, organizations are generating and processing more data than ever before, necessitating robust tools to ensure optimal database performance. This trend is particularly pronounced in sectors such as BFSI, healthcare, and e-commerce, where data-driven decision-making and real-time insights are critical for competitive advantage.
Another key driver is the growing complexity of IT environments, particularly with the rise of hybrid and multi-cloud deployments. As enterprises migrate workloads to cloud platforms and integrate on-premises systems with cloud-based solutions, managing and tuning SQL databases becomes increasingly challenging. SQL performance tuning tools enable IT teams to monitor and optimize database performance across diverse and distributed environments, ensuring consistency, reliability, and scalability. These tools offer advanced features such as automated query optimization, real-time monitoring, and predictive analytics, which are essential for maintaining service-level agreements (SLAs) and minimizing downtime. The increasing reliance on cloud infrastructure, coupled with the need for agile and resilient database management, is expected to further propel market growth.
The expanding ecosystem of database technologies and the proliferation of open-source SQL databases are also fueling demand for performance tuning solutions. Organizations are adopting a wide range of relational and non-relational databases to support diverse workloads, leading to greater heterogeneity in database environments. This diversity introduces new challenges in performance management, as traditional tuning methods may not be effective across different platforms. SQL performance tuning tools are evolving to support a broad spectrum of database engines, providing unified visibility and optimization capabilities. As businesses strive to deliver high-quality digital experiences and minimize operational costs, the adoption of advanced tuning tools is becoming a strategic imperative.
From a regional perspective, North America continues to dominate the SQL Performance Tuning Tools market, accounting for the largest share in 2024. This leadership is driven by the presence of major technology vendors, a mature IT infrastructure, and early adoption of advanced database management solutions. Europe and Asia Pacific are also witnessing rapid growth, fueled by increasing investments in digital transformation, expanding IT services sectors, and the rise of cloud computing. The Asia Pacific region, in particular, is expected to exhibit the highest CAGR during the forecast period, supported by the proliferation of SMEs, growing e-commerce activities, and government initiatives to promote digital innovation. Meanwhile, Latin America and the Middle East & Africa are emerging as promising markets, albeit at a relatively nascent stage, as organizations in these regions modernize their IT landscapes and embrace data-driven strategies.
The SQL Performance Tuning Tools market by component is broadly segmented into software and servi
Facebook
Twitterhttps://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The Online Transaction Processing (OLTP) market is experiencing robust growth, driven by the increasing adoption of cloud-based solutions, the proliferation of mobile and IoT devices generating massive transactional data, and the rising demand for real-time data processing across diverse industries. Let's assume, for illustrative purposes, a 2025 market size of $150 billion, with a Compound Annual Growth Rate (CAGR) of 12% projected for the forecast period of 2025-2033. This signifies a substantial expansion of the market, reaching an estimated value exceeding $400 billion by 2033. Key drivers include the need for enhanced operational efficiency, improved customer experience through faster transaction processing, and the ability to leverage real-time data for informed decision-making. The increasing adoption of advanced technologies like in-memory databases and distributed databases further fuels this growth. Significant trends shaping the OLTP market include the shift towards cloud-based deployment models, owing to their scalability, cost-effectiveness, and ease of management. The growing demand for high-availability and fault-tolerant systems is also pushing innovation in database technologies. The integration of artificial intelligence (AI) and machine learning (ML) for predictive analytics and fraud detection within OLTP systems is another key trend gaining momentum. While the market faces certain restraints like data security concerns, integration complexities, and the need for skilled professionals, the overall growth trajectory remains positive, driven by strong market demand and technological advancements. The segment analysis shows a significant contribution from cloud-based OLTP solutions, with the market being highly competitive, with key players constantly innovating to maintain their market share.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This is a set of SQL databases with information about molecules and radicals with the following database conventions and content.
Each database is named as “CnHm”, where n and m range from 1 to 5 and indicate the number of carbon (C) and hydrogen (H) atoms in the structures in the database.
Each database contains entries of a large number of “CnHm” geometries. Within each database are frous tables.
Table “meta”Contains the name of the database and the date it was created.Table “xyz”The columns of the “xyz” database are the following:
“id”: a numerical identification number, integer
“calc_params”: metadata describing the level of theory and other details of the quantum chemical calculations to generate the Hessian used for generating this structure, numpy array stored as a blob.
“calc”: software used to for the calculation, string
“temp”: the temperature used during the normal mode sampling process to generate the structure in K, outliers are assigned negative temperature, float
“name”: a unique name describing the anchor point the structure was generated from, string
“dist”: the normalized unitless distance of the structure from its anchor point at the temperature given in “temp”, float
“geometry”: atomic coordinates in angstroms, (n+m) by 3 numpy array of floats stored as a blob. Note that the atomic positions are listed with carbons first followed by hydrogens.
“created_at”: date the structure was generated
Table “energy”The columns of the “energy” database are the following:
“id”: a numerical identification number, not linked to the “id” in “xyz”, integer
“fidelity”: the fidelity level the energy was calculated at, integer
0 = B3LYP/6-31+G(d)
1 = wB97X-D/6-311++G(d,p)
2 = HF/6-31G
3 = B3LYP/6-31G
4 = B2PLYPD3/6-311++G(d,p)
“E”: molecular energy in eV, float
“xyz_id”: the “id” of the geometry in the “xyz” table this energy calculated for, integer
“hessian”: empty
“forces”: atomic forces in eV/angstrom, (n+m) by 3 numpy array of floats stored as a blob. Note that the atomic forces are listed in the same order as the atoms in the “geometry” in the “xyz” table
“calc_params”: metadata describing the level of theory and other details of the energy and force calculations of this entry, numpy array stored as a blob
“calc”: software used to for the energy and force calculation
“created_at”: date the energy and forces were calculated
“sample_set_id”: empty
Table “aev”: currently empty (edited)
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the SQL Query Audit Tools market size reached USD 1.26 billion in 2024, reflecting robust adoption across multiple industries. The market is projected to expand at a CAGR of 13.2% from 2025 to 2033, culminating in a forecasted market value of USD 3.69 billion by 2033. This substantial growth trajectory is primarily driven by the escalating demand for robust database security and compliance solutions in an era marked by increasingly stringent data privacy regulations and an upsurge in cyber threats targeting sensitive business information.
One of the most significant growth factors for the SQL Query Audit Tools market is the rising complexity and volume of enterprise data. Organizations across sectors are generating and handling massive amounts of structured and unstructured data, necessitating advanced auditing mechanisms to ensure data integrity, compliance, and security. The proliferation of digital transformation initiatives, cloud migration, and the adoption of big data analytics have further underscored the need for sophisticated tools capable of auditing SQL queries in real-time. These tools not only help organizations identify suspicious activities and unauthorized access but also play a pivotal role in maintaining regulatory compliance, especially in highly regulated industries such as banking, financial services, and healthcare.
The growing regulatory landscape is another key driver propelling the SQL Query Audit Tools market. Governments and regulatory bodies worldwide have introduced stringent data protection laws such as GDPR, HIPAA, and CCPA, compelling organizations to implement comprehensive audit trails for all database activities. SQL query audit tools offer granular visibility into database transactions, enabling companies to demonstrate compliance and avoid hefty fines associated with non-compliance. Furthermore, as cyberattacks become more sophisticated, organizations are increasingly recognizing the value of proactive monitoring and auditing solutions that can detect anomalies, prevent data breaches, and support forensic investigations in the event of security incidents.
Technological advancements and the integration of artificial intelligence and machine learning into SQL query audit tools are also fueling market expansion. Modern solutions are leveraging AI-driven analytics to automate anomaly detection, streamline compliance reporting, and enhance the accuracy of security alerts. Additionally, the shift towards cloud-based deployments is making these tools more accessible to small and medium enterprises (SMEs), which historically faced barriers due to high upfront costs and resource constraints. The combination of technological innovation, regulatory pressure, and the increasing importance of data governance is expected to sustain the strong growth momentum of the SQL Query Audit Tools market in the coming years.
Regionally, North America currently dominates the SQL Query Audit Tools market, accounting for the largest share in 2024, followed by Europe and the Asia Pacific. The United States, in particular, is witnessing significant adoption driven by the presence of large enterprises, advanced IT infrastructure, and a highly regulated business environment. Europe is also experiencing robust growth, fueled by stringent data protection regulations and increasing investments in cybersecurity solutions. Meanwhile, the Asia Pacific region is poised for the fastest growth over the forecast period, supported by rapid digitalization, expanding IT and telecommunications sectors, and rising awareness about data security among enterprises in emerging economies such as China and India.
The Component segment of the SQL Query Audit Tools market is bifurcated into software and services, each playing a critical role in the overall ecosystem. Software solutions form the backbone of the market, encompassing standalone audit tools, integrated database management platforms, and advanced analytics engines. These software offerings are designed to monitor, log, and analyze SQL queries in real-time, providing detailed audit trails and actionable insights for security, compliance, and performance optimization. The demand for feature-rich, scalable, and user-friendly software is on the rise as organizations seek to automate audit processes and minimize manual intervention.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.
NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.
Datasets
The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).
The datasets contain both benign and malicious traffic. All collected datasets are balanced.
The version of NetFlow used to build the datasets is 5.
Dataset
Aim
Samples
Benign-malicious
traffic ratio
D1
Training
400,003
50%
D2
Test
57,239
50%
Infrastructure and implementation
Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.
DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)
Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).
The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.
The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.
Parameters
Description
'--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema'
Enumerate users, password hashes, privileges, roles, databases, tables and columns
--level=5
Increase the probability of a false positive identification
--risk=3
Increase the probability of extracting data
--random-agent
Select the User-Agent randomly
--batch
Never ask for user input, use the default behavior
--answers="follow=Y"
Predefined answers to yes
Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).
The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24. The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.
However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.
To run the MySQL server we ran MariaDB version 10.4.12. Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.