Facebook
TwitterThe Sakila sample database is a fictitious database designed to represent a DVD rental store. The tables of the database include film, film_category, actor, customer, rental, payment and inventory among others. The Sakila sample database is intended to provide a standard schema that can be used for examples in books, tutorials, articles, samples, and so forth. Detailed information about the database can be found on the MySQL website: https://dev.mysql.com/doc/sakila/en/
Sakila for SQLite is a part of the sakila-sample-database-ports project intended to provide ported versions of the original MySQL database for other database systems, including:
Sakila for SQLite is a port of the Sakila example database available for MySQL, which was originally developed by Mike Hillyer of the MySQL AB documentation team. This project is designed to help database administrators to decide which database to use for development of new products The user can run the same SQL against different kind of databases and compare the performance
License: BSD Copyright DB Software Laboratory http://www.etl-tools.com
Note: Part of the insert scripts were generated by Advanced ETL Processor http://www.etl-tools.com/etl-tools/advanced-etl-processor-enterprise/overview.html
Information about the project and the downloadable files can be found at: https://code.google.com/archive/p/sakila-sample-database-ports/
Other versions and developments of the project can be found at: https://github.com/ivanceras/sakila/tree/master/sqlite-sakila-db
https://github.com/jOOQ/jOOQ/tree/main/jOOQ-examples/Sakila
Direct access to the MySQL Sakila database, which does not require installation of MySQL (queries can be typed directly in the browser), is provided on the phpMyAdmin demo version website: https://demo.phpmyadmin.net/master-config/
The files in the sqlite-sakila-db folder are the script files which can be used to generate the SQLite version of the database. For convenience, the script files have already been run in cmd to generate the sqlite-sakila.db file, as follows:
sqlite> .open sqlite-sakila.db # creates the .db file
sqlite> .read sqlite-sakila-schema.sql # creates the database schema
sqlite> .read sqlite-sakila-insert-data.sql # inserts the data
Therefore, the sqlite-sakila.db file can be directly loaded into SQLite3 and queries can be directly executed. You can refer to my notebook for an overview of the database and a demonstration of SQL queries. Note: Data about the film_text table is not provided in the script files, thus the film_text table is empty. Instead the film_id, title and description fields are included in the film table. Moreover, the Sakila Sample Database has many versions, so an Entity Relationship Diagram (ERD) is provided to describe this specific version. You are advised to refer to the ERD to familiarise yourself with the structure of the database.
Facebook
TwitterThis is the sample database from sqlservertutorial.net. This is a great dataset for learning SQL and practicing querying relational databases.
Database Diagram:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4146319%2Fc5838eb006bab3938ad94de02f58c6c1%2FSQL-Server-Sample-Database.png?generation=1692609884383007&alt=media" alt="">
The sample database is copyrighted and cannot be used for commercial purposes. For example, it cannot be used for the following but is not limited to the purposes: - Selling - Including in paid courses
Facebook
TwitterMySQL Classicmodels sample database
The MySQL sample database schema consists of the following tables:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F8652778%2Fefc56365be54c0e2591a1aefa5041f36%2FMySQL-Sample-Database-Schema.png?generation=1670498341027618&alt=media" alt="">
Facebook
TwitterThis dataset was created by Sudhir Singh
Released under Data files © Original Authors
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
FooDrugs database is a development done by the Computational Biology Group at IMDEA Food Institute (Madrid, Spain), in the context of the Food Nutrition Security Cloud (FNS-Cloud) project. Food Nutrition Security Cloud (FNS-Cloud) has received funding from the European Union's Horizon 2020 Research and Innovation programme (H2020-EU.3.2.2.3. – A sustainable and competitive agri-food industry) under Grant Agreement No. 863059 – www.fns-cloud.eu (See more details about FNS-Cloud below)
FooDrugs stores information extracted from transcriptomics and text documents for foo-drug interactiosn and it is part of a demonstrator to be done in the FNS-Cloud project. The database was built using MySQL, an open source relational database management system. FooDrugs host information for a total of 161 transcriptomics GEO series with 585 conditions for food or bioactive compounds. Each condition is defined as a food/biocomponent per time point, per concentration, per cell line, primary culture or biopsy per study. FooDrugs includes information about a bipartite network with 510 nodes and their similarity scores (tau score; https://clue.io/connectopedia/connectivity_scores) related with possible drug interactions with drugs assayed in conectivity map (https://www.broadinstitute.org/connectivity-map-cmap). The information is stored in eight tables:
Table “study” : This table contains basic information about study identifiers from GEO, pubmed or platform, study type, title and abstract
Table “sample”: This table contains basic information about the different experiments in a study, like the identifier of the sample, treatment, origin type, time point or concentration.
Table “misc_study”: This table contains additional information about different attributes of the study.
Table “misc_sample”: This table contains additional information about different attributes of the sample.
Table “cmap”: This table contains information about 70895 nodes, compromising drugs, foods or bioactives, overexpressed and knockdown genes (see section 3.4). The information includes cell line, compound and perturbation type.
Table “cmap_foodrugs”: This table contains information about the tau score (see section 3.4) that relates food with drugs or genes and the node identifier in the FooDrugs network.
Table “topTable”: This table contains information about 150 over and underexpressed genes from each GEO study condition, used to calculate the tau score (see section 3.4). The information stored is the logarithmic fold change, average expression, t-statistic, p-value, adjusted p-value and if the gene is up or downregulated.
Table “nodes”: This table stores the information about the identification of the sample and the node in the bipartite network connecting the tables “sample”, “cmap_foodrugs” and “topTable”.
In addition, FooDrugs database stores a total of 6422 food/drug interactions from 2849 text documents, obtained from three different sources: 2312 documents from PubMed, 285 from DrugBank, and 252 from drugs.com. These documents describe potential interactions between 1464 food/bioactive compounds and 3009 drugs. The information is stored in two tables:
Table “texts”: This table contains all the documents with its identifiers where interactions have been identified with strategy described in section 4.
Table “TM_interactions”: This table contains information about interaction identifiers, the food and drug entities, and the start and the end positions of the context for the interaction in the document.
FNS-Cloud will overcome fragmentation problems by integrating existing FNS data, which is essential for high-end, pan-European FNS research, addressing FNS, diet, health, and consumer behaviours as well as on sustainable agriculture and the bio-economy. Current fragmented FNS resources not only result in knowledge gaps that inhibit public health and agricultural policy, and the food industry from developing effective solutions, making production sustainable and consumption healthier, but also do not enable exploitation of FNS knowledge for the benefit of European citizens. FNS-Cloud will, through three Demonstrators; Agri-Food, Nutrition & Lifestyle and NCDs & the Microbiome to facilitate: (1) Analyses of regional and country-specific differences in diet including nutrition, (epi)genetics, microbiota, consumer behaviours, culture and lifestyle and their effects on health (obesity, NCDs, ethnic and traditional foods), which are essential for public health and agri-food and health policies; (2) Improved understanding agricultural differences within Europe and what these means in terms of creating a sustainable, resilient food systems for healthy diets; and (3) Clear definitions of boundaries and how these affect the compositions of foods and consumer choices and, ultimately, personal and public health in the future. Long-term sustainability of the FNS-Cloud will be based on Services that have the capacity to link with new resources and enable cross-talk amongst them; access to FNS-Cloud data will be open access, underpinned by FAIR principles (findable, accessible, interoperable and re-useable). FNS-Cloud will work closely with the proposed Food, Nutrition and Health Research Infrastructure (FNHRI) as well as METROFOOD-RI and other existing ESFRI RIs (e.g. ELIXIR, ECRIN) in which several FNS-Cloud Beneficiaries are involved directly. (https://cordis.europa.eu/project/id/863059)
***** changes between version FooDrugs_v2 and FooDrugs_V3 (31st January 2023) are:
Increased the amount of text documents by 85.675 from PubMed and ClinicalTrials.gov, and the amount of Text Mining interactions by 168.826.
Increased the amount of transcriptomic studies by 32 GEO series.
Removed all rows in table cmap_foodrugs representing interactions with values of tau=0
Removed 43 GEO series that after manually checking didn't correspond to food compounds.
Added a new column to the table texts: citation to hold the citation of the text.
Added these columns to the table study: contributor to contain the authors of the study, publication_date to store the date of publication of the study in GEO and pubmed_id to reference the publication associated with the study if any.
Added a new column to topTable to hold the top 150 up-regulated and 150 down-regulated genes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present three defect rediscovery datasets mined from Bugzilla. The datasets capture data for three groups of open source software projects: Apache, Eclipse, and KDE. The datasets contain information about approximately 914 thousands of defect reports over a period of 18 years (1999-2017) to capture the inter-relationships among duplicate defects.
File Descriptions
apache.csv - Apache Defect Rediscovery dataset
eclipse.csv - Eclipse Defect Rediscovery dataset
kde.csv - KDE Defect Rediscovery dataset
apache.relations.csv - Inter-relations of rediscovered defects of Apache
eclipse.relations.csv - Inter-relations of rediscovered defects of Eclipse
kde.relations.csv - Inter-relations of rediscovered defects of KDE
create_and_populate_neo4j_objects.cypher - Populates Neo4j graphDB by importing all the data from the CSV files. Note that you have to set dbms.import.csv.legacy_quote_escaping configuration setting to false to load the CSV files as per https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.import.csv.legacy_quote_escaping
create_and_populate_mysql_objects.sql - Populates MySQL RDBMS by importing all the data from the CSV files
rediscovery_db_mysql.zip - For your convenience, we also provide full backup of the MySQL database
neo4j_examples.txt - Sample Neo4j queries
mysql_examples.txt - Sample MySQL queries
rediscovery_eclipse_6325.png - Output of Neo4j example #1
distinct_attrs.csv - Distinct values of bug_status, resolution, priority, severity for each project
Facebook
TwitterAs of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of *******; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Malaria is the leading cause of death in the African region. Data mining can help extract valuable knowledge from available data in the healthcare sector. This makes it possible to train models to predict patient health faster than in clinical trials. Implementations of various machine learning algorithms such as K-Nearest Neighbors, Bayes Theorem, Logistic Regression, Support Vector Machines, and Multinomial Naïve Bayes (MNB), etc., has been applied to malaria datasets in public hospitals, but there are still limitations in modeling using the Naive Bayes multinomial algorithm. This study applies the MNB model to explore the relationship between 15 relevant attributes of public hospitals data. The goal is to examine how the dependency between attributes affects the performance of the classifier. MNB creates transparent and reliable graphical representation between attributes with the ability to predict new situations. The model (MNB) has 97% accuracy. It is concluded that this model outperforms the GNB classifier which has 100% accuracy and the RF which also has 100% accuracy.
Methods
Prior to collection of data, the researcher was be guided by all ethical training certification on data collection, right to confidentiality and privacy reserved called Institutional Review Board (IRB). Data was be collected from the manual archive of the Hospitals purposively selected using stratified sampling technique, transform the data to electronic form and store in MYSQL database called malaria. Each patient file was extracted and review for signs and symptoms of malaria then check for laboratory confirmation result from diagnosis. The data was be divided into two tables: the first table was called data1 which contain data for use in phase 1 of the classification, while the second table data2 which contains data for use in phase 2 of the classification.
Data Source Collection
Malaria incidence data set is obtained from Public hospitals from 2017 to 2021. These are the data used for modeling and analysis. Also, putting in mind the geographical location and socio-economic factors inclusive which are available for patients inhabiting those areas. Naive Bayes (Multinomial) is the model used to analyze the collected data for malaria disease prediction and grading accordingly.
Data Preprocessing:
Data preprocessing shall be done to remove noise and outlier.
Transformation:
The data shall be transformed from analog to electronic record.
Data Partitioning
The data which shall be collected will be divided into two portions; one portion of the data shall be extracted as a training set, while the other portion will be used for testing. The training portion shall be taken from a table stored in a database and will be called data which is training set1, while the training portion taking from another table store in a database is shall be called data which is training set2.
The dataset was split into two parts: a sample containing 70% of the training data and 30% for the purpose of this research. Then, using MNB classification algorithms implemented in Python, the models were trained on the training sample. On the 30% remaining data, the resulting models were tested, and the results were compared with the other Machine Learning models using the standard metrics.
Classification and prediction:
Base on the nature of variable in the dataset, this study will use Naïve Bayes (Multinomial) classification techniques; Classification phase 1 and Classification phase 2. The operation of the framework is illustrated as follows:
i. Data collection and preprocessing shall be done.
ii. Preprocess data shall be stored in a training set 1 and training set 2. These datasets shall be used during classification.
iii. Test data set is shall be stored in database test data set.
iv. Part of the test data set must be compared for classification using classifier 1 and the remaining part must be classified with classifier 2 as follows:
Classifier phase 1: It classify into positive or negative classes. If the patient is having malaria, then the patient is classified as positive (P), while a patient is classified as negative (N) if the patient does not have malaria.
Classifier phase 2: It classify only data set that has been classified as positive by classifier 1, and then further classify them into complicated and uncomplicated class label. The classifier will also capture data on environmental factors, genetics, gender and age, cultural and socio-economic variables. The system will be designed such that the core parameters as a determining factor should supply their value.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.
NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.
Datasets
The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).
The datasets contain both benign and malicious traffic. All collected datasets are balanced.
The version of NetFlow used to build the datasets is 5.
Dataset
Aim
Samples
Benign-malicious
traffic ratio
D1
Training
400,003
50%
D2
Test
57,239
50%
Infrastructure and implementation
Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.
DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)
Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).
The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.
The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.
Parameters
Description
'--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema'
Enumerate users, password hashes, privileges, roles, databases, tables and columns
--level=5
Increase the probability of a false positive identification
--risk=3
Increase the probability of extracting data
--random-agent
Select the User-Agent randomly
--batch
Never ask for user input, use the default behavior
--answers="follow=Y"
Predefined answers to yes
Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).
The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24. The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.
However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.
To run the MySQL server we ran MariaDB version 10.4.12. Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comprises of two .csv format files used within workstream 2 of the Wellcome Trust funded ‘Orphan drugs: High prices, access to medicines and the transformation of biopharmaceutical innovation’ project (219875/Z/19/Z). They appear in various outputs, e.g. publications and presentations.
The deposited data were gathered using the University of Amsterdam Digital Methods Institute’s ‘Twitter Capture and Analysis Toolset’ (DMI-TCAT) before being processed and extracted from Gephi. DMI-TCAT queries Twitter’s STREAM Application Programming Interface (API) using SQL and retrieves data on a pre-set text query. It then sends the returned data for storage on a MySQL database. The tool allows for output of that data in various formats. This process aligns fully with Twitter’s service user terms and conditions. The query for the deposited dataset gathered a 1% random sample of all public tweets posted between 10-Feb-2021 and 10-Mar-2021 containing the text ‘Rare Diseases’ and/or ‘Rare Disease Day’, storing it on a local MySQL database managed by the University of Sheffield School of Sociological Studies (http://dmi-tcat.shef.ac.uk/analysis/index.php), accessible only via a valid VPN such as FortiClient and through a permitted active directory user profile. The dataset was output from the MySQL database raw as a .gexf format file, suitable for social network analysis (SNA). It was then opened using Gephi (0.9.2) data visualisation software and anonymised/pseudonymised in Gephi as per the ethical approval granted by the University of Sheffield School of Sociological Studies Research Ethics Committee on 02-Jun-201 (reference: 039187). The deposited dataset comprises of two anonymised/pseudonymised social network analysis .csv files extracted from Gephi, one containing node data (Issue-networks as excluded publics – Nodes.csv) and another containing edge data (Issue-networks as excluded publics – Edges.csv). Where participants explicitly provided consent, their original username has been provided. Where they have provided consent on the basis that they not be identifiable, their username has been replaced with an appropriate pseudonym. All other usernames have been anonymised with a randomly generated 16-digit key. The level of anonymity for each Twitter user is provided in column C of deposited file ‘Issue-networks as excluded publics – Nodes.csv’.
This dataset was created and deposited onto the University of Sheffield Online Research Data repository (ORDA) on 26-Aug-2021 by Dr. Matthew S. Hanchard, Research Associate at the University of Sheffield iHuman institute/School of Sociological Studies. ORDA has full permission to store this dataset and to make it open access for public re-use without restriction under a CC BY license, in line with the Wellcome Trust commitment to making all research data Open Access.
The University of Sheffield are the designated data controller for this dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This mysql database contains list of submitted Java programs based on series of online lab exercises from year 2013 to 2015. The programs were submitted by first year computer science students from Faculty of Informatics and Computing, Universiti Sultan Zainal Abidin, Malaysia who undertaking Introductory Computer Programming subject. There were 67, 18 and 47 of participated students in 2013, 2014 and 2015 respectively. The submitted programs were all of their solution attempts in answering a computational programming question. The question was as the following:
Write a program that will read string. Then your program should show all the string character using * except for character 2, output its real character. sample input. Apology sample output. p****
Facebook
TwitterAs of June 2024, the most popular relational database management system (RDBMS) worldwide was Oracle, with a ranking score of *******. Oracle was also the most popular DBMS overall. MySQL and Microsoft SQL server rounded out the top three.
Facebook
TwitterREADME 2025-09-10 Introduction The peatland mid infrared database (pmird) stores data from peat, vegetation, litter, and dissolved organic matter samples, in particular mid infrared spectra and other variables, from previously published and unpublished data sources. The majority of samples in the database are peat samples from northern bogs. Currently, the database contains entries from 26 studies, 11216 samples, and 3877 mid infrared spectra. The aim is to provide a harmonized data source that can be useful to re-analyse existing data, analyze peat chemistry, develop and test spectral prediction models, and provide data on various peat properties. Usage notes Download and Setup The peatland mid infrared database can be downloaded from https://doi.org/10.5281/zenodo.17092587. The publication contains the following files and folders: pmird-backup-2025-09-10.sql: A mysqldump backup of the pmird database. pmird_prepared_data: A folder that contains: Folders like c00001-2020-08-17-Hodgkins with the raw spectra for samples from each dataset in the pmird database (see below for how to import the spectra). Files like pmird_prepare_data_c00001-2020-08-17-Hodgkins.Rmd that contain the R code used to process and import the data from each dataset into the database. Corresponding html files contain the compiled scripts. pmird_prepare_data.Rmd: An Rmarkdown script that was used to run the scripts that created the database (the top level script). mysql_scripts: A folder that contains: pmird_mysql_initialization.sql: MariaDB script to initialize the database. 001-db-initialize.Rmd: Rmarkdown script that executes pmird_mysql_initialization.sql and populated dataset-independent tables. add-citations.Rmd: Rmarkdown script that adds information on references to the database. add-licenses.Rmd: Rmarkdown script that adds information on licenses to the database. add-mir-metadata-quality.Rmd Rmarkdown script that adds information on the quality of the infrared spectra to the database. Dockerfile: A Dockerfile that defines the computing environment used to create the database. renv.lock A renv.lock file that lists the R packages used to create the database. The database can be set up as follows: The downloaded database needs to be imported in a running MariaDB instance. In a linux terminal, the downloaded sql file can be imported like so: mysql -u
Facebook
TwitterThe main aim of this study is to evaluate the impact and effectiveness of the scale up of the DREAMS HIV prevention package of biological, behavioural and social interventions in reducing HIV incidence in adolescent girls and young women residing in the uMkhanyakude district of KwaZulu-Natal. To achieve this aim, the changes in different outcomes will be assessed over time. The primary outcome will be HIV incidence and other key secondary outcomes will include knowledge of own HIV status, sexual debut, HSV-2, number of sexual partners, age-disparity with sexual partners, ever been pregnant, condom use, unmet need for contraception, transactional sex, education (remaining in school) and experiences of violence.
Demographic surveillance area of the Africa Health Research Institute; KwaZulu-Natal, uMkhanyakude district.
Individual
Closed cohorts of 800 AGYW will be followed prospectively at three time points over the two-year study period - baseline, 12 months and 24 months - at points closest aligned with periods before, during and after DREAMS implementation. In ACDIS cohorts of 400 girls aged 13-17 years and 400 young women aged 14-23 years will undergo informed consent, recruited undergo a baseline questionnaire and provide dry blood spots for HSV2 at the same time they provide a sample for the HIV testing in the surveillance and then reviewed annually for the next two years.
Longitudinal survey data
Adolescent girls and young women aged 14-23 years who were residents in the demographic surveillance area of the Africa Health Research Institute. A total of 3013 participants were randomly selected to obtain a target sample size of 800 after 2 years of follow-up, allowing for 40% non-contact/loss-to-follow-up. Sampling was stratified by age group and area (week-blocks).
All data will be managed using electronic data management tools. The data management system for these will be based on REDCap (research electronic data capture) developed at Vanderbilt University. The REDCap database resides within a single MySQL database server within a secure server cluster at the AHRI. Survey data are synchronised by the REDCap application from the mobile device to a central MySQL server. Access control is managed through Microsoft Active Directory with minimum password complexity and compulsory password change policies.
Facebook
TwitterThe main aim of this study is to evaluate the impact and effectiveness of the scale up of the DREAMS HIV prevention package of biological, behavioural and social interventions in reducing HIV incidence in adolescent girls and young women residing in the uMkhanyakude district of KwaZulu-Natal. To achieve this aim, the changes in different outcomes will be assessed over time in relation to DREAMS roll-out. The primary outcome will be HIV incidence and other key secondary outcomes will include knowledge of own HIV status, sexual debut, HSV-2, number of sexual partners, age-disparity with sexual partners, ever been pregnant, condom use, unmet need for contraception, transactional sex, education (remaining in school) and experiences of violence.
Demographic surveillance area of the Africa Health Research Institute; KwaZulu-Natal, uMkhanyakude district.
Individual
A planned closed cohorts of 800 AGYW will be followed prospectively at three time points over the two-year study period - baseline, 12 months and 24 months - at points closest aligned with periods before, during and after DREAMS implementation. In ACDIS cohorts of 400 girls aged 13-17 years and 400 young women aged 14-23 years will undergo informed consent, recruited undergo a baseline questionnaire and provide dry blood spots for HSV2 at the same time they provide a sample for the HIV testing in the surveillance and then reviewed annually for the next two years.
Longitudinal survey data
This is a closed cohort of AGYW who were enrolled in 2017 (aged 13-22) being followed up in 2018 aged 14-23 who were residents in the demographic surveillance area of the Africa Health Research Institute. A total of 3013 participants were randomly selected to obtain a target sample size of 800 after 2 years of follow-up, allowing for 40% non-contact/loss-to-follow-up. Sampling was stratified by age group and area (week-blocks).
All data will be managed using electronic data management tools. The data management system for these will be based on REDCap (research electronic data capture) developed at Vanderbilt University. The REDCap database resides within a single MySQL database server within a secure server cluster at the AHRI. Survey data are synchronised by the REDCap application from the mobile device to a central MySQL server. Access control is managed through Microsoft Active Directory with minimum password complexity and compulsory password change policies.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Data files generated as part of a study into the influence of neighbouring bases on point mutation. The data are sampled from the Ensembl (http://www.ensembl.org) MySQL databases or COSMIC (http://cancer.sanger.ac.uk/cosmic) and processed using custom scripts that will be uploaded separately and associated with this submission via gthe related identifier.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Chinook Database is a sample database designed for use with multiple database platforms, such as SQL Server, Oracle, MySQL, and others. It can be easily set up by running a single SQL script, making it a convenient alternative to the popular Northwind database. Chinook is widely used in demos and testing environments, particularly for Object-Relational Mapping (ORM) tools that target both single and multiple database servers.
Supported Database Servers Chinook supports several database servers, including:
DB2 MySQL Oracle PostgreSQL SQL Server SQL Server Compact SQLite Download Instructions You can download the SQL scripts for each supported database server from the latest release assets. The appropriate SQL script file(s) for your database vendor are provided, which can be executed using your preferred database management tool.
Data Model The Chinook Database represents a digital media store, containing tables that include:
Artists Albums Media tracks Invoices Customers Sample Data The media data in Chinook is derived from a real iTunes Library, providing a realistic dataset for users. Additionally, users can generate their own SQL scripts using their personal iTunes Library by following specific instructions. Customer and employee details in the database were manually crafted with fictitious names, addresses (mappable via Google Maps), and well-structured contact information such as phone numbers, faxes, and emails. Sales data is auto-generated and spans a four-year period, using random values.
Why is it Called Chinook? The Chinook Database's name is a nod to its predecessor, the Northwind database. Chinooks are warm, dry winds found in the interior regions of North America, particularly over southern Alberta in Canada, where the Canadian Prairies meet mountain ranges. This natural phenomenon inspired the choice of name, reflecting the idea that Chinook serves as a refreshing alternative to the Northwind database.
Facebook
TwitterThe Radiocarbon dating laboratory of IRPA/KIK was founded in the 1960s. Initially dates were reported at more or less regular intervals in the journal Radiocarbon (Schreurs 1968). Since the advent of radiocarbon dating in the 1950s it had been a common practice amongst radiocarbon laboratories to publish their dates in so-called ‘date-lists’ that were arranged per laboratory. This was first done in the Radiocarbon Supplement of the American Journal of Science and later in the specialised journal Radiocarbon. In the course of time the latter, with the added subtitle An International Journal of Cosmogenic Isotope Research, became a regular scientific journal shifting focus from date-lists to articles. Furthermore the world-wide exponential increase of radiocarbon dates made it almost impossible to publish them all in the same journal, even more so because of the broad range of applications that use radiocarbon analysis, ranging from archaeology and art history to geology and oceanography and recently also biomedical studies.The IRPA/KIK database From 1995 onwards IRPA/KIK’s Radiocarbon laboratory started to publish its dates in small publications, continuing the numbering of the preceding lists in Radiocarbon. The first booklet in this series was “Royal Institute for Cultural Heritage Radiocarbon dates XV” (Van Strydonck et al. 1995), followed by three more volumes (XVI, XVII, XVIII). The next list (XIX, 2005) was no longer printed but instead handed out as a PDF file on CD-rom. The ever increasing number of dates and the difficulties in handling all the data, however, made us look for a more permanent and easier solution. In order to improve data management and consulting, it was thus decided to gather all our dates in a web-based database. List XIX was in fact already a Microsoft Access database that was converted into a reader friendly style and could also be printed as a PDF file. However a Microsoft Access database is not the most practical solution to make information publicly available. Hence the structure of the database was recreated in Mysql and the existing content was transferred into the corresponding fields. To display the records, a web-based front-end was programmed in PHP/Apache. It features a full-text search function that allows for partial word-matching. In addition the records can be consulted in PDF format. Old records from the printed date-lists as well as new records are now added using the same Microsoft Acces back-end, which is now connected directly to the Mysql database. The main problem with introducing the old data was that not all the current criteria were available in the past (e.g. stable isotope measurements). Furthermore since all the sample information is given by the submitter, its quality largely depends on the persons willingness to contribute as well as on the accuracy and correctness of the information he provides. Sometimes problems arrive from the fact that a certain investigation (like an excavation) is carried out over a relatively long period (sometimes even more than ten years) and is directed by different people or even institutions. This can lead to differences in the labeling procedure of the samples, but also in the interpretation of structures and artifacts and in the orthography of the site’s name. Finally the submitter might change address, while the names of institutions or even regions and countries might change as well (e.g.Zaire - Congo)
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE, documented on July 17, 2013. A database which contains the information of heparin-binding proteins of E. coli K-12 MG1655 cells. Heparin affinity columns were applied to enrich and fractionate proteins. Identification of proteins was done via the collaboration with David Russell''s lab. Because heparin is negatively charged sulfated glucosaminoglycan, polyamion binding proteins, which contain nucleic acid-binding proteins, are expected to bind to heparin columns. Study of the expression pattern of heparin-binding proteins will help to study the nucleic acid-binding proteins, most of which are related to regulation. Moreover, heparin affinity columns will also erich low abundance proteins. Heparome database is constructed using MySQL. Website interface is built using HTML and PHP. Queries between MySQL database and website interface are executed using PHP. Besides including information of identified proteins, such as swiss accession number, gene name, molecular weight, isoelectric point, condon adaptation index (CAI), functional classification, et. al. , it also includes information of experiments, such as sample preparation, heparin-HPLC chromatography, SDS-PAGE gel separation and MALDI-MS.
Facebook
TwitterChEssBase is a dynamic relational database for all deep-water species from chemosynthetic ecosystems (hydrothermal vents, cold seeps and other reducing environments such as whale carcasses, sunken wood or OMZs) being constructed from the ChEss project (Biogeography of Deep-Water Chemosynthetic Ecosystems) within the Census of Marine Life initiative. AccConID=21 AccConstrDescription=This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of licenses offered. Recommended for maximum dissemination and use of licensed materials. AccConstrDisplay=This dataset is licensed under a Creative Commons Attribution 4.0 International License. AccConstrEN=Attribution (CC BY) AccessConstraint=Attribution (CC BY) AccessConstraints=None Acronym=None added_date=2013-06-12 15:21:34.517000 BrackishFlag=0 CDate=2004-06-24 cdm_data_type=Other CheckedFlag=0 Citation=Ramirez-Llodra, E., Blanco, 2005. ChEssBase: an online information system on biodiversity and biogeography of deep-sea fauna from chemosynthetic ecosystems. Version 2. World Wide Web electronic publications, http://www.noc.soton.ac.uk/chess/database/db_home.php Comments=None ContactEmail=None Conventions=COARDS, CF-1.6, ACDD-1.3 CurrencyDate=None DasID=212 DasOrigin=Literature research DasType=Data DasTypeID=1 DateLastModified={'date': '2025-08-12 01:34:46.196267', 'timezone_type': 1, 'timezone': '+02:00'} DescrCompFlag=0 DescrTransFlag=0 Easternmost_Easting=179.8 EmbargoDate=None EngAbstract=ChEssBase is a dynamic relational database for all deep-water species from chemosynthetic ecosystems (hydrothermal vents, cold seeps and other reducing environments such as whale carcasses, sunken wood or OMZs) being constructed from the ChEss project (Biogeography of Deep-Water Chemosynthetic Ecosystems) within the Census of Marine Life initiative. EngDescr=The aim of ChEssBase is to provide taxonomical, biological, ecological and distributional data for all species described from deep-water chemosynthetic ecosystems, as well as information on available samples, images, bibliography and information on the habitats.These habitats include hydrothermal vents, cold seeps, whale falls, sunken wood and areas of minimum oxygen that intersect with the continental margin or seamounts. Since the discovery of hydrothermal vents in 1977 and of cold seep communities in 1984, over 590 species from vents and over 230 species from seeps have been described. Chemosynthetically fueled communities have now also been found on large organic falls to the deep-sea floor such as whale falls and sunken wood, as well as on benthic zones of oxygen minimum.The data gathered in the last 30 years has shown that some species are shared amongst these ecosystems and our knowledge of their phylogeography improves with every new discovery. New species are continuously being discovered and described from research programmes around the globe and therefore ChEssBase is in active development and new data are being entered regularly. At present, ChEssBase includes data on 1740 species from 193 chemosynthetic sites around the globe. These data contain information (when available) on the taxonomy, morphology, trophic level, reproduction, endemicity, habitat type and distribution. There are now 1880 papers in our reference database.The first version of ChEssBase was available online in December 2004. In summer 2005, ChEssBase and the InterRidge biological database (www.interridge.org) were fused into a single source of information for biological data from chemosynthetic ecosystems. This second version of ChEssBase is available online since August 2005, with new records as well as new search and download options. Since December 2005, ChEssBase is integrated in the Ocean Biogeographic Information System (OBIS, www.iobis.org).ChEssBase is supported by a species-based relational database in MySQL. The database includes 3 major components:Taxonomy (from kingdom to subspecies)Distribution (from site to major geographic area)Samples (including sample, cruise and institution information)ChEssBase is regularly updated with new information available in the literature. In order to quickly obtain accurate new data and help maintain the database up to date, we would be very grateful if you could send us any new publications with data relevant to ChEssBase, which we would add to the database, together with the relevant references. FreshFlag=0 geospatial_lat_max=72.0 geospatial_lat_min=-55.1 geospatial_lat_units=degrees_north geospatial_lon_max=179.8 geospatial_lon_min=-158.1 geospatial_lon_units=degrees_east infoUrl=None InputNotes=None institution=COML, SOTON-NOC, SOTON-SOES License=https://creativecommons.org/licenses/by/4.0/ Lineage=Prior to publication data undergo quality control checked which are described in https://github.com/EMODnet/EMODnetBiocheck?tab=readme-ov-file#understanding-the-output MarineFlag=1 modified_sync=2021-02-05 00:00:00 Northernmost_Northing=72.0 OrigAbstract=None OrigDescr=None OrigDescrLang=None OrigDescrLangNL=None OrigLangCode=None OrigLangCodeExtended=None OrigLangID=None OrigTitle=None OrigTitleLang=None OrigTitleLangCode=None OrigTitleLangID=None OrigTitleLangNL=None Progress=In Progress PublicFlag=1 ReleaseDate=Jun 12 2013 12:00AM ReleaseDate0=2013-06-12 RevisionDate=None SizeReference=1740 species from 193 sites sourceUrl=(local files) Southernmost_Northing=-55.1 standard_name_vocabulary=CF Standard Name Table v70 StandardTitle=ChEssBase StatusID=1 subsetVariables=ScientificName,BasisOfRecord,aphia_id TerrestrialFlag=0 UDate=2025-03-26 VersionDate=Jun 3 2004 12:00AM VersionDay=23 VersionMonth=10 VersionName=2 VersionYear=2007 VlizCoreFlag=1 Westernmost_Easting=-158.1
Facebook
TwitterThe Sakila sample database is a fictitious database designed to represent a DVD rental store. The tables of the database include film, film_category, actor, customer, rental, payment and inventory among others. The Sakila sample database is intended to provide a standard schema that can be used for examples in books, tutorials, articles, samples, and so forth. Detailed information about the database can be found on the MySQL website: https://dev.mysql.com/doc/sakila/en/
Sakila for SQLite is a part of the sakila-sample-database-ports project intended to provide ported versions of the original MySQL database for other database systems, including:
Sakila for SQLite is a port of the Sakila example database available for MySQL, which was originally developed by Mike Hillyer of the MySQL AB documentation team. This project is designed to help database administrators to decide which database to use for development of new products The user can run the same SQL against different kind of databases and compare the performance
License: BSD Copyright DB Software Laboratory http://www.etl-tools.com
Note: Part of the insert scripts were generated by Advanced ETL Processor http://www.etl-tools.com/etl-tools/advanced-etl-processor-enterprise/overview.html
Information about the project and the downloadable files can be found at: https://code.google.com/archive/p/sakila-sample-database-ports/
Other versions and developments of the project can be found at: https://github.com/ivanceras/sakila/tree/master/sqlite-sakila-db
https://github.com/jOOQ/jOOQ/tree/main/jOOQ-examples/Sakila
Direct access to the MySQL Sakila database, which does not require installation of MySQL (queries can be typed directly in the browser), is provided on the phpMyAdmin demo version website: https://demo.phpmyadmin.net/master-config/
The files in the sqlite-sakila-db folder are the script files which can be used to generate the SQLite version of the database. For convenience, the script files have already been run in cmd to generate the sqlite-sakila.db file, as follows:
sqlite> .open sqlite-sakila.db # creates the .db file
sqlite> .read sqlite-sakila-schema.sql # creates the database schema
sqlite> .read sqlite-sakila-insert-data.sql # inserts the data
Therefore, the sqlite-sakila.db file can be directly loaded into SQLite3 and queries can be directly executed. You can refer to my notebook for an overview of the database and a demonstration of SQL queries. Note: Data about the film_text table is not provided in the script files, thus the film_text table is empty. Instead the film_id, title and description fields are included in the film table. Moreover, the Sakila Sample Database has many versions, so an Entity Relationship Diagram (ERD) is provided to describe this specific version. You are advised to refer to the ERD to familiarise yourself with the structure of the database.