31 datasets found
  1. SQLite Sakila Sample Database

    • kaggle.com
    zip
    Updated Mar 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atanas Kanev (2021). SQLite Sakila Sample Database [Dataset]. https://www.kaggle.com/datasets/atanaskanev/sqlite-sakila-sample-database/code
    Explore at:
    zip(4495190 bytes)Available download formats
    Dataset updated
    Mar 14, 2021
    Authors
    Atanas Kanev
    Description

    SQLite Sakila Sample Database

    Database Description

    The Sakila sample database is a fictitious database designed to represent a DVD rental store. The tables of the database include film, film_category, actor, customer, rental, payment and inventory among others. The Sakila sample database is intended to provide a standard schema that can be used for examples in books, tutorials, articles, samples, and so forth. Detailed information about the database can be found on the MySQL website: https://dev.mysql.com/doc/sakila/en/

    Sakila for SQLite is a part of the sakila-sample-database-ports project intended to provide ported versions of the original MySQL database for other database systems, including:

    • Oracle
    • SQL Server
    • SQLIte
    • Interbase/Firebird
    • Microsoft Access

    Sakila for SQLite is a port of the Sakila example database available for MySQL, which was originally developed by Mike Hillyer of the MySQL AB documentation team. This project is designed to help database administrators to decide which database to use for development of new products The user can run the same SQL against different kind of databases and compare the performance

    License: BSD Copyright DB Software Laboratory http://www.etl-tools.com

    Note: Part of the insert scripts were generated by Advanced ETL Processor http://www.etl-tools.com/etl-tools/advanced-etl-processor-enterprise/overview.html

    Information about the project and the downloadable files can be found at: https://code.google.com/archive/p/sakila-sample-database-ports/

    Other versions and developments of the project can be found at: https://github.com/ivanceras/sakila/tree/master/sqlite-sakila-db

    https://github.com/jOOQ/jOOQ/tree/main/jOOQ-examples/Sakila

    Direct access to the MySQL Sakila database, which does not require installation of MySQL (queries can be typed directly in the browser), is provided on the phpMyAdmin demo version website: https://demo.phpmyadmin.net/master-config/

    Files Description

    The files in the sqlite-sakila-db folder are the script files which can be used to generate the SQLite version of the database. For convenience, the script files have already been run in cmd to generate the sqlite-sakila.db file, as follows:

    sqlite> .open sqlite-sakila.db # creates the .db file sqlite> .read sqlite-sakila-schema.sql # creates the database schema sqlite> .read sqlite-sakila-insert-data.sql # inserts the data

    Therefore, the sqlite-sakila.db file can be directly loaded into SQLite3 and queries can be directly executed. You can refer to my notebook for an overview of the database and a demonstration of SQL queries. Note: Data about the film_text table is not provided in the script files, thus the film_text table is empty. Instead the film_id, title and description fields are included in the film table. Moreover, the Sakila Sample Database has many versions, so an Entity Relationship Diagram (ERD) is provided to describe this specific version. You are advised to refer to the ERD to familiarise yourself with the structure of the database.

  2. Bike Store Relational Database | SQL

    • kaggle.com
    zip
    Updated Aug 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dillon Myrick (2023). Bike Store Relational Database | SQL [Dataset]. https://www.kaggle.com/datasets/dillonmyrick/bike-store-sample-database
    Explore at:
    zip(94412 bytes)Available download formats
    Dataset updated
    Aug 21, 2023
    Authors
    Dillon Myrick
    Description

    This is the sample database from sqlservertutorial.net. This is a great dataset for learning SQL and practicing querying relational databases.

    Database Diagram:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4146319%2Fc5838eb006bab3938ad94de02f58c6c1%2FSQL-Server-Sample-Database.png?generation=1692609884383007&alt=media" alt="">

    Terms of Use

    The sample database is copyrighted and cannot be used for commercial purposes. For example, it cannot be used for the following but is not limited to the purposes: - Selling - Including in paid courses

  3. classicmodels

    • kaggle.com
    zip
    Updated Dec 10, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marta Tavares (2022). classicmodels [Dataset]. https://www.kaggle.com/datasets/martatavares/classicmodels
    Explore at:
    zip(72431 bytes)Available download formats
    Dataset updated
    Dec 10, 2022
    Authors
    Marta Tavares
    Description

    MySQL Classicmodels sample database

    The MySQL sample database schema consists of the following tables:

    • Customers: stores customer’s data.
    • Products: stores a list of scale model cars.
    • ProductLines: stores a list of product line categories.
    • Orders: stores sales orders placed by customers.
    • OrderDetails: stores sales order line items for each sales order.
    • Payments: stores payments made by customers based on their accounts.
    • Employees: stores all employee information as well as the organization structure such as who reports to whom.
    • Offices: stores sales office data.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F8652778%2Fefc56365be54c0e2591a1aefa5041f36%2FMySQL-Sample-Database-Schema.png?generation=1670498341027618&alt=media" alt="">

  4. Design and Implementation of Computerized Student Information System

    • figshare.com
    docx
    Updated Nov 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Micha Yohana (2022). Design and Implementation of Computerized Student Information System [Dataset]. http://doi.org/10.6084/m9.figshare.21608814.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    Nov 23, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Micha Yohana
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Over the years sorting of students information has been a thing of concern as it is done manually. Students file which are scattered all over the place will have to be thoroughly searched for particular information to be retrieved, this have caused some of the files to be tattered. This application will focus on the management of student’s information of computer science and information technology department. Once a student is enrolled his or her information will be stored in the student database. This information will include student’s name, registration number, and state of origin, grade point and exam scores. The objective of the study is to design and implement a student information management system for computer science and information technology department to automate selected processes it in the Student Record System. Especially the study aims to: 1. Analyse the current student; 2. Design and developed a system that will meet the user requirements that will be scalable and intelligent in nature and 3. Conduct a testing using a test edition of the developed system with the end-users to ascertain the efficiency, scalability and intelligence of the system.

  5. j

    Data from: SQL Injection Attack Netflow

    • portalcienciaytecnologia.jcyl.es
    • portalcientifico.unileon.es
    • +3more
    Updated 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crespo, Ignacio; Campazas, Adrián; Crespo, Ignacio; Campazas, Adrián (2022). SQL Injection Attack Netflow [Dataset]. https://portalcienciaytecnologia.jcyl.es/documentos/668fc461b9e7c03b01bdba14
    Explore at:
    Dataset updated
    2022
    Authors
    Crespo, Ignacio; Campazas, Adrián; Crespo, Ignacio; Campazas, Adrián
    Description

    Introduction This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used. NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device. Datasets The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2). The datasets contain both benign and malicious traffic. All collected datasets are balanced. The version of NetFlow used to build the datasets is 5. Dataset Aim Samples Benign-malicious
    traffic ratio D1 Training 400,003 50% D2 Test 57,239 50% Infrastructure and implementation Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows. DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes) Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet). The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities. The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table. Parameters Description '--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema' Enumerate users, password hashes, privileges, roles, databases, tables and columns --level=5 Increase the probability of a false positive identification --risk=3 Increase the probability of extracting data --random-agent Select the User-Agent randomly --batch Never ask for user input, use the default behavior --answers="follow=Y" Predefined answers to yes Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer). The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24.
    The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases. However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes. To run the MySQL server we ran MariaDB version 10.4.12.
    Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.

  6. Z

    FooDrugs database: A database with molecular and text information about food...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Garranzo, Marco; Piette Gómez, Óscar; Lacruz Pleguezuelos, Blanca; Pérez, David; Laguna Lobo, Teresa; Carrillo de Santa Pau, Enrique (2023). FooDrugs database: A database with molecular and text information about food - drug interactions [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6638469
    Explore at:
    Dataset updated
    Jul 28, 2023
    Dataset provided by
    IMDEA Food Institute
    Authors
    Garranzo, Marco; Piette Gómez, Óscar; Lacruz Pleguezuelos, Blanca; Pérez, David; Laguna Lobo, Teresa; Carrillo de Santa Pau, Enrique
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    FooDrugs database is a development done by the Computational Biology Group at IMDEA Food Institute (Madrid, Spain), in the context of the Food Nutrition Security Cloud (FNS-Cloud) project. Food Nutrition Security Cloud (FNS-Cloud) has received funding from the European Union's Horizon 2020 Research and Innovation programme (H2020-EU.3.2.2.3. – A sustainable and competitive agri-food industry) under Grant Agreement No. 863059 – www.fns-cloud.eu (See more details about FNS-Cloud below)

    FooDrugs stores information extracted from transcriptomics and text documents for foo-drug interactiosn and it is part of a demonstrator to be done in the FNS-Cloud project. The database was built using MySQL, an open source relational database management system. FooDrugs host information for a total of 161 transcriptomics GEO series with 585 conditions for food or bioactive compounds. Each condition is defined as a food/biocomponent per time point, per concentration, per cell line, primary culture or biopsy per study. FooDrugs includes information about a bipartite network with 510 nodes and their similarity scores (tau score; https://clue.io/connectopedia/connectivity_scores) related with possible drug interactions with drugs assayed in conectivity map (https://www.broadinstitute.org/connectivity-map-cmap). The information is stored in eight tables:

    Table “study” : This table contains basic information about study identifiers from GEO, pubmed or platform, study type, title and abstract

    Table “sample”: This table contains basic information about the different experiments in a study, like the identifier of the sample, treatment, origin type, time point or concentration.

    Table “misc_study”: This table contains additional information about different attributes of the study.

    Table “misc_sample”: This table contains additional information about different attributes of the sample.

    Table “cmap”: This table contains information about 70895 nodes, compromising drugs, foods or bioactives, overexpressed and knockdown genes (see section 3.4). The information includes cell line, compound and perturbation type.

    Table “cmap_foodrugs”: This table contains information about the tau score (see section 3.4) that relates food with drugs or genes and the node identifier in the FooDrugs network.

    Table “topTable”: This table contains information about 150 over and underexpressed genes from each GEO study condition, used to calculate the tau score (see section 3.4). The information stored is the logarithmic fold change, average expression, t-statistic, p-value, adjusted p-value and if the gene is up or downregulated.

    Table “nodes”: This table stores the information about the identification of the sample and the node in the bipartite network connecting the tables “sample”, “cmap_foodrugs” and “topTable”.

    In addition, FooDrugs database stores a total of 6422 food/drug interactions from 2849 text documents, obtained from three different sources: 2312 documents from PubMed, 285 from DrugBank, and 252 from drugs.com. These documents describe potential interactions between 1464 food/bioactive compounds and 3009 drugs. The information is stored in two tables:

    Table “texts”: This table contains all the documents with its identifiers where interactions have been identified with strategy described in section 4.

    Table “TM_interactions”: This table contains information about interaction identifiers, the food and drug entities, and the start and the end positions of the context for the interaction in the document.

    FNS-Cloud will overcome fragmentation problems by integrating existing FNS data, which is essential for high-end, pan-European FNS research, addressing FNS, diet, health, and consumer behaviours as well as on sustainable agriculture and the bio-economy. Current fragmented FNS resources not only result in knowledge gaps that inhibit public health and agricultural policy, and the food industry from developing effective solutions, making production sustainable and consumption healthier, but also do not enable exploitation of FNS knowledge for the benefit of European citizens. FNS-Cloud will, through three Demonstrators; Agri-Food, Nutrition & Lifestyle and NCDs & the Microbiome to facilitate: (1) Analyses of regional and country-specific differences in diet including nutrition, (epi)genetics, microbiota, consumer behaviours, culture and lifestyle and their effects on health (obesity, NCDs, ethnic and traditional foods), which are essential for public health and agri-food and health policies; (2) Improved understanding agricultural differences within Europe and what these means in terms of creating a sustainable, resilient food systems for healthy diets; and (3) Clear definitions of boundaries and how these affect the compositions of foods and consumer choices and, ultimately, personal and public health in the future. Long-term sustainability of the FNS-Cloud will be based on Services that have the capacity to link with new resources and enable cross-talk amongst them; access to FNS-Cloud data will be open access, underpinned by FAIR principles (findable, accessible, interoperable and re-useable). FNS-Cloud will work closely with the proposed Food, Nutrition and Health Research Infrastructure (FNHRI) as well as METROFOOD-RI and other existing ESFRI RIs (e.g. ELIXIR, ECRIN) in which several FNS-Cloud Beneficiaries are involved directly. (https://cordis.europa.eu/project/id/863059)

    ***** changes between version FooDrugs_v2 and FooDrugs_V3 (31st January 2023) are:

    Increased the amount of text documents by 85.675 from PubMed and ClinicalTrials.gov, and the amount of Text Mining interactions by 168.826.

    Increased the amount of transcriptomic studies by 32 GEO series.

    Removed all rows in table cmap_foodrugs representing interactions with values of tau=0

    Removed 43 GEO series that after manually checking didn't correspond to food compounds.

    Added a new column to the table texts: citation to hold the citation of the text.

    Added these columns to the table study: contributor to contain the authors of the study, publication_date to store the date of publication of the study in GEO and pubmed_id to reference the publication associated with the study if any.

    Added a new column to topTable to hold the top 150 up-regulated and 150 down-regulated genes.

  7. H

    Oracle 1Z0-908 Dumps with Authentic 1Z0-908 Exam Questions [2023]

    • dataverse.harvard.edu
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2023). Oracle 1Z0-908 Dumps with Authentic 1Z0-908 Exam Questions [2023] [Dataset]. http://doi.org/10.7910/DVN/VCD2KK
    Explore at:
    Dataset updated
    Jan 5, 2023
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Prepare For Your Oracle 1Z0-908 Exam with CertsFire The Oracle Database 1Z0-908 certification will significantly increase the scope of your knowledge. You need to put in a lot of preparation work to pass the Oracle Database 1Z0-908 exam. MySQL 8.0 Database Administrator 1Z0-908 exam dumps are a great chance to assess your knowledge and skills. 1Z0-908 Questions can better understand your areas of strength and weakness by asking yourself questions. To be ready for the Oracle Database 1Z0-908 certification exam, you should take a Oracle Database 1Z0-908 practice exam. Utilizing 1Z0-908 exam preparation software, you can hone your abilities and perform better. To help you select the Oracle Database 1Z0-908 exam dumps that are best for you, they are divided into a number of categories. The Oracle Database 1Z0-908 practice test application employs the exact same testing strategy as the Oracle Database 1Z0-908 exam. You may get ready for your Oracle Database 1Z0-908 certification exam by using 1Z0-908 exam questions. Through MySQL 8.0 Database Administrator 1Z0-908 exam dumps that feature questions relating to a company's ideal employees, job proficiency can be assessed. These Oracle 1Z0-908 practice questions include settings that are typical of real life, making the scoring questions particularly useful for mid-level managers and entry-level recruits. Some Best Formats of CertsFire Oracle 1Z0-908 Exam Questions: By taking a Oracle Database 1Z0-908 practice exam, you can find out what you're good at. Oracle Database 1Z0-908 exam preparation software is the best way to prepare for your Oracle Database 1Z0-908 certification exam. With the Oracle Database 1Z0-908 list of questions, you can brush up on your skills and knowledge. With CertsFire.com, you'll access a lot of 1Z0-908 practice questions, detailed explanations, and personalized feedback. And because it's all online, you can study anywhere, anytime. The MySQL 8.0 Database Administrator 1Z0-908 practice exam consists of questions from a pool of questions. You can narrow down the Oracle 1Z0-908 dumps pool using filters to focus on specific topics, which will help you know what areas you need more work on before taking your official Oracle Database 1Z0-908 exam. To ensure effectiveness, we offer the Oracle Exam Questions in three versions i.e Web-Based Oracle 1Z0-908 Practice Exam Desktop Oracle 1Z0-908 Practice Test Software Oracle 1Z0-908 PDF Dumps [Contains Questions and Answers in a simple PDF document] Boost Your Career with the Latest Oracle 1Z0-908 Exam Questions: The Oracle Database 1Z0-908 online exam simulator is the best way to prepare for the Oracle Database 1Z0-908 exam. CertsFire has a huge selection of 1Z0-908 dumps and topics that you can choose from. The 1Z0-908 exam questions are categorized into specific areas, letting you focus on the MySQL 8.0 Database Administrator 1Z0-908 subject areas you need to work on. Additionally, Oracle 1Z0-908 exam dumps are constantly updated with new Oracle Database 1Z0-908 questions to ensure you're always prepared for Oracle Database 1Z0-908 exam. we offer a free demo of Oracle Database 1Z0-908 exam dumps before the purchase to test the features of the products. We also offer three months of free Oracle Database 1Z0-908 exam questions updates if the 1Z0-908 certification exam content changes after purchasing our 1Z0-908 exam dumps. It is possible to adjust the MySQL 8.0 Database Administrator 1Z0-908 practice test difficulty levels according to your needs. You can also choose the number of Oracle 1Z0-908 questions and topics. You may be given the Oracle Database 1Z0-908 practice exam results as soon as they have been saved in the software. They will be preserved for 1 month. we modified Oracle Database 1Z0-908 exam dumps to allow students to learn effectively about the real Oracle Database 1Z0-908 certification exam. Oracle Database 1Z0-908 practice exam software allows students to review and refine skills in a preceding test setting. Oracle 1Z0-908 Exam Questions - Successful Tool For Preparation: With so many online resources, knowing where to start when preparing for an Oracle Database 1Z0-908 exam can be tough. But with Oracle Database 1Z0-908 practice test, you can be confident you're getting the best possible Oracle Database 1Z0-908 exam dumps. Our exam simulator mirrors the Oracle Database 1Z0-908 exam-taking experience, so you know what to expect on MySQL 8.0 Database Administrator 1Z0-908 exam day. Plus, with our wide range of Oracle 1Z0-908 exam questions types and difficulty levels, you can tailor your Oracle Database 1Z0-908 exam practice to your needs. Your performance and exam skills will be improved with our Oracle Database 1Z0-908 practice test software. The software provides you with a range of MySQL 8.0 Database Administrator 1Z0-908 exam dumps, all of which are based on past Oracle 1Z0-908 certification. Either way, the Oracle Database 1Z0-908 practice exam software will provide you with feedback on your performance. The Oracle Database 1Z0-908 practice test software also includes a built-in timer and score tracker so students can monitor their progress. 1Z0-908 practice exam enables applicants to practice time management, answer strategies, and all other elements of the final Oracle Database 1Z0-908 certification exam and can check their scores. The exhaustive report enrollment database allows students to evaluate their performance and prepare for the Oracle Database 1Z0-908 certification exam without further difficulty. Use CertsFire Oracle 1Z0-908 Practice Questions to Pass Exam With Confidence: Our Oracle Database 1Z0-908 exam questions are the best because these are so realistic! It feels just like taking a real Oracle Database 1Z0-908 exam, but without the stress! Our MySQL 8.0 Database Administrator 1Z0-908 practice test software is the answer if you want to score higher on your real Oracle 1Z0-908 certification exam and achieve your academic goals. Don't let the 1Z0-908 exam stress you out! Prepare with CertsFire Oracle Database 1Z0-908 exam dumps and boost your confidence in the real Oracle Database 1Z0-908 exam. We ensure your road towards success without any mark of failure. Time is of the essence - don't wait to ace your MySQL 8.0 Database Administrator 1Z0-908 certification exam! Register yourself now.

  8. b

    Test DB mysql Dataset By Martin 220714

    • moda-opendata-testing.blueplanet.com.tw
    • data.nat.gov.tw
    csv
    Updated Jan 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    新北市政府 (2023). Test DB mysql Dataset By Martin 220714 [Dataset]. https://moda-opendata-testing.blueplanet.com.tw/dataset/156529
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 4, 2023
    Dataset authored and provided by
    新北市政府
    License

    https://moda-opensource-testing.blueplanet.com.tw/licensehttps://moda-opensource-testing.blueplanet.com.tw/license

    Description

    test data.........................................

  9. How Many Methods Covered By Tests Are Pseudo-Tested?

    • figshare.com
    txt
    Updated Jul 31, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rainer Niedermayr (2017). How Many Methods Covered By Tests Are Pseudo-Tested? [Dataset]. http://doi.org/10.6084/m9.figshare.5259697.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 31, 2017
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Rainer Niedermayr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Accompanying dataset for the paper "How Many Methods Covered By Tests Are Pseudo-Tested?" (IEEE Transactions on Reliability: Special Section on Software Testing and Program Analysis)We analyzed test cases of 19 open-source projects using mutation testing. We determined for each covered method whether at least one test case can detect if the method’s whole logic is removed. We studied method characteristics and relations to test cases to €find indicators for covered methods that are pseudo-tested.The SQL dump requires a MySQL database (version >= 5.7).

  10. daily - sales - expenses database&tables

    • kaggle.com
    zip
    Updated Nov 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OIE (2025). daily - sales - expenses database&tables [Dataset]. https://www.kaggle.com/datasets/emmyofh/daily-sales-expenses-database-and-tables
    Explore at:
    zip(666 bytes)Available download formats
    Dataset updated
    Nov 11, 2025
    Authors
    OIE
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This file contains the database and table creation scripts used in the Daily Sales & Expenses System project. It defines the structure of the MySQL database where all sales, expenses, inventory, and login records are stored before being accessed, analyzed, and visualized using Python.

    The file includes: - Commands to create the database - SQL statements to create the users, sales, expenses and inventory tables - (Optionally) Sample INSERT statements to populate the tables with test data

    This file is essential for anyone who wants to replicate or test the Python-MySQL integration in the project.

  11. Data and scripts for the paper

    • zenodo.org
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    anonymous; anonymous (2020). Data and scripts for the paper [Dataset]. http://doi.org/10.5281/zenodo.2567662
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    anonymous; anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These files were used to calculate the model effect and select candidate committers for the paper.

    The folder `calculate-model-effect` contains the MySQL database and the Python script for each metric.
    In the MySQL database, there are the following tables:
    1. scmlog (the basic information of the commits)
    2. files
    3. hash_file (the hash of commits and their modified files)
    4. sign (signed-off-by of commits)
    5. review (reviewed-by of commits)
    6. test (tested-by of commits)
    7. ack (acked-by of commits)
    8. maintainers (created using the file MAINTAINERS in the Linux kernel repository)
    9. signer_maintainer
    10. i915-committer-no-maintainer

    The folder `select-candidate-committers` contains the data and the C++ script for selecting candidate committers for the subsystems.
    First, run `gen.cpp` to build the collaboration network of the contributors for each potential subsystem.

    Then, run `main.cpp` to get the list of the candidate committers.

  12. Z

    KGCW 2024 Challenge @ ESWC 2024

    • data-staging.niaid.nih.gov
    • investigacion.usc.gal
    • +3more
    Updated Jun 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Van Assche, Dylan; Chaves-Fraga, David; Dimou, Anastasia; Serles, Umutcan; Iglesias, Ana (2024). KGCW 2024 Challenge @ ESWC 2024 [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_10721874
    Explore at:
    Dataset updated
    Jun 11, 2024
    Dataset provided by
    Universidad Politécnica de Madrid
    KU Leuven
    Universidade de Santiago de Compostela
    IDLab
    STI Insbruck
    Authors
    Van Assche, Dylan; Chaves-Fraga, David; Dimou, Anastasia; Serles, Umutcan; Iglesias, Ana
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Knowledge Graph Construction Workshop 2024: challenge

    Knowledge graph construction of heterogeneous data has seen a lot of uptakein the last decade from compliance to performance optimizations with respectto execution time. Besides execution time as a metric for comparing knowledgegraph construction, other metrics e.g. CPU or memory usage are not considered.This challenge aims at benchmarking systems to find which RDF graphconstruction system optimizes for metrics e.g. execution time, CPU,memory usage, or a combination of these metrics.

    Task description

    The task is to reduce and report the execution time and computing resources(CPU and memory usage) for the parameters listed in this challenge, comparedto the state-of-the-art of the existing tools and the baseline results providedby this challenge. This challenge is not limited to execution times to createthe fastest pipeline, but also computing resources to achieve the most efficientpipeline.

    We provide a tool which can execute such pipelines end-to-end. This tool alsocollects and aggregates the metrics such as execution time, CPU and memoryusage, necessary for this challenge as CSV files. Moreover, the informationabout the hardware used during the execution of the pipeline is available aswell to allow fairly comparing different pipelines. Your pipeline should consistof Docker images which can be executed on Linux to run the tool. The tool isalready tested with existing systems, relational databases e.g. MySQL andPostgreSQL, and triplestores e.g. Apache Jena Fuseki and OpenLink Virtuosowhich can be combined in any configuration. It is strongly encouraged to usethis tool for participating in this challenge. If you prefer to use a differenttool or our tool imposes technical requirements you cannot solve, please contactus directly.

    Track 1: Conformance

    The set of new specification for the RDF Mapping Language (RML) established by the W3C Community Group on Knowledge Graph Construction provide a set of test-cases for each module:

    RML-Core

    RML-IO

    RML-CC

    RML-FNML

    RML-Star

    These test-cases are evaluated in this Track of the Challenge to determine their feasibility, correctness, etc. by applying them in implementations. This Track is in Beta status because these new specifications have not seen any implementation yet, thus it may contain bugs and issues. If you find problems with the mappings, output, etc. please report them to the corresponding repository of each module.

    Note: validating the output of the RML Star module automatically through the provided tooling is currently not possible, see https://github.com/kg-construct/challenge-tool/issues/1.

    Through this Track we aim to spark development of implementations for the new specifications and improve the test-cases. Let us know your problems with the test-cases and we will try to find a solution.

    Track 2: Performance

    Part 1: Knowledge Graph Construction Parameters

    These parameters are evaluated using synthetic generated data to have moreinsights of their influence on the pipeline.

    Data

    Number of data records: scaling the data size vertically by the number of records with a fixed number of data properties (10K, 100K, 1M, 10M records).

    Number of data properties: scaling the data size horizontally by the number of data properties with a fixed number of data records (1, 10, 20, 30 columns).

    Number of duplicate values: scaling the number of duplicate values in the dataset (0%, 25%, 50%, 75%, 100%).

    Number of empty values: scaling the number of empty values in the dataset (0%, 25%, 50%, 75%, 100%).

    Number of input files: scaling the number of datasets (1, 5, 10, 15).

    Mappings

    Number of subjects: scaling the number of subjects with a fixed number of predicates and objects (1, 10, 20, 30 TMs).

    Number of predicates and objects: scaling the number of predicates and objects with a fixed number of subjects (1, 10, 20, 30 POMs).

    Number of and type of joins: scaling the number of joins and type of joins (1-1, N-1, 1-N, N-M)

    Part 2: GTFS-Madrid-Bench

    The GTFS-Madrid-Bench provides insights in the pipeline with real data from thepublic transport domain in Madrid.

    Scaling

    GTFS-1 SQL

    GTFS-10 SQL

    GTFS-100 SQL

    GTFS-1000 SQL

    Heterogeneity

    GTFS-100 XML + JSON

    GTFS-100 CSV + XML

    GTFS-100 CSV + JSON

    GTFS-100 SQL + XML + JSON + CSV

    Example pipeline

    The ground truth dataset and baseline results are generated in different stepsfor each parameter:

    The provided CSV files and SQL schema are loaded into a MySQL relational database.

    Mappings are executed by accessing the MySQL relational database to construct a knowledge graph in N-Triples as RDF format

    The pipeline is executed 5 times from which the median execution time of eachstep is calculated and reported. Each step with the median execution time isthen reported in the baseline results with all its measured metrics.Knowledge graph construction timeout is set to 24 hours. The execution is performed with the following tool: https://github.com/kg-construct/challenge-tool,you can adapt the execution plans for this example pipeline to your own needs.

    Each parameter has its own directory in the ground truth dataset with thefollowing files:

    Input dataset as CSV.

    Mapping file as RML.

    Execution plan for the pipeline in metadata.json.

    Datasets

    Knowledge Graph Construction Parameters

    The dataset consists of:

    Input dataset as CSV for each parameter.

    Mapping file as RML for each parameter.

    Baseline results for each parameter with the example pipeline.

    Ground truth dataset for each parameter generated with the example pipeline.

    Format

    All input datasets are provided as CSV, depending on the parameter that is beingevaluated, the number of rows and columns may differ. The first row is alwaysthe header of the CSV.

    GTFS-Madrid-Bench

    The dataset consists of:

    Input dataset as CSV with SQL schema for the scaling and a combination of XML,

    CSV, and JSON is provided for the heterogeneity.

    Mapping file as RML for both scaling and heterogeneity.

    SPARQL queries to retrieve the results.

    Baseline results with the example pipeline.

    Ground truth dataset generated with the example pipeline.

    Format

    CSV datasets always have a header as their first row.JSON and XML datasets have their own schema.

    Evaluation criteria

    Submissions must evaluate the following metrics:

    Execution time of all the steps in the pipeline. The execution time of a step is the difference between the begin and end time of a step.

    CPU time as the time spent in the CPU for all steps of the pipeline. The CPU time of a step is the difference between the begin and end CPU time of a step.

    Minimal and maximal memory consumption for each step of the pipeline. The minimal and maximal memory consumption of a step is the minimum and maximum calculated of the memory consumption during the execution of a step.

    Expected output

    Duplicate values

    Scale Number of Triples

    0 percent 2000000 triples

    25 percent 1500020 triples

    50 percent 1000020 triples

    75 percent 500020 triples

    100 percent 20 triples

    Empty values

    Scale Number of Triples

    0 percent 2000000 triples

    25 percent 1500000 triples

    50 percent 1000000 triples

    75 percent 500000 triples

    100 percent 0 triples

    Mappings

    Scale Number of Triples

    1TM + 15POM 1500000 triples

    3TM + 5POM 1500000 triples

    5TM + 3POM 1500000 triples

    15TM + 1POM 1500000 triples

    Properties

    Scale Number of Triples

    1M rows 1 column 1000000 triples

    1M rows 10 columns 10000000 triples

    1M rows 20 columns 20000000 triples

    1M rows 30 columns 30000000 triples

    Records

    Scale Number of Triples

    10K rows 20 columns 200000 triples

    100K rows 20 columns 2000000 triples

    1M rows 20 columns 20000000 triples

    10M rows 20 columns 200000000 triples

    Joins

    1-1 joins

    Scale Number of Triples

    0 percent 0 triples

    25 percent 125000 triples

    50 percent 250000 triples

    75 percent 375000 triples

    100 percent 500000 triples

    1-N joins

    Scale Number of Triples

    1-10 0 percent 0 triples

    1-10 25 percent 125000 triples

    1-10 50 percent 250000 triples

    1-10 75 percent 375000 triples

    1-10 100 percent 500000 triples

    1-5 50 percent 250000 triples

    1-10 50 percent 250000 triples

    1-15 50 percent 250005 triples

    1-20 50 percent 250000 triples

    1-N joins

    Scale Number of Triples

    10-1 0 percent 0 triples

    10-1 25 percent 125000 triples

    10-1 50 percent 250000 triples

    10-1 75 percent 375000 triples

    10-1 100 percent 500000 triples

    5-1 50 percent 250000 triples

    10-1 50 percent 250000 triples

    15-1 50 percent 250005 triples

    20-1 50 percent 250000 triples

    N-M joins

    Scale Number of Triples

    5-5 50 percent 1374085 triples

    10-5 50 percent 1375185 triples

    5-10 50 percent 1375290 triples

    5-5 25 percent 718785 triples

    5-5 50 percent 1374085 triples

    5-5 75 percent 1968100 triples

    5-5 100 percent 2500000 triples

    5-10 25 percent 719310 triples

    5-10 50 percent 1375290 triples

    5-10 75 percent 1967660 triples

    5-10 100 percent 2500000 triples

    10-5 25 percent 719370 triples

    10-5 50 percent 1375185 triples

    10-5 75 percent 1968235 triples

    10-5 100 percent 2500000 triples

    GTFS Madrid Bench

    Generated Knowledge Graph

    Scale Number of Triples

    1 395953 triples

    10 3959530 triples

    100 39595300 triples

    1000 395953000 triples

    Queries

    Query Scale 1 Scale 10 Scale 100 Scale 1000

    Q1 58540 results 585400 results No results available No results available

    Q2 636 results 11998 results
    125565 results 1261368 results

    Q3 421 results 4207 results 42067 results 420667 results

    Q4 13 results 130 results 1300 results 13000 results

    Q5 35 results 350 results 3500 results 35000 results

    Q6 1 result 1 result 1 result 1 result

    Q7 68 results 67 results 67 results 53 results

    Q8 35460 results 354600 results No results available No results available

    Q9 130 results 1300

  13. How Many Methods Covered By Tests Are Pseudo-Tested?

    • figshare.com
    txt
    Updated Jun 8, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rainer Niedermayr (2017). How Many Methods Covered By Tests Are Pseudo-Tested? [Dataset]. http://doi.org/10.6084/m9.figshare.5092099.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 8, 2017
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Rainer Niedermayr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset of the ICTSS 2017 paper by Niedermayr et al.We analyzed test cases of 19 open-source projects using mutation testing. We determined for each covered method whether at least one test case can detect if the method’s whole logic is removed. We studied method characteristics and relations to test cases to €find indicators for covered methods that are pseudo-tested.The SQL dump requires a MySQL database (version >= 5.7).

  14. Employees

    • kaggle.com
    zip
    Updated Nov 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sudhir Singh (2021). Employees [Dataset]. https://www.kaggle.com/datasets/crepantherx/employees
    Explore at:
    zip(31992550 bytes)Available download formats
    Dataset updated
    Nov 12, 2021
    Authors
    Sudhir Singh
    Description

    Dataset

    This dataset was created by Sudhir Singh

    Released under Data files © Original Authors

    Contents

  15. How Much Code Covered By Tests is Pseudo-Tested?

    • figshare.com
    7z
    Updated Feb 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rainer Niedermayr (2017). How Much Code Covered By Tests is Pseudo-Tested? [Dataset]. http://doi.org/10.6084/m9.figshare.4609348.v1
    Explore at:
    7zAvailable download formats
    Dataset updated
    Feb 3, 2017
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Rainer Niedermayr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset of the ISSTA 2017 paper by Niedermayr et al.We analyzed test cases of 19 open-source projects using mutation testing. We determined for each covered method whether at least one test case can detect if the method’s whole logic is removed. We studied method characteristics and relations to test cases to €find indicators for covered methods that are pseudo-tested.The SQL dump requires a MySQL database (version >= 5.7).

  16. s

    Data from: Peatland Mid-Infrared Database (1.0.0)

    • repository.soilwise-he.eu
    Updated Sep 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Peatland Mid-Infrared Database (1.0.0) [Dataset]. http://doi.org/10.5281/zenodo.17092587
    Explore at:
    Dataset updated
    Sep 10, 2025
    Description

    README 2025-09-10 Introduction The peatland mid infrared database (pmird) stores data from peat, vegetation, litter, and dissolved organic matter samples, in particular mid infrared spectra and other variables, from previously published and unpublished data sources. The majority of samples in the database are peat samples from northern bogs. Currently, the database contains entries from 26 studies, 11216 samples, and 3877 mid infrared spectra. The aim is to provide a harmonized data source that can be useful to re-analyse existing data, analyze peat chemistry, develop and test spectral prediction models, and provide data on various peat properties. Usage notes Download and Setup The peatland mid infrared database can be downloaded from https://doi.org/10.5281/zenodo.17092587. The publication contains the following files and folders: pmird-backup-2025-09-10.sql: A mysqldump backup of the pmird database. pmird_prepared_data: A folder that contains: Folders like c00001-2020-08-17-Hodgkins with the raw spectra for samples from each dataset in the pmird database (see below for how to import the spectra). Files like pmird_prepare_data_c00001-2020-08-17-Hodgkins.Rmd that contain the R code used to process and import the data from each dataset into the database. Corresponding html files contain the compiled scripts. pmird_prepare_data.Rmd: An Rmarkdown script that was used to run the scripts that created the database (the top level script). mysql_scripts: A folder that contains: pmird_mysql_initialization.sql: MariaDB script to initialize the database. 001-db-initialize.Rmd: Rmarkdown script that executes pmird_mysql_initialization.sql and populated dataset-independent tables. add-citations.Rmd: Rmarkdown script that adds information on references to the database. add-licenses.Rmd: Rmarkdown script that adds information on licenses to the database. add-mir-metadata-quality.Rmd Rmarkdown script that adds information on the quality of the infrared spectra to the database. Dockerfile: A Dockerfile that defines the computing environment used to create the database. renv.lock A renv.lock file that lists the R packages used to create the database. The database can be set up as follows: The downloaded database needs to be imported in a running MariaDB instance. In a linux terminal, the downloaded sql file can be imported like so: mysql -u

  17. Z

    Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and...

    • data.niaid.nih.gov
    Updated Aug 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sadat, Mefta; Bener, Ayse Basar; Miranskyy, Andriy V. (2024). Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and KDE [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_400614
    Explore at:
    Dataset updated
    Aug 3, 2024
    Dataset provided by
    Ryerson University
    Authors
    Sadat, Mefta; Bener, Ayse Basar; Miranskyy, Andriy V.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present three defect rediscovery datasets mined from Bugzilla. The datasets capture data for three groups of open source software projects: Apache, Eclipse, and KDE. The datasets contain information about approximately 914 thousands of defect reports over a period of 18 years (1999-2017) to capture the inter-relationships among duplicate defects.

    File Descriptions

    apache.csv - Apache Defect Rediscovery dataset

    eclipse.csv - Eclipse Defect Rediscovery dataset

    kde.csv - KDE Defect Rediscovery dataset

    apache.relations.csv - Inter-relations of rediscovered defects of Apache

    eclipse.relations.csv - Inter-relations of rediscovered defects of Eclipse

    kde.relations.csv - Inter-relations of rediscovered defects of KDE

    create_and_populate_neo4j_objects.cypher - Populates Neo4j graphDB by importing all the data from the CSV files. Note that you have to set dbms.import.csv.legacy_quote_escaping configuration setting to false to load the CSV files as per https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.import.csv.legacy_quote_escaping

    create_and_populate_mysql_objects.sql - Populates MySQL RDBMS by importing all the data from the CSV files

    rediscovery_db_mysql.zip - For your convenience, we also provide full backup of the MySQL database

    neo4j_examples.txt - Sample Neo4j queries

    mysql_examples.txt - Sample MySQL queries

    rediscovery_eclipse_6325.png - Output of Neo4j example #1

    distinct_attrs.csv - Distinct values of bug_status, resolution, priority, severity for each project

  18. s

    Orphan Drugs - Dataset 1: Twitter issue-networks as excluded publics

    • orda.shef.ac.uk
    txt
    Updated Oct 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Hanchard (2021). Orphan Drugs - Dataset 1: Twitter issue-networks as excluded publics [Dataset]. http://doi.org/10.15131/shef.data.16447326.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 22, 2021
    Dataset provided by
    The University of Sheffield
    Authors
    Matthew Hanchard
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises of two .csv format files used within workstream 2 of the Wellcome Trust funded ‘Orphan drugs: High prices, access to medicines and the transformation of biopharmaceutical innovation’ project (219875/Z/19/Z). They appear in various outputs, e.g. publications and presentations.

    The deposited data were gathered using the University of Amsterdam Digital Methods Institute’s ‘Twitter Capture and Analysis Toolset’ (DMI-TCAT) before being processed and extracted from Gephi. DMI-TCAT queries Twitter’s STREAM Application Programming Interface (API) using SQL and retrieves data on a pre-set text query. It then sends the returned data for storage on a MySQL database. The tool allows for output of that data in various formats. This process aligns fully with Twitter’s service user terms and conditions. The query for the deposited dataset gathered a 1% random sample of all public tweets posted between 10-Feb-2021 and 10-Mar-2021 containing the text ‘Rare Diseases’ and/or ‘Rare Disease Day’, storing it on a local MySQL database managed by the University of Sheffield School of Sociological Studies (http://dmi-tcat.shef.ac.uk/analysis/index.php), accessible only via a valid VPN such as FortiClient and through a permitted active directory user profile. The dataset was output from the MySQL database raw as a .gexf format file, suitable for social network analysis (SNA). It was then opened using Gephi (0.9.2) data visualisation software and anonymised/pseudonymised in Gephi as per the ethical approval granted by the University of Sheffield School of Sociological Studies Research Ethics Committee on 02-Jun-201 (reference: 039187). The deposited dataset comprises of two anonymised/pseudonymised social network analysis .csv files extracted from Gephi, one containing node data (Issue-networks as excluded publics – Nodes.csv) and another containing edge data (Issue-networks as excluded publics – Edges.csv). Where participants explicitly provided consent, their original username has been provided. Where they have provided consent on the basis that they not be identifiable, their username has been replaced with an appropriate pseudonym. All other usernames have been anonymised with a randomly generated 16-digit key. The level of anonymity for each Twitter user is provided in column C of deposited file ‘Issue-networks as excluded publics – Nodes.csv’.

    This dataset was created and deposited onto the University of Sheffield Online Research Data repository (ORDA) on 26-Aug-2021 by Dr. Matthew S. Hanchard, Research Associate at the University of Sheffield iHuman institute/School of Sociological Studies. ORDA has full permission to store this dataset and to make it open access for public re-use without restriction under a CC BY license, in line with the Wellcome Trust commitment to making all research data Open Access.

    The University of Sheffield are the designated data controller for this dataset.

  19. CHINOOK Music

    • kaggle.com
    zip
    Updated Sep 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira (2024). CHINOOK Music [Dataset]. https://www.kaggle.com/datasets/willianoliveiragibin/chinook-music
    Explore at:
    zip(9603 bytes)Available download formats
    Dataset updated
    Sep 19, 2024
    Authors
    willian oliveira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Chinook Database is a sample database designed for use with multiple database platforms, such as SQL Server, Oracle, MySQL, and others. It can be easily set up by running a single SQL script, making it a convenient alternative to the popular Northwind database. Chinook is widely used in demos and testing environments, particularly for Object-Relational Mapping (ORM) tools that target both single and multiple database servers.

    Supported Database Servers Chinook supports several database servers, including:

    DB2 MySQL Oracle PostgreSQL SQL Server SQL Server Compact SQLite Download Instructions You can download the SQL scripts for each supported database server from the latest release assets. The appropriate SQL script file(s) for your database vendor are provided, which can be executed using your preferred database management tool.

    Data Model The Chinook Database represents a digital media store, containing tables that include:

    Artists Albums Media tracks Invoices Customers Sample Data The media data in Chinook is derived from a real iTunes Library, providing a realistic dataset for users. Additionally, users can generate their own SQL scripts using their personal iTunes Library by following specific instructions. Customer and employee details in the database were manually crafted with fictitious names, addresses (mappable via Google Maps), and well-structured contact information such as phone numbers, faxes, and emails. Sales data is auto-generated and spans a four-year period, using random values.

    Why is it Called Chinook? The Chinook Database's name is a nod to its predecessor, the Northwind database. Chinooks are warm, dry winds found in the interior regions of North America, particularly over southern Alberta in Canada, where the Canadian Prairies meet mountain ranges. This natural phenomenon inspired the choice of name, reflecting the idea that Chinook serves as a refreshing alternative to the Northwind database.

  20. r

    Usage Statistics for University of Tasmania EPrints Repository

    • researchdata.edu.au
    Updated Apr 27, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sale, Arthur; Sale, Arthur (2017). Usage Statistics for University of Tasmania EPrints Repository [Dataset]. https://researchdata.edu.au/usage-statistics-university-eprints-repository/927350
    Explore at:
    Dataset updated
    Apr 27, 2017
    Dataset provided by
    University of Tasmania, Australia
    Authors
    Sale, Arthur; Sale, Arthur
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    The dataset is an active collection of access data to information items in the University of Tasmania’s EPrints repository. Each night a task is scheduled to run, and this picks up in the Apache access logs from where it left off the previous night. Each download of an open access full-text item causes the generation of a database record in the MySQL database, together with a timestamp, and an approximate location of the computer system generating the download. This is achieved by looking up the IP address against the GeoIP database, with one significant difference. Downloads originating from a University of Tasmania IP address are separately identified, and removed from the ‘Australia’ category. This eliminates vanity searches from achieving high significance. Countries are coded using the ISO3166 two-letter code.

    The dataset has been used to analyse the usage made of the repository and to tune it to achieve maximal visibility for the University of Tasmania. Researchers with items in the repository have used it to identify the types of use being made of their work, and to find potential collaborators. The citation of a work in a journal or conference article, for example, causes a typical step in usage, and the citing article can be searched in Google or Google Scholar to identify the authors. This enhances the dissemination experience and its value.

    The software was written in the University of Tasmania by Professor Arthur Sale (in php) based on earlier work by the University of Melbourne (with permission). Mr Christian McGee wrote some critical sections of the code in perl, and set up the cron scheduling.

    The dataset is generated by a computer program written by Professor Arthur Sale. The software was a test bed for ideas, and subsequently resulted in an official software set included in the EPrints distribution. This set expanded on the concepts significantly

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Atanas Kanev (2021). SQLite Sakila Sample Database [Dataset]. https://www.kaggle.com/datasets/atanaskanev/sqlite-sakila-sample-database/code
Organization logo

SQLite Sakila Sample Database

SQLite Port of the Original MySQL Sakila Sample Database

Explore at:
zip(4495190 bytes)Available download formats
Dataset updated
Mar 14, 2021
Authors
Atanas Kanev
Description

SQLite Sakila Sample Database

Database Description

The Sakila sample database is a fictitious database designed to represent a DVD rental store. The tables of the database include film, film_category, actor, customer, rental, payment and inventory among others. The Sakila sample database is intended to provide a standard schema that can be used for examples in books, tutorials, articles, samples, and so forth. Detailed information about the database can be found on the MySQL website: https://dev.mysql.com/doc/sakila/en/

Sakila for SQLite is a part of the sakila-sample-database-ports project intended to provide ported versions of the original MySQL database for other database systems, including:

  • Oracle
  • SQL Server
  • SQLIte
  • Interbase/Firebird
  • Microsoft Access

Sakila for SQLite is a port of the Sakila example database available for MySQL, which was originally developed by Mike Hillyer of the MySQL AB documentation team. This project is designed to help database administrators to decide which database to use for development of new products The user can run the same SQL against different kind of databases and compare the performance

License: BSD Copyright DB Software Laboratory http://www.etl-tools.com

Note: Part of the insert scripts were generated by Advanced ETL Processor http://www.etl-tools.com/etl-tools/advanced-etl-processor-enterprise/overview.html

Information about the project and the downloadable files can be found at: https://code.google.com/archive/p/sakila-sample-database-ports/

Other versions and developments of the project can be found at: https://github.com/ivanceras/sakila/tree/master/sqlite-sakila-db

https://github.com/jOOQ/jOOQ/tree/main/jOOQ-examples/Sakila

Direct access to the MySQL Sakila database, which does not require installation of MySQL (queries can be typed directly in the browser), is provided on the phpMyAdmin demo version website: https://demo.phpmyadmin.net/master-config/

Files Description

The files in the sqlite-sakila-db folder are the script files which can be used to generate the SQLite version of the database. For convenience, the script files have already been run in cmd to generate the sqlite-sakila.db file, as follows:

sqlite> .open sqlite-sakila.db # creates the .db file sqlite> .read sqlite-sakila-schema.sql # creates the database schema sqlite> .read sqlite-sakila-insert-data.sql # inserts the data

Therefore, the sqlite-sakila.db file can be directly loaded into SQLite3 and queries can be directly executed. You can refer to my notebook for an overview of the database and a demonstration of SQL queries. Note: Data about the film_text table is not provided in the script files, thus the film_text table is empty. Instead the film_id, title and description fields are included in the film table. Moreover, the Sakila Sample Database has many versions, so an Entity Relationship Diagram (ERD) is provided to describe this specific version. You are advised to refer to the ERD to familiarise yourself with the structure of the database.

Search
Clear search
Close search
Google apps
Main menu