28 datasets found
  1. Bike Store Relational Database | SQL

    • kaggle.com
    zip
    Updated Aug 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dillon Myrick (2023). Bike Store Relational Database | SQL [Dataset]. https://www.kaggle.com/datasets/dillonmyrick/bike-store-sample-database
    Explore at:
    zip(94412 bytes)Available download formats
    Dataset updated
    Aug 21, 2023
    Authors
    Dillon Myrick
    Description

    This is the sample database from sqlservertutorial.net. This is a great dataset for learning SQL and practicing querying relational databases.

    Database Diagram:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4146319%2Fc5838eb006bab3938ad94de02f58c6c1%2FSQL-Server-Sample-Database.png?generation=1692609884383007&alt=media" alt="">

    Terms of Use

    The sample database is copyrighted and cannot be used for commercial purposes. For example, it cannot be used for the following but is not limited to the purposes: - Selling - Including in paid courses

  2. SQLite Sakila Sample Database

    • kaggle.com
    zip
    Updated Mar 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atanas Kanev (2021). SQLite Sakila Sample Database [Dataset]. https://www.kaggle.com/datasets/atanaskanev/sqlite-sakila-sample-database/code
    Explore at:
    zip(4495190 bytes)Available download formats
    Dataset updated
    Mar 14, 2021
    Authors
    Atanas Kanev
    Description

    SQLite Sakila Sample Database

    Database Description

    The Sakila sample database is a fictitious database designed to represent a DVD rental store. The tables of the database include film, film_category, actor, customer, rental, payment and inventory among others. The Sakila sample database is intended to provide a standard schema that can be used for examples in books, tutorials, articles, samples, and so forth. Detailed information about the database can be found on the MySQL website: https://dev.mysql.com/doc/sakila/en/

    Sakila for SQLite is a part of the sakila-sample-database-ports project intended to provide ported versions of the original MySQL database for other database systems, including:

    • Oracle
    • SQL Server
    • SQLIte
    • Interbase/Firebird
    • Microsoft Access

    Sakila for SQLite is a port of the Sakila example database available for MySQL, which was originally developed by Mike Hillyer of the MySQL AB documentation team. This project is designed to help database administrators to decide which database to use for development of new products The user can run the same SQL against different kind of databases and compare the performance

    License: BSD Copyright DB Software Laboratory http://www.etl-tools.com

    Note: Part of the insert scripts were generated by Advanced ETL Processor http://www.etl-tools.com/etl-tools/advanced-etl-processor-enterprise/overview.html

    Information about the project and the downloadable files can be found at: https://code.google.com/archive/p/sakila-sample-database-ports/

    Other versions and developments of the project can be found at: https://github.com/ivanceras/sakila/tree/master/sqlite-sakila-db

    https://github.com/jOOQ/jOOQ/tree/main/jOOQ-examples/Sakila

    Direct access to the MySQL Sakila database, which does not require installation of MySQL (queries can be typed directly in the browser), is provided on the phpMyAdmin demo version website: https://demo.phpmyadmin.net/master-config/

    Files Description

    The files in the sqlite-sakila-db folder are the script files which can be used to generate the SQLite version of the database. For convenience, the script files have already been run in cmd to generate the sqlite-sakila.db file, as follows:

    sqlite> .open sqlite-sakila.db # creates the .db file sqlite> .read sqlite-sakila-schema.sql # creates the database schema sqlite> .read sqlite-sakila-insert-data.sql # inserts the data

    Therefore, the sqlite-sakila.db file can be directly loaded into SQLite3 and queries can be directly executed. You can refer to my notebook for an overview of the database and a demonstration of SQL queries. Note: Data about the film_text table is not provided in the script files, thus the film_text table is empty. Instead the film_id, title and description fields are included in the film table. Moreover, the Sakila Sample Database has many versions, so an Entity Relationship Diagram (ERD) is provided to describe this specific version. You are advised to refer to the ERD to familiarise yourself with the structure of the database.

  3. classicmodels

    • kaggle.com
    zip
    Updated Apr 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ambreen (2024). classicmodels [Dataset]. https://www.kaggle.com/datasets/ambreenabdulraheem/classicmodels
    Explore at:
    zip(879935 bytes)Available download formats
    Dataset updated
    Apr 22, 2024
    Authors
    Ambreen
    Description

    MySQL Sample Database Schema. The MySQL sample database schema consists of the following tables:

    customers: stores customer’s data.

    products: stores a list of scale model cars.

    productlines: stores a list of product lines.

    orders: stores sales orders placed by customers.

    orderdetails: stores sales order line items for every sales order.

    payments: stores payments made by customers based on their accounts.

    employees: stores employee information and the organization structure such as who reports to whom.

    offices: stores sales office data.

  4. Employees

    • kaggle.com
    zip
    Updated Nov 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sudhir Singh (2021). Employees [Dataset]. https://www.kaggle.com/datasets/crepantherx/employees
    Explore at:
    zip(31992550 bytes)Available download formats
    Dataset updated
    Nov 12, 2021
    Authors
    Sudhir Singh
    Description

    Dataset

    This dataset was created by Sudhir Singh

    Released under Data files © Original Authors

    Contents

  5. Z

    Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and...

    • data.niaid.nih.gov
    Updated Aug 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sadat, Mefta; Bener, Ayse Basar; Miranskyy, Andriy V. (2024). Rediscovery Datasets: Connecting Duplicate Reports of Apache, Eclipse, and KDE [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_400614
    Explore at:
    Dataset updated
    Aug 3, 2024
    Dataset provided by
    Ryerson University
    Authors
    Sadat, Mefta; Bener, Ayse Basar; Miranskyy, Andriy V.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present three defect rediscovery datasets mined from Bugzilla. The datasets capture data for three groups of open source software projects: Apache, Eclipse, and KDE. The datasets contain information about approximately 914 thousands of defect reports over a period of 18 years (1999-2017) to capture the inter-relationships among duplicate defects.

    File Descriptions

    apache.csv - Apache Defect Rediscovery dataset

    eclipse.csv - Eclipse Defect Rediscovery dataset

    kde.csv - KDE Defect Rediscovery dataset

    apache.relations.csv - Inter-relations of rediscovered defects of Apache

    eclipse.relations.csv - Inter-relations of rediscovered defects of Eclipse

    kde.relations.csv - Inter-relations of rediscovered defects of KDE

    create_and_populate_neo4j_objects.cypher - Populates Neo4j graphDB by importing all the data from the CSV files. Note that you have to set dbms.import.csv.legacy_quote_escaping configuration setting to false to load the CSV files as per https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.import.csv.legacy_quote_escaping

    create_and_populate_mysql_objects.sql - Populates MySQL RDBMS by importing all the data from the CSV files

    rediscovery_db_mysql.zip - For your convenience, we also provide full backup of the MySQL database

    neo4j_examples.txt - Sample Neo4j queries

    mysql_examples.txt - Sample MySQL queries

    rediscovery_eclipse_6325.png - Output of Neo4j example #1

    distinct_attrs.csv - Distinct values of bug_status, resolution, priority, severity for each project

  6. Z

    FooDrugs database: A database with molecular and text information about food...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Garranzo, Marco; Piette Gómez, Óscar; Lacruz Pleguezuelos, Blanca; Pérez, David; Laguna Lobo, Teresa; Carrillo de Santa Pau, Enrique (2023). FooDrugs database: A database with molecular and text information about food - drug interactions [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6638469
    Explore at:
    Dataset updated
    Jul 28, 2023
    Dataset provided by
    IMDEA Food Institute
    Authors
    Garranzo, Marco; Piette Gómez, Óscar; Lacruz Pleguezuelos, Blanca; Pérez, David; Laguna Lobo, Teresa; Carrillo de Santa Pau, Enrique
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    FooDrugs database is a development done by the Computational Biology Group at IMDEA Food Institute (Madrid, Spain), in the context of the Food Nutrition Security Cloud (FNS-Cloud) project. Food Nutrition Security Cloud (FNS-Cloud) has received funding from the European Union's Horizon 2020 Research and Innovation programme (H2020-EU.3.2.2.3. – A sustainable and competitive agri-food industry) under Grant Agreement No. 863059 – www.fns-cloud.eu (See more details about FNS-Cloud below)

    FooDrugs stores information extracted from transcriptomics and text documents for foo-drug interactiosn and it is part of a demonstrator to be done in the FNS-Cloud project. The database was built using MySQL, an open source relational database management system. FooDrugs host information for a total of 161 transcriptomics GEO series with 585 conditions for food or bioactive compounds. Each condition is defined as a food/biocomponent per time point, per concentration, per cell line, primary culture or biopsy per study. FooDrugs includes information about a bipartite network with 510 nodes and their similarity scores (tau score; https://clue.io/connectopedia/connectivity_scores) related with possible drug interactions with drugs assayed in conectivity map (https://www.broadinstitute.org/connectivity-map-cmap). The information is stored in eight tables:

    Table “study” : This table contains basic information about study identifiers from GEO, pubmed or platform, study type, title and abstract

    Table “sample”: This table contains basic information about the different experiments in a study, like the identifier of the sample, treatment, origin type, time point or concentration.

    Table “misc_study”: This table contains additional information about different attributes of the study.

    Table “misc_sample”: This table contains additional information about different attributes of the sample.

    Table “cmap”: This table contains information about 70895 nodes, compromising drugs, foods or bioactives, overexpressed and knockdown genes (see section 3.4). The information includes cell line, compound and perturbation type.

    Table “cmap_foodrugs”: This table contains information about the tau score (see section 3.4) that relates food with drugs or genes and the node identifier in the FooDrugs network.

    Table “topTable”: This table contains information about 150 over and underexpressed genes from each GEO study condition, used to calculate the tau score (see section 3.4). The information stored is the logarithmic fold change, average expression, t-statistic, p-value, adjusted p-value and if the gene is up or downregulated.

    Table “nodes”: This table stores the information about the identification of the sample and the node in the bipartite network connecting the tables “sample”, “cmap_foodrugs” and “topTable”.

    In addition, FooDrugs database stores a total of 6422 food/drug interactions from 2849 text documents, obtained from three different sources: 2312 documents from PubMed, 285 from DrugBank, and 252 from drugs.com. These documents describe potential interactions between 1464 food/bioactive compounds and 3009 drugs. The information is stored in two tables:

    Table “texts”: This table contains all the documents with its identifiers where interactions have been identified with strategy described in section 4.

    Table “TM_interactions”: This table contains information about interaction identifiers, the food and drug entities, and the start and the end positions of the context for the interaction in the document.

    FNS-Cloud will overcome fragmentation problems by integrating existing FNS data, which is essential for high-end, pan-European FNS research, addressing FNS, diet, health, and consumer behaviours as well as on sustainable agriculture and the bio-economy. Current fragmented FNS resources not only result in knowledge gaps that inhibit public health and agricultural policy, and the food industry from developing effective solutions, making production sustainable and consumption healthier, but also do not enable exploitation of FNS knowledge for the benefit of European citizens. FNS-Cloud will, through three Demonstrators; Agri-Food, Nutrition & Lifestyle and NCDs & the Microbiome to facilitate: (1) Analyses of regional and country-specific differences in diet including nutrition, (epi)genetics, microbiota, consumer behaviours, culture and lifestyle and their effects on health (obesity, NCDs, ethnic and traditional foods), which are essential for public health and agri-food and health policies; (2) Improved understanding agricultural differences within Europe and what these means in terms of creating a sustainable, resilient food systems for healthy diets; and (3) Clear definitions of boundaries and how these affect the compositions of foods and consumer choices and, ultimately, personal and public health in the future. Long-term sustainability of the FNS-Cloud will be based on Services that have the capacity to link with new resources and enable cross-talk amongst them; access to FNS-Cloud data will be open access, underpinned by FAIR principles (findable, accessible, interoperable and re-useable). FNS-Cloud will work closely with the proposed Food, Nutrition and Health Research Infrastructure (FNHRI) as well as METROFOOD-RI and other existing ESFRI RIs (e.g. ELIXIR, ECRIN) in which several FNS-Cloud Beneficiaries are involved directly. (https://cordis.europa.eu/project/id/863059)

    ***** changes between version FooDrugs_v2 and FooDrugs_V3 (31st January 2023) are:

    Increased the amount of text documents by 85.675 from PubMed and ClinicalTrials.gov, and the amount of Text Mining interactions by 168.826.

    Increased the amount of transcriptomic studies by 32 GEO series.

    Removed all rows in table cmap_foodrugs representing interactions with values of tau=0

    Removed 43 GEO series that after manually checking didn't correspond to food compounds.

    Added a new column to the table texts: citation to hold the citation of the text.

    Added these columns to the table study: contributor to contain the authors of the study, publication_date to store the date of publication of the study in GEO and pubmed_id to reference the publication associated with the study if any.

    Added a new column to topTable to hold the top 150 up-regulated and 150 down-regulated genes.

  7. CHINOOK Music

    • kaggle.com
    zip
    Updated Sep 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira (2024). CHINOOK Music [Dataset]. https://www.kaggle.com/datasets/willianoliveiragibin/chinook-music
    Explore at:
    zip(9603 bytes)Available download formats
    Dataset updated
    Sep 19, 2024
    Authors
    willian oliveira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Chinook Database is a sample database designed for use with multiple database platforms, such as SQL Server, Oracle, MySQL, and others. It can be easily set up by running a single SQL script, making it a convenient alternative to the popular Northwind database. Chinook is widely used in demos and testing environments, particularly for Object-Relational Mapping (ORM) tools that target both single and multiple database servers.

    Supported Database Servers Chinook supports several database servers, including:

    DB2 MySQL Oracle PostgreSQL SQL Server SQL Server Compact SQLite Download Instructions You can download the SQL scripts for each supported database server from the latest release assets. The appropriate SQL script file(s) for your database vendor are provided, which can be executed using your preferred database management tool.

    Data Model The Chinook Database represents a digital media store, containing tables that include:

    Artists Albums Media tracks Invoices Customers Sample Data The media data in Chinook is derived from a real iTunes Library, providing a realistic dataset for users. Additionally, users can generate their own SQL scripts using their personal iTunes Library by following specific instructions. Customer and employee details in the database were manually crafted with fictitious names, addresses (mappable via Google Maps), and well-structured contact information such as phone numbers, faxes, and emails. Sales data is auto-generated and spans a four-year period, using random values.

    Why is it Called Chinook? The Chinook Database's name is a nod to its predecessor, the Northwind database. Chinooks are warm, dry winds found in the interior regions of North America, particularly over southern Alberta in Canada, where the Canadian Prairies meet mountain ranges. This natural phenomenon inspired the choice of name, reflecting the idea that Chinook serves as a refreshing alternative to the Northwind database.

  8. s

    Orphan Drugs - Dataset 1: Twitter issue-networks as excluded publics

    • orda.shef.ac.uk
    txt
    Updated Oct 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Hanchard (2021). Orphan Drugs - Dataset 1: Twitter issue-networks as excluded publics [Dataset]. http://doi.org/10.15131/shef.data.16447326.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 22, 2021
    Dataset provided by
    The University of Sheffield
    Authors
    Matthew Hanchard
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises of two .csv format files used within workstream 2 of the Wellcome Trust funded ‘Orphan drugs: High prices, access to medicines and the transformation of biopharmaceutical innovation’ project (219875/Z/19/Z). They appear in various outputs, e.g. publications and presentations.

    The deposited data were gathered using the University of Amsterdam Digital Methods Institute’s ‘Twitter Capture and Analysis Toolset’ (DMI-TCAT) before being processed and extracted from Gephi. DMI-TCAT queries Twitter’s STREAM Application Programming Interface (API) using SQL and retrieves data on a pre-set text query. It then sends the returned data for storage on a MySQL database. The tool allows for output of that data in various formats. This process aligns fully with Twitter’s service user terms and conditions. The query for the deposited dataset gathered a 1% random sample of all public tweets posted between 10-Feb-2021 and 10-Mar-2021 containing the text ‘Rare Diseases’ and/or ‘Rare Disease Day’, storing it on a local MySQL database managed by the University of Sheffield School of Sociological Studies (http://dmi-tcat.shef.ac.uk/analysis/index.php), accessible only via a valid VPN such as FortiClient and through a permitted active directory user profile. The dataset was output from the MySQL database raw as a .gexf format file, suitable for social network analysis (SNA). It was then opened using Gephi (0.9.2) data visualisation software and anonymised/pseudonymised in Gephi as per the ethical approval granted by the University of Sheffield School of Sociological Studies Research Ethics Committee on 02-Jun-201 (reference: 039187). The deposited dataset comprises of two anonymised/pseudonymised social network analysis .csv files extracted from Gephi, one containing node data (Issue-networks as excluded publics – Nodes.csv) and another containing edge data (Issue-networks as excluded publics – Edges.csv). Where participants explicitly provided consent, their original username has been provided. Where they have provided consent on the basis that they not be identifiable, their username has been replaced with an appropriate pseudonym. All other usernames have been anonymised with a randomly generated 16-digit key. The level of anonymity for each Twitter user is provided in column C of deposited file ‘Issue-networks as excluded publics – Nodes.csv’.

    This dataset was created and deposited onto the University of Sheffield Online Research Data repository (ORDA) on 26-Aug-2021 by Dr. Matthew S. Hanchard, Research Associate at the University of Sheffield iHuman institute/School of Sociological Studies. ORDA has full permission to store this dataset and to make it open access for public re-use without restriction under a CC BY license, in line with the Wellcome Trust commitment to making all research data Open Access.

    The University of Sheffield are the designated data controller for this dataset.

  9. c

    Data Base Management Systems market size was USD 50.5 billion in 2022 !

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2025). Data Base Management Systems market size was USD 50.5 billion in 2022 ! [Dataset]. https://www.cognitivemarketresearch.com/data-base-management-systems-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Oct 29, 2025
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    The global Data Base Management Systems market was valued at USD 50.5 billion in 2022 and is projected to reach USD 120.6 Billion by 2030, registering a CAGR of 11.5 % for the forecast period 2023-2030. Factors Affecting Data Base Management Systems Market Growth

    Growing inclination of organizations towards adoption of advanced technologies like cloud-based technology favours the growth of global DBMS market
    

    The cloud-based data base management system solutions offer the organizations with an ability to scale their database infrastructure up or down as per requirement. In a crucial business environment data volume can vary over time. Here, the cloud allows organizations to allocate resources in a dynamic and systematic manner, thereby, ensuring optimal performance without underutilization. In addition, these cloud-based solutions are cost-efficient. As, these cloud-based DBMS solutions eliminate the need for companies to maintain and invest in physical infrastructure and hardware. It helps in reducing ongoing operational costs and upfront capital expenditures. Organizations can choose pay-as-you-go pricing models, where they need to pay only for the resources they consume. Therefore, it has been a cost-efficient option for both smaller businesses and large-enterprises. Moreover, the cloud-based data base management system platforms usually come with management tools which streamline administrative tasks such as backup, provisioning, recovery, and monitoring. It allows IT teams to concentrate on more of strategic tasks rather than routine maintenance activities, thereby, enhancing operational efficiency. Whereas, these cloud-based data base management systems allow users to remote access and collaboration among teams, irrespective of their physical locations. Thus, in regards with today's work environment, which focuses on distributed and remote workforces. These cloud-based DBMS solution enables to access data and update in real-time through authorized personnel, allowing collaboration and better decision-making. Thus, owing to all the above factors, the rising adoption of advanced technologies like cloud-based DBMS is favouring the market growth.

    Availability of open-source solutions is likely to restrain the global data base management systems market growth
    

    Open-source data base management system solutions such as PostgreSQL, MongoDB, and MySQL, offer strong functionality at minimal or no licensing costs. It makes open-source solutions an attractive option for companies, especially start-ups or smaller businesses with limited budgets. As these open-source solutions offer similar capabilities to various commercial DBMS offerings, various organizations may opt for this solutions in order to save costs. The open-source solutions may benefit from active developer communities which contribute to their development, enhancement, and maintenance. This type of collaborative environment supports continuous innovation and improvement, which results into solutions that are slightly competitive with commercial offerings in terms of performance and features. Thus, the open-source solutions create competition for commercial DBMS market, they thrive in the market by offering unique value propositions, addressing needs of organizations which prioritize professional support, seamless integration into complex IT ecosystems, and advanced features. Introduction of Data Base Management Systems

    A Database Management System (DBMS) is a software which is specifically designed to organize and manage data in a structured manner. This system allows users to create, modify, and query a database, and also manage the security and access controls for that particular database. The DBMS offers tools for creating and modifying data models, that define the structure and relationships of data in a database. This system is also responsible for storing and retrieving data from the database, and also provide several methods for searching and querying the data. The data base management system also offers mechanisms to control concurrent access to the database, in order to ensure that number of users may access the data. The DBMS provides tools to enforce security constraints and data integrity, such as the constraints on the value of data and access controls that restricts who can access the data. The data base management system also provides mechanisms for recovering and backing up the data when a system failure occurs....

  10. n

    Malaria disease and grading system dataset from public hospitals reflecting...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Nov 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Temitope Olufunmi Atoyebi; Rashidah Funke Olanrewaju; N. V. Blamah; Emmanuel Chinanu Uwazie (2023). Malaria disease and grading system dataset from public hospitals reflecting complicated and uncomplicated conditions [Dataset]. http://doi.org/10.5061/dryad.4xgxd25gn
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 10, 2023
    Dataset provided by
    Nasarawa State University
    Authors
    Temitope Olufunmi Atoyebi; Rashidah Funke Olanrewaju; N. V. Blamah; Emmanuel Chinanu Uwazie
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Malaria is the leading cause of death in the African region. Data mining can help extract valuable knowledge from available data in the healthcare sector. This makes it possible to train models to predict patient health faster than in clinical trials. Implementations of various machine learning algorithms such as K-Nearest Neighbors, Bayes Theorem, Logistic Regression, Support Vector Machines, and Multinomial Naïve Bayes (MNB), etc., has been applied to malaria datasets in public hospitals, but there are still limitations in modeling using the Naive Bayes multinomial algorithm. This study applies the MNB model to explore the relationship between 15 relevant attributes of public hospitals data. The goal is to examine how the dependency between attributes affects the performance of the classifier. MNB creates transparent and reliable graphical representation between attributes with the ability to predict new situations. The model (MNB) has 97% accuracy. It is concluded that this model outperforms the GNB classifier which has 100% accuracy and the RF which also has 100% accuracy. Methods Prior to collection of data, the researcher was be guided by all ethical training certification on data collection, right to confidentiality and privacy reserved called Institutional Review Board (IRB). Data was be collected from the manual archive of the Hospitals purposively selected using stratified sampling technique, transform the data to electronic form and store in MYSQL database called malaria. Each patient file was extracted and review for signs and symptoms of malaria then check for laboratory confirmation result from diagnosis. The data was be divided into two tables: the first table was called data1 which contain data for use in phase 1 of the classification, while the second table data2 which contains data for use in phase 2 of the classification. Data Source Collection Malaria incidence data set is obtained from Public hospitals from 2017 to 2021. These are the data used for modeling and analysis. Also, putting in mind the geographical location and socio-economic factors inclusive which are available for patients inhabiting those areas. Naive Bayes (Multinomial) is the model used to analyze the collected data for malaria disease prediction and grading accordingly. Data Preprocessing: Data preprocessing shall be done to remove noise and outlier. Transformation: The data shall be transformed from analog to electronic record. Data Partitioning The data which shall be collected will be divided into two portions; one portion of the data shall be extracted as a training set, while the other portion will be used for testing. The training portion shall be taken from a table stored in a database and will be called data which is training set1, while the training portion taking from another table store in a database is shall be called data which is training set2. The dataset was split into two parts: a sample containing 70% of the training data and 30% for the purpose of this research. Then, using MNB classification algorithms implemented in Python, the models were trained on the training sample. On the 30% remaining data, the resulting models were tested, and the results were compared with the other Machine Learning models using the standard metrics. Classification and prediction: Base on the nature of variable in the dataset, this study will use Naïve Bayes (Multinomial) classification techniques; Classification phase 1 and Classification phase 2. The operation of the framework is illustrated as follows: i. Data collection and preprocessing shall be done. ii. Preprocess data shall be stored in a training set 1 and training set 2. These datasets shall be used during classification. iii. Test data set is shall be stored in database test data set. iv. Part of the test data set must be compared for classification using classifier 1 and the remaining part must be classified with classifier 2 as follows: Classifier phase 1: It classify into positive or negative classes. If the patient is having malaria, then the patient is classified as positive (P), while a patient is classified as negative (N) if the patient does not have malaria.
    Classifier phase 2: It classify only data set that has been classified as positive by classifier 1, and then further classify them into complicated and uncomplicated class label. The classifier will also capture data on environmental factors, genetics, gender and age, cultural and socio-economic variables. The system will be designed such that the core parameters as a determining factor should supply their value.

  11. Germline and malignant melanoma sampled sequences containing point mutations...

    • zenodo.org
    application/gzip, bin
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gavin Huttley; Yicheng Zhu; Teresa Neeman; Von Bing Yap; Gavin Huttley; Yicheng Zhu; Teresa Neeman; Von Bing Yap (2020). Germline and malignant melanoma sampled sequences containing point mutations [Dataset]. http://doi.org/10.5281/zenodo.53158
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gavin Huttley; Yicheng Zhu; Teresa Neeman; Von Bing Yap; Gavin Huttley; Yicheng Zhu; Teresa Neeman; Von Bing Yap
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Data files generated as part of a study into the influence of neighbouring bases on point mutation. The data are sampled from the Ensembl (http://www.ensembl.org) MySQL databases or COSMIC (http://cancer.sanger.ac.uk/cosmic) and processed using custom scripts that will be uploaded separately and associated with this submission via gthe related identifier.

  12. p

    Royal Institute for Cultural Heritage Radiocarbon and stable isotope...

    • pandora.earth
    Updated Jul 12, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). Royal Institute for Cultural Heritage Radiocarbon and stable isotope measurements - Dataset - Pandora [Dataset]. https://pandora.earth/gl_ES/dataset/royal-institute-for-cultural-heritage-radiocarbon-and-stable-isotope-measurements
    Explore at:
    Dataset updated
    Jul 12, 2011
    Description

    The Radiocarbon dating laboratory of IRPA/KIK was founded in the 1960s. Initially dates were reported at more or less regular intervals in the journal Radiocarbon (Schreurs 1968). Since the advent of radiocarbon dating in the 1950s it had been a common practice amongst radiocarbon laboratories to publish their dates in so-called ‘date-lists’ that were arranged per laboratory. This was first done in the Radiocarbon Supplement of the American Journal of Science and later in the specialised journal Radiocarbon. In the course of time the latter, with the added subtitle An International Journal of Cosmogenic Isotope Research, became a regular scientific journal shifting focus from date-lists to articles. Furthermore the world-wide exponential increase of radiocarbon dates made it almost impossible to publish them all in the same journal, even more so because of the broad range of applications that use radiocarbon analysis, ranging from archaeology and art history to geology and oceanography and recently also biomedical studies.The IRPA/KIK database From 1995 onwards IRPA/KIK’s Radiocarbon laboratory started to publish its dates in small publications, continuing the numbering of the preceding lists in Radiocarbon. The first booklet in this series was “Royal Institute for Cultural Heritage Radiocarbon dates XV” (Van Strydonck et al. 1995), followed by three more volumes (XVI, XVII, XVIII). The next list (XIX, 2005) was no longer printed but instead handed out as a PDF file on CD-rom. The ever increasing number of dates and the difficulties in handling all the data, however, made us look for a more permanent and easier solution. In order to improve data management and consulting, it was thus decided to gather all our dates in a web-based database. List XIX was in fact already a Microsoft Access database that was converted into a reader friendly style and could also be printed as a PDF file. However a Microsoft Access database is not the most practical solution to make information publicly available. Hence the structure of the database was recreated in Mysql and the existing content was transferred into the corresponding fields. To display the records, a web-based front-end was programmed in PHP/Apache. It features a full-text search function that allows for partial word-matching. In addition the records can be consulted in PDF format. Old records from the printed date-lists as well as new records are now added using the same Microsoft Acces back-end, which is now connected directly to the Mysql database. The main problem with introducing the old data was that not all the current criteria were available in the past (e.g. stable isotope measurements). Furthermore since all the sample information is given by the submitter, its quality largely depends on the persons willingness to contribute as well as on the accuracy and correctness of the information he provides. Sometimes problems arrive from the fact that a certain investigation (like an excavation) is carried out over a relatively long period (sometimes even more than ten years) and is directed by different people or even institutions. This can lead to differences in the labeling procedure of the samples, but also in the interpretation of structures and artifacts and in the orthography of the site’s name. Finally the submitter might change address, while the names of institutions or even regions and countries might change as well (e.g.Zaire - Congo)

  13. n

    Heparome

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Oct 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Heparome [Dataset]. http://identifiers.org/RRID:SCR_008615
    Explore at:
    Dataset updated
    Oct 11, 2025
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 17, 2013. A database which contains the information of heparin-binding proteins of E. coli K-12 MG1655 cells. Heparin affinity columns were applied to enrich and fractionate proteins. Identification of proteins was done via the collaboration with David Russell''s lab. Because heparin is negatively charged sulfated glucosaminoglycan, polyamion binding proteins, which contain nucleic acid-binding proteins, are expected to bind to heparin columns. Study of the expression pattern of heparin-binding proteins will help to study the nucleic acid-binding proteins, most of which are related to regulation. Moreover, heparin affinity columns will also erich low abundance proteins. Heparome database is constructed using MySQL. Website interface is built using HTML and PHP. Queries between MySQL database and website interface are executed using PHP. Besides including information of identified proteins, such as swiss accession number, gene name, molecular weight, isoelectric point, condon adaptation index (CAI), functional classification, et. al. , it also includes information of experiments, such as sample preparation, heparin-HPLC chromatography, SDS-PAGE gel separation and MALDI-MS.

  14. E

    Data from: ChEssBase

    • erddap.eurobis.org
    • obis.org
    • +2more
    Updated Aug 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Perry, Hall, Baker, Ramirez-Llodra (2025). ChEssBase [Dataset]. https://erddap.eurobis.org/erddap/info/chessbase/index.html
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Perry, Hall, Baker, Ramirez-Llodra
    Area covered
    Variables measured
    aphia_id, latitude, longitude, MaximumDepth, MinimumDepth, BasisOfRecord, ScientificName, InstitutionCode
    Description

    ChEssBase is a dynamic relational database for all deep-water species from chemosynthetic ecosystems (hydrothermal vents, cold seeps and other reducing environments such as whale carcasses, sunken wood or OMZs) being constructed from the ChEss project (Biogeography of Deep-Water Chemosynthetic Ecosystems) within the Census of Marine Life initiative. AccConID=21 AccConstrDescription=This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of licenses offered. Recommended for maximum dissemination and use of licensed materials. AccConstrDisplay=This dataset is licensed under a Creative Commons Attribution 4.0 International License. AccConstrEN=Attribution (CC BY) AccessConstraint=Attribution (CC BY) AccessConstraints=None Acronym=None added_date=2013-06-12 15:21:34.517000 BrackishFlag=0 CDate=2004-06-24 cdm_data_type=Other CheckedFlag=0 Citation=Ramirez-Llodra, E., Blanco, 2005. ChEssBase: an online information system on biodiversity and biogeography of deep-sea fauna from chemosynthetic ecosystems. Version 2. World Wide Web electronic publications, http://www.noc.soton.ac.uk/chess/database/db_home.php Comments=None ContactEmail=None Conventions=COARDS, CF-1.6, ACDD-1.3 CurrencyDate=None DasID=212 DasOrigin=Literature research DasType=Data DasTypeID=1 DateLastModified={'date': '2025-08-12 01:34:46.196267', 'timezone_type': 1, 'timezone': '+02:00'} DescrCompFlag=0 DescrTransFlag=0 Easternmost_Easting=179.8 EmbargoDate=None EngAbstract=ChEssBase is a dynamic relational database for all deep-water species from chemosynthetic ecosystems (hydrothermal vents, cold seeps and other reducing environments such as whale carcasses, sunken wood or OMZs) being constructed from the ChEss project (Biogeography of Deep-Water Chemosynthetic Ecosystems) within the Census of Marine Life initiative. EngDescr=The aim of ChEssBase is to provide taxonomical, biological, ecological and distributional data for all species described from deep-water chemosynthetic ecosystems, as well as information on available samples, images, bibliography and information on the habitats.These habitats include hydrothermal vents, cold seeps, whale falls, sunken wood and areas of minimum oxygen that intersect with the continental margin or seamounts. Since the discovery of hydrothermal vents in 1977 and of cold seep communities in 1984, over 590 species from vents and over 230 species from seeps have been described. Chemosynthetically fueled communities have now also been found on large organic falls to the deep-sea floor such as whale falls and sunken wood, as well as on benthic zones of oxygen minimum.The data gathered in the last 30 years has shown that some species are shared amongst these ecosystems and our knowledge of their phylogeography improves with every new discovery. New species are continuously being discovered and described from research programmes around the globe and therefore ChEssBase is in active development and new data are being entered regularly. At present, ChEssBase includes data on 1740 species from 193 chemosynthetic sites around the globe. These data contain information (when available) on the taxonomy, morphology, trophic level, reproduction, endemicity, habitat type and distribution. There are now 1880 papers in our reference database.The first version of ChEssBase was available online in December 2004. In summer 2005, ChEssBase and the InterRidge biological database (www.interridge.org) were fused into a single source of information for biological data from chemosynthetic ecosystems. This second version of ChEssBase is available online since August 2005, with new records as well as new search and download options. Since December 2005, ChEssBase is integrated in the Ocean Biogeographic Information System (OBIS, www.iobis.org).ChEssBase is supported by a species-based relational database in MySQL. The database includes 3 major components:Taxonomy (from kingdom to subspecies)Distribution (from site to major geographic area)Samples (including sample, cruise and institution information)ChEssBase is regularly updated with new information available in the literature. In order to quickly obtain accurate new data and help maintain the database up to date, we would be very grateful if you could send us any new publications with data relevant to ChEssBase, which we would add to the database, together with the relevant references. FreshFlag=0 geospatial_lat_max=72.0 geospatial_lat_min=-55.1 geospatial_lat_units=degrees_north geospatial_lon_max=179.8 geospatial_lon_min=-158.1 geospatial_lon_units=degrees_east infoUrl=None InputNotes=None institution=COML, SOTON-NOC, SOTON-SOES License=https://creativecommons.org/licenses/by/4.0/ Lineage=Prior to publication data undergo quality control checked which are described in https://github.com/EMODnet/EMODnetBiocheck?tab=readme-ov-file#understanding-the-output MarineFlag=1 modified_sync=2021-02-05 00:00:00 Northernmost_Northing=72.0 OrigAbstract=None OrigDescr=None OrigDescrLang=None OrigDescrLangNL=None OrigLangCode=None OrigLangCodeExtended=None OrigLangID=None OrigTitle=None OrigTitleLang=None OrigTitleLangCode=None OrigTitleLangID=None OrigTitleLangNL=None Progress=In Progress PublicFlag=1 ReleaseDate=Jun 12 2013 12:00AM ReleaseDate0=2013-06-12 RevisionDate=None SizeReference=1740 species from 193 sites sourceUrl=(local files) Southernmost_Northing=-55.1 standard_name_vocabulary=CF Standard Name Table v70 StandardTitle=ChEssBase StatusID=1 subsetVariables=ScientificName,BasisOfRecord,aphia_id TerrestrialFlag=0 UDate=2025-03-26 VersionDate=Jun 3 2004 12:00AM VersionDay=23 VersionMonth=10 VersionName=2 VersionYear=2007 VlizCoreFlag=1 Westernmost_Easting=-158.1

  15. Sample Superstore cleaned dataset

    • kaggle.com
    zip
    Updated Apr 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prixpam (2025). Sample Superstore cleaned dataset [Dataset]. https://www.kaggle.com/prixpam/eda-on-sample-superstore-data-set-using-mysql
    Explore at:
    zip(563700 bytes)Available download formats
    Dataset updated
    Apr 4, 2025
    Authors
    Prixpam
    Description

    A Cleaned Dataset and SQL Scripts for Business Insights from the Sample Superstore

    Dataset Description This project features the Sample Superstore dataset, originally sourced from Kaggle, enhanced with MySQL-based data cleaning and analysis. The dataset includes 4,929 records of sales, profit, customer, and product data from a fictional retail superstore. It has been cleaned to remove duplicate orders and paired with a comprehensive set of SQL queries to uncover actionable business insights.

  16. m

    Membership of the IMO Commission for Agricultural Meteorology (1913-1947)

    • data.mendeley.com
    Updated May 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuditta Parolini (2020). Membership of the IMO Commission for Agricultural Meteorology (1913-1947) [Dataset]. http://doi.org/10.17632/pds6tz443t.1
    Explore at:
    Dataset updated
    May 6, 2020
    Authors
    Giuditta Parolini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset has been built as part of a research project on the history of agricultural meteorology in the first half of the twentieth century. The dataset provides information on the members of the technical commission on agriculture established by the International Meteorological Organization (IMO) in the years 1913-1947. The main sources used to build the dataset are the proceedings of the meetings held by the IMO, and in particular the membership lists printed in these proceedings. Information provided by these primary sources has been constantly examined for consistency and correctness and, whenever possible, biographical resources on individual members have been consulted and mentioned in the dataset here presented. The sql files in the dataset allow to re-build the MySQL database that was created to investigate the commission membership and its transformation over time. A copy of the data tables is also available in csv format for the users who wish to access only the data. In the dataset there are twelve tables. Eleven tables provide information (affiliation, nation, city, role in the commission) on the scientists listed as members of the commission for a specific time period. There is a table for each year (1913, 1919, 1921, 1923, 1926, 1929, 1932, 1935, 1937, 1946, 1947) in which a membership list is available. NationH and NationG stands for Nation(History) and Nation(Geography), similarly for cityH and cityG. In this way, it is possible to place commission members also within modern countries and cities, not only their historical counterparts, if one wishes to build a map of the members’ location using current geodata. The role of each member within the commission has only three possible options: president, secretary/vice-president, member. The last table in the dataset, m_all, provides a comprehensive list of all the over one-hundred members of the commission with some biographical details on them (when available). The idmembers value is the unique identifier for each member within this dataset. This dataset has been used in an extensive investigation on the role that the IMO had in promoting international collaboration in agricultural meteorology during the first half of the twentieth century. The data here gathered, however, can be of interest beyond the history of agricultural meteorology. They also offer relevant materials to scholars more generally concerned with the work of the IMO, and the database structure provides a template for similar data collection work on other IMO technical commissions. These commissions were key places for sharing meteorological and climatological knowledge between mid-nineteenth century and mid-twentieth century and they certainly deserve more attention than the one they have so far received from scholars.

    I gratefully acknowledge the financial support of the German Research Foundation (DFG) (Project No. 321660352) in the preparation of this dataset.

  17. g

    Meta-Information des Samples der Media-Analyse Daten: IntermediaPlus...

    • search.gesis.org
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brentel, Inga; Kampes, Céline Fabienne; Jandura, Olaf, Meta-Information des Samples der Media-Analyse Daten: IntermediaPlus (2014-2016) [Dataset]. https://search.gesis.org/research_data/SDN-10.7802-2030
    Explore at:
    Dataset provided by
    GESIS search
    GESIS, Köln
    Authors
    Brentel, Inga; Kampes, Céline Fabienne; Jandura, Olaf
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Description

    Bei dem aufbereiteten Längsschnitt-Datensatzes 2014 bis 2016 handelt es sich um „Big-Data“, weshalb der Gesamtdatensatz nur in Form einer Datenbank (MySQL) verfügbar sein wird. In dieser Datenbank liegt die Information verschiedener Variablen eines Befragten untereinander. Die vorliegende Publikation umfasst eine SQL-Datenbank mit den Meta-Daten des Sample des Gesamtdatensatzes, das einen Ausschnitt der verfügbaren Variablen des Gesamtdatensatzes darstellt und die Struktur der aufbereiteten Daten darlegen soll, und eine Datendokumentation des Samples. Für diesen Zweck beinhaltet das Sample alle Variablen der Soziodemographie, dem Freizeitverhalten, der Zusatzinformation zu einem Befragten und dessen Haushalt sowie den interviewspezifischen Variablen und Gewichte. Lediglich bei den Variablen bezüglich der Mediennutzung des Befragten, handelt es sich um eine kleine Auswahl: Für die Onlinemediennutzung wurden die Variablen aller Gesamtangebote sowie der Einzelangebote der Genre Politik und Digital aufgenommen. Die Mediennutzung von Radio, Print und TV wurde im Sample nicht berücksichtigt, da deren Struktur anhand der veröffentlichten Längsschnittdaten der Media-Analyse MA Radio, MA Pressemedien und MA Intermedia nachvollzogen werden kann.
    Die Datenbank mit den tatsächlichen Befragungsdaten wäre auf Grund der Größe des Datenmaterials bereits im kritischen Bereich der Dateigröße für den normalen Up- und Download. Die tatsächlichen Befragungsergebnisse, die zur Analyse nötig sind, werden dann 2021 in Form des Gesamtdatensatzes der Media-Analyse-Daten: IntermediaPlus (2014-2016) im DBK bei GESIS veröffentlicht werden.

    Die Daten sowie deren Datenaufbereitung sind ein Vorschlag eines Best-Practice Cases für Big-Data Management bzw. den Umgang mit Big-Data in den Sozialwissenschaften und mit sozialwissenschaftlichen Daten. Unter Verwendung der GESIS Software CharmStats, die im Rahmen dieses Projektes um Big-Data Features erweitert wurde, erfolgt die Dokumentation und Herstellung der Transparenz der Harmonisierungsarbeit. Durch ein Python-Skript sowie ein html-Template wurde der Arbeitsprozess um und mit CharmStats zudem stärker automatisiert.

    Der aufbereitete Längsschnitt des Gesamtdatensatzes der MA IntermediaPlus für 2014 bis 2016 wird 2021 in Kooperation mit GESIS herausgegeben werden und den FAIR-Prinzipien (Wilkinson et al. 2016) entsprechend verfügbar gemacht werden. Ziel ist es durch die Harmonisierung der einzelnen Querschnitte die Datenquelle der Media-Analyse, die im Rahmen des Dissertationsprojektes „Angebots- und Publikumsfragmentierung online“ durch Inga Brentel und Céline Fabienne Kampes erfolgt, für Forschung zum sozialen und medialen Wandel in der Bundesrepublik Deutschland zugänglich zu machen.

    Künftige Studiennummer des Gesamtdatensatzes der IndermediaPlus im DBK der GESIS: ZA5769 (Version 1-0-0) und der doi: https://dx.doi.org/10.4232/1.13530

    ****************English Version****************

    The prepared Longitudinal IntermediaPlus dataset 2014 to 2016 is a "big data", which is why the entire dataset will only be available in the form of a database (MySQL). In this database, the information of different variables of a respondent is organized in one column, one below the other. The present publication includes a SQL-Database with the meta data of a sample of the full database, which represents a section of the available variables of the total data set and is intended to show the structure of the prepared data and the data-documentation (codebook) of the sample. For this purpose, the sample contains all variables of sociodemography, free-time activities, additional information on a respondent and his household as well as the interview-specific variables and weights. Only the variables concerning the respondent's media use are a small selection: For online media use, the variables of all overall offerings as well as the individual offerings of the genres politics and digital were included. The media use of radio, print and TV was not included in the sample because its structure can be traced using the published longitudinal data of the media analysis MA Radio, MA Pressemedien and MA Intermedia.
    Due to the size of the datafile, the database with the actual survey data would already be in the critical range of the file size for the common upload and download. The actual survey result...

  18. daily - sales - expenses database&tables

    • kaggle.com
    zip
    Updated Nov 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OIE (2025). daily - sales - expenses database&tables [Dataset]. https://www.kaggle.com/datasets/emmyofh/daily-sales-expenses-database-and-tables
    Explore at:
    zip(666 bytes)Available download formats
    Dataset updated
    Nov 11, 2025
    Authors
    OIE
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This file contains the database and table creation scripts used in the Daily Sales & Expenses System project. It defines the structure of the MySQL database where all sales, expenses, inventory, and login records are stored before being accessed, analyzed, and visualized using Python.

    The file includes: - Commands to create the database - SQL statements to create the users, sales, expenses and inventory tables - (Optionally) Sample INSERT statements to populate the tables with test data

    This file is essential for anyone who wants to replicate or test the Python-MySQL integration in the project.

  19. Z

    Dataset of raw experiments data of large triaxial-experiments

    • data.niaid.nih.gov
    Updated Jan 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Biebricher, Sven F. (2021). Dataset of raw experiments data of large triaxial-experiments [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4382099
    Explore at:
    Dataset updated
    Jan 18, 2021
    Dataset provided by
    RWTH Aachen - Lehrstuhl für Geotechnik
    Authors
    Biebricher, Sven F.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set contains the measurement data of 20 large-scale triaxial (permeability included) tests carried out. Synthetic sand-lime bricks (KS XL) from the Silka company and the natural Nievelstein sandstone were sampled. The samples have a size of about 25 cm in diameter and a height of 45 cm. The experiments where carried out at the Chair of Geotechnical Engeneering at the RWTH Aachen University in Germany.

    The following data were recorded for the rock samples:

    geomechanical probe properties

    size and weight

    density (dry and wet)

    porosity

    permeability coefficient

    uniaxial strength

    During the tests the following data were recorded:

    axial strain

    confining and axial pressure

    fluid pressure

    fluid flow through specimen

    temperatures

    Data sets are stored in a MySQL-Database. Data sets can be interpreted with the devloped Triaxial Test Evaluation Tool.

  20. Logistics analytics task

    • kaggle.com
    zip
    Updated Mar 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iron Monger t (2023). Logistics analytics task [Dataset]. https://www.kaggle.com/datasets/ironmongert/logistics-analytics-task
    Explore at:
    zip(293839 bytes)Available download formats
    Dataset updated
    Mar 28, 2023
    Authors
    Iron Monger t
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Track dissent records for a logistics company

    The entities involved in this process include EvidenceLog, DissentRecord, Vendor, DissentCategory, and StatusMaster.

    The EvidenceLog entity represents the evidence that is logged for each dissent record and has attributes such as ID, EvidenceCode, EvidenceRelatedTo, DissentCategoryID, StatusID, VendorID, and more. The DissentRecord entity represents the actual dissent record and has attributes such as ID, DissentCategoryID, VendorID, and EvidenceLogID. The Vendor entity contains vendor details, while the DissentCategory entity contains attributes related to the category of dissent, such as ID, CategoryCode, CategoryName, and more. The StatusMaster entity contains attributes related to the status of the dissent record, such as ID, Status, description, and StatusCode.

    The relationships between these entities are also defined in the ER diagram, such as the many-to-one relationships between DissentRecords and DissentCategory, EvidenceLog, VendorMaster, and StatusMaster. Additionally, EvidenceLog has a one-to-many relationship with EvidenceImages.

    To develop and visualize sample data from this application, you could create sample records for each entity and populate them with data that represents the typical use case of the software. For example, you could create a DissentCategory record with the CategoryName "Damaged Goods", a StatusMaster record with the Status "Resolved", a Vendor record with details of a specific vendor, and an EvidenceLog record with details of the evidence related to the dissent record. You could then link these records together using the appropriate relationships, such as linking the DissentRecord to the DissentCategory, Vendor, and EvidenceLog records.

    To visualize this data, you could create a graphical representation of the ER diagram using a tool such as Lucidchart or draw.io. This would allow you to see the relationships between the entities and how they are linked together. Additionally, you could use a database management tool such as MySQL Workbench to create a database schema based on the ER diagram and populate it with sample data. This would allow you to view the data in a tabular format and run queries to retrieve specific information as needed.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dillon Myrick (2023). Bike Store Relational Database | SQL [Dataset]. https://www.kaggle.com/datasets/dillonmyrick/bike-store-sample-database
Organization logo

Bike Store Relational Database | SQL

Sample database from sqlservertutorial.net for a retail bike store.

Explore at:
zip(94412 bytes)Available download formats
Dataset updated
Aug 21, 2023
Authors
Dillon Myrick
Description

This is the sample database from sqlservertutorial.net. This is a great dataset for learning SQL and practicing querying relational databases.

Database Diagram:

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4146319%2Fc5838eb006bab3938ad94de02f58c6c1%2FSQL-Server-Sample-Database.png?generation=1692609884383007&alt=media" alt="">

Terms of Use

The sample database is copyrighted and cannot be used for commercial purposes. For example, it cannot be used for the following but is not limited to the purposes: - Selling - Including in paid courses

Search
Clear search
Close search
Google apps
Main menu