Facebook
TwitterMySQL Classicmodels sample database
The MySQL sample database schema consists of the following tables:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F8652778%2Fefc56365be54c0e2591a1aefa5041f36%2FMySQL-Sample-Database-Schema.png?generation=1670498341027618&alt=media" alt="">
Facebook
TwitterThe Sakila sample database is a fictitious database designed to represent a DVD rental store. The tables of the database include film, film_category, actor, customer, rental, payment and inventory among others. The Sakila sample database is intended to provide a standard schema that can be used for examples in books, tutorials, articles, samples, and so forth. Detailed information about the database can be found on the MySQL website: https://dev.mysql.com/doc/sakila/en/
Sakila for SQLite is a part of the sakila-sample-database-ports project intended to provide ported versions of the original MySQL database for other database systems, including:
Sakila for SQLite is a port of the Sakila example database available for MySQL, which was originally developed by Mike Hillyer of the MySQL AB documentation team. This project is designed to help database administrators to decide which database to use for development of new products The user can run the same SQL against different kind of databases and compare the performance
License: BSD Copyright DB Software Laboratory http://www.etl-tools.com
Note: Part of the insert scripts were generated by Advanced ETL Processor http://www.etl-tools.com/etl-tools/advanced-etl-processor-enterprise/overview.html
Information about the project and the downloadable files can be found at: https://code.google.com/archive/p/sakila-sample-database-ports/
Other versions and developments of the project can be found at: https://github.com/ivanceras/sakila/tree/master/sqlite-sakila-db
https://github.com/jOOQ/jOOQ/tree/main/jOOQ-examples/Sakila
Direct access to the MySQL Sakila database, which does not require installation of MySQL (queries can be typed directly in the browser), is provided on the phpMyAdmin demo version website: https://demo.phpmyadmin.net/master-config/
The files in the sqlite-sakila-db folder are the script files which can be used to generate the SQLite version of the database. For convenience, the script files have already been run in cmd to generate the sqlite-sakila.db file, as follows:
sqlite> .open sqlite-sakila.db # creates the .db file
sqlite> .read sqlite-sakila-schema.sql # creates the database schema
sqlite> .read sqlite-sakila-insert-data.sql # inserts the data
Therefore, the sqlite-sakila.db file can be directly loaded into SQLite3 and queries can be directly executed. You can refer to my notebook for an overview of the database and a demonstration of SQL queries. Note: Data about the film_text table is not provided in the script files, thus the film_text table is empty. Instead the film_id, title and description fields are included in the film table. Moreover, the Sakila Sample Database has many versions, so an Entity Relationship Diagram (ERD) is provided to describe this specific version. You are advised to refer to the ERD to familiarise yourself with the structure of the database.
Facebook
TwitterThis is the sample database from sqlservertutorial.net. This is a great dataset for learning SQL and practicing querying relational databases.
Database Diagram:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4146319%2Fc5838eb006bab3938ad94de02f58c6c1%2FSQL-Server-Sample-Database.png?generation=1692609884383007&alt=media" alt="">
The sample database is copyrighted and cannot be used for commercial purposes. For example, it cannot be used for the following but is not limited to the purposes: - Selling - Including in paid courses
Facebook
TwitterThis dataset was created by Sudhir Singh
Released under Data files © Original Authors
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
FooDrugs database is a development done by the Computational Biology Group at IMDEA Food Institute (Madrid, Spain), in the context of the Food Nutrition Security Cloud (FNS-Cloud) project. Food Nutrition Security Cloud (FNS-Cloud) has received funding from the European Union's Horizon 2020 Research and Innovation programme (H2020-EU.3.2.2.3. – A sustainable and competitive agri-food industry) under Grant Agreement No. 863059 – www.fns-cloud.eu (See more details about FNS-Cloud below)
FooDrugs stores information extracted from transcriptomics and text documents for foo-drug interactiosn and it is part of a demonstrator to be done in the FNS-Cloud project. The database was built using MySQL, an open source relational database management system. FooDrugs host information for a total of 161 transcriptomics GEO series with 585 conditions for food or bioactive compounds. Each condition is defined as a food/biocomponent per time point, per concentration, per cell line, primary culture or biopsy per study. FooDrugs includes information about a bipartite network with 510 nodes and their similarity scores (tau score; https://clue.io/connectopedia/connectivity_scores) related with possible drug interactions with drugs assayed in conectivity map (https://www.broadinstitute.org/connectivity-map-cmap). The information is stored in eight tables:
Table “study” : This table contains basic information about study identifiers from GEO, pubmed or platform, study type, title and abstract
Table “sample”: This table contains basic information about the different experiments in a study, like the identifier of the sample, treatment, origin type, time point or concentration.
Table “misc_study”: This table contains additional information about different attributes of the study.
Table “misc_sample”: This table contains additional information about different attributes of the sample.
Table “cmap”: This table contains information about 70895 nodes, compromising drugs, foods or bioactives, overexpressed and knockdown genes (see section 3.4). The information includes cell line, compound and perturbation type.
Table “cmap_foodrugs”: This table contains information about the tau score (see section 3.4) that relates food with drugs or genes and the node identifier in the FooDrugs network.
Table “topTable”: This table contains information about 150 over and underexpressed genes from each GEO study condition, used to calculate the tau score (see section 3.4). The information stored is the logarithmic fold change, average expression, t-statistic, p-value, adjusted p-value and if the gene is up or downregulated.
Table “nodes”: This table stores the information about the identification of the sample and the node in the bipartite network connecting the tables “sample”, “cmap_foodrugs” and “topTable”.
In addition, FooDrugs database stores a total of 6422 food/drug interactions from 2849 text documents, obtained from three different sources: 2312 documents from PubMed, 285 from DrugBank, and 252 from drugs.com. These documents describe potential interactions between 1464 food/bioactive compounds and 3009 drugs. The information is stored in two tables:
Table “texts”: This table contains all the documents with its identifiers where interactions have been identified with strategy described in section 4.
Table “TM_interactions”: This table contains information about interaction identifiers, the food and drug entities, and the start and the end positions of the context for the interaction in the document.
FNS-Cloud will overcome fragmentation problems by integrating existing FNS data, which is essential for high-end, pan-European FNS research, addressing FNS, diet, health, and consumer behaviours as well as on sustainable agriculture and the bio-economy. Current fragmented FNS resources not only result in knowledge gaps that inhibit public health and agricultural policy, and the food industry from developing effective solutions, making production sustainable and consumption healthier, but also do not enable exploitation of FNS knowledge for the benefit of European citizens. FNS-Cloud will, through three Demonstrators; Agri-Food, Nutrition & Lifestyle and NCDs & the Microbiome to facilitate: (1) Analyses of regional and country-specific differences in diet including nutrition, (epi)genetics, microbiota, consumer behaviours, culture and lifestyle and their effects on health (obesity, NCDs, ethnic and traditional foods), which are essential for public health and agri-food and health policies; (2) Improved understanding agricultural differences within Europe and what these means in terms of creating a sustainable, resilient food systems for healthy diets; and (3) Clear definitions of boundaries and how these affect the compositions of foods and consumer choices and, ultimately, personal and public health in the future. Long-term sustainability of the FNS-Cloud will be based on Services that have the capacity to link with new resources and enable cross-talk amongst them; access to FNS-Cloud data will be open access, underpinned by FAIR principles (findable, accessible, interoperable and re-useable). FNS-Cloud will work closely with the proposed Food, Nutrition and Health Research Infrastructure (FNHRI) as well as METROFOOD-RI and other existing ESFRI RIs (e.g. ELIXIR, ECRIN) in which several FNS-Cloud Beneficiaries are involved directly. (https://cordis.europa.eu/project/id/863059)
***** changes between version FooDrugs_v2 and FooDrugs_V3 (31st January 2023) are:
Increased the amount of text documents by 85.675 from PubMed and ClinicalTrials.gov, and the amount of Text Mining interactions by 168.826.
Increased the amount of transcriptomic studies by 32 GEO series.
Removed all rows in table cmap_foodrugs representing interactions with values of tau=0
Removed 43 GEO series that after manually checking didn't correspond to food compounds.
Added a new column to the table texts: citation to hold the citation of the text.
Added these columns to the table study: contributor to contain the authors of the study, publication_date to store the date of publication of the study in GEO and pubmed_id to reference the publication associated with the study if any.
Added a new column to topTable to hold the top 150 up-regulated and 150 down-regulated genes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This mysql database contains list of submitted Java programs based on series of online lab exercises from year 2013 to 2015. The programs were submitted by first year computer science students from Faculty of Informatics and Computing, Universiti Sultan Zainal Abidin, Malaysia who undertaking Introductory Computer Programming subject. There were 67, 18 and 47 of participated students in 2013, 2014 and 2015 respectively. The submitted programs were all of their solution attempts in answering a computational programming question. The question was as the following:
Write a program that will read string. Then your program should show all the string character using * except for character 2, output its real character. sample input. Apology sample output. p****
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present three defect rediscovery datasets mined from Bugzilla. The datasets capture data for three groups of open source software projects: Apache, Eclipse, and KDE. The datasets contain information about approximately 914 thousands of defect reports over a period of 18 years (1999-2017) to capture the inter-relationships among duplicate defects.
File Descriptions
apache.csv - Apache Defect Rediscovery dataset
eclipse.csv - Eclipse Defect Rediscovery dataset
kde.csv - KDE Defect Rediscovery dataset
apache.relations.csv - Inter-relations of rediscovered defects of Apache
eclipse.relations.csv - Inter-relations of rediscovered defects of Eclipse
kde.relations.csv - Inter-relations of rediscovered defects of KDE
create_and_populate_neo4j_objects.cypher - Populates Neo4j graphDB by importing all the data from the CSV files. Note that you have to set dbms.import.csv.legacy_quote_escaping configuration setting to false to load the CSV files as per https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/#config_dbms.import.csv.legacy_quote_escaping
create_and_populate_mysql_objects.sql - Populates MySQL RDBMS by importing all the data from the CSV files
rediscovery_db_mysql.zip - For your convenience, we also provide full backup of the MySQL database
neo4j_examples.txt - Sample Neo4j queries
mysql_examples.txt - Sample MySQL queries
rediscovery_eclipse_6325.png - Output of Neo4j example #1
distinct_attrs.csv - Distinct values of bug_status, resolution, priority, severity for each project
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Korean Text to MySQL Dataset
Dataset Summary
Korean Text to MySQL is a dataset comprising approximately 3,300 samples generated using OpenAI's gpt-4o model. This dataset is designed to train models that convert natural language questions in Korean into MySQL queries. The data generation process was inspired by the Self-Instruct method and followed the steps outlined below.
Data Generation Process
Approximately 100 SEED samples were… See the full description on the dataset page: https://huggingface.co/datasets/won75/text_to_sql_ko.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Explore our public data on competitions, datasets, kernels (code / notebooks) and more Meta Kaggle may not be the Rosetta Stone of data science, but we do think there's a lot to learn (and plenty of fun to be had) from this collection of rich data about Kaggle’s community and activity.
Strategizing to become a Competitions Grandmaster? Wondering who, where, and what goes into a winning team? Choosing evaluation metrics for your next data science project? The kernels published using this data can help. We also hope they'll spark some lively Kaggler conversations and be a useful resource for the larger data science community.
https://i.imgur.com/2Egeb8R.png" alt="" title="a title">
This dataset is made available as CSV files through Kaggle Kernels. It contains tables on public activity from Competitions, Datasets, Kernels, Discussions, and more. The tables are updated daily.
Please note: This data is not a complete dump of our database. Rows, columns, and tables have been filtered out and transformed.
In August 2023, we released Meta Kaggle for Code, a companion to Meta Kaggle containing public, Apache 2.0 licensed notebook data. View the dataset and instructions for how to join it with Meta Kaggle here
We also updated the license on Meta Kaggle from CC-BY-NC-SA to Apache 2.0.
UserId column in the ForumMessages table has values that do not exist in the Users table.True or False.Total columns.
For example, the DatasetCount is not the total number of datasets with the Tag according to the DatasetTags table.db_abd_create_tables.sql script.clean_data.py script.
The script does the following steps for each table:
NULL.add_foreign_keys.sql script.Total columns in the database tables. I do that by running the update_totals.sql script.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.
NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.
Datasets
The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).
The datasets contain both benign and malicious traffic. All collected datasets are balanced.
The version of NetFlow used to build the datasets is 5.
| Dataset | Aim | Samples | Benign-malicious traffic ratio |
|---|---|---|---|
| D1 | Training | 400,003 | 50% |
| D2 | Test | 57,239 | 50% |
Infrastructure and implementation
Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.
DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)
Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).
The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.
The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.
| Parameters | Description |
|---|---|
| '--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema' | Enumerate users, password hashes, privileges, roles, databases, tables and columns |
| --level=5 | Increase the probability of a false positive identification |
| --risk=3 | Increase the probability of extracting data |
| --random-agent | Select the User-Agent randomly |
| --batch | Never ask for user input, use the default behavior |
| --answers="follow=Y" | Predefined answers to yes |
Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).
The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24.
The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.
However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.
To run the MySQL server we ran MariaDB version 10.4.12.
Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.
Facebook
TwitterAs of June 2024, the most popular database management system (DBMS) worldwide was Oracle, with a ranking score of *******; MySQL and Microsoft SQL server rounded out the top three. Although the database management industry contains some of the largest companies in the tech industry, such as Microsoft, Oracle and IBM, a number of free and open-source DBMSs such as PostgreSQL and MariaDB remain competitive. Database Management Systems As the name implies, DBMSs provide a platform through which developers can organize, update, and control large databases. Given the business world’s growing focus on big data and data analytics, knowledge of SQL programming languages has become an important asset for software developers around the world, and database management skills are seen as highly desirable. In addition to providing developers with the tools needed to operate databases, DBMS are also integral to the way that consumers access information through applications, which further illustrates the importance of the software.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DACOS - DAtaset of COde Smells
The dataset offers annotated code snippets for three code smells— multifaceted abstraction, complex method, and long parameter list.
In addition to a manually annotated dataset on potentially subjective snippets, we offer a larger set of snippets containing the snippets that are either definitely benign or smelly.
The upload contains three files :
Required Software
The dataset is created in MySQL. Hence a local or remote installation of MySQL is needed with privileges to create and modify schemas.
Importing the Dataset
The dataset is a self-contained SQL file. To import the dataset, run the following command:
mysql -u username -p database_name < DACOSMain.sql
mysql -u username -p database_name < DACOSExtended.sql
Understanding the Datasets
Both the datasets differ in architecture. The main dataset contains a table named annotations that contains every annotation collected from users. The sample table contains the samples presented to the user for annotation. The class_metrics and method_metrics contain the tables for class and method metrics respectively. These were used to filter samples that are likely to contain smells and hence can be shown to users.
The extended dataset is created by selecting samples that are below or above the selected metric range for each smell. Hence, these samples are definitely smelly or benign. The extended version of the dataset does not contain a table for annotation since they were not presented to user. It instead has an 'entry' table where each sample is classified according to the smell it contains. The codes for identifying smells are as below:
| Condition | smell Id |
|---|---|
| Multifaceted Abstraction Present | 1 |
| Multifaceted Abstraction not detected | 4 |
| Long Parameter List Present | 2 |
| Long Parameter List Absent | 5 |
| Complex Method Present | 3 |
| Complex Method Absent | 6 |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comprises of two .csv format files used within workstream 2 of the Wellcome Trust funded ‘Orphan drugs: High prices, access to medicines and the transformation of biopharmaceutical innovation’ project (219875/Z/19/Z). They appear in various outputs, e.g. publications and presentations.
The deposited data were gathered using the University of Amsterdam Digital Methods Institute’s ‘Twitter Capture and Analysis Toolset’ (DMI-TCAT) before being processed and extracted from Gephi. DMI-TCAT queries Twitter’s STREAM Application Programming Interface (API) using SQL and retrieves data on a pre-set text query. It then sends the returned data for storage on a MySQL database. The tool allows for output of that data in various formats. This process aligns fully with Twitter’s service user terms and conditions. The query for the deposited dataset gathered a 1% random sample of all public tweets posted between 10-Feb-2021 and 10-Mar-2021 containing the text ‘Rare Diseases’ and/or ‘Rare Disease Day’, storing it on a local MySQL database managed by the University of Sheffield School of Sociological Studies (http://dmi-tcat.shef.ac.uk/analysis/index.php), accessible only via a valid VPN such as FortiClient and through a permitted active directory user profile. The dataset was output from the MySQL database raw as a .gexf format file, suitable for social network analysis (SNA). It was then opened using Gephi (0.9.2) data visualisation software and anonymised/pseudonymised in Gephi as per the ethical approval granted by the University of Sheffield School of Sociological Studies Research Ethics Committee on 02-Jun-201 (reference: 039187). The deposited dataset comprises of two anonymised/pseudonymised social network analysis .csv files extracted from Gephi, one containing node data (Issue-networks as excluded publics – Nodes.csv) and another containing edge data (Issue-networks as excluded publics – Edges.csv). Where participants explicitly provided consent, their original username has been provided. Where they have provided consent on the basis that they not be identifiable, their username has been replaced with an appropriate pseudonym. All other usernames have been anonymised with a randomly generated 16-digit key. The level of anonymity for each Twitter user is provided in column C of deposited file ‘Issue-networks as excluded publics – Nodes.csv’.
This dataset was created and deposited onto the University of Sheffield Online Research Data repository (ORDA) on 26-Aug-2021 by Dr. Matthew S. Hanchard, Research Associate at the University of Sheffield iHuman institute/School of Sociological Studies. ORDA has full permission to store this dataset and to make it open access for public re-use without restriction under a CC BY license, in line with the Wellcome Trust commitment to making all research data Open Access.
The University of Sheffield are the designated data controller for this dataset.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This file contains the database and table creation scripts used in the Daily Sales & Expenses System project. It defines the structure of the MySQL database where all sales, expenses, inventory, and login records are stored before being accessed, analyzed, and visualized using Python.
The file includes: - Commands to create the database - SQL statements to create the users, sales, expenses and inventory tables - (Optionally) Sample INSERT statements to populate the tables with test data
This file is essential for anyone who wants to replicate or test the Python-MySQL integration in the project.
Facebook
TwitterAs of June 2024, the most popular relational database management system (RDBMS) worldwide was Oracle, with a ranking score of *******. Oracle was also the most popular DBMS overall. MySQL and Microsoft SQL server rounded out the top three.
Facebook
TwitterThe Radiocarbon dating laboratory of IRPA/KIK was founded in the 1960s. Initially dates were reported at more or less regular intervals in the journal Radiocarbon (Schreurs 1968). Since the advent of radiocarbon dating in the 1950s it had been a common practice amongst radiocarbon laboratories to publish their dates in so-called ‘date-lists’ that were arranged per laboratory. This was first done in the Radiocarbon Supplement of the American Journal of Science and later in the specialised journal Radiocarbon. In the course of time the latter, with the added subtitle An International Journal of Cosmogenic Isotope Research, became a regular scientific journal shifting focus from date-lists to articles. Furthermore the world-wide exponential increase of radiocarbon dates made it almost impossible to publish them all in the same journal, even more so because of the broad range of applications that use radiocarbon analysis, ranging from archaeology and art history to geology and oceanography and recently also biomedical studies.The IRPA/KIK database From 1995 onwards IRPA/KIK’s Radiocarbon laboratory started to publish its dates in small publications, continuing the numbering of the preceding lists in Radiocarbon. The first booklet in this series was “Royal Institute for Cultural Heritage Radiocarbon dates XV” (Van Strydonck et al. 1995), followed by three more volumes (XVI, XVII, XVIII). The next list (XIX, 2005) was no longer printed but instead handed out as a PDF file on CD-rom. The ever increasing number of dates and the difficulties in handling all the data, however, made us look for a more permanent and easier solution. In order to improve data management and consulting, it was thus decided to gather all our dates in a web-based database. List XIX was in fact already a Microsoft Access database that was converted into a reader friendly style and could also be printed as a PDF file. However a Microsoft Access database is not the most practical solution to make information publicly available. Hence the structure of the database was recreated in Mysql and the existing content was transferred into the corresponding fields. To display the records, a web-based front-end was programmed in PHP/Apache. It features a full-text search function that allows for partial word-matching. In addition the records can be consulted in PDF format. Old records from the printed date-lists as well as new records are now added using the same Microsoft Acces back-end, which is now connected directly to the Mysql database. The main problem with introducing the old data was that not all the current criteria were available in the past (e.g. stable isotope measurements). Furthermore since all the sample information is given by the submitter, its quality largely depends on the persons willingness to contribute as well as on the accuracy and correctness of the information he provides. Sometimes problems arrive from the fact that a certain investigation (like an excavation) is carried out over a relatively long period (sometimes even more than ten years) and is directed by different people or even institutions. This can lead to differences in the labeling procedure of the samples, but also in the interpretation of structures and artifacts and in the orthography of the site’s name. Finally the submitter might change address, while the names of institutions or even regions and countries might change as well (e.g.Zaire - Congo)
Facebook
TwitterREADME 2025-09-10 Introduction The peatland mid infrared database (pmird) stores data from peat, vegetation, litter, and dissolved organic matter samples, in particular mid infrared spectra and other variables, from previously published and unpublished data sources. The majority of samples in the database are peat samples from northern bogs. Currently, the database contains entries from 26 studies, 11216 samples, and 3877 mid infrared spectra. The aim is to provide a harmonized data source that can be useful to re-analyse existing data, analyze peat chemistry, develop and test spectral prediction models, and provide data on various peat properties. Usage notes Download and Setup The peatland mid infrared database can be downloaded from https://doi.org/10.5281/zenodo.17092587. The publication contains the following files and folders: pmird-backup-2025-09-10.sql: A mysqldump backup of the pmird database. pmird_prepared_data: A folder that contains: Folders like c00001-2020-08-17-Hodgkins with the raw spectra for samples from each dataset in the pmird database (see below for how to import the spectra). Files like pmird_prepare_data_c00001-2020-08-17-Hodgkins.Rmd that contain the R code used to process and import the data from each dataset into the database. Corresponding html files contain the compiled scripts. pmird_prepare_data.Rmd: An Rmarkdown script that was used to run the scripts that created the database (the top level script). mysql_scripts: A folder that contains: pmird_mysql_initialization.sql: MariaDB script to initialize the database. 001-db-initialize.Rmd: Rmarkdown script that executes pmird_mysql_initialization.sql and populated dataset-independent tables. add-citations.Rmd: Rmarkdown script that adds information on references to the database. add-licenses.Rmd: Rmarkdown script that adds information on licenses to the database. add-mir-metadata-quality.Rmd Rmarkdown script that adds information on the quality of the infrared spectra to the database. Dockerfile: A Dockerfile that defines the computing environment used to create the database. renv.lock A renv.lock file that lists the R packages used to create the database. The database can be set up as follows: The downloaded database needs to be imported in a running MariaDB instance. In a linux terminal, the downloaded sql file can be imported like so: mysql -u
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE, documented on July 17, 2013. A database which contains the information of heparin-binding proteins of E. coli K-12 MG1655 cells. Heparin affinity columns were applied to enrich and fractionate proteins. Identification of proteins was done via the collaboration with David Russell''s lab. Because heparin is negatively charged sulfated glucosaminoglycan, polyamion binding proteins, which contain nucleic acid-binding proteins, are expected to bind to heparin columns. Study of the expression pattern of heparin-binding proteins will help to study the nucleic acid-binding proteins, most of which are related to regulation. Moreover, heparin affinity columns will also erich low abundance proteins. Heparome database is constructed using MySQL. Website interface is built using HTML and PHP. Queries between MySQL database and website interface are executed using PHP. Besides including information of identified proteins, such as swiss accession number, gene name, molecular weight, isoelectric point, condon adaptation index (CAI), functional classification, et. al. , it also includes information of experiments, such as sample preparation, heparin-HPLC chromatography, SDS-PAGE gel separation and MALDI-MS.
Facebook
TwitterThe datasets contain the attribute data of 4,2000 movies and 5,000 users, as well as user ratings on the movies which were extracted from Douban website and CN-DBpedia. Anyone wanting to use the .sql file should run the file in MySQL.
Facebook
TwitterMySQL Classicmodels sample database
The MySQL sample database schema consists of the following tables:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F8652778%2Fefc56365be54c0e2591a1aefa5041f36%2FMySQL-Sample-Database-Schema.png?generation=1670498341027618&alt=media" alt="">