Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is a relational database schema for a sales and order management system, designed to track customers, employees, products, orders, and payments. Below is a detailed breakdown of each table and their relationships:
productlines Table (Product Categories)productLinetextDescription: A short description of the product line.htmlDescription: A detailed HTML-based description.image: Associated image (if applicable).products: Each product belongs to one productLine.products Table (Product Information)productCodeproductName: Name of the product.productLine: Foreign key linking to productlines.productScale, productVendor, productDescription: Additional product details.quantityInStock: Number of units available.buyPrice: Cost price per unit.MSRP: Manufacturer's Suggested Retail Price.productlines (each product belongs to one category).orderdetails (a product can be part of many orders).orderdetails Table (Line Items in an Order)orderNumber, productCode)quantityOrdered: Number of units in the order.priceEach: Price per unit.orderLineNumber: The sequence number in the order.orders (each order has multiple products).products (each product can appear in multiple orders).orders Table (Customer Orders)orderNumberorderDate: Date when the order was placed.requiredDate: Expected delivery date.shippedDate: Actual shipping date (can be NULL if not shipped).status: Order status (e.g., "Shipped", "In Process", "Cancelled").comments: Additional remarks about the order.customerNumber: Foreign key linking to customers.orderdetails (an order contains multiple products).customers (each order is placed by one customer).customers Table (Customer Details)customerNumbercustomerName: Name of the customer.contactLastName, contactFirstName: Contact person.phone: Contact number.addressLine1, addressLine2, city, state, postalCode, country: Address details.salesRepEmployeeNumber: Foreign key linking to employees, representing the sales representative.creditLimit: Maximum credit limit assigned to the customer.orders (a customer can place multiple orders).payments (a customer can make multiple payments).employees (each customer has a sales representative).payments Table (Customer Payments)customerNumber, checkNumber)paymentDate: Date of payment.amount: Payment amount.customers (each payment is linked to a customer).employees Table (Employee Information)employeeNumberlastName, firstName: Employee's name.extension, email: Contact details.officeCode: Foreign key linking to offices, representing the employee's office.reportsTo: References another employeeNumber, establishing a hierarchy.jobTitle: Employee’s role (e.g., "Sales Rep", "Manager").offices (each employee works in one office).employees (self-referential, representing reporting structure).customers (each employee manages multiple customers).offices Table (Office Locations)officeCodecity, state, country: Location details.phone: Office contact number.addressLine1, addressLine2, postalCode, territory: Address details.employees (each office has multiple employees).This schema provides a well-structured design for managing a sales and order system, covering:
✅ Product inventory
✅ Order and payment tracking
✅ Customer and employee management
✅ Office locations and hierarchical reporting
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is synthetically generated fake data designed to simulate a realistic e-commerce environment.
To provide large-scale relational datasets for practicing database operations, analytics, and testing tools like DuckDB, Pandas, and SQL engines. Ideal for benchmarking, educational projects, and data engineering experiments.
int): Unique identifier for each customer string): Customer full name string): Customer email address string): Customer gender ('Male', 'Female', 'Other') date): Date customer signed up string): Customer country of residence int): Unique identifier for each product string): Name of the product string): Product category (e.g., Electronics, Books) float): Price per unit int): Available stock count string): Product brand name int): Unique identifier for each order int): ID of the customer who placed the order (foreign key to Customers) date): Date when order was placed float): Total amount for the order string): Payment method used (Credit Card, PayPal, etc.) string): Country where the order is shipped int): Unique identifier for each order item int): ID of the order this item belongs to (foreign key to Orders) int): ID of the product ordered (foreign key to Products) int): Number of units ordered float): Price per unit at order time int): Unique identifier for each review int): ID of the reviewed product (foreign key to Products) int): ID of the customer who wrote the review (foreign key to Customers) int): Rating score (1 to 5) string): Text content of the review date): Date the review was written https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F9179978%2F7681afe8fc52a116ff56a2a4e179ad19%2FEDR.png?generation=1754741998037680&alt=media" alt="">
The script saves two folders inside the specified output path:
csv/ # CSV files
parquet/ # Parquet files
MIT License
Facebook
TwitterThis is the sample database from sqlservertutorial.net. This is a great dataset for learning SQL and practicing querying relational databases.
Database Diagram:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4146319%2Fc5838eb006bab3938ad94de02f58c6c1%2FSQL-Server-Sample-Database.png?generation=1692609884383007&alt=media" alt="">
The sample database is copyrighted and cannot be used for commercial purposes. For example, it cannot be used for the following but is not limited to the purposes: - Selling - Including in paid courses
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The address register authentically reproduces all addresses officially assigned by the municipalities throughout Austria. Thus it is the reference of the addresses of Austria with regard to addressability, spelling, order number assignment and spatial assignment. This central address database is part of the border register (§9a VermG) and is maintained and updated by the municipalities and cities with the characteristics listed in the Surveying Act via a central reporting rail. The BEV — Bundesamt für Eich- und Vermessungswesen (Bundesamt für Eich- und Vermessungswesen — Bundesamt für Eich- und Vermessungswesen) gives the addresses in various forms. In addition to many other information, each address has a unique key (address code) and an exact spatial coordinative assignment (geocodation).
The address code is an unspoken seven-digit unique key of all addresses. No conclusions can be drawn from the sequence of figures on a classification of the award. If buildings are located on an address, a 3-digit subcode is assigned for each. Address code and subcode together create the address number.
The cut-off date dates of 3 April 2022 contain additional attribute fields. For more details, see interface description (included in the download file).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains historical data collected in the digital humanities project Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World. The project was funded by the VolkswagenFoundation within the scope of the Mixed Methods initiative. The project was a collaboration between the Institute for Medieval History II of the Goethe University in Frankfurt/Main, Germany, and the Institute for Visualization and Interactive Systems at the University of Stuttgart, and took place there from 2018 to 2021. The objective of this joint project was to develop a novel visualization approach in order to gain new insights on the multi-religious landscapes of the Middle East under Muslim rule during the Middle Ages (7th to 14th century). In particular, information on multi-religious communities were researched and made available in a database accessible through interactive visualization as well as through a pilot web-based geo-temporal multi-view system to analyze and compare information from multiple sources. The code for this visualization system is publicly available on GitHub under the MIT license. The data in this repository is a curated database dump containing data collected from a predetermined set of primary historical sources and literature. The core objective of the data entry was to record historical evidence for religious groups in cities of the Medieval Middle East. In the project, data was collected in a relational PostgreSQL database, the structure of which can be reconstructed from the file schema.sql. An entire database dump including both the database schema and the table contents is located in database.sql. The PDF file database-structure.pdf describes the relationship between tables in a graphical schematic. In the database.json file, the contents of the individual tables are stored in JSON format. At the top level, the JSON file is an object. Each table is stored as a key-value pair, where the key is the database name, and the value is an array of table records. Each table record is itself an object of key-value pairs, where the keys are the table columns, and the values are the corresponding values in the record. The dataset is centered around the evidence, which represents one piece of historical evidence as extracted from one or more sources. An evidence must contain a reference to a place and a religion, and may reference a person and one or more time spans. Instances are used to connect evidences to places, persons, and religions; and additional metadata are stored individually in the instances. Time instances are connected to the evidence via a time group to allow for more than one time span per evidence. An evidence is connected via one or more source instances to one or more sources. Evidences can also be tagged with one or more tags via the tag_evidence table. Places and persons have a type, which are defined in the place type and person type tables. Alternative names for places are stored in the name_var table with a reference to the respective language. For places and persons, references to URIs in other data collections (such as Syriaca.org or the Digital Atlas of the Roman Empire) are also stored, in the external_place_uri and external_person_uri tables. Rules for how to construct the URIs from the fragments stored in the last-mentioned tables are controlled via the uri_namespace and external_database tables. Part of the project was to extract historical evidence from digitized texts, via annotations. Annotations are placed in a document, which is a digital version of a source. An annotation can be one of the four instance types, thereby referencing a place, person, religion, or time group. A reference to the annotation is stored in the instance, and evidences are constructed from annotations by connecting the respective instances in an evidence tuple.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data corpus was produced during the RoBivaL project, by robotics and agriculture researchers from DFKI (German Research Center for Artificial Intelligence, Robotics Innovation Center) and HSO (Hochschule Osnabrück, University of Applied Sciences, Agro-Technicum), between August 2021 and October 2023.
The RoBivaL project compared different robot locomotion concepts from both space research and agricultural applications on the basis of experiments conducted under agricultural conditions. Four robot systems were used, two of which (ARTEMIS & SherpaTT) have their origin in futuristic space applications, while the other two (Naio Oz & BoniRob) were developed specifically for agriculture.
The robots were subjected to six experiments, addressing different challenges and requirements for agricultural applications. Since real-world soil conditions usually change with the seasons and can be expected to have a crucial impact on robot performance, the experimental soil conditions were controlled and varied on the two dimensions moisture (dry, moist, wet) and density (tilled, compacted), resulting in six soil condition options. Depending on the specific objectives, each experiment was conducted either on a subset or on all available soil conditions. The experiments were:
Straight travel: Determine variations of travel speed and directional stability under different soil moisture and densitiy levels, and determine the soil deformation and compaction caused by a traverse under given initial soil conditions.
Turn around: Examine the effect of steering on soil deformation with moist and tilled soil.
Repeated rollover: Investigate the effects of repeated axle rollovers on soil compaction, determined by measuring the soil penetration resistance.
Tensile force: Compare the maximum exerted tractive force under different soil moisture and densitiy levels, and gain insights how varying soil conditions affect the performance of each system during traction.
Sill crossing: Determine the ability to overcome different types of obstacles, and compare relevant system characteristics, e.g. ground clearance, or center of mass.
Obstacle avoidance: Demonstrate SherpaTT's ability to step over an obstacle without contact, thanks to its actively controlled suspension.
Field conditions and robot behavior were monitored with various sensors and measuring devices, partly on the robots and partly in the field, in order to document the experiment execution, and to determine the robot performance. The data capturing devices, their roles and deployments are summarized in Table 1.
Table 1: Overview of data capturing devices
Device on System Device on System and in Field Device in Field
System Monitoring
IMU
Force logger
RTK-GPS
Stopwatch
Compass
System and Field Monitoring
Video camera
Ruler
Field Monitoring
Tilt laser scanner
Penetrometer
Moisture meter
The data corpus is stored in a file tree, which is divided into three main sections:
Logbook
Data
Specification
Each section is described in detail in the following chapters.
Here is a complete overview of the file tree:
logbook/ csv/ experiment.csv parameter.csv possible_robot.csv possible_value.csv robot.csv run_${experiment}.csv database/ logbook.sqlite schema/ logbook_entities.png logbook_schema.sqlite src/ create_sqlite_database.sh data/ ${experiment}/ ${robot}/ ${run}/ ${datafile} specification/ experiment/ experiment.md ${experiment}/ img/ ${experiment}.png ${experiment}-description.md ${experiment}.json parameter/ parameter.json datafile/ ${datafile_stem}.json robot/ ${robot}/ system_properties.json robots.png sensor/ ${sensor}.json software/ ${software}.json
The variables ${experiment}, ${robot}, ${run}, ${datafile}, ${datafile_stem}, ${sensor}, and ${software} are explained in the context of each respective section.
2.1. Logbook
The Logbook is a small relational database. Primarily, it contains one table for every experiment, where each row represents an experiment run. These tables capture all facts and measurements about a run that can be expressed as scalar values, including
start and end times,
independent variables, (e.g. run track length, soil moisture and density level, name of the tested robot, commanded speed, etc.),
dependent variables, (e.g. wheel track depth and width, heading and offset of the robot after a run, etc.),
comments about unforeseen events.
Additional measurements that are better managed in separate data files are stored in the Data section of the corpus, which is discussed in Chapter 2.2.
Besides the run tables, the Logbook contains tables to specify the experiments and the available robots, as well as the parameters that are present in the run tables. These additional tables have some overlap with the Specification section of the data corpus, which is discussed in Chapter 2.3. In the Logbook, the specifying tables were used during the run data acquisition in the field, in order to facilitate and live-validate the data entry.
The entire Logbook is stored in the SQLite file logbook/database/logbook.sqlite. Users who prefer other tools than SQLite can find the constituting tables as CSV files in the directory logbook/csv/. The Logbook schema and entity-relationship-diagram are in the logbook/schema/ directory. The database can be recreated from the schema and CSV files with the Bash script logbook/src/create_sqlite_database.sh
The full Logbook file tree is as follows:
logbook/ csv/ experiment.csv parameter.csv possible_robot.csv possible_value.csv robot.csv run_${experiment}.csv database/ logbook.sqlite schema/ logbook_entities.png logbook_schema.sqlite src/ create_sqlite_database.sh
The ${experiment} variable refers to the keys at the top level of the data tree, which is discussed in Chapter 2.2.1.
2.2. Data
The Data section of the corpus contains all measurements that would be impractical to store directly in a run table of the Logbook database, but are better managed as separate data files. In most cases, these are time series issued by a particular sensor, and/or by a software running on one of the robots.
In addition to the data strictly necessary for evaluation purposes in RoBivaL, there are some extra data streams that were routinely captured on the ARTEMIS robot which were not available for the other systems, as well as data from experiment runs that were considered invalid or performed for testing.
For data recording on the robots, two different approaches were used, due to different sensor availabilites. In the case of SherpaTT, Naio Oz, and BoniRob, a custom-built, stand-alone embedded PC in a battery-equipped box for autonomous operation (aka Sensor Box) was attached to the given robot. The Sensor Box includes IMU and GNSS sensors, which are of primary relevance for the experiments. In the case of ARTEMIS, built-in sensors and data logging functionality could be used that relies on similar sensors as the Sensor Box, and employs the same software infrastructure for data recording, based on the Rock software framework.
Table 2 gives an overview of all available data files with a short description and possible sources, including sources outside of the robots (i.e. Force logger, Penetrometer, Tilt scanner, and Video camera). A thorough specification of the data files and their respective hard- and software resources is in the Specification section of the corpus, which is discussed in Chapter 2.3.
Table 2: Overview of file types in the Data section
Data file name Description Possible sources
bogie_dispatcher.motion_status.csv
Time series of status of the joint of the mobile base ARTEMIS
force.csv
Time series of momentary tractive force exerted by a robot, measured at regular intervals Force logger
gnss.nwu_position_samples.csv
Time series of cartesian positions measured by GPS in North-West-Up coordinate system ARTEMIS, Sensor Box
gnss.position_samples.csv
Time series of cartesian positions measured by GPS in robot coordiante system ARTEMIS, Sensor Box
gnss.solution.csv
Time series of raw values from the GPS sensor ARTEMIS, Sensor Box
joystick_converter.motion_command.csv
Time series of joystick commands interpreted as motion commands ARTEMIS
motion_controller.actuators_command.csv
Time series of commands for joints of mobile base ARTEMIS
odometry.odometry_samples.csv
Time series of aggregated pose of the odometry component ARTEMIS
penetrometer-after.json
penetrometer-before.json
penetrometer.json
Penetrometer measurements of the soil penetration resistance at multiple depth levels, before or after the experiment run Penetrometer
tiltscan-before-front.asc
tiltscan-before-rear.asc
tiltscan-before-left.asc
tiltscan-before-right.asc
tiltscan-before.png
tiltscan-before.txt
tiltscan-after-front.asc
tiltscan-after-rear.asc
tiltscan-after-left.asc
tiltscan-after-right.asc
tiltscan-after.png
tiltscan-after.txt
Tilt scanner measurements of the track surface, before or after the experiment run, on the front, rear, left, or right side of the robot, in raw pointcloud (.asc) or rasterized and consolidated (.png, .txt) form Tilt scanner
video.mp4.defaced.mp4
Video recordings of the robot performing the experiment run. Postprocessed to remove faces for privacy protection. Video camera
xsens.calibrated_sensors.csv
Raw readings of inertial unit ARTEMIS, Sensor Box
xsens.orientation_samples.csv
Integrated Cartesian pose measured by
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Summary of relational tables in the TSCEvolTree_Aze&2011_CorrJul2018 database
MorphospeciesAze_TableS3
Details for the 339 morphospecies of the Aze & others paper [1], augmented from [1, Appendix S1, Table S3 and Appendix S5, worksheet aM]. The main focus is on clarifying the choice of stratigraphic ranges and ancestry, and incorporating post-publication corrections by the authors of Aze & others or selective corrections/amendments during conversion to TimeScale Creator.
Stratigraphic ranges are given in Ma values; the time scales of the sources for the Ma values are made explicit (via links to table, MorphospeciesAze_TableS3DateRef). Almost all ranges are simple, as per those provided by the 2011 paper, delineated by lowest (start date) and highest occurrence (end date). However, a small number of ranges more closely represent those given by the nominated sources by also including range extensions: “questioned” or “questioned (rare)” for less confident stratigraphic occurrences; and “conjectured”, where a range extension is hypothesized, usually to support an ancestry proposal lacking contiguous stratigraphic occurrences. A proportion (~15 %) of Ma values are corrected where minor differences in Ma values were found between the 2011 paper and the nominated source; however, a systematic check was not conducted across the dataset. A further proportion (~15 %) of Ma values are amended where alternative sources appear to better represent the intention of the 2011 paper; these include a few instances where there would be a conflict with the index (marker) datum sequence of the Wade & others [2] zonation. Corrections to Ma values are accompanied by brief explanatory comments. Minor changes to Ma values were also made by one of us (TA) for a proportion (~17 %) of entries; most of these corresponded to the already invoked corrections or amendments.
Entries for ancestors follow the 2011 paper, with two exceptions in which adjustments to Ma values have removed the overlap in range between ancestor and descendant: a correction made by Tracy Aze (for Pulleniatina finalis, P. obliquiloculata replaced P. spectabilis); and an amendment (for Paragloborotalia pseudokugleri, Dentoglobigerina galavisi is amended to D. globularis). Levels of evidential support for the ancestor–descendant proposals were not critically appraised as part of the TimeScale Creator conversion. However, column [PhylogenyMethod] was employed to distinguish a small number of proposals which were distinctly less (“not well”) or better (“strongly”) supported than the typical “well supported” proposals presumed for this group.
All other information given in [1, Table S3] was incorporated, including indications of morphology, ecology, geography, and analyses made using the Neptune database. This information from Table S3 also included the lists of segments from both morphospecies (ID) and lineage (LID) trees within which each morphospecies occurred; in terms of relational logic, these could be supplanted by a single entry, the code for the lineage containing the highest occurrence of the morphospecies, and this was added manually for the TimeScale Creator conversion.
BiospeciesAze_aL
Details for the 210 lineages of the 2011 paper, augmented from [1, Appendix S5, worksheet aL]. The main focus is to maximize and maintain consistency and transparency between morphospecies and lineages for Ma values of their stratigraphic ranges. This is achieved for the TimeScale Creator conversion by nominating a morphospecies whose Ma value (start or end date) potentially defines the date (start or end) for a lineage; each morphospecies chosen for this is based on the apparent link between morphospecies and lineage dates used in the 2011 paper; this morphospecies is given by column [StartDateOrigLinkMph]. For start dates, ~40 % of lineages could be linked in this way; for end dates, almost all (93 %) were. Where a lineage range point of the 2011 study did not correspond to a morphospecies range point, then this morphospecies is at least used to provide the time scale applied to the date for the lineage.
Entries for ancestral lineages follow the 2011 paper, with two exceptions necessitated by changes in Ma values which place the ancestral lineage outside the date of origin of the descendant lineage: N150-N151-T153, involving the origin of morphospecies Paragloborotalia pseudokugleri; and N52-N54-T53, involving the origin of morphospecies Hirsutella cibaoensis. Levels of evidential support for the ancestor–descendant proposals were not critically appraised as part of the TimeScale Creator conversion. However, column [PhylogenyMethod] was employed to distinguish two proposals that were distinctly less (“not well”) or better (“strongly”) supported than the typical “well supported” proposals presumed for this group. The assignment of branching type as bifurcating or budding in the 2011 paper is incorporated.
Ecogroup and morphogroup allocations follow the 2011 paper (these data were not provided with the 2011 paper, but were indicated by colours employed in [1, Appendices S2, S3]; some colours for lineage morphogroups needed to be corrected; the ecogroup and morphogroup data for lineages were provided for the TimeScale Creator conversion by one of us [TA]). Some minor exceptions to these ecogroup and morphogroups were invoked for the TimeScale Creator conversion, in order to better match those of the contained morphospecies.
MorphospeciesAze_TableS1_Morphogroup
Details for morphogroups used for morphospecies and lineages; as for [1, Appendix 1, Table S1, "Morphogroup"], with explicit colour codes.
MorphospeciesAze_TableS1_Ecogroup
Details for ecogroups used for morphospecies and lineages; as for [1, Appendix 1, Table S1, "Ecogroup"], with explicit colour codes.
MorphospeciesAze_TableS3_EcogroupReference
Sources for ecogroups assigned to morphospecies; as for "Ecogroup reference", taken from [1, Appendix 1, Table S3]; multiple references in the original entries are accorded a row each.
MorphospeciesAze_TableS3_AppendixS1C_References
References for [1, Appendix 1, Table S3 ].
MorphospeciesAze_TableS3DateRef
Sources, and their time-scales, used for Ma values (sources from [1, Appendix 1, Table S3, "Date reference"] "Date reference", Table S3, Appendix 1 of the 2011 paper). The key purpose is to make explicit the time scale against which the source has (apparently) provided the Ma value, essential in order to appropriately recalibrate to the current GTS time scale and also to maintain the capability to recalibrate to future time scales. An important example of this need is where dates from the Paleocene Atlas [3] have here been remeasured directly from the Atlas and so are against the time scale of Berggren & others [4], rather than calibrated to Wade & others [2] as in the 2011 study.
In the interests of transparency and to provide a pointer to recalibration steps needed, a further level of specificity is needed for those sources which imply more than one time scale for Ma values used. For the TimeScale Creator conversion, references to these sources also have the time scale specified. Examples include chapters from the Eocene Atlas [5]. For instance, in order for the TimeScale Creator conversion to record the questionable parts of the stratigraphic ranges given for some Clavigerinella morphospecies by Coxall & Pearson [6], additional start dates for these morphospecies have been measured directly from their Figure 8.1, drawn against the scale of Berggren & Pearson [7]. However, these dates need to be integrated with the Ma values from Coxall & Pearson already used in the 2011 paper, which were presented recalibrated by them to the scale of Wade & others. These two sets of sources are given as, respectively, “Coxall & Pearson (2006: BP05)” (against Berggren & Pearson) and “Coxall & Pearson (2006)” (against the time-scale option of Wade & others which was calibrated to Cande & Kent [8]). Analogous examples came from sources such as Berggren & others, which include some dates for which the usual recalibration is not applicable (reasons are specific to each instance and are indicated in comments fields in table, MorphospeciesAze_TableS3; Appendix S1b includes descriptions of these fields in worksheet, DesignMorphospeciesAze_TableS3, and corresponding data in worksheet, MorphospeciesAze_TableS3).
MorphospeciesAze_TableS3DateRef_DateScale
This simply gives full names for the four time scales requiring recalibration: BKSA95: Berggren & others, 1995 [4] BP05: Berggren & Pearson, 2005 [7] WPBP11(CK95): Wade & others, 2011 [2]; calibrated to Cande & Kent, 1995 [8] WPBP11(GTS04): Wade & others, 2011 [2]; calibrated to Gradstein & others, 2004 (GTS2004) [9].
Wade & others, 2011 Datum
Details for datums relative to zonations, compiled from [2, Tables 1, 3, 4 ].
Zonal (marker) datums are indicated, but other datums are also included, almost all of which provide intrazonal intervals employed for calibration between time scales. Datums specific to the BKSA95 zonation are separately tabulated from those of BP05, allowing calibration between zonations BKSA95, BP05, WPBP11(CK95), and WPBP11(GTS04) (see MorphospeciesAze_TableS3DateRef_DateScale, above). The WPBP11(GTS04) zonation corresponds to GTS2004 and so allows calibration to later GTS time scales (GTS2012, GTS2016).
Additional columns provide brief indications of adjustments needed for calibration, including a small number of alternative datums resulting from revised definitions of zonations. Nomenclatural links are provided for datum-naming taxa.
Global tables:
SpeciesGroupName GenusGroupName ChronosPortal ColoursClofordWebSafeByHue
augmented from TimeScale Creator spreadsheet data:
TimeUnit_ReferenceUnit TimeUnit TSCPlanktonicForaminifersDatum TSCPlanktonicForaminifersDatumMorphospecies
Datapack
Facebook
TwitterA relational database of Lepidoptera chorion proteins. The proteinaceous Lepidopteran chorions are used in our lab, as a model system towards unraveling the routes and rules of formation of natural protective amyloids. Therefore, we constructed LepChorionDB a relational database, containing all Lepidoptera chorion proteins identified to date. Lepidoptera chorion proteins can be classified in two major protein families, A and B. This classification was based on multiple sequence alignments of conserved key residues, in the central domain of, well characterized, silkmoth chorion proteins. These alignments were used to build Hidden Markov Models in order to search various DataBases. This work was a collaboration of the Department of Cell Biology and Biophysics, University of Athens and the Centre of Immunology & Transplantation Biomedical Research Foundation, Academy of Athens.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This module is part of the AGSO-APIRA Australian Petroleum Systems Project. Eight basin modules were examined which covered almost the entire North West Shelf, the Petrel Sub-basiTn, as well as the Papuan basin in PNG. Two relational databases were established containing the biostratigraphic data (STRATDAT) and reservoir, facies and hydrocarbon shows data (RESFACS). These databases were linked by application programs which allow time series searching using geologically intelligent routines. Petroleum systems analyses were conducted on each area, with key results focussing upon the comparison of source quality and timing of generation between similar systems in different areas.
You can also purchase hard copies of Geoscience Australia data and other products at http://www.ga.gov.au/products-services/how-to-order-products/sales-centre.html
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Hotel Room Booking & Customer Orders Dataset This is a rich, synthetic dataset meticulously designed for data analysts, data scientists, and machine learning practitioners to practice their skills on realistic e-commerce data. It models a hotel booking platform, providing a comprehensive and interconnected environment to analyze booking trends, customer behavior, and operational patterns. It is an ideal resource for building a professional portfolio project from initial exploratory data analysis to advanced predictive modeling.
The dataset is structured as a relational database, consisting of three core tables that can be easily joined:
rooms.csv: This table serves as the hotel's inventory, containing a catalog of unique rooms with essential attributes such as room_id, type, capacity, and price_per_night.
customers.csv: This file provides a list of unique customers, offering demographic insights with columns like customer_id, name, country, and age. This data can be used to segment customers and personalize marketing strategies.
orders.csv: As the central transactional table, it links rooms and customers, capturing the details of each booking. Key columns include order_id, customer_id, room_id, booking_date, and the order_total, which can be derived from the room price and the duration of the stay.
This dataset is valuable because its structure enables a wide range of analytical projects. The relationships between tables are clearly defined, allowing you to practice complex SQL joins and data manipulation with Pandas. The presence of both categorical data (room_type, country) and numerical data (age, price) makes it versatile for different analytical approaches.
Use Cases for Data Exploration & Modeling This dataset is a versatile tool for a wide range of analytical projects:
Data Visualization: Create dashboards to analyze booking trends over time, identify the most popular room types, or visualize the geographical distribution of your customer base.
Machine Learning: Build a regression model to predict the order_total based on room type and customer characteristics. Alternatively, you could develop a model to recommend room types to customers based on their past orders.
SQL & Database Skills: Practice complex queries to find the average order value per country, or identify the most profitable room types by month.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
dataset summary:the dataset contains underwater images taken in the mediterranean sea, between 1980 and 2005.these images provide essential qualitative and quantitative data for assessing and monitoring the coastal marine environment by documenting organisms and their natural habitats. image inventory assembles pictures illustrating the most representative aspects of benthic domain.each underwater image is temporally and spatially associated. however, despite the availability of detailed coordinates, the location name is designated as the principal reference and primary identifier for the survey site.the dataset includes underwater images that help describe and analyse three distinct aspects of marine ecosystem studies, collected in a geopackage (gpkg), a standard, open, and portable format for storing geospatial data.the geopackage file is organised into three layers, reflecting and classifying the content of the three main types of images:species: it refers to the distinct groups of living organisms within the biological/taxonomic classification system.environment: it refers to the habitat in which an organism lives or the context where an image has been taken. the environment section describes the type of biocenosis, the zone and the type of substrate.method: it refers to the mode of execution of the survey, defining its primary functional objective and scope. the information specifies the actual scientific action being performed and often determines the nature of the data collected. a full inventory of the instruments, sensors, and equipment utilised, detailing their model numbers and technical specifications, is presented to ensure the reproducibility of the procedure and to validate the technical context of the recorded data.dataset structure:the current dataset, structured as a geopackage, represents an upgrade of an earlier version based on unstructured data. this upgrade reflects comprehensive enhancements spanning both the it implementation and the precision of the taxonomic classification. concerning the it aspect, a dedicated re-engineering initiative was undertaken to restructure the existing data entities and implement them within a relational database, ensuring standardized data representation and greatly simplifying future updates and management. furthermore, the relational database was used to build the geopackage file constituted by three layers that are linked to the three main image typologies of the dataset.concurrently, the taxonomic completeness has been substantially improved. initially, the completeness of the taxonomic data at the time of the catalogue's publication was limited. this gap has been addressed by updating the fields related to species classification. as a result of this revision, the classification accuracy has been significantly improved. the catalogue now reflects the most recent and authoritative taxonomy, fully aligning with the scientific standard provided by the world register of marine species (worms).the “images” directory contains all pictures. in each table of the geopackage, the field named “img_code” serves as the unique identifier (or key) referencing the respective image.the geopackage tables include the following further information:img_code (image identifier code)datephotographer_namecontext (the activity during which the image was taken)mode (the gear used to take the image)location_name (toponym)position (specifies if the survey point is identified by coordinates or toponym)latlondepthnotes (operator’s comment)caption (additional description)further information is provided according to the three main image typologies:1 species: flora and fauna species are inventoried according to the usual hierarchical order of the systematic categories. as a result of the dataset upgrade, two fields related to species names are provided: the first refers to the species name of the older version of the dataset, and the second one refers to the updated species name. therefore, the geopackage table for the species layer includes the following fields:previous_species_name: it refers to the specific nomenclature according to the earlier classificationupdated_species_name: it refers to the specific name according to the updated classificationspecies_name_authorphylum/divisionsuperclassclassorderfamilyspecies2 environment: it provides a description of the operational environment, the type of substrate and the zone. the latter is defined according to the benthos committee of the international commission for the scientific exploration of the mediterranean (ciesm). consequently, the provided fields are the following:descriptionsubstrate: it refers to the kind of substratum (hard or soft)zone: the catalogue details four zones that are accessible to scientific diving operators (supralittoral, mid-littoral, infralittoral, circumlittoral)3 method: it refers to the methodology[...]
Facebook
TwitterThe Adventure Works dataset is a comprehensive and widely used sample database provided by Microsoft for educational and testing purposes. It's designed to represent a fictional company, Adventure Works Cycles, which is a global manufacturer of bicycles and related products. The dataset is often used for learning and practicing various data management, analysis, and reporting skills.
1. Company Overview: - Industry: Bicycle manufacturing - Operations: Global presence with various departments such as sales, production, and human resources.
2. Data Structure: - Tables: The dataset includes a variety of tables, typically organized into categories such as: - Sales: Information about sales orders, products, and customer details. - Production: Data on manufacturing processes, inventory, and product specifications. - Human Resources: Employee details, departments, and job roles. - Purchasing: Vendor information and purchase orders.
3. Sample Tables: - Sales.SalesOrderHeader: Contains information about sales orders, including order dates, customer IDs, and total amounts. - Sales.SalesOrderDetail: Details of individual items within each sales order, such as product ID, quantity, and unit price. - Production.Product: Information about the products being manufactured, including product names, categories, and prices. - Production.ProductCategory: Data on product categories, such as bicycles and accessories. - Person.Person: Contains personal information about employees and contacts, including names and addresses. - Purchasing.Vendor: Information on vendors that supply the company with materials.
4. Usage: - Training and Education: It's widely used for teaching SQL, data analysis, and database management. - Testing and Demonstrations: Useful for testing software features and demonstrating data-related functionalities.
5. Tools: - The dataset is often used with Microsoft SQL Server, but it's also compatible with other relational database systems.
The Adventure Works dataset provides a rich and realistic environment for practicing a range of data-related tasks, from querying and reporting to data modeling and analysis.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Timescale Creator–database customization
Features provided by Timescale Creator enhance the information which can be gleaned from the 2011 trees. These features can be provided either from functions already built into Timescale Creator, or via “in-house” programming within the database which has exploited the built-in functions to provide data and information on key issues of interest to the case study. It is this flexibility provided by the combination of Timescale Creator functions and datapacks programmed from the back-end relational database which is showcased below.
Groups
Colours were used in the original 2011 trees [1, Appendices 2, 3 ], and now in the Timescale Creator trees, to display eco- and morpho-groups (respectively). The Timescale Creator trees also add coloured group labels (rather than colouring the range labels as in the original trees), and this allows identification of groups without recourse to the legend. These group labels are positioned on ancestor–descendant branches, but have here been programmed to display only when the group membership changes from ancestor to descendant. As a result, they have the added advantage of highlighting origins and reappearances of the selected groups or properties in a phylogenetic context. A handy use of this feature is when, for example, this is programmed to apply to the generic assignment of morphospecies, making polyphyletic morphogenera, intentioned or otherwise, easy to spot.
Lineage labels
To label range lines on the lineage tree, the Timescale Creator version has been programmed to augment each lineage code with its list of contained morphospecies, e.g., the listing appended to Lineage N1-N3 is “H. holmdelensis > G. archeocompressa > G. planocompressa > G. compressa“. The morphospecies series in these listings is ordered by lowest occurrence, and so the >’s denote stratigraphic succession. (The >’s do not necessarily represent ancestor–descendant relationships; of course only a single line of descent could be expressed in such a format.) This allows the lineage and its proposed morphological succession to be grasped much more easily, including a ready comparison with the morphospecies tree.
Pop-ups
Pop-ups provide the most ample opportunity within Timescale Creator to provide access to supporting information for trees. Because pop-up windows are flexibly resizable and are coded in html, textual content has in effect few quota limitations and, in fact, can be employed to view external sources such as Internet sites and image files without the need to store them in the pop-up itself. They can also be programmed to follow a format tailored for the subject matter, as is done here.
Pop-ups for the morphospecies tree display the contents of the 2011 paper’s summary table [1, Appendix S1, Table S3], including decoding of eco- and morpho-group numbers, range statistics from the Neptune portal, and tailoring the reference list to each morphospecies. They also incorporate the ancestor [from 1, Appendix S5, worksheet aM], specify the type of cladogenetic event (all are, in fact, budding for this budding/bifurcating topology [2]), and level of support for the ancestor–descendant proposal (see § Branches). Lineages containing the morphospecies are listed, along with their morphospecies content and age range (for details, see § Linkages between morphospecies and lineage trees [3]). Also included are the binomen’s original assignation and, where available, links to portals, Chronos [4][5-7] and the World Register of Marine Species (WoRMS) [8].
Range lines
Range-line styles have been used for the Timescale Creator version of the 2011 trees to depict four levels of confidence for ranges. Apart from accepted ranges (lines of usual thickness), two less-confident records of stratigraphic occurrence are depicted: “questioned” (thin line) and “questioned-and-rare” (broken line). For extensions to ranges that are not based on stratigraphic occurrences but are hypothesized (for various reasons), a “conjectured” range is separately recognised (dotted line) to ensure that stratigraphic and hypothesized categories are not conflated. There is an option to attach age labels (in Ma) to range lines, providing the chart with an explicit deep-time positioning throughout.
Branches
Similarly to ranges, branch-line styles have been used to depict three levels of stratophenetic support for ancestry. Almost all ancestor–descendant proposals for the 2011 study are presumed to be “Well Supported” (correspondence between detailed stratigraphic sequences and plausible phyletic series; drawn as a broken line). A small number have been categorised as less or better supported than the usual: “Not Well Supported” (only broad correspondence between stratigraphic order and suggestive phyletic series; drawn as a dotted line); or “Strongly Supported” (detailed morphometric–stratigraphic sequences from ancestor to descendant; continuous line).
Linkages between morphospecies and lineage trees
Many range points of the lineages of the 2011 study are herein directly linked to those of included morphospecies: not quite half of start dates and almost all of end dates. Brief details of this linkage are displayed in the “Stratigraphic Range (continued)” section of the pop-up, where the linkage will usually result in the same precalibrated Ma value between lineage and morphospecies range points, but these values will differ where there has been a correction or amendment of the original Ma value. The reason for choosing the morphospecies range point is usually briefly indicated. Where the original Ma value of the lineage range point is retained and not directly linked to a morphospecies point, the morphospecies and its time scale that are employed nonetheless for calibration are indicated.
Pop-ups are also employed to more easily appreciate the linkages between morphospecies and lineages, following from the morphospecies content of lineages. These are displayed both in terms of the lineages in which a morphospecies occurs and in terms of the morphospecies included in a lineage, along with other information to help track these interrelationships.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
📑 The structure of the online_shop dataset consists of interconnected tables that simulate a real-world e-commerce platform. Each table represents a key aspect of the business, such as products, orders, customers, suppliers, and reviews. Below is a detailed breakdown of each table and its columns:
order_id: A unique identifier for each order.order_date: The date when the order was placed.customer_id: A reference to the customer who placed the order (linked to the customers table).total_price: The total cost of the order, calculated as the sum of all items in the order.customer_id: A unique identifier for each customer.first_name: The customer's first name.last_name: The customer's last name.address: The address of the customer.email: The email address of the customer (unique for each customer).phone_number: The phone number of the customer.product_id: A unique identifier for each product.product_name: The name of the product.category: The category to which the product belongs (e.g., Electronics, Home & Kitchen).price: The price of the product.supplier_id: A reference to the supplier providing the product (linked to the suppliers table).order_item_id: A unique identifier for each item in an order.order_id: A reference to the order containing the item (linked to the orders table).product_id: A reference to the product being ordered (linked to the products table).quantity: The quantity of the product ordered.price_at_purchase: The price of the product at the time of the order.supplier_id: A unique identifier for each supplier.supplier_name: The name of the supplier.contact_name: The name of the contact person at the supplier.address: The address of the supplier.phone_number: The phone number of the supplier.email: The email address of the supplier.review_id: A unique identifier for each product review.product_id: A reference to the product being reviewed (linked to the products table).customer_id: A reference to the customer who wrote the review (linked to the customers table).rating: The rating given to the product (1-5, where 5 is the best).review_text: The text content of the review.review_date: The date when the review was written.payment_id: A unique identifier for each payment.order_id: A reference to the order being paid for (linked to the orders table).payment_method: The method of payment (e.g., Credit Card, PayPal).payment_date: The date when the payment was made.amount: The amount of the payment.transaction_status: The status of the payment (e.g., Pending, Completed, Failed).shipment_id: A unique identifier for each shipment.order_id: A reference to the order being shipped (linked to the orders table).shipment_date: The date when the shipment was dispatched.carrier: The company responsible for delivering the shipment.tracking_number: The tracking number for the shipment.delivery_date: The date when the shipment was delivered (if applicable).shipment_status: The status of the shipment (e.g., Pending, Shipped, Delivered, Cancelled).This dataset provides a comprehensive simulation of an e-commerce platform, covering everything from customer orders to supplier relationships, payments, shipments, and customer reviews. It is an excellent resource for practicing SQL, understanding relational databases, or performing data analysis and machine learning tasks.
Facebook
TwitterA centralized sequence database and community resource for Tribolium genetics, genomics and developmental biology containing genomic sequence scaffolds mapped to 10 linkage groups, genetic linkage maps, the official gene set, Reference Sequences from NCBI (RefSeq), predicted gene models, ESTs and whole-genome tiling array data representing several developmental stages. The current version of Beetlebase is built on the Tribolium castaneum 3.0 Assembly (Tcas 3.0) released by the Human Genome Sequencing Center at the Baylor College of Medicine. The database is constructed using the upgraded Generic Model Organism Database (GMOD) modules. The genomic data is stored in a PostgreSQL relational database using the Chado schema and visualized as tracks in GBrowse. The genetic map is visualized using the comparative genetic map viewer CMAP. To enhance search capabilities, the BLAST search tool has been integrated with the GMOD tools. Tribolium castaneum is a very sophisticated genetic model organism among higher eukaryotes. As the member of a primitive order of holometabolous insects, Coleoptera, Tribolium is in a key phylogenetic position to understand the genetic innovations that accompanied the evolution of higher forms with more complex development. Coleoptera is also the largest and most species diverse of all eukaryotic orders and Tribolium offers the only genetic model for the profusion of medically and economically important species therein. The genome sequences may be downloaded.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is a relational database schema for a sales and order management system, designed to track customers, employees, products, orders, and payments. Below is a detailed breakdown of each table and their relationships:
productlines Table (Product Categories)productLinetextDescription: A short description of the product line.htmlDescription: A detailed HTML-based description.image: Associated image (if applicable).products: Each product belongs to one productLine.products Table (Product Information)productCodeproductName: Name of the product.productLine: Foreign key linking to productlines.productScale, productVendor, productDescription: Additional product details.quantityInStock: Number of units available.buyPrice: Cost price per unit.MSRP: Manufacturer's Suggested Retail Price.productlines (each product belongs to one category).orderdetails (a product can be part of many orders).orderdetails Table (Line Items in an Order)orderNumber, productCode)quantityOrdered: Number of units in the order.priceEach: Price per unit.orderLineNumber: The sequence number in the order.orders (each order has multiple products).products (each product can appear in multiple orders).orders Table (Customer Orders)orderNumberorderDate: Date when the order was placed.requiredDate: Expected delivery date.shippedDate: Actual shipping date (can be NULL if not shipped).status: Order status (e.g., "Shipped", "In Process", "Cancelled").comments: Additional remarks about the order.customerNumber: Foreign key linking to customers.orderdetails (an order contains multiple products).customers (each order is placed by one customer).customers Table (Customer Details)customerNumbercustomerName: Name of the customer.contactLastName, contactFirstName: Contact person.phone: Contact number.addressLine1, addressLine2, city, state, postalCode, country: Address details.salesRepEmployeeNumber: Foreign key linking to employees, representing the sales representative.creditLimit: Maximum credit limit assigned to the customer.orders (a customer can place multiple orders).payments (a customer can make multiple payments).employees (each customer has a sales representative).payments Table (Customer Payments)customerNumber, checkNumber)paymentDate: Date of payment.amount: Payment amount.customers (each payment is linked to a customer).employees Table (Employee Information)employeeNumberlastName, firstName: Employee's name.extension, email: Contact details.officeCode: Foreign key linking to offices, representing the employee's office.reportsTo: References another employeeNumber, establishing a hierarchy.jobTitle: Employee’s role (e.g., "Sales Rep", "Manager").offices (each employee works in one office).employees (self-referential, representing reporting structure).customers (each employee manages multiple customers).offices Table (Office Locations)officeCodecity, state, country: Location details.phone: Office contact number.addressLine1, addressLine2, postalCode, territory: Address details.employees (each office has multiple employees).This schema provides a well-structured design for managing a sales and order system, covering:
✅ Product inventory
✅ Order and payment tracking
✅ Customer and employee management
✅ Office locations and hierarchical reporting