5 datasets found
  1. ⚙️ SQL Tutorial Exercise Data

    • kaggle.com
    zip
    Updated Oct 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mexwell (2023). ⚙️ SQL Tutorial Exercise Data [Dataset]. https://www.kaggle.com/datasets/mexwell/sql-tutorial-exercise-data
    Explore at:
    zip(3701453 bytes)Available download formats
    Dataset updated
    Oct 2, 2023
    Authors
    mexwell
    Description

    This dataset was created to be the base of the data.world SQL tutorial exercises. Data was genererated using Synthea, a synthetic patient generator that models the medical history of synthetic patients. Their mission is to output high-quality synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions, enabling research with Health IT data that is otherwise legally or practically unavailable. De-identified real data still presents a challenge in the medical field because there are peopel who excel at re-identification of these data. For that reason the average medical center, etc. will not share their patient data. Most governmental data is at the hospital level. NHANES data is an exception.

    You can read Synthea's first academic paper here.

    Original Data

    Acknowlegement

    Foto von Rubaitul Azad auf Unsplash

  2. h

    synthetic_text_to_sql

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gretel.ai, synthetic_text_to_sql [Dataset]. https://huggingface.co/datasets/gretelai/synthetic_text_to_sql
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset provided by
    Gretel.ai
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Image generated by DALL-E. See prompt for more details

      synthetic_text_to_sql
    

    gretelai/synthetic_text_to_sql is a rich dataset of high quality synthetic Text-to-SQL samples, designed and generated using Gretel Navigator, and released under Apache 2.0. Please see our release blogpost for more details. The dataset includes:

    105,851 records partitioned into 100,000 train and 5,851 test records ~23M total tokens, including ~12M SQL tokens Coverage across 100 distinct… See the full description on the dataset page: https://huggingface.co/datasets/gretelai/synthetic_text_to_sql.

  3. Comparison of Database Documentation Tools

    • blog.devart.com
    html
    Updated May 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Devart (2024). Comparison of Database Documentation Tools [Dataset]. https://blog.devart.com/best-database-documentation-tools.html
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 13, 2024
    Dataset authored and provided by
    Devart
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Tool/Criteria, Supported DBMS, Pricing starts from, Documentation format, Ease of use (max. 4), Customization options (max. 4)
    Description

    A comparison table of popular database documentation tools, including supported DBMS, documentation formats, ease of use, customization options, and pricing.

  4. E-Commerce Data

    • kaggle.com
    zip
    Updated Aug 17, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carrie (2017). E-Commerce Data [Dataset]. https://www.kaggle.com/datasets/carrie1/ecommerce-data
    Explore at:
    zip(7548686 bytes)Available download formats
    Dataset updated
    Aug 17, 2017
    Authors
    Carrie
    Description

    Context

    Typically e-commerce datasets are proprietary and consequently hard to find among publicly available data. However, The UCI Machine Learning Repository has made this dataset containing actual transactions from 2010 and 2011. The dataset is maintained on their site, where it can be found by the title "Online Retail".

    Content

    "This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers."

    Acknowledgements

    Per the UCI Machine Learning Repository, this data was made available by Dr Daqing Chen, Director: Public Analytics group. chend '@' lsbu.ac.uk, School of Engineering, London South Bank University, London SE1 0AA, UK.

    Image from stocksnap.io.

    Inspiration

    Analyses for this dataset could include time series, clustering, classification and more.

  5. Supply Chain DataSet

    • kaggle.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amir Motefaker (2023). Supply Chain DataSet [Dataset]. https://www.kaggle.com/datasets/amirmotefaker/supply-chain-dataset
    Explore at:
    zip(9340 bytes)Available download formats
    Dataset updated
    Jun 1, 2023
    Authors
    Amir Motefaker
    Description

    Supply chain analytics is a valuable part of data-driven decision-making in various industries such as manufacturing, retail, healthcare, and logistics. It is the process of collecting, analyzing and interpreting data related to the movement of products and services from suppliers to customers.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
mexwell (2023). ⚙️ SQL Tutorial Exercise Data [Dataset]. https://www.kaggle.com/datasets/mexwell/sql-tutorial-exercise-data
Organization logo

⚙️ SQL Tutorial Exercise Data

This dataset is part of the data.world tutorial for learning SQL.

Explore at:
zip(3701453 bytes)Available download formats
Dataset updated
Oct 2, 2023
Authors
mexwell
Description

This dataset was created to be the base of the data.world SQL tutorial exercises. Data was genererated using Synthea, a synthetic patient generator that models the medical history of synthetic patients. Their mission is to output high-quality synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions, enabling research with Health IT data that is otherwise legally or practically unavailable. De-identified real data still presents a challenge in the medical field because there are peopel who excel at re-identification of these data. For that reason the average medical center, etc. will not share their patient data. Most governmental data is at the hospital level. NHANES data is an exception.

You can read Synthea's first academic paper here.

Original Data

Acknowlegement

Foto von Rubaitul Azad auf Unsplash

Search
Clear search
Close search
Google apps
Main menu