46 datasets found
  1. d

    SQL Practice

    • dune.com
    Updated Feb 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wuwe1 (2023). SQL Practice [Dataset]. https://dune.com/discover/content/relevant?q=author:wuwe1&resource-type=queries
    Explore at:
    Dataset updated
    Feb 3, 2023
    Authors
    wuwe1
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Blockchain data query: SQL Practice

  2. 🎓 365DS Practice Exams • People Analytics Dataset

    • kaggle.com
    zip
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ísis Santos Costa (2025). 🎓 365DS Practice Exams • People Analytics Dataset [Dataset]. https://www.kaggle.com/datasets/isissantoscosta/365ds-practice-exams-people-analytics-dataset
    Explore at:
    zip(61775349 bytes)Available download formats
    Dataset updated
    May 20, 2025
    Authors
    ĂŤsis Santos Costa
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    This dataset has been uploaded to Kaggle on the occasion of solving questions of the 365 Data Science • Practice Exams: SQL curriculum, a set of free resources designed to help test and elevate data science skills. The dataset consists of a synthetic, relational collection of data structured to simulate common employee and organizational data scenarios, ideal for practicing SQL queries and data analysis skills in a People Analytics context.

    The dataset contains the following tables:

    departments.csv: List of all company departments. dept_emp.csv: Historical and current assignments of employees to departments. dept_manager.csv: Historical and current assignments of employees as department managers. employees.csv: Core employee demographic information. employees.db: A SQLite database containing all the relational tables from the CSV files. salaries.csv: Historical salary records for employees. titles.csv: Historical job titles held by employees.

    Usage

    The dataset is ideal for practicing SQL queries and data analysis skills in a People Analytics context. It serves applications on both general Data Analytics, and also Time Series Analysis.

    A practical application is presented on the 🎓 365DS Practice Exams • SQL notebook, which covers in detail answers to the questions of SQL Practice Exams 1, 2, and 3 on the 365DS platform, especially ilustrating the usage and the value of SQL procedures and functions.

    Acknowledgements & Data Origin

    This dataset has a rich lineage, originating from academic research and evolving through various formats to its current relational structure:

    Original Authors

    The foundational dataset was authored by Prof. Dr. Fusheng Wang đź”— (then a PhD student at the University of California, Los Angeles - UCLA) and his advisor, Prof. Dr. Carlo Zaniolo đź”— (UCLA). This work is primarily described in their paper:

    Relational Conversion

    It was originally distributed as an .xml file. Giuseppe Maxia (known as @datacharmer on GitHubđź”— and LinkedInđź”—, as well as here on Kaggle) converted it into its relational form and subsequently distributed it as a .sql file, making it accessible for relational database use.

    Kaggle Upload

    This .sql version was then loaded to Kaggle as the « Employees Dataset » by Mirza Huzaifa🔗 on February 5th, 2023.

  3. Bike Store Relational Database | SQL

    • kaggle.com
    zip
    Updated Aug 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dillon Myrick (2023). Bike Store Relational Database | SQL [Dataset]. https://www.kaggle.com/datasets/dillonmyrick/bike-store-sample-database
    Explore at:
    zip(94412 bytes)Available download formats
    Dataset updated
    Aug 21, 2023
    Authors
    Dillon Myrick
    Description

    This is the sample database from sqlservertutorial.net. This is a great dataset for learning SQL and practicing querying relational databases.

    Database Diagram:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4146319%2Fc5838eb006bab3938ad94de02f58c6c1%2FSQL-Server-Sample-Database.png?generation=1692609884383007&alt=media" alt="">

    Terms of Use

    The sample database is copyrighted and cannot be used for commercial purposes. For example, it cannot be used for the following but is not limited to the purposes: - Selling - Including in paid courses

  4. SQL Databases for Students and Educators

    • zenodo.org
    • data-staging.niaid.nih.gov
    • +1more
    bin, html
    Updated Oct 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mauricio Vargas SepĂşlveda; Mauricio Vargas SepĂşlveda (2020). SQL Databases for Students and Educators [Dataset]. http://doi.org/10.5281/zenodo.4136985
    Explore at:
    bin, htmlAvailable download formats
    Dataset updated
    Oct 28, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mauricio Vargas SepĂşlveda; Mauricio Vargas SepĂşlveda
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Publicly accessible databases often impose query limits or require registration. Even when I maintain public and limit-free APIs, I never wanted to host a public database because I tend to think that the connection strings are a problem for the user.

    I’ve decided to host different light/medium size by using PostgreSQL, MySQL and SQL Server backends (in strict descending order of preference!).

    Why 3 database backends? I think there are a ton of small edge cases when moving between DB back ends and so testing lots with live databases is quite valuable. With this resource you can benchmark speed, compression, and DDL types.

    Please send me a tweet if you need the connection strings for your lectures or workshops. My Twitter username is @pachamaltese. See the SQL dumps on each section to have the data locally.

  5. SQL Case Study for Data Analysts

    • kaggle.com
    zip
    Updated Jan 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ShravyaShetty1 (2025). SQL Case Study for Data Analysts [Dataset]. https://www.kaggle.com/datasets/shravyashetty1/sql-basic-case-study
    Explore at:
    zip(59519 bytes)Available download formats
    Dataset updated
    Jan 29, 2025
    Authors
    ShravyaShetty1
    Description

    This dataset is a practical SQL case study designed for learners who are looking to enhance their SQL skills in analyzing sales, products, and marketing data. It contains several SQL queries related to a simulated business database for product sales, marketing expenses, and location data. The database consists of three main tables: Fact, Product, and Location.

    Objective of the Case Study: The purpose of this case study is to provide learners with a variety of practical SQL exercises that involve real-world business problems. The queries explore topics such as:

    • Aggregating data (e.g., sum, count, average)
    • Filtering and sorting data
    • Grouping and joining multiple tables
    • Using SQL functions like AVG(), COUNT(), SUM(), and MIN/MAX()
    • Handling advanced SQL features such as row numbering, transactions, and stored procedures
  6. Simple Employee Dataset for beginners

    • kaggle.com
    zip
    Updated Aug 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahar Jamal (2024). Simple Employee Dataset for beginners [Dataset]. https://www.kaggle.com/datasets/saharsyed/simple-employee-dataset-for-beginners/data
    Explore at:
    zip(514 bytes)Available download formats
    Dataset updated
    Aug 8, 2024
    Authors
    Sahar Jamal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset has been created to practice simple SQL queries. For Example: Find the average salary of each department. Find the employee with the highest salary. Find employees with a salary range between 5000 to 58000.

  7. w3school SQL practice Beginner

    • kaggle.com
    zip
    Updated Jan 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luis Fernandez (2023). w3school SQL practice Beginner [Dataset]. https://www.kaggle.com/datasets/analytic4pinguino/w3school-sql-practice-beginner
    Explore at:
    zip(2884 bytes)Available download formats
    Dataset updated
    Jan 10, 2023
    Authors
    Luis Fernandez
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    this is a DB that collect all of the table used in the following: 33 exercises of SQL belonging to the page https://www.w3resource.com/sql-exercises/sql-retrieve-from-table.php

    12 exercises of Boolean and Relational Operators to the page https://www.w3resource.com/sql-exercises/sql-boolean-operators.php

    22 exercises of Wildcard and Special operators to the page https://www.w3resource.com/sql-exercises/sql-wildcard-

    25 exercises of Aggregate Functions page https://www.w3resource.com/sql-exercises/sql-aggregate-functions.php

    10 exercises of Formatting query output page https://www.w3resource.com/sql-exercises/sql-fromatting-output-exercises.php

    8 exercises of Query on Multiple Tables page https://www.w3resource.com/sql-exercises/sql-exercises-quering-on-multiple-table.php

    29 exercises of SQL JOINS page https://www.w3resource.com/sql-exercises/sql-joins-exercises.php

    TOTAL 129 exercises for NOW this DB is updated as I need it but I think that it is complete

  8. Nvidia Database

    • kaggle.com
    zip
    Updated Jan 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ajay Tom (2025). Nvidia Database [Dataset]. https://www.kaggle.com/datasets/ajayt0m/nvidia-database
    Explore at:
    zip(8712 bytes)Available download formats
    Dataset updated
    Jan 30, 2025
    Authors
    Ajay Tom
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This is a beginner-friendly SQLite database designed to help users practice SQL and relational database concepts. The dataset represents a basic business model inspired by NVIDIA and includes interconnected tables covering essential aspects like products, customers, sales, suppliers, employees, and projects. It's perfect for anyone new to SQL or data analytics who wants to learn and experiment with structured data.

    Tables and Their Contents:

    Products:

    Includes details of 15 products (e.g., GPUs, AI accelerators). Attributes: product_id, product_name, category, release_date, price.

    Customers:

    Lists 20 fictional customers with their industry and contact information. Attributes: customer_id, customer_name, industry, contact_email, contact_phone.

    Sales:

    Contains 100 sales records tied to products and customers. Attributes: sale_id, product_id, customer_id, sale_date, region, quantity_sold, revenue.

    Suppliers:

    Features 50 suppliers and the materials they provide. Attributes: supplier_id, supplier_name, material_supplied, contact_email.

    Supply Chain:

    Tracks materials supplied to produce products, proportional to sales. Attributes: supply_chain_id, supplier_id, product_id, supply_date, quantity_supplied.

    Departments:

    Lists 5 departments within the business. Attributes: department_id, department_name, location.

    Employees:

    Contains data on 30 employees and their roles in different departments. Attributes: employee_id, first_name, last_name, department_id, hire_date, salary.

    Projects:

    Describes 10 projects handled by different departments. Attributes: project_id, project_name, department_id, start_date, end_date, budget.

    Why Use This Dataset?

    • Perfect for Beginners: The dataset is simple and easy to understand.
    • Interconnected Tables: Provides a basic introduction to relational database concepts like joins and foreign keys.
    • SQL Practice: Run basic queries, filter data, and perform simple aggregations or calculations.
    • Learning Tool: Great for small projects and understanding business datasets.

    Potential Use Cases:

    • Practice SQL queries (SELECT, INSERT, UPDATE, DELETE, JOIN).
    • Understand how to design and query relational databases.
    • Analyze basic sales and supply chain data for patterns and trends.
    • Learn how to use databases in analytics tools like Excel, Power BI, or Tableau.

    Data Size:

    Number of Tables: 8 Total Rows: Around 230 across all tables, ensuring quick queries and easy exploration.

  9. Supply Chain Management SQL Case Study

    • kaggle.com
    zip
    Updated Jan 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sreelakshmi Sivan (2024). Supply Chain Management SQL Case Study [Dataset]. https://www.kaggle.com/datasets/sreelakshmisivan/supply-chain-management-sql-case-study
    Explore at:
    zip(2830 bytes)Available download formats
    Dataset updated
    Jan 21, 2024
    Authors
    Sreelakshmi Sivan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Sreelakshmi Sivan

    Released under MIT

    Contents

  10. Additional file 1: of Examining database persistence of ISO/EN 13606...

    • springernature.figshare.com
    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ricardo SĂĄnchez-de-Madariaga; Adolfo MuĂąoz; Raimundo Lozano-RubĂ; Pablo Serrano-Balazote; Antonio Castro; Oscar Moreno; Mario Pascual (2023). Additional file 1: of Examining database persistence of ISO/EN 13606 standardized electronic health record extracts: relational vs. NoSQL approaches [Dataset]. http://doi.org/10.6084/m9.figshare.c.3858004_D1.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Ricardo SĂĄnchez-de-Madariaga; Adolfo MuĂąoz; Raimundo Lozano-RubĂ­; Pablo Serrano-Balazote; Antonio Castro; Oscar Moreno; Mario Pascual
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SQL program. Program written in SQL performing the six queries on the MySQL database. (SQL 15.3 kb)

  11. Example code list definition in csv format.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David A. Springate; Rosa Parisi; Ivan Olier; David Reeves; Evangelos Kontopantelis (2023). Example code list definition in csv format. [Dataset]. http://doi.org/10.1371/journal.pone.0171784.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    David A. Springate; Rosa Parisi; Ivan Olier; David Reeves; Evangelos Kontopantelis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Example code list definition in csv format.

  12. Available functions in rEHR.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David A. Springate; Rosa Parisi; Ivan Olier; David Reeves; Evangelos Kontopantelis (2023). Available functions in rEHR. [Dataset]. http://doi.org/10.1371/journal.pone.0171784.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    David A. Springate; Rosa Parisi; Ivan Olier; David Reeves; Evangelos Kontopantelis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Available functions in rEHR.

  13. OpenAIRE Graph Training for Scientometrics Research

    • data.europa.eu
    unknown
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). OpenAIRE Graph Training for Scientometrics Research [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-13981535?locale=no
    Explore at:
    unknown(4694366)Available download formats
    Dataset updated
    May 7, 2025
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Presentation for a hands-on training session designed to help participants learn or refine their skills in analysing OpenAIRE Graph data from the Google Cloud with Biq Query. The workshop lasted 4 hours and alternated between presentations and hands-on practice with guidance from trainers. The training covered: Introduction to Google Cloud and Big Query Introduction to the OpenAIRE Graph on BigQuery Gentle introduction to SQL Simple queries walkthrough and exercises Advanced queries (e.g., with JOINS and Big Query functions) walkthrough and exercises Data takeout + Python notebooks on Google BigQuery

  14. IMDB Movies Analysis - SQL

    • kaggle.com
    zip
    Updated Feb 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaurav B R (2023). IMDB Movies Analysis - SQL [Dataset]. https://www.kaggle.com/datasets/gauravbr/imdb-movies-data-erd
    Explore at:
    zip(3818401 bytes)Available download formats
    Dataset updated
    Feb 21, 2023
    Authors
    Gaurav B R
    Description

    SQL IMDB Movies Analysis for RSVP (Film Production Company)

    RSVP Movies is an Indian film production company which has produced many super-hit movies. They have usually released movies for the Indian audience but for their next project, they are planning to release a movie for the global audience in 2022.

    The production company wants to plan their every move analytically based on data. We have taken the last three years IMDB movies data and carried out the analysis using SQL. We have analysed the data set and drew meaningful insights that could help them start their new project.

    For our convenience, the entire analytics process has been divided into four segments, where each segment leads to significant insights from different combinations of tables. The questions in each segment with business objectives are written in the script given below. We have written the solution code below every question.

  15. S

    Annual Trends of Entries into Out-of-Home Care, 2010-2022

    • splitgraph.com
    • healthdata.gov
    • +4more
    Updated May 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wa-gov (2024). Annual Trends of Entries into Out-of-Home Care, 2010-2022 [Dataset]. https://www.splitgraph.com/wa-gov/annual-trends-of-entries-into-outofhome-care-jq8n-5me2
    Explore at:
    json, application/vnd.splitgraph.image, application/openapi+jsonAvailable download formats
    Dataset updated
    May 17, 2024
    Authors
    wa-gov
    Description

    These data are related to DCYF’s Office of Innovation, Alignment, and Accountability (OIAA) prevention dashboards, published to support the agency’s efforts to prevent child maltreatment. Those dashboards can be found here: https://www.dcyf.wa.gov/practice/oiaa/reports/prevention-dashboard

    Much of the data requested by the Strengthen Families Locally communities to inform their planning, and thus contained in these initial dashboards and datasets, are what we know about children entering out-of-home care (OOH care) – age distribution, counts, rates, trends over time, and race/ethnicity. In 2022, about 3,370 children entered out of home care statewide, a record low for Washington State.

    The prevention dashboards and datasets also include descriptive data on children in Child Protection Services (CPS) intakes – rates of intakes “screened-in” for a CPS response, as well as the types of referents referring to CPS. In 2022, DCYF received CPS intakes involving over 89,000 children statewide, and 46,000 total children in intakes screened in for a CPS response.

    Some of the data focus on children aged 0 to 1 (or birth to just under 2 years old). This group of children enter out-of-home care at a high rate, and the Strengthen Families Locally communities have identified that early intervention with this group of children and their families can be especially impactful.

    OIAA expects to update these dashboards and datasets annually. In addition, we will be working to develop additional dashboards to support other related DCYF prevention efforts.

    Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:

    See the Splitgraph documentation for more information.

  16. S

    Watershed Protection Fee Credit Status

    • splitgraph.com
    • opendata.howardcountymd.gov
    • +1more
    Updated May 5, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    opendata-howardcountymd-gov (2015). Watershed Protection Fee Credit Status [Dataset]. https://www.splitgraph.com/opendata-howardcountymd-gov/watershed-protection-fee-credit-status-brpx-j859
    Explore at:
    application/vnd.splitgraph.image, application/openapi+json, jsonAvailable download formats
    Dataset updated
    May 5, 2015
    Authors
    opendata-howardcountymd-gov
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Residential credits of 20% against the annual Watershed Protection Fee for installation of a recognized Best Management Practice (BMP), which meets minimum treatment criteria

    Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:

    See the Splitgraph documentation for more information.

  17. d

    PostgreSQL Dump of IMDB Data for JOB Workload

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcus, Ryan (2023). PostgreSQL Dump of IMDB Data for JOB Workload [Dataset]. http://doi.org/10.7910/DVN/2QYZBT
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Marcus, Ryan
    Description

    This is a dump generated by pg_dump -Fc of the IMDb data used in the "How Good are Query Optimizers, Really?" paper. PostgreSQL compatible SQL queries and scripts to automatically create a VM with this dataset can be found here: https://git.io/imdb

  18. AdventureWorks Sample Mfg Database Tables

    • kaggle.com
    zip
    Updated Feb 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Brown (2023). AdventureWorks Sample Mfg Database Tables [Dataset]. https://www.kaggle.com/datasets/universalanalyst/adventureworks-sample-mfg-database-tables
    Explore at:
    zip(3689556 bytes)Available download formats
    Dataset updated
    Feb 24, 2023
    Authors
    Michael Brown
    Description

    In order to practice writing SQL queries in a semi-realistic database, I discovered and imported Microsoft's AdventureWorks sample database into Microsoft SQL Server Express. The Adventure Works [fictious] company represents a bicycle manufacturer that sells bicycles and accessories to global markets. Queries were written for developing and testing a Tableau dashboard.

    The dataset presented here represents a fraction of the entire manufacturing relational database. Tables within the dataset include product, purchasing, work order, and transaction data.

    The full database sample can be found on Microsoft SQL Docs website: https://learn.microsoft.com/en-us/sql/samples/ and additionally on Github: https://github.com/microsoft/sql-server-samples

  19. S

    Feed the Future Ethiopia Value Chain Activity FY 2018 Annual Performance...

    • splitgraph.com
    Updated Feb 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    usaid-gov (2022). Feed the Future Ethiopia Value Chain Activity FY 2018 Annual Performance Monitoring Survey - Chickpea [Dataset]. https://www.splitgraph.com/usaid-gov/feed-the-future-ethiopia-value-chain-activity-fy-4j8j-zd2d
    Explore at:
    application/vnd.splitgraph.image, json, application/openapi+jsonAvailable download formats
    Dataset updated
    Feb 14, 2022
    Authors
    usaid-gov
    Area covered
    Ethiopia
    Description

    Input and labor costs, production, technology and good agricultural practice application, and sales variables related to chickpea during the December 2017-April 2018 production season. Data is long format.

    Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:

    See the Splitgraph documentation for more information.

  20. f

    Table of rcprd functions.

    • plos.figshare.com
    xls
    Updated Aug 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Pate; Rosa Parisi; Evangelos Kontopantelis; Matthew Sperrin (2025). Table of rcprd functions. [Dataset]. http://doi.org/10.1371/journal.pone.0327229.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 19, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Alexander Pate; Rosa Parisi; Evangelos Kontopantelis; Matthew Sperrin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Clinical Practice Research Datalink (CPRD) is a large and widely used resource of electronic health records from the UK, linking primary care data to hospital data, death registration data, cancer registry data, deprivation data and mental health services data. Extraction and management of CPRD data is a computationally demanding process and requires a significant amount of work, in particular when using R. The rcprd package simplifies the process of extracting and processing CPRD data in order to build datasets ready for statistical analysis. Raw CPRD data is provided in thousands of.txt files, making querying this data cumbersome and inefficient. rcprd saves the relevant information into an SQLite database stored on the hard drive which can then be queried efficiently to extract required information about individuals. rcprd follows a four-stage process: 1) Definition of a cohort, 2) Read in medical/prescription data and save into an SQLite database, 3) Query this SQLite database for specific codes and tests to create variables for each individual in the cohort, 4) Combine extracted variables into a dataset ready for statistical analysis. Functions are available to extract common variable types (e.g., history of a condition, or time until an event occurs, relative to an index date), and more general functions for database queries, allowing users to define their own variables for extraction. The entire process can be done from within R, with no knowledge of SQL required. This manuscript showcases the functionality of rcprd by running through an example using simulated CPRD Aurum data. rcprd will reduce the duplication of time and effort among those using CPRD data for research, allowing more time to be focused on other aspects of research projects.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
wuwe1 (2023). SQL Practice [Dataset]. https://dune.com/discover/content/relevant?q=author:wuwe1&resource-type=queries

SQL Practice

Explore at:
117 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Feb 3, 2023
Authors
wuwe1
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Blockchain data query: SQL Practice

Search
Clear search
Close search
Google apps
Main menu