61 datasets found

t
Data from: Data Dictionary Template
data.tempe.gov
data-academy.tempe.gov
+8more
Updated Jun 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Tempe (2020). Data Dictionary Template [Dataset]. https://data.tempe.gov/documents/f97e93ac8d324c71a35caf5a295c4c1e
Explore at:
Dataset updated
Jun 5, 2020
Dataset authored and provided by
City of Tempe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data Dictionary template for Tempe Open Data.
Database Creation Description and Data Dictionaries
figshare.com
txt
Updated Aug 11, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jordan Kempker; John David Ike (2016). Database Creation Description and Data Dictionaries [Dataset]. http://doi.org/10.6084/m9.figshare.3569067.v3
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3569067.v3
Dataset updated
Aug 11, 2016
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Jordan Kempker; John David Ike
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
There are several Microsoft Word documents here detailing data creation methods and with various dictionaries describing the included and derived variables.The Database Creation Description is meant to walk a user through some of the steps detailed in the SAS code with this project.The alphabetical list of variables is intended for users as sometimes this makes some coding steps easier to copy and paste from this list instead of retyping.The NIS Data Dictionary contains some general dataset description as well as each variable's responses.
u
Data from: Pesticide Data Program (PDP)
agdatacommons.nal.usda.gov
txt
Updated Dec 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Agriculture (USDA), Agricultural Marketing Service (AMS) (2025). Pesticide Data Program (PDP) [Dataset]. http://doi.org/10.15482/USDA.ADC/1520764
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1520764
Dataset updated
Dec 2, 2025
Dataset provided by
Ag Data Commons
Authors
U.S. Department of Agriculture (USDA), Agricultural Marketing Service (AMS)
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
The Pesticide Data Program (PDP) is a national pesticide residue database program. Through cooperation with State agriculture departments and other Federal agencies, PDP manages the collection, analysis, data entry, and reporting of pesticide residues on agricultural commodities in the U.S. food supply, with an emphasis on those commodities highly consumed by infants and children.This dataset provides information on where each tested sample was collected, where the product originated from, what type of product it was, and what residues were found on the product, for calendar years 1992 through 2023. The data can measure residues of individual compounds and classes of compounds, as well as provide information about the geographic distribution of the origin of samples, from growers, packers and distributors. The dataset also includes information on where the samples were taken, what laboratory was used to test them, and all testing procedures (by sample, so can be linked to the compound that is identified). The dataset also contains a reference variable for each compound that denotes the limit of detection for a pesticide/commodity pair (LOD variable). The metadata also includes EPA tolerance levels or action levels for each pesticide/commodity pair. The dataset will be updated on a continual basis, with a new resource data file added annually after the PDP calendar-year survey data is released.Resources in this dataset:Resource Title: CSV Data Dictionary for PDP.File Name: PDP_DataDictionary.csv. Resource Description: Machine-readable Comma Separated Values (CSV) format data dictionary for PDP Database Zip files. Defines variables for the sample identity and analytical results data tables/files. The ## characters in the Table and Text Data File name refer to the 2-digit year for the PDP survey, like 97 for 1997 or 01 for 2001. For details on table linking, see PDF. Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excelResource Title: Data dictionary for Pesticide Data Program. File Name: PDP DataDictionary.pdf. Resource Description: Data dictionary for PDP Database Zip files. Resource Software Recommended: Adobe Acrobat, url: https://www.adobe.comResource Title: 2023 PDP Database Zip File. File Name: 2023PDPDatabase.zipResource Title: 2022 PDP Database Zip File. File Name: 2022PDPDatabase.zipResource Title: 2021 PDP Database Zip File. File Name: 2021PDPDatabase.zipResource Title: 2020 PDP Database Zip File. File Name: 2020PDPDatabase.zipResource Title: 2019 PDP Database Zip File. File Name: 2019PDPDatabase.zipResource Title: 2018 PDP Database Zip File. File Name: 2018PDPDatabase.zipResource Title: 2017 PDP Database Zip File. File Name: 2017PDPDatabase.zipResource Title: 2016 PDP Database Zip File. File Name: 2016PDPDatabase.zipResource Title: 2015 PDP Database Zip File. File Name: 2015PDPDatabase.zipResource Title: 2014 PDP Database Zip File. File Name: 2014PDPDatabase.zipResource Title: 2013 PDP Database Zip File. File Name: 2013PDPDatabase.zipResource Title: 2012 PDP Database Zip File. File Name: 2012PDPDatabase.zipResource Title: 2011 PDP Database Zip File. File Name: 2011PDPDatabase.zipResource Title: 2010 PDP Database Zip File. File Name: 2010PDPDatabase.zipResource Title: 2009 PDP Database Zip File. File Name: 2009PDPDatabase.zipResource Title: 2008 PDP Database Zip File. File Name: 2008PDPDatabase.zipResource Title: 2007 PDP Database Zip File. File Name: 2007PDPDatabase.zipResource Title: 2006 PDP Database Zip File. File Name: 2006PDPDatabase.zipResource Title: 2005 PDP Database Zip File. File Name: 2005PDPDatabase.zipResource Title: 2004 PDP Database Zip File. File Name: 2004PDPDatabase.zipResource Title: 2003 PDP Database Zip File. File Name: 2003PDPDatabase.zipResource Title: 2002 PDP Database Zip File. File Name: 2002PDPDatabase.zipResource Title: 2001 PDP Database Zip File. File Name: 2001PDPDatabase.zipResource Title: 2000 PDP Database Zip File. File Name: 2000PDPDatabase.zipResource Title: 1999 PDP Database Zip File. File Name: 1999PDPDatabase.zipResource Title: 1998 PDP Database Zip File. File Name: 1998PDPDatabase.zipResource Title: 1997 PDP Database Zip File. File Name: 1997PDPDatabase.zipResource Title: 1996 PDP Database Zip File. File Name: 1996PDPDatabase.zipResource Title: 1995 PDP Database Zip File. File Name: 1995PDPDatabase.zipResource Title: 1994 PDP Database Zip File. File Name: 1994PDPDatabase.zipResource Title: 1993 PDP Database Zip File. File Name: 1993PDPDatabase.zipResource Title: 1992 PDP Database Zip File. File Name: 1992PDPDatabase.zip
S
data dictionary
health.data.ny.gov
csv, xlsx, xml
Updated Aug 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Center for Environmental Health (2022). data dictionary [Dataset]. https://health.data.ny.gov/Health/data-dictionary/3tsn-2bah
Explore at:
xlsx, csv, xmlAvailable download formats
Dataset updated
Aug 23, 2022
Authors
Center for Environmental Health
Description
This data includes the location of cooling towers registered with New York State. The data is self-reported by owners/property managers of cooling towers in service in New York State. In August 2015 the New York State Department of Health released emergency regulations requiring the owners of cooling towers to register them with New York State. In addition the regulation includes requirements: regular inspection; annual certification; obtaining and implementing a maintenance plan; record keeping; reporting of certain information; and sample collection and culture testing. All cooling towers in New York State, including New York City, need to be registered in the NYS system. Registration is done through an electronic database found at: www.ny.gov/services/register-cooling-tower-and-submit-reports. For more information, check http://www.health.ny.gov/diseases/communicable/legionellosis/, or go to the “About” tab.
d
DOE Legacy Management Sample Locations
catalog.data.gov
s.cnmilf.com
Updated May 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office of Legacy Management (2025). DOE Legacy Management Sample Locations [Dataset]. https://catalog.data.gov/dataset/doe-legacy-management-sample-locations
Explore at:
Dataset updated
May 2, 2025
Dataset provided by
Office of Legacy Management
Description
Each feature within this dataset is the authoritative representation of the location of a sample within the U.S. Department of Energy (DOE) Office of Legacy Management (LM) Environmental Database. The dataset includes sample locations from Puerto Rico to Alaska, with point features representing different types of sample locations such as boreholes, wells, geoprobes, etc. All sample locations are maintained within the LM Environmental Database, with feature attributes defined within the associated data dictionary.
d
U.S. Geological Survey National Produced Waters Geochemical Database v2.3
catalog.data.gov
catalog.newmexicowaterdata.org
+1more
Updated Nov 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). U.S. Geological Survey National Produced Waters Geochemical Database v2.3 [Dataset]. https://catalog.data.gov/dataset/u-s-geological-survey-national-produced-waters-geochemical-database-v2-3
Explore at:
Dataset updated
Nov 20, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
During hydrocarbon production, water is typically co-produced from the geologic formations producing oil and gas. Understanding the composition of these produced waters is important to help investigate the regional hydrogeology, the source of the water, the efficacy of water treatment and disposal plans, potential economic benefits of mineral commodities in the fluids, and the safety of potential sources of drinking or agricultural water. In addition to waters co-produced with hydrocarbons, geothermal development or exploration brings deep formation waters to the surface for possible sampling. This U.S. Geological Survey (USGS) Produced Waters Geochemical Database, which contains geochemical and other information for 114,943 produced water and other deep formation water samples of the United States, is a provisional, updated version of the 2002 USGS Produced Waters Database (Breit and others, 2002). In addition to the major element data presented in the original, the new database contains trace elements, isotopes, and time-series data, as well as nearly 100,000 additional samples that provide greater spatial coverage from both conventional and unconventional reservoir types, including geothermal. The database is a compilation of 40 individual databases, publications, or reports. The database was created in a manner to facilitate addition of new data and correct any compilation errors, and is expected to be updated over time with new data as provided and needed. Table 1, USGSPWDBv2.3 Data Sources.csv, shows the abbreviated ID of each input database (IDDB), the number of samples from each, and its reference. Table 2, USGSPWDBv2.3 Data Dictionary.csv, defines the 190 variables contained in the database and their descriptions. The database variables are organized first with identification and location information, followed by well descriptions, dates, rock properties, physical properties of the water, and then chemistry. The chemistry is organized alphabetically by elemental symbol. Each element is followed by any associated compounds (e.g. H2S is found after S). After Zr, molecules containing carbon, organic 9 compounds and dissolved gases follow. Isotopic data are found at the end of the dataset, just before the culling parameters.
f
Statistics of the sample of 6567 words from the website database.
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arnaud Rey; Jean-Luc Manguin; Chloé Olivier; Sébastien Pacton; Pierre Courrieu (2023). Statistics of the sample of 6567 words from the website database. [Dataset]. http://doi.org/10.1371/journal.pone.0226647.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0226647.t004
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Arnaud Rey; Jean-Luc Manguin; Chloé Olivier; Sébastien Pacton; Pierre Courrieu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Statistics of the sample of 6567 words from the website database.

AdventureWorks 2022 Denormalized

kaggle.com

Updated Nov 25, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Bhavesh J (2024). AdventureWorks 2022 Denormalized [Dataset]. https://www.kaggle.com/datasets/bjaising/adventureworks-2022-denormalized

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Nov 25, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Bhavesh J

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Adventure Works 2022 Denormalized dataset

How this Dataset is created?

The CSV data was sourced from the existing Kaggle dataset titled "Adventure Works 2022" by Algorismus. This data was normalized and consisted of seven individual CSV files. The Sales table served as a fact table that connected to other dimensions. To consolidate all the data into a single table, it was loaded into a SQLite database and transformed accordingly. The final denormalized table was then exported as a single CSV file (delimited by | ), and the column names were updated to follow snake_case style.

DOI

doi.org/10.6084/m9.figshare.27899706

Data Dictionary

Column Name	Description
sales_order_number	Unique identifier for each sales order.
sales_order_date	The date and time when the sales order was placed. (e.g., Friday, August 25, 2017)
sales_order_date_day_of_week	The day of the week when the sales order was placed (e.g., Monday, Tuesday).
sales_order_date_month	The month when the sales order was placed (e.g., January, February).
sales_order_date_day	The day of the month when the sales order was placed (1-31).
sales_order_date_year	The year when the sales order was placed (e.g., 2022).
quantity	The number of units sold in the sales order.
unit_price	The price per unit of the product sold.
total_sales	The total sales amount for the sales order (quantity * unit price).
cost	The total cost associated with the products sold in the sales order.
product_key	Unique identifier for the product sold.
product_name	The name of the product sold.
reseller_key	Unique identifier for the reseller.
reseller_name	The name of the reseller.
reseller_business_type	The type of business of the reseller (e.g., Warehouse, Value Reseller, Specialty Bike Shop).
reseller_city	The city where the reseller is located.
reseller_state	The state where the reseller is located.
reseller_country	The country where the reseller is located.
employee_key	Unique identifier for the employee associated with the sales order.
employee_id	The ID of the employee who processed the sales order.
salesperson_fullname	The full name of the salesperson associated with the sales order.
salesperson_title	The title of the salesperson (e.g., North American Sales Manager, Sales Representative).
email_address	The email address of the salesperson.
sales_territory_key	Unique identifier for the sales territory for the actual sale. (e.g. 3)
assigned_sales_territory	List of sales_territory_key separated by comma assigned to the salesperson. (e.g., 3,4)
sales_territory_region	The region of the sales territory. US territory broken down in regions. International regions listed as country name (e.g., Northeast, France).
sales_territory_country	The country associated with the sales territory.
sales_territory_group	The group classification of the sales territory. (e.g., Europe, North America, Pacific)
target	The ...

E
New Oxford Dictionary of English, 2nd Edition
live.european-language-grid.eu
catalog.elra.info
Updated Dec 6, 2005
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2005). New Oxford Dictionary of English, 2nd Edition [Dataset]. https://live.european-language-grid.eu/catalogue/lcr/2276
Explore at:
Dataset updated
Dec 6, 2005
License
http://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttp://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
Description
This is Oxford University Press's most comprehensive single-volume dictionary, with 170,000 entries covering all varieties of English worldwide. The NODE data set constitutes a fully integrated range of formal data types suitable for language engineering and NLP applications: It is available in XML or SGML. - Source dictionary data. The NODE data set includes all the information present in the New Oxford Dictionary of English itself, such as definition text, example sentences, grammatical indicators, and encyclopaedic material. - Morphological data. Each NODE lemma (both headwords and subentries) has a full listing of all possible syntactic forms (e.g. plurals for nouns, inflections for verbs, comparatives and superlatives for adjectives), tagged to show their syntactic relationships. Each form has an IPA pronunciation. Full morphological data is also given for spelling variants (e.g. typical American variants), and a system of links enables straightforward correlation of variant forms to standard forms. The data set thus provides robust support for all look-up routines, and is equally viable for applications dealing with American and British English. - Phrases and idioms. The NODE data set provides a rich and flexible codification of over 10,000 phrasal verbs and other multi-word phrases. It features comprehensive lexical resources enabling applications to identify a phrase not only in the form listed in the dictionary but also in a range of real-world variations, including alternative wording, variable syntactic patterns, inflected verbs, optional determiners, etc. - Subject classification. Using a categorization scheme of 200 key domains, over 80,000 words and senses have been associated with particular subject areas, from aeronautics to zoology. As well as facilitating the extraction of subject-specific sub-lexicons, this also provides an extensive resource for document categorization and information retrieval. - Semantic relationships. The relationships between every noun and noun sense in the dictionary are being codified using an extensive semantic taxonomy on the model of the Princeton WordNet project. (Mapping to WordNet 1.7 is supported.) This structure allows elements of the basic lexical database to function as a formal knowledge database, enabling functionality such as sense disambiguation and logical inference. - Derived from the detailed and authoritative corpus-based research of Oxford University Press's lexicographic team, the NODE data set is a powerful asset for any task dealing with real-world contemporary English usage. By integrating a number of different data types into a single structure, it creates a coherent resource which can be queried along numerous axes, allowing open-ended exploitation by many kinds of language-related applications.
Data from: US Federal LCA Commons Life Cycle Inventory Unit Process Template...
catalog.data.gov
Updated Apr 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). US Federal LCA Commons Life Cycle Inventory Unit Process Template [Dataset]. https://catalog.data.gov/dataset/us-federal-lca-commons-life-cycle-inventory-unit-process-template-3cc7d
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Area covered
United States
Description
An excel template with data elements and conventions corresponding to the openLCA unit process data model. Includes LCA Commons data and metadata guidelines and definitions Resources in this dataset:Resource Title: READ ME - data dictionary. File Name: lcaCommonsSubmissionGuidelines_FINAL_2014-09-22.pdfResource Title: US Federal LCA Commons Life Cycle Inventory Unit Process Template. File Name: FedLCA_LCI_template_blank EK 7-30-2015.xlsxResource Description: Instructions: This template should be used for life cycle inventory (LCI) unit process development and is associated with an openLCA plugin to import these data into an openLCA database. See www.openLCA.org to download the latest release of openLCA for free, and to access available plugins.
d
Data from: Environmental and Quality-Control Data for Per- and...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Environmental and Quality-Control Data for Per- and Polyfluoroalkyl Substances (PFAS) Measured in Selected Rivers and Streams in Massachusetts, 2020 (ver. 2.0, May 2023) [Dataset]. https://catalog.data.gov/dataset/environmental-and-quality-control-data-for-per-and-polyfluoroalkyl-substances-pfas-measure
Explore at:
Dataset updated
Nov 27, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Massachusetts
Description
This data release includes concentrations of 24 per- and polyfluoroalkyl substances (PFAS) and physical properties of water-quality samples collected by the U.S. Geological Survey (USGS) at 64 selected sites in rivers and streams in Massachusetts over three rounds of sampling. The samples were collected from August to November 2020 when streamflow conditions were below normal (also considered to be base-flow conditions) at rivers and streams in urban areas that receive treated wastewater from municipal wastewater-treatment facilities, and in rural rivers and streams that are not associated with municipal wastewater discharges and may have other source inputs of PFAS. The measured physical properties include water temperature, specific conductance, pH, dissolved oxygen, and turbidity and the quality-control data from blanks, replicates, laboratory control samples, and laboratory spike samples are provided. The physical properties, along with all of the discrete water-quality PFAS data, except the quality-control data, are also available online from the U.S. Geological Survey's National Water Information System (NWIS) database (https://nwis.waterdata.usgs.gov/nwis). This data release is structured as a set of tab-delimited (.txt) files.The metadata includes descriptions of files: Site_Information.txt, Abbreviations_and_Remark_Codes.txt, and Analysis_Information.txt. This data release also includes a Data Dictionary (Data_Dictionary.txt) that is used to describe environmental sample data (Environmental_Data.txt), and Quality Control field and laboratory blank data (QC_Blanks.txt), field and laboratory replicate data (QC_Replicates.txt), and laboratory control sample and spike data (QC_Laboratory_Control_Samples_and_Spikes.txt).
g
Database and Biobank of the Quebec Longitudinal Study on Nutrition and...
gaaindata.org
Updated Mar 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nancy Presse, Pierrette Gaudreau, José A. Morais, Stéphanie Chevalier (2021). Database and Biobank of the Quebec Longitudinal Study on Nutrition and Successful Aging [Dataset]. https://www.gaaindata.org/partner/NUAGE
Explore at:
Dataset updated
Mar 16, 2021
Dataset provided by
The Global Alzheimer's Association Interactive Network
Authors
Nancy Presse, Pierrette Gaudreau, José A. Morais, Stéphanie Chevalier
Area covered

Description
The NuAge Study recruited 1,793 men and women aged 67-84 years in the regions of Montreal and Sherbrooke (QC, Canada) and followed them annually for 3 years. A total of 1,753 participants are part of the NuAge Database and Biobank containing exhaustive data (demography, social, lifestyle, nutrition, functional, clinical, anthropometry, cognition, biomarkers) and biological samples to be shared with the scientific community to carry out research projects characterizing the trajectories of aging.
Dictionary of English Words and Definitions
kaggle.com
zip
Updated Sep 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnthonyTherrien (2024). Dictionary of English Words and Definitions [Dataset]. https://www.kaggle.com/datasets/anthonytherrien/dictionary-of-english-words-and-definitions
Explore at:
zip(6401928 bytes)Available download formats
Dataset updated
Sep 22, 2024
Authors
AnthonyTherrien
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Overview

This dataset consists of 42,052 English words and their corresponding definitions. It is a comprehensive collection of words ranging from common terms to more obscure vocabulary. The dataset is ideal for Natural Language Processing (NLP) tasks, educational tools, and various language-related applications.

Key Features:

Words: A diverse set of English words, including both rare and frequently used terms.

Definitions: Each word is accompanied by a detailed definition that explains its meaning and contextual usage.

Total Number of Words: 42,052

Applications

This dataset is well-suited for a range of use cases, including:

Natural Language Processing (NLP): Enhance text understanding models by providing contextual meaning and word associations.

Vocabulary Building: Create educational tools or games that help users expand their vocabulary.

Lexical Studies: Perform academic research on word usage, trends, and lexical semantics.

Dictionary and Thesaurus Development: Serve as a resource for building dictionary or thesaurus applications, where users can search for words and definitions.

Data Structure

Word: The column containing the English word.

Definition: The column providing a comprehensive definition of the word.

Potential Use Cases

Language Learning: This dataset can be used to develop applications or tools aimed at enhancing vocabulary acquisition for language learners.

NLP Model Training: Useful for tasks such as word embeddings, definition generation, and contextual learning.

Research: Analyze word patterns, rare vocabulary, and trends in the English language.

This version focuses on providing essential information while emphasizing the total number of words and potential applications of the dataset. Let me know if you'd like any further adjustments!
Database Infrastructure for Mass Spectrometry - Per- and Polyfluoroalkyl...
nist.gov
data.nist.gov
+1more
Updated Jul 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2023). Database Infrastructure for Mass Spectrometry - Per- and Polyfluoroalkyl Substances [Dataset]. http://doi.org/10.18434/mds2-2905
Explore at:
Unique identifier
https://doi.org/10.18434/mds2-2905, https://identifiers.org/ark:/88434/mds2-2905
Dataset updated
Jul 5, 2023
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
License
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
Description
Data here contain and describe an open-source structured query language (SQLite) portable database containing high resolution mass spectrometry data (MS1 and MS2) for per- and polyfluorinated alykl substances (PFAS) and associated metadata regarding their measurement techniques, quality assurance metrics, and the samples from which they were produced. These data are stored in a format adhering to the Database Infrastructure for Mass Spectrometry (DIMSpec) project. That project produces and uses databases like this one, providing a complete toolkit for non-targeted analysis. See more information about the full DIMSpec code base - as well as these data for demonstration purposes - at GitHub (https://github.com/usnistgov/dimspec) or view the full User Guide for DIMSpec (https://pages.nist.gov/dimspec/docs). Files of most interest contained here include the database file itself (dimspec_nist_pfas.sqlite) as well as an entity relationship diagram (ERD.png) and data dictionary (DIMSpec for PFAS_1.0.1.20230615_data_dictionary.json) to elucidate the database structure and assist in interpretation and use.
w
Users' guide to PETROG: AGSO's petrography database
data.wu.ac.at
dev.ecat.ga.gov.au
pdf
Updated Jun 26, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Corp (2018). Users' guide to PETROG: AGSO's petrography database [Dataset]. https://data.wu.ac.at/schema/data_gov_au/Yjc2NjIwYTgtOTNiZC00ZTI0LTlkOTctNzQ1YjJhMzIzZDJh
Explore at:
pdfAvailable download formats
Dataset updated
Jun 26, 2018
Dataset provided by
Corp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PETROG, AGSO's Petrography Database, is a relational computer database of petrographic data obtained from microscopic examination of thin sections of rock samples. The database is designed for petrographic descriptions of crystalline igneous and metamorphic rocks, and also for sedimentary petrography. A variety of attributes pertaining to thin sections can be recorded, as can the volume proportions of component minerals, clasts and matrix.

PETROG is one of a family of field and laboratory databases that include mineral deposits, regolith, rock chemistry, geochronology, stream-sediment geochemistry, geophysical rock properties and ground spectral properties for remote sensing. All these databases rely on a central Field Database for information on geographic location, outcrops and rock samples. PETROG depends, in particular, on the Field Database's SITES and ROCKS tables, as well as a number of lookup tables of standard terms. ROCKMINSITES, a flat view of PETROG's tables combined with the SITES and ROCKS tables, allows thin-section and mineral data to be accessed from geographic information systems and plotted on maps.

This guide presents an overview of PETROG's infrastructure and describes in detail the menus and screen forms used to input and view the data. In particular, the definitions of most fields in the database are given in some depth under descriptions of the screen forms - providing, in effect, a comprehensive data dictionary of the database. The database schema, with all definitions of tables, views and indexes is contained in an appendix to the guide.
Steam Dataset 2025: Multi-Modal Gaming Analytics
kaggle.com
zip
Updated Oct 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CrainBramp (2025). Steam Dataset 2025: Multi-Modal Gaming Analytics [Dataset]. https://www.kaggle.com/datasets/crainbramp/steam-dataset-2025-multi-modal-gaming-analytics
Explore at:
zip(12478964226 bytes)Available download formats
Dataset updated
Oct 7, 2025
Authors
CrainBramp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Steam Dataset 2025: Multi-Modal Gaming Analytics Platform

The first multi-modal Steam dataset with semantic search capabilities. 239,664 applications collected from official Steam Web APIs with PostgreSQL database architecture, vector embeddings for content discovery, and comprehensive review analytics.

Made by a lifelong gamer for the gamer in all of us. Enjoy!🎮

GitHub Repository https://github.com/vintagedon/steam-dataset-2025

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F28514182%2F4b7eb73ac0f2c3cc9f0d57f37321b38f%2FScreenshot%202025-10-18%20180450.png?generation=1760825194507387&alt=media" alt=""> 1024-dimensional game embeddings projected to 2D via UMAP reveal natural genre clustering in semantic space

What Makes This Different

Unlike traditional flat-file Steam datasets, this is built as an analytically-native database optimized for advanced data science workflows:

☑️ Semantic Search Ready - 1024-dimensional BGE-M3 embeddings enable content-based game discovery beyond keyword matching

☑️ Multi-Modal Architecture - PostgreSQL + JSONB + pgvector in unified database structure

☑️ Production Scale - 239K applications vs typical 6K-27K in existing datasets

☑️ Complete Review Corpus - 1,048,148 user reviews with sentiment and metadata

☑️ 28-Year Coverage - Platform evolution from 1997-2025

☑️ Publisher Networks - Developer and publisher relationship data for graph analysis

☑️ Complete Methodology & Infrastructure - Full work logs document every technical decision and challenge encountered, while my API collection scripts, database schemas, and processing pipelines enable you to update the dataset, fork it for customized analysis, learn from real-world data engineering workflows, or critique and improve the methodology

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F28514182%2F649e9f7f46c6ce213101d0948c89e8ac%2F4_price_distribution_by_top_10_genres.png?generation=1760824835918620&alt=media" alt=""> Market segmentation and pricing strategy analysis across top 10 genres

What's Included

Core Data (CSV Exports): - 239,664 Steam applications with complete metadata - 1,048,148 user reviews with scores and statistics - 13 normalized relational tables for pandas/SQL workflows - Genre classifications, pricing history, platform support - Hardware requirements (min/recommended specs) - Developer and publisher portfolios

Advanced Features (PostgreSQL): - Full database dump with optimized indexes - JSONB storage preserving complete API responses - Materialized columns for sub-second query performance - Vector embeddings table (pgvector-ready)

Documentation: - Complete data dictionary with field specifications - Database schema documentation - Collection methodology and validation reports

Example Analysis: Published Notebooks (v1.0)

Three comprehensive analysis notebooks demonstrate dataset capabilities. All notebooks render directly on GitHub with full visualizations and output:

📊 Platform Evolution & Market Landscape

View on GitHub | PDF Export
28 years of Steam's growth, genre evolution, and pricing strategies.

🔍 Semantic Game Discovery

View on GitHub | PDF Export
Content-based recommendations using vector embeddings across genre boundaries.

🎯 The Semantic Fingerprint

View on GitHub | PDF Export
Genre prediction from game descriptions - demonstrates text analysis capabilities.

Notebooks render with full output on GitHub. Kaggle-native versions planned for v1.1 release. CSV data exports included in dataset for immediate analysis.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F28514182%2F4079e43559d0068af00a48e2c31f0f1d%2FScreenshot%202025-10-18%20180214.png?generation=1760824950649726&alt=media" alt=""> *Steam platfor...
European Soccer Database Supplementary
kaggle.com
zip
Updated Sep 10, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
willinghorse (2017). European Soccer Database Supplementary [Dataset]. https://www.kaggle.com/datasets/jiezi2004/soccer/code
Explore at:
zip(13757870 bytes)Available download formats
Dataset updated
Sep 10, 2017
Authors
willinghorse
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Context

This dataset was built as a supplementary to "[European Soccer Database][1]". It includes data dictionary, extraction of detailed match information previously contains in XML columns.

Content

PositionReference.csv: A reference of position x, y and map them to actual position in a play court.

DataDictionary.xlsx: Data dictionary for all XML columns in "Match" data table.

card_detail.csv: Detailed XML information extracted form "card" column in "Match" data table.

corner_detail.csv: Detailed XML information extracted form "corner" column in "Match" data table.

cross_detail.csv: Detailed XML information extracted form "cross" column in "Match" data table.

foulcommit_detail.csv: Detailed XML information extracted form "foulcommit" column in "Match" data table.

goal_detail.csv: Detailed XML information extracted form "goal" column in "Match" data table.

possession_detail.csv: Detailed XML information extracted form "possession" column in "Match" data table.

shotoff_detail.csv: Detailed XML information extracted form "shotoffl" column in "Match" data table.

shoton_detail.csv: Detailed XML information extracted form "shoton" column in "Match" data table.

Acknowledgements

Original data comes from [European Soccer Database][1] by Hugo Mathien. I personally thank him for all his efforts.

Inspiration

Since this is a open dataset with no specific goals / objectives, I would like to explore the following aspects by data analytics / data mining:

Team statistics Including overall team ranking, team points, winning possibility, team lineup, etc. Mostly descriptive analysis.

Team Transferring Track and study team players transferring in the market. Study team's strength and weakness, construct models to suggest best fit players to the team.

Player Statistics Summarize player's performance (goal, assist, cross, corner, pass, block, etc). Identify key factors of players by position. Based on these factors, evaluate player's characteristics.

Player Evolution Construct model to predict player's rating of future.

New Player's Template Identify template and model player for young players cater to their positions and characteristics.

Market Value Prediction Predict player's market value based on player's capacity and performance.

The Winning Eleven Given a season / league / other criteria, propose the best 11 players as a team based on their capacity and performance.
l
LScDC Word-Category RIG Matrix
figshare.le.ac.uk
pdf
Updated Apr 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neslihan Suzen (2020). LScDC Word-Category RIG Matrix [Dataset]. http://doi.org/10.25392/leicester.data.12133431.v2
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25392/leicester.data.12133431.v2
Dataset updated
Apr 28, 2020
Dataset provided by
University of Leicester
Authors
Neslihan Suzen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
LScDC Word-Category RIG MatrixApril 2020 by Neslihan Suzen, PhD student at the University of Leicester (ns433@leicester.ac.uk / suzenneslihan@hotmail.com)Supervised by Prof Alexander Gorban and Dr Evgeny MirkesGetting StartedThis file describes the Word-Category RIG Matrix for theLeicester Scientific Corpus (LSC) [1], the procedure to build the matrix and introduces the Leicester Scientific Thesaurus (LScT) with the construction process. The Word-Category RIG Matrix is a 103,998 by 252 matrix, where rows correspond to words of Leicester Scientific Dictionary-Core (LScDC) [2] and columns correspond to 252 Web of Science (WoS) categories [3, 4, 5]. Each entry in the matrix corresponds to a pair (category,word). Its value for the pair shows the Relative Information Gain (RIG) on the belonging of a text from the LSC to the category from observing the word in this text. The CSV file of Word-Category RIG Matrix in the published archive is presented with two additional columns of the sum of RIGs in categories and the maximum of RIGs over categories (last two columns of the matrix). So, the file ‘Word-Category RIG Matrix.csv’ contains a total of 254 columns.This matrix is created to be used in future research on quantifying of meaning in scientific texts under the assumption that words have scientifically specific meanings in subject categories and the meaning can be estimated by information gains from word to categories. LScT (Leicester Scientific Thesaurus) is a scientific thesaurus of English. The thesaurus includes a list of 5,000 words from the LScDC. We consider ordering the words of LScDC by the sum of their RIGs in categories. That is, words are arranged in their informativeness in the scientific corpus LSC. Therefore, meaningfulness of words evaluated by words’ average informativeness in the categories. We have decided to include the most informative 5,000 words in the scientific thesaurus. Words as a Vector of Frequencies in WoS CategoriesEach word of the LScDC is represented as a vector of frequencies in WoS categories. Given the collection of the LSC texts, each entry of the vector consists of the number of texts containing the word in the corresponding category.It is noteworthy that texts in a corpus do not necessarily belong to a single category, as they are likely to correspond to multidisciplinary studies, specifically in a corpus of scientific texts. In other words, categories may not be exclusive. There are 252 WoS categories and a text can be assigned to at least 1 and at most 6 categories in the LSC. Using the binary calculation of frequencies, we introduce the presence of a word in a category. We create a vector of frequencies for each word, where dimensions are categories in the corpus.The collection of vectors, with all words and categories in the entire corpus, can be shown in a table, where each entry corresponds to a pair (word,category). This table is build for the LScDC with 252 WoS categories and presented in published archive with this file. The value of each entry in the table shows how many times a word of LScDC appears in a WoS category. The occurrence of a word in a category is determined by counting the number of the LSC texts containing the word in a category. Words as a Vector of Relative Information Gains Extracted for CategoriesIn this section, we introduce our approach to representation of a word as a vector of relative information gains for categories under the assumption that meaning of a word can be quantified by their information gained for categories.For each category, a function is defined on texts that takes the value 1, if the text belongs to the category, and 0 otherwise. For each word, a function is defined on texts that takes the value 1 if the word belongs to the text, and 0 otherwise. Consider LSC as a probabilistic sample space (the space of equally probable elementary outcomes). For the Boolean random variables, the joint probability distribution, the entropy and information gains are defined.The information gain about the category from the word is the amount of information on the belonging of a text from the LSC to the category from observing the word in the text [6]. We used the Relative Information Gain (RIG) providing a normalised measure of the Information Gain. This provides the ability of comparing information gains for different categories. The calculations of entropy, Information Gains and Relative Information Gains can be found in the README file in the archive published. Given a word, we created a vector where each component of the vector corresponds to a category. Therefore, each word is represented as a vector of relative information gains. It is obvious that the dimension of vector for each word is the number of categories. The set of vectors is used to form the Word-Category RIG Matrix, in which each column corresponds to a category, each row corresponds to a word and each component is the relative information gain from the word to the category. In Word-Category RIG Matrix, a row vector represents the corresponding word as a vector of RIGs in categories. We note that in the matrix, a column vector represents RIGs of all words in an individual category. If we choose an arbitrary category, words can be ordered by their RIGs from the most informative to the least informative for the category. As well as ordering words in each category, words can be ordered by two criteria: sum and maximum of RIGs in categories. The top n words in this list can be considered as the most informative words in the scientific texts. For a given word, the sum and maximum of RIGs are calculated from the Word-Category RIG Matrix.RIGs for each word of LScDC in 252 categories are calculated and vectors of words are formed. We then form the Word-Category RIG Matrix for the LSC. For each word, the sum (S) and maximum (M) of RIGs in categories are calculated and added at the end of the matrix (last two columns of the matrix). The Word-Category RIG Matrix for the LScDC with 252 categories, the sum of RIGs in categories and the maximum of RIGs over categories can be found in the database.Leicester Scientific Thesaurus (LScT)Leicester Scientific Thesaurus (LScT) is a list of 5,000 words form the LScDC [2]. Words of LScDC are sorted in descending order by the sum (S) of RIGs in categories and the top 5,000 words are selected to be included in the LScT. We consider these 5,000 words as the most meaningful words in the scientific corpus. In other words, meaningfulness of words evaluated by words’ average informativeness in the categories and the list of these words are considered as a ‘thesaurus’ for science. The LScT with value of sum can be found as CSV file with the published archive. Published archive contains following files:1) Word_Category_RIG_Matrix.csv: A 103,998 by 254 matrix where columns are 252 WoS categories, the sum (S) and the maximum (M) of RIGs in categories (last two columns of the matrix), and rows are words of LScDC. Each entry in the first 252 columns is RIG from the word to the category. Words are ordered as in the LScDC.2) Word_Category_Frequency_Matrix.csv: A 103,998 by 252 matrix where columns are 252 WoS categories and rows are words of LScDC. Each entry of the matrix is the number of texts containing the word in the corresponding category. Words are ordered as in the LScDC.3) LScT.csv: List of words of LScT with sum (S) values. 4) Text_No_in_Cat.csv: The number of texts in categories. 5) Categories_in_Documents.csv: List of WoS categories for each document of the LSC.6) README.txt: Description of Word-Category RIG Matrix, Word-Category Frequency Matrix and LScT and forming procedures.7) README.pdf (same as 6 in PDF format)References[1] Suzen, Neslihan (2019): LSC (Leicester Scientific Corpus). figshare. Dataset. https://doi.org/10.25392/leicester.data.9449639.v2[2] Suzen, Neslihan (2019): LScDC (Leicester Scientific Dictionary-Core). figshare. Dataset. https://doi.org/10.25392/leicester.data.9896579.v3[3] Web of Science. (15 July). Available: https://apps.webofknowledge.com/[4] WoS Subject Categories. Available: https://images.webofknowledge.com/WOKRS56B5/help/WOS/hp_subject_category_terms_tasca.html [5] Suzen, N., Mirkes, E. M., & Gorban, A. N. (2019). LScDC-new large scientific dictionary. arXiv preprint arXiv:1912.06858. [6] Shannon, C. E. (1948). A mathematical theory of communication. Bell system technical journal, 27(3), 379-423.
Soil Survey Geographic Database (SSURGO)
agdatacommons.nal.usda.gov
pdf
Updated Nov 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
USDA Natural Resources Conservation Service (2025). Soil Survey Geographic Database (SSURGO) [Dataset]. http://doi.org/10.15482/USDA.ADC/1242479
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1242479
Dataset updated
Nov 21, 2025
Dataset provided by
Natural Resources Conservation Servicehttp://www.nrcs.usda.gov/
United States Department of Agriculturehttp://usda.gov/
Authors
USDA Natural Resources Conservation Service
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The SSURGO database contains information about soil as collected by the National Cooperative Soil Survey over the course of a century. The information can be displayed in tables or as maps and is available for most areas in the United States and the Territories, Commonwealths, and Island Nations served by the USDA-NRCS (Natural Resources Conservation Service). The information was gathered by walking over the land and observing the soil. Many soil samples were analyzed in laboratories. The maps outline areas called map units. The map units describe soils and other components that have unique properties, interpretations, and productivity. The information was collected at scales ranging from 1:12,000 to 1:63,360. More details were gathered at a scale of 1:12,000 than at a scale of 1:63,360. The mapping is intended for natural resource planning and management by landowners, townships, and counties. Some knowledge of soils data and map scale is necessary to avoid misunderstandings. The maps are linked in the database to information about the component soils and their properties for each map unit. Each map unit may contain one to three major components and some minor components. The map units are typically named for the major components. Examples of information available from the database include available water capacity, soil reaction, electrical conductivity, and frequency of flooding; yields for cropland, woodland, rangeland, and pastureland; and limitations affecting recreational development, building site development, and other engineering uses. SSURGO datasets consist of map data, tabular data, and information about how the maps and tables were created. The extent of a SSURGO dataset is a soil survey area, which may consist of a single county, multiple counties, or parts of multiple counties. SSURGO map data can be viewed in the Web Soil Survey or downloaded in ESRI® Shapefile format. The coordinate systems are Geographic. Attribute data can be downloaded in text format that can be imported into a Microsoft® Access® database. A complete SSURGO dataset consists of:

GIS data (as ESRI® Shapefiles) attribute data (dbf files - a multitude of separate tables) database template (MS Access format - this helps with understanding the structure and linkages of the various tables) metadata

Resources in this dataset:Resource Title: SSURGO Metadata - Tables and Columns Report. File Name: SSURGO_Metadata_-_Tables_and_Columns.pdfResource Description: This report contains a complete listing of all columns in each database table. Please see SSURGO Metadata - Table Column Descriptions Report for more detailed descriptions of each column.

Find the Soil Survey Geographic (SSURGO) web site at https://www.nrcs.usda.gov/wps/portal/nrcs/detail/vt/soils/?cid=nrcs142p2_010596#Datamart Title: SSURGO Metadata - Table Column Descriptions Report. File Name: SSURGO_Metadata_-_Table_Column_Descriptions.pdfResource Description: This report contains the descriptions of all columns in each database table. Please see SSURGO Metadata - Tables and Columns Report for a complete listing of all columns in each database table.

Find the Soil Survey Geographic (SSURGO) web site at https://www.nrcs.usda.gov/wps/portal/nrcs/detail/vt/soils/?cid=nrcs142p2_010596#Datamart Title: SSURGO Data Dictionary. File Name: SSURGO 2.3.2 Data Dictionary.csvResource Description: CSV version of the data dictionary
p
MIMIC-IV Clinical Database Demo
physionet.org
registry.opendata.aws
Updated Jan 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Steven Horng; Leo Anthony Celi; Roger Mark (2023). MIMIC-IV Clinical Database Demo [Dataset]. http://doi.org/10.13026/dp1f-ex47
Explore at:
Unique identifier
https://doi.org/10.13026/dp1f-ex47
Dataset updated
Jan 31, 2023
Authors
Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Steven Horng; Leo Anthony Celi; Roger Mark
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
The Medical Information Mart for Intensive Care (MIMIC)-IV database is comprised of deidentified electronic health records for patients admitted to the Beth Israel Deaconess Medical Center. Access to MIMIC-IV is limited to credentialed users. Here, we have provided an openly-available demo of MIMIC-IV containing a subset of 100 patients. The dataset includes similar content to MIMIC-IV, but excludes free-text clinical notes. The demo may be useful for running workshops and for assessing whether the MIMIC-IV is appropriate for a study before making an access request.

Facebook

Twitter

Click to copy link

Link copied

Cite

City of Tempe (2020). Data Dictionary Template [Dataset]. https://data.tempe.gov/documents/f97e93ac8d324c71a35caf5a295c4c1e

Data from: Data Dictionary Template

Explore at:

Dataset updated

Jun 5, 2020

Dataset authored and provided by

City of Tempe

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Data Dictionary template for Tempe Open Data.

Clear search

Close search

Google apps

Main menu

Data from: Data Dictionary Template

Database Creation Description and Data Dictionaries

Data from: Pesticide Data Program (PDP)

data dictionary

DOE Legacy Management Sample Locations

U.S. Geological Survey National Produced Waters Geochemical Database v2.3

Statistics of the sample of 6567 words from the website database.

AdventureWorks 2022 Denormalized

Adventure Works 2022 Denormalized dataset

How this Dataset is created?

DOI

Data Dictionary

New Oxford Dictionary of English, 2nd Edition

Data from: US Federal LCA Commons Life Cycle Inventory Unit Process Template...

Data from: Environmental and Quality-Control Data for Per- and...

Database and Biobank of the Quebec Longitudinal Study on Nutrition and...

Dictionary of English Words and Definitions

Dataset Overview

Key Features:

Total Number of Words: 42,052

Applications

Data Structure

Potential Use Cases

Database Infrastructure for Mass Spectrometry - Per- and Polyfluoroalkyl...

Users' guide to PETROG: AGSO's petrography database

Steam Dataset 2025: Multi-Modal Gaming Analytics

Steam Dataset 2025: Multi-Modal Gaming Analytics Platform

What Makes This Different

What's Included

Example Analysis: Published Notebooks (v1.0)

📊 Platform Evolution & Market Landscape

🔍 Semantic Game Discovery

🎯 The Semantic Fingerprint

European Soccer Database Supplementary

Context

Content

Acknowledgements

Inspiration

LScDC Word-Category RIG Matrix

Soil Survey Geographic Database (SSURGO)

MIMIC-IV Clinical Database Demo

Data from: Data Dictionary Template