82 datasets found

o
Getting Started Creating Data Dictionaries: How to Create a Shareable...
osf.io
Updated Jan 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erin Buchanan; Sarah Crain; Ari Wynn; Hannah Johnson; Hannah Hooven (Stash); Marietta Papadatou-Pastou; Peder Isager; Rickard Carlsson; Balazs Aczel (2021). Getting Started Creating Data Dictionaries: How to Create a Shareable Dataset [Dataset]. http://doi.org/10.17605/OSF.IO/3Y2EX
Explore at:
Unique identifier
https://doi.org/10.17605/OSF.IO/3Y2EX
Dataset updated
Jan 27, 2021
Dataset provided by
Center For Open Science
Authors
Erin Buchanan; Sarah Crain; Ari Wynn; Hannah Johnson; Hannah Hooven (Stash); Marietta Papadatou-Pastou; Peder Isager; Rickard Carlsson; Balazs Aczel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As researchers embrace open and transparent data sharing, they will need to provide information about their data that effectively helps others understand its contents. Without proper documentation, data stored in online repositories such as OSF will often be rendered unfindable and unreadable by other researchers and indexing search engines. Data dictionaries and codebooks provide a wealth of information about variables, data collection, and other important facets of a dataset. This information, called metadata, provides key insights into how the data might be further used in research and facilitates search engine indexing to reach a broader audience of interested parties. This tutorial first explains the terminology and standards surrounding data dictionaries and codebooks. We then present a guided workflow of the entire process from source data (e.g., survey answers on Qualtrics) to an openly shared dataset accompanied by a data dictionary or codebook that follows an agreed-upon standard. Finally, we explain how to use freely available web applications to assist this process of ensuring that psychology data are findable, accessible, interoperable, and reusable (FAIR; Wilkinson et al., 2016).
c
Building Footprints Data Dictionary
s.cnmilf.com
datasets.ai
+3more
Updated Mar 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lake County Illinois GIS (2023). Building Footprints Data Dictionary [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/building-footprints-data-dictionary-2f64b
Explore at:
Dataset updated
Mar 17, 2023
Dataset provided by
Lake County Illinois GIS
Description
An in-depth description of the Building Footprint GIS data layer outlining terms of use, update frequency, attribute explanations, and more.
b
Street Centerline (Native) Schema Data Dictionary
data.baltimorecity.gov
Updated Dec 23, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Baltimore City (2023). Street Centerline (Native) Schema Data Dictionary [Dataset]. https://data.baltimorecity.gov/documents/9cc66bc07999466b95cc291eb081734c
Explore at:
Dataset updated
Dec 23, 2023
Dataset authored and provided by
Baltimore City
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
This file contains the data dictionary for Street Centerline (Native) dataset. To leave feedback or ask a question about this dataset, please fill out the following form: Street Centerline (Native) Schema Data Dictionary feedback form.
g
Building Footprints Data Dictionary | gimi9.com
gimi9.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Building Footprints Data Dictionary | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_building-footprints-data-dictionary-2f64b/
Explore at:
Description
buildings data-dictionary lake-county-illinois planimetrics planimetrics-and-landmarks readme
f
Very common form names and the number and percentage of studies their used...
plos.figshare.com
xls
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Craig S. Mayer; Nick Williams; Vojtech Huser (2023). Very common form names and the number and percentage of studies their used in. [Dataset]. http://doi.org/10.1371/journal.pone.0240047.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0240047.t006
Dataset updated
Jun 14, 2023
Dataset provided by
PLOS ONE
Authors
Craig S. Mayer; Nick Williams; Vojtech Huser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Very common form names and the number and percentage of studies their used in.
d
TSS Summarized Results Data Dictionary
catalog.data.gov
Updated May 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office of Government-wide Policy (2025). TSS Summarized Results Data Dictionary [Dataset]. https://catalog.data.gov/dataset/tss-summarized-results-data-dictionary
Explore at:
Dataset updated
May 6, 2025
Dataset provided by
Office of Government-wide Policy
Description
A Data Dictionary for the TSS Summarized Reports at the Building and Individual level reports.
l
LScDC (Leicester Scientific Dictionary-Core)
figshare.le.ac.uk
docx
Updated Apr 15, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neslihan Suzen (2020). LScDC (Leicester Scientific Dictionary-Core) [Dataset]. http://doi.org/10.25392/leicester.data.9896579.v3
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.25392/leicester.data.9896579.v3
Dataset updated
Apr 15, 2020
Dataset provided by
University of Leicester
Authors
Neslihan Suzen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Leicester
Description
The LScDC (Leicester Scientific Dictionary-Core Dictionary)April 2020 by Neslihan Suzen, PhD student at the University of Leicester (ns433@leicester.ac.uk/suzenneslihan@hotmail.com)Supervised by Prof Alexander Gorban and Dr Evgeny Mirkes[Version 3] The third version of LScDC (Leicester Scientific Dictionary-Core) is formed using the updated LScD (Leicester Scientific Dictionary) - Version 3*. All steps applied to build the new version of core dictionary are the same as in Version 2** and can be found in description of Version 2 below. We did not repeat the explanation. The files provided with this description are also same as described as for LScDC Version 2. The numbers of words in the 3rd versions of LScD and LScDC are summarized below. # of wordsLScD (v3) 972,060LScDC (v3) 103,998 * Suzen, Neslihan (2019): LScD (Leicester Scientific Dictionary). figshare. Dataset. https://doi.org/10.25392/leicester.data.9746900.v3 ** Suzen, Neslihan (2019): LScDC (Leicester Scientific Dictionary-Core). figshare. Dataset. https://doi.org/10.25392/leicester.data.9896579.v2[Version 2] Getting StartedThis file describes a sorted and cleaned list of words from LScD (Leicester Scientific Dictionary), explains steps for sub-setting the LScD and basic statistics of words in the LSC (Leicester Scientific Corpus), to be found in [1, 2]. The LScDC (Leicester Scientific Dictionary-Core) is a list of words ordered by the number of documents containing the words, and is available in the CSV file published. There are 104,223 unique words (lemmas) in the LScDC. This dictionary is created to be used in future work on the quantification of the sense of research texts. The objective of sub-setting the LScD is to discard words which appear too rarely in the corpus. In text mining algorithms, usage of enormous number of text data brings the challenge to the performance and the accuracy of data mining applications. The performance and the accuracy of models are heavily depend on the type of words (such as stop words and content words) and the number of words in the corpus. Rare occurrence of words in a collection is not useful in discriminating texts in large corpora as rare words are likely to be non-informative signals (or noise) and redundant in the collection of texts. The selection of relevant words also holds out the possibility of more effective and faster operation of text mining algorithms.To build the LScDC, we decided the following process on LScD: removing words that appear in no more than 10 documents (
d
LNWB Ch03 Data Processes
search.dataone.org
hydroshare.org
+1more
Updated Apr 15, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christina Bandaragoda; Joanne Greenberg; Peter Gill; Bracken Capen; Mary Dumas (2022). LNWB Ch03 Data Processes [Dataset]. https://search.dataone.org/view/sha256%3A2a8103e6f0e432948dd223f69ee2ce60f9611139cdfae7b8dab0b800e6f2526f
Explore at:
Dataset updated
Apr 15, 2022
Dataset provided by
Hydroshare
Authors
Christina Bandaragoda; Joanne Greenberg; Peter Gill; Bracken Capen; Mary Dumas
Description
Overview: The Lower Nooksack Water Budget Project involved assembling a wide range of existing data related to WRIA 1 and specifically the Lower Nooksack Subbasin, updating existing data sets and generating new data sets. This Data Management Plan provides an overview of the data sets, formats and collaboration environment that was used to develop the project. Use of a plan during development of the technical work products provided a forum for the data development and management to be conducted with transparent methods and processes. At project completion, the Data Management Plan provides an accessible archive of the data resources used and supporting information on the data storage, intended access, sharing and re-use guidelines.

One goal of the Lower Nooksack Water Budget project is to make this “usable technical information” as accessible as possible across technical, policy and general public users. The project data, analyses and documents will be made available through the WRIA 1 Watershed Management Project website http://wria1project.org. This information is intended for use by the WRIA 1 Joint Board and partners working to achieve the adopted goals and priorities of the WRIA 1 Watershed Management Plan.

Model outputs for the Lower Nooksack Water Budget are summarized by sub-watersheds (drainages) and point locations (nodes). In general, due to changes in land use over time and changes to available streamflow and climate data, the water budget for any watershed needs to be updated periodically. Further detailed information about data sources is provided in review packets developed for specific technical components including climate, streamflow and groundwater level, soils and land cover, and water use.

Purpose: This project involves assembling a wide range of existing data related to the WRIA 1 and specifically the Lower Nooksack Subbasin, updating existing data sets and generating new data sets. Data will be used as input to various hydrologic, climatic and geomorphic components of the Topnet-Water Management (WM) model, but will also be available to support other modeling efforts in WRIA 1. Much of the data used as input to the Topnet model is publicly available and maintained by others, (i.e., USGS DEMs and streamflow data, SSURGO soils data, University of Washington gridded meteorological data). Pre-processing is performed to convert these existing data into a format that can be used as input to the Topnet model. Post-processing of Topnet model ASCII-text file outputs is subsequently combined with spatial data to generate GIS data that can be used to create maps and illustrations of the spatial distribution of water information. Other products generated during this project will include documentation of methods, input by WRIA 1 Joint Board Staff Team during review and comment periods, communication tools developed for public engagement and public comment on the project.

In order to maintain an organized system of developing and distributing data, Lower Nooksack Water Budget project collaborators should be familiar with standards for data management described in this document, and the following issues related to generating and distributing data: 1. Standards for metadata and data formats 2. Plans for short-term storage and data management (i.e., file formats, local storage and back up procedures and security) 3. Legal and ethical issues (i.e., intellectual property, confidentiality of study participants) 4. Access policies and provisions (i.e., how the data will be made available to others, any restrictions needed) 5. Provisions for long-term archiving and preservation (i.e., establishment of a new data archive or utilization of an existing archive) 6. Assigned data management responsibilities (i.e., persons responsible for ensuring data Management, monitoring compliance with the Data Management Plan)

This resource is a subset of the Lower Nooksack Water Budget (LNWB) Collection Resource.
Employee Benefits Security Administration (EBSA) Enforcement Data
datalumos.org
delimited
Updated Feb 15, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Department of Labor. Employee Benefits Security Administration (2017). Employee Benefits Security Administration (EBSA) Enforcement Data [Dataset]. http://doi.org/10.3886/E100438V1
Explore at:
delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/E100438V1
Dataset updated
Feb 15, 2017
Dataset provided by
Employee Benefits Security Administrationhttps://www.dol.gov/agencies/ebsa
United States Department of Laborhttp://www.dol.gov/
Authors
United States Department of Labor. Employee Benefits Security Administration
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
The dataset consists of closed cases that resulted in penalty assessments by EBSA since 2000. This data provides information on EBSA's enforcement programs to enforce ERISA's Form 5500 Annual Return/Report filing requirement focusing on deficient filers, late filers and non-filers.

Dataset tables listing: EBSA Data Dictionary, EBSA Metadata and EBSA OCATS.
H
Survey of Income and Program Participation (SIPP)
dataverse.harvard.edu
Updated May 30, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anthony Damico (2013). Survey of Income and Program Participation (SIPP) [Dataset]. http://doi.org/10.7910/DVN/I0FFJV
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/I0FFJV
Dataset updated
May 30, 2013
Dataset provided by
Harvard Dataverse
Authors
Anthony Damico
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
analyze the survey of income and program participation (sipp) with r if the census bureau's budget was gutted and only one complex sample survey survived, pray it's the survey of income and program participation (sipp). it's giant. it's rich with variables. it's monthly. it follows households over three, four, now five year panels. the congressional budget office uses it for their health insurance simulation . analysts read that sipp has person-month files, get scurred, and retreat to inferior options. the american community survey may be the mount everest of survey data, but sipp is most certainly the amazon. questions swing wild and free through the jungle canopy i mean core data dictionary. legend has it that there are still species of topical module variables that scientists like you have yet to analyze. ponce de león would've loved it here. ponce. what a name. what a guy. the sipp 2008 panel data started from a sample of 105,663 individuals in 42,030 households. once the sample gets drawn, the census bureau surveys one-fourth of the respondents every four months, over f our or five years (panel durations vary). you absolutely must read and understand pdf pages 3, 4, and 5 of this document before starting any analysis (start at the header 'waves and rotation groups'). if you don't comprehend what's going on, try their survey design tutorial. since sipp collects information from respondents regarding every month over the duration of the panel, you'll need to be hyper-aware of whether you want your results to be point-in-time, annualized, or specific to some other period. the analysis scripts below provide examples of each. at every four-month interview point, every respondent answers every core question for the previous four months. after that, wave-specific addenda (called topical modules) get asked, but generally only regarding a single prior month. to repeat: core wave files contain four records per person, topical modules contain one. if you stacked every core wave, you would have one record per person per month for the duration o f the panel. mmmassive. ~100,000 respondents x 12 months x ~4 years. have an analysis plan before you start writing code so you extract exactly what you need, nothing more. better yet, modify something of mine. cool? this new github repository contains eight, you read me, eight scripts: 1996 panel - download and create database.R 2001 panel - download and create database.R 2004 panel - download and create database.R 2008 panel - download and create database.R since some variables are character strings in one file and integers in anoth er, initiate an r function to harmonize variable class inconsistencies in the sas importation scripts properly handle the parentheses seen in a few of the sas importation scripts, because the SAScii package currently does not create an rsqlite database, initiate a variant of the read.SAScii function that imports ascii data directly into a sql database (.db) download each microdata file - weights, topical modules, everything - then read 'em into sql 2008 panel - full year analysis examples.R< br /> define which waves and specific variables to pull into ram, based on the year chosen loop through each of twelve months, constructing a single-year temporary table inside the database read that twelve-month file into working memory, then save it for faster loading later if you like read the main and replicate weights columns into working memory too, merge everything construct a few annualized and demographic columns using all twelve months' worth of information construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half, again save it for faster loading later, only if you're so inclined reproduce census-publish ed statistics, not precisely (due to topcoding described here on pdf page 19) 2008 panel - point-in-time analysis examples.R define which wave(s) and specific variables to pull into ram, based on the calendar month chosen read that interview point (srefmon)- or calendar month (rhcalmn)-based file into working memory read the topical module and replicate weights files into working memory too, merge it like you mean it construct a few new, exciting variables using both core and topical module questions construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half reproduce census-published statistics, not exactly cuz the authors of this brief used the generalized variance formula (gvf) to calculate the margin of error - see pdf page 4 for more detail - the friendly statisticians at census recommend using the replicate weights whenever possible. oh hayy, now it is. 2008 panel - median value of household assets.R define which wave(s) and spe cific variables to pull into ram, based on the topical module chosen read the topical module and replicate weights files into working memory too, merge once again construct a replicate-weighted complex sample design with a...
c
ckanext-downloadall
catalog.civicdataecosystem.org
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-downloadall [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-downloadall
Explore at:
Dataset updated
Jun 4, 2025
Description
The downloadall extension for CKAN enhances dataset accessibility by adding a "Download all" button to dataset pages. This feature enables users to download a single zip file containing all resource files associated with a dataset, along with a datapackage.json file that provides machine-readable metadata. The extension streamlines the data packaging and distribution process, ensuring data and its documentation are kept together. Key Features: Single-Click Download: Adds a "Download all" button to dataset pages, allowing users to download all resources and metadata in one go. Data Package Creation: Generates a datapackage.json file conforming to the Frictionless Data standard, including dataset metadata. Comprehensive Data Packaging: Packages all data files and datapackage.json into a single zip file to ensure usability. Data Dictionary Inclusion: If resources are stored in the DataStore (using xloader or datapusher), the datapackage.json will include the data dictionary (schema) of the data, specifying column types. Background Zip Creation: Utilises a CKAN background job to (re)create the zip file when a dataset is created, updated, or the data dictionary changes. The extension will detect updates when all data have been uploaded but only if the dataset is updated. Command-Line Interface: Includes a command-line interface for various operations. Technical Integration: The downloadall extension integrates into CKAN as a plugin, adding a new button to the dataset view. It depends on the CKAN background job worker to generate the zip files, and if used with DataStore and xloader (or datapusher), incorporates the data dictionary into the datapackage.json. The extension requires activation in the CKAN configuration file (production.ini). Specific CKAN versions are supported, primarily 2.7 and 2.8. Benefits & Impact: Implementing the downloadall extension can improve data accessibility and usability by providing a convenient way to download datasets and their associated metadata. It streamlines workflows for data analysts, researchers, and others who need comprehensive access to datasets and their documentation. The inclusion of machine-readable metadata in the form of a datapackage.json facilitates automation and standardisation in data processing and validation.
g
Delta Produce Sources Study | gimi9.com
gimi9.com
Updated Feb 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Delta Produce Sources Study | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_delta-produce-sources-study-51a7a
Explore at:
Dataset updated
Feb 12, 2021
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Resource Description: The dataset contains variables corresponding to availability, source (country, state and town if country is the United States), quality, and price (by weight or volume) of 13 fresh fruits and 32 fresh vegetables sold in farmers markets and grocery stores located in 5 Lower Mississippi Delta towns.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Delta Produce Sources Study data dictionary. File Name: DPS Data Dictionary Public.csvResource Description: This file is the data dictionary corresponding to the Delta Produce Sources Study dataset.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel
o
Data from: Data in online version of the ‘Dictionary of Medieval Latin from...
ora.ox.ac.uk
Updated Jan 1, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ashdowne, R (2016). Data in online version of the ‘Dictionary of Medieval Latin from British Sources’ (DMLBS) [Dataset]. https://ora.ox.ac.uk/objects/uuid:b9608816-7ede-4f52-a2a4-e7bc8abd7788
Explore at:
Dataset updated
Jan 1, 2016
Dataset provided by
University of Oxford
Authors
Ashdowne, R
License
https://ora.ox.ac.uk/terms_of_usehttps://ora.ox.ac.uk/terms_of_use
Area covered
Britain
Description
The DMLBS is distinctive not only for the breadth of its coverage but also for the fact that it is wholly based on original research, i.e. on a fresh reading of medieval Latin texts for this specific purpose, where possible in the best available source, whether that be original manuscripts or modern critical editions. (The method is that used by other major dictionaries, such as the monumental Oxford English Dictionary and, for Latin, the Oxford Latin Dictionary, the Thesaurus Linguae Latinae.) In the nearly 50 years of drafting the Dictionary, different editorial practices and conventions have inevitably created a text that varies significantly from the earliest fascicules to the final ones while remaining recognizably the same underlying work. Many of these variations have been the result of conscious decisions, other simply the result of the Dictionary being the work of many people over many years.

Work on digitizing the Dictionary began in earnest in 2009, with a move from a traditional print-based workflow to an electronic XML-based workflow, first for material already drafted on paper slips but not yet keyed as electronic data, and then subsequently with the introduction of full ab initio electronic drafting.

However, even then the majority of the dictionary's content still existed only in print — in the thirteen fascicules (more than 2,500 three-column pages containing nearly 65,000 entries) published since 1965. Once the new workflow for the remaining material to be published was fully established within the project, work began on digitizing earlier fascicules; this work was undertaken by a specialist outside contractor, which captured these printed pages and tagged the material in accordance with the Dictionary schema. The captured material was then evaluated and corrected within the project. Plans for the project itself developing and hosting an online platform for the full dataset were discontinued in 2014 due to lack of technical support and funding, but partnerships have been established to ensure that online publication is achieved.

Technical Overview:

The DMLBS is held in XML according to customized XSD schemas. All data is held in unicode encoding.

Data structure: At the heart of the DMLBS XML workflow sit the data schemas which describe and are used to constrain the structure of the data. The DMLBS uses XSD schemas. The Dictionary data is represented essentially in the form in which it has been published in print. In addition to the schema for the Dictionary text, there is a further schema for the Dictionary's complex bibliography, which is also held in XML form. The schemas in use were custom-built for the DMLBS in order to match the project’s very specific needs, ensuring that the drafted or captured text always complies with the long-standing structures and conventions of the printed dictionary by requiring, allowing or prohibiting as necessary. (Although the use of TEI encoding was seriously considered, it was clear from initial exploration that the level of customization and optimization required to bring the TEI in line with the practical production needs of the dictionary was too great to be feasible.)

Data encoding and entry: The encoding chosen for all DMLBS data is Unicode. In addition to the Roman alphabet, with the full range of diacritics (including the macron and breve to mark vowel length), the Dictionary regularly uses Anglo-Saxon letters (such as thorn, wynn, and yogh) and polytonic Greek, along with assorted other letters and symbols. The ‘Dictionary of Medieval Latin from British Sources’ (DMLBS) was prepared by a project team of specialist researchers as a research project of the British Academy, overseen by a committee appointed by the Academy to direct its work. Initially based in London at the Public Record Office, the editorial team moved to Oxford in the early 1980s and since the late 1990s has formed part of the Faculty of Classics at Oxford University. The main aim of the DMLBS project has been to create a successor to the previous standard dictionary of medieval Latin, the Glossarium ... mediae et infimae Latinitatis, first compiled in the seventeenth century by the French scholar, Du Cange (Charles du Fresne), and a history of the project is available at http://www.dmlbs.ox.ac.uk/about-us/history-of-the-project and in Richard Ashdowne ‘Dictionary of Medieval Latin from British Sources’, British Academy Review 24 (2014), 46–53. The project has been supported financially by major research grants from the Arts & Humanities Research Council, the Packard Humanities Institute, and the OUP John Fell Research Fund, and by a small annual grant from the British Academy. It also received institutional support from the British Academy and the University of Oxford.
Data from: Delta Produce Sources Study
catalog.data.gov
agdatacommons.nal.usda.gov
Updated Apr 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Delta Produce Sources Study [Dataset]. https://catalog.data.gov/dataset/delta-produce-sources-study-51a7a
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description
The Delta Produce Sources Study was an observational study designed to measure and compare food environments of farmers markets (n=3) and grocery stores (n=12) in 5 rural towns located in the Lower Mississippi Delta region of Mississippi. Data were collected via electronic surveys from June 2019 to March 2020 using a modified version of the Nutrition Environment Measures Survey (NEMS) Farmers Market Audit tool. The tool was modified to collect information pertaining to source of fresh produce and also for use with both farmers markets and grocery stores. Availability, source, quality, and price information were collected and compared between farmers markets and grocery stores for 13 fresh fruits and 32 fresh vegetables via SAS software programming. Because the towns were not randomly selected and the sample sizes are relatively small, the data may not be generalizable to all rural towns in the Lower Mississippi Delta region of Mississippi. Resources in this dataset:Resource Title: Delta Produce Sources Study dataset . File Name: DPS Data Public.csvResource Description: The dataset contains variables corresponding to availability, source (country, state and town if country is the United States), quality, and price (by weight or volume) of 13 fresh fruits and 32 fresh vegetables sold in farmers markets and grocery stores located in 5 Lower Mississippi Delta towns.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Delta Produce Sources Study data dictionary. File Name: DPS Data Dictionary Public.csvResource Description: This file is the data dictionary corresponding to the Delta Produce Sources Study dataset.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel
Open Data Portal Catalogue
open.canada.ca
datasets.ai
+1more
csv, json, jsonl, png +2
Updated Jul 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Treasury Board of Canada Secretariat (2025). Open Data Portal Catalogue [Dataset]. https://open.canada.ca/data/en/dataset/c4c5c7f1-bfa6-4ff6-b4a0-c164cb2060f7
Explore at:
csv, sqlite, json, png, jsonl, xlsxAvailable download formats
Dataset updated
Jul 27, 2025
Dataset provided by
Treasury Board of Canada Secretariathttp://www.tbs-sct.gc.ca/
Treasury Board of Canadahttps://www.canada.ca/en/treasury-board-secretariat/corporate/about-treasury-board.html
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
The open data portal catalogue is a downloadable dataset containing some key metadata for the general datasets available on the Government of Canada's Open Data portal. Resource 1 is generated using the ckanapi tool (external link) Resources 2 - 8 are generated using the Flatterer (external link) utility. ###Description of resources: 1. Dataset is a JSON Lines (external link) file where the metadata of each Dataset/Open Information Record is one line of JSON. The file is compressed with GZip. The file is heavily nested and recommended for users familiar with working with nested JSON. 2. Catalogue is a XLSX workbook where the nested metadata of each Dataset/Open Information Record is flattened into worksheets for each type of metadata. 3. datasets metadata contains metadata at the dataset level. This is also referred to as the package in some CKAN documentation. This is the main table/worksheet in the SQLite database and XLSX output. 4. Resources Metadata contains the metadata for the resources contained within each dataset. 5. resource views metadata contains the metadata for the views applied to each resource, if a resource has a view configured. 6. datastore fields metadata contains the DataStore information for CSV datasets that have been loaded into the DataStore. This information is displayed in the Data Dictionary for DataStore enabled CSVs. 7. Data Package Fields contains a description of the fields available in each of the tables within the Catalogue, as well as the count of the number of records each table contains. 8. data package entity relation diagram Displays the title and format for column, in each table in the Data Package in the form of a ERD Diagram. The Data Package resource offers a text based version. 9. SQLite Database is a .db database, similar in structure to Catalogue. This can be queried with database or analytical software tools for doing analysis.
o
Data from: Dictionary of Old English Corpus in Electronic Form (DOEC)
llds.ling-phil.ox.ac.uk
Updated Jun 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Dictionary of Old English Corpus in Electronic Form (DOEC) [Dataset]. https://llds.ling-phil.ox.ac.uk/llds/xmlui/handle/20.500.14106/2488
Explore at:
Dataset updated
Jun 16, 2022
License
https://hdl.handle.net/20.500.14106/licence-otahttps://hdl.handle.net/20.500.14106/licence-ota
Description
(:unav)...........................................
g
Data from: Framework to Develop an Open-Source Forage Data Network to...
gimi9.com
Updated Dec 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Data from: Framework to Develop an Open-Source Forage Data Network to Improve Primary Productivity and Enhance System Resiliency | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_937f9403af70f400b8ce8e65c3c3e1935b31ca99/
Explore at:
Dataset updated
Dec 9, 2024
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
Resource Title: FDH Data Dictionary File Name: FDH_Data_Dictionary.csv Resource Description: Data dictionary for the data compiled as a result of the efforts described in Ashworth et al. (2023) - Framework to Develop an Open-Source Forage Data Network to Improve Primary Productivity and Enhance System Resiliency (in review). Includes descriptions for the data fields in the FDH Data data file.
l
Data from: Bosworth-Toller’s Anglo-Saxon Dictionary online
lindat.cz
live.european-language-grid.eu
Updated Apr 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ondřej Tichý; Martin Roček; Renata Bočková; Matěj Čermák; Jolana Dragounová; Helena Filipová; Lucie Gilová; Michaela Hejná; Lenka Hladíková; Alena Hladká; Veronika Hubinová; Vlaďena Krajcsovicsová; Tatiana Kupková; Tatiana Lebedeva; Nikola Malečková; Alena Novotná; Tereza Pazderová; Jiřina Popelíková; Jana Rumlová; Dana Tyčová Ocelík; Veronika Volná; Tereza Zahradníková (2021). Bosworth-Toller’s Anglo-Saxon Dictionary online [Dataset]. https://lindat.cz/repository/xmlui/handle/11234/1-3532?locale-attribute=cs
Explore at:
Dataset updated
Apr 9, 2021
Authors
Ondřej Tichý; Martin Roček; Renata Bočková; Matěj Čermák; Jolana Dragounová; Helena Filipová; Lucie Gilová; Michaela Hejná; Lenka Hladíková; Alena Hladká; Veronika Hubinová; Vlaďena Krajcsovicsová; Tatiana Kupková; Tatiana Lebedeva; Nikola Malečková; Alena Novotná; Tereza Pazderová; Jiřina Popelíková; Jana Rumlová; Dana Tyčová Ocelík; Veronika Volná; Tereza Zahradníková
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description : This is an online edition of An Anglo-Saxon Dictionary, or a dictionary of "Old English". The dictionary records the state of the English language as it was used between ca. 700-1100 AD by the Anglo-Saxon inhabitants of the British Isles. This project is based on a digital edition of An Anglo-Saxon dictionary, based on the manuscript collections of the late Joseph Bosworth (the so called Main Volume, first edition 1898) and its Supplement (first edition 1921), edited by Joseph Bosworth and T. Northcote Toller, today the largest complete dictionary of Old English (one day to be hopefully supplanted by the DOE). Alistair Campbell's "enlarged addenda and corrigenda" from 1972 are not public domain and are therefore not part of the online dictionary. Please see the front & back matter of the paper dictionary for further information, prefaces and lists of references & contractions. The digitization project was initiated by Sean Crist in 2001 as a part of his Germanic Lexicon Project and many individuals and institutions have contributed to this project. Check out the original GLP webpage and the old Bosworth-Toller offline application webpage (to be updated). Currently the project is hosted by the Faculty of Arts, Charles University. In 2010, the data from the GLP were converted to create the current site. Care was taken to preserve the typography of the original dictionary, but also provide a modern, user friendly interface for contemporary users. In 2013, the entries were structurally re-tagged and the original typography was abandoned, though the immediate access to the scans of the paper dictionary was preserved. Our aim is to reach beyond a simple digital edition and create an online environment dedicated to all interested in Old English and Anglo-Saxon culture. Feel free to join in the editing of the Dictionary, commenting on its numerous entries or participating in the discussions at our forums. We hope that by drawing the attention of the community of Anglo-Saxonists to our site and joining our resources, we may create a more useful tool for everybody. The most immediate project to draw on the corrected and tagged data of the Dictionary is a Morphological Analyzer of Old English (currently under development). We are grateful for the generous support of the Charles University Grant Agency and for the free hosting at the Faculty of Arts at Charles University. The site is currently maintained and developed by Ondrej Tichy et al. at the Department of English Language and ELT Methodology, Faculty of Arts, Charles University in Prague (Czech Republic).
Untitled Visualization - Based on Housing New York Units by Building
data.wu.ac.at
csv, json, xml
Updated Mar 14, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Housing Preservation and Development (HPD) (2018). Untitled Visualization - Based on Housing New York Units by Building [Dataset]. https://data.wu.ac.at/schema/bronx_lehman_cuny_edu/YndnaS1mN2lk
Explore at:
xml, csv, jsonAvailable download formats
Dataset updated
Mar 14, 2018
Dataset provided by
New York City Department of Housing Preservation and Development
Area covered
New York
Description
The Department of Housing Preservation and Development (HPD) reports on buildings, units, and projects that began after January 1, 2014 and are counted towards the Housing New York plan. The Housing New York Units by Building file presents this data by building, and includes building-level data, such as house number, street name, BBL, and BIN for each building in a project. The unit counts are provided by building. For additional documentation, including a data dictionary, review the attachments in the “About this Dataset” section of the Primer landing page.
d
Digital database of structure contour and isopach maps of multiple...
catalog.data.gov
data.usgs.gov
+2more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Digital database of structure contour and isopach maps of multiple subsurface units, Michigan and Illinois Basins, USA [Dataset]. https://catalog.data.gov/dataset/digital-database-of-structure-contour-and-isopach-maps-of-multiple-subsurface-units-michig-634cc
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
This digital data release presents contour data from multiple subsurface geologic horizons as presented in previously published summaries of the regional subsurface configuration of the Michigan and Illinois Basins. The original maps that served as the source of the digital data within this geodatabase are from the Geological Society of America’s Decade of North American Geology project series, “The Geology of North America” volume D-2, chapter 13 “The Michigan Basin” and chapter 14 “Illinois Basin Region”. Contour maps in the original published chapters were generated from geophysical well logs (generally gamma-ray) and adapted from previously published contour maps. The published contour maps illustrated the distribution sedimentary strata within the Illinois and Michigan Basin in the context of the broad 1st order supercycles of L.L. Sloss including the Sauk, Tippecanoe, Kaskaskia, Absaroka, Zuni, and Tejas supersequences. Because these maps represent time-transgressive surfaces, contours frequently delineate the composite of multiple named sedimentary formations at once. Structure contour maps on the top of the Precambrian basement surface in both the Michigan and Illinois basins illustrate the general structural geometry which undergirds the sedimentary cover. Isopach maps of the Sauk 2 and 3, Tippecanoe 1 and 2, Kaskaskia 1 and 2, Absaroka, and Zuni sequences illustrate the broad distribution of sedimentary units in the Michigan Basin, as do isopach maps of the Sauk, Upper Sauk, Tippecanoe 1 and 2, Lower Kaskaskia 1, Upper Kaskaskia 1-Lower Kaskaskia 2, Kaskaskia 2, and Absaroka supersequences in the Illinois Basins. Isopach contours and structure contours were formatted and attributed as GIS data sets for use in digital form as part of U.S. Geological Survey’s ongoing effort to inventory, catalog, and release subsurface geologic data in geospatial form. This effort is part of a broad directive to develop 2D and 3D geologic information at detailed, national, and continental scales. This data approximates, but does not strictly follow the USGS National Cooperative Geologic Mapping Program's GeMS data structure schema for geologic maps. Structure contour lines and isopach contours for each supersequence are stored within separate “IsoValueLine” feature classes. These are distributed within a geographic information system geodatabase and are also saved as shapefiles. Contour data is provided in both feet and meters to maintain consistency with the original publication and for ease of use. Nonspatial tables define the data sources used, define terms used in the dataset, and describe the geologic units referenced herein. A tabular data dictionary describes the entity and attribute information for all attributes of the geospatial data and accompanying nonspatial tables.

Facebook

Twitter

Click to copy link

Link copied

Cite

Erin Buchanan; Sarah Crain; Ari Wynn; Hannah Johnson; Hannah Hooven (Stash); Marietta Papadatou-Pastou; Peder Isager; Rickard Carlsson; Balazs Aczel (2021). Getting Started Creating Data Dictionaries: How to Create a Shareable Dataset [Dataset]. http://doi.org/10.17605/OSF.IO/3Y2EX

Getting Started Creating Data Dictionaries: How to Create a Shareable Dataset

Explore at:

47 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://doi.org/10.17605/OSF.IO/3Y2EX

Dataset updated

Jan 27, 2021

Dataset provided by

Center For Open Science

Authors

Erin Buchanan; Sarah Crain; Ari Wynn; Hannah Johnson; Hannah Hooven (Stash); Marietta Papadatou-Pastou; Peder Isager; Rickard Carlsson; Balazs Aczel

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

As researchers embrace open and transparent data sharing, they will need to provide information about their data that effectively helps others understand its contents. Without proper documentation, data stored in online repositories such as OSF will often be rendered unfindable and unreadable by other researchers and indexing search engines. Data dictionaries and codebooks provide a wealth of information about variables, data collection, and other important facets of a dataset. This information, called metadata, provides key insights into how the data might be further used in research and facilitates search engine indexing to reach a broader audience of interested parties. This tutorial first explains the terminology and standards surrounding data dictionaries and codebooks. We then present a guided workflow of the entire process from source data (e.g., survey answers on Qualtrics) to an openly shared dataset accompanied by a data dictionary or codebook that follows an agreed-upon standard. Finally, we explain how to use freely available web applications to assist this process of ensuring that psychology data are findable, accessible, interoperable, and reusable (FAIR; Wilkinson et al., 2016).

Clear search

Close search

Google apps

Main menu

Getting Started Creating Data Dictionaries: How to Create a Shareable...

Building Footprints Data Dictionary

Street Centerline (Native) Schema Data Dictionary

Building Footprints Data Dictionary | gimi9.com

Very common form names and the number and percentage of studies their used...

TSS Summarized Results Data Dictionary

LScDC (Leicester Scientific Dictionary-Core)

LNWB Ch03 Data Processes

Employee Benefits Security Administration (EBSA) Enforcement Data

Survey of Income and Program Participation (SIPP)

ckanext-downloadall

Delta Produce Sources Study | gimi9.com

Data from: Data in online version of the ‘Dictionary of Medieval Latin from...

Data from: Delta Produce Sources Study

Open Data Portal Catalogue

Data from: Dictionary of Old English Corpus in Electronic Form (DOEC)

Data from: Framework to Develop an Open-Source Forage Data Network to...

Data from: Bosworth-Toller’s Anglo-Saxon Dictionary online

Untitled Visualization - Based on Housing New York Units by Building

Digital database of structure contour and isopach maps of multiple...

Getting Started Creating Data Dictionaries: How to Create a Shareable Dataset